首頁猿問 Python 中缺失溫度數據的插值

Python 中缺失溫度數據的插值

Python

阿晨1998 2023-07-05 16:11:42

我有東西伯利亞幾個站點的每月溫度數據。然而，我工作所必需的一個站點缺少大量數據，而附近的其他站點覆蓋范圍良好。有沒有辦法根據另一個數據集的行為插入缺失的數據？無法提供任何代碼，因為我不知道從哪里開始，并且數據集如下所示：紅點是來自缺失值的站點的數據，而綠色圖是來自覆蓋良好的站點的數據如果有人能指出我正確的方向，我將不勝感激

查看完整描述

2 回答

慕工程0101907

TA貢獻1887條經驗獲得超5個贊

有一些方法可以做到這一點，例如，對覆蓋范圍良好的數據集應用 FFT，并查看它與覆蓋范圍較差的數據集的擬合情況，同時刪除高頻項。

但是，我非常懷疑這是否有用：覆蓋率高的數據集幾乎完全適合覆蓋率低的數據集。無論您要應用哪種方法，與具有高覆蓋率的數據集相似、同時擬合具有較差覆蓋率的數據集的最佳函數是具有高覆蓋率的數據集本身。

反對回復 2023-07-05

斯蒂芬大帝

TA貢獻1827條經驗獲得超8個贊

讓我們創建一個試驗數據集來解決您的問題：

import numpy as np

import pandas as pd

import matplotlib.pyplot as plt

t = np.linspace(0, 30*2*np.pi, 30*24*2)

td = pd.date_range("2020-01-01", freq='30T', periods=t.size)

T0 = np.sin(t)*8 - 15 + np.random.randn(t.size)*0.2

T1 = np.sin(t)*7 - 13 + np.random.randn(t.size)*0.1

T2 = np.sin(t)*9 - 10 + np.random.randn(t.size)*0.3

T3 = np.sin(t)*8.5 - 11 + np.random.randn(t.size)*0.5

T = np.vstack([T0, T1, T2, T3]).T

features = pd.DataFrame(T, columns=["s1", "s2", "s3", "s4"], index=td)

看起來像：

axe = features[:"2020-01-04"].plot()

axe.legend()

axe.grid()

然后，如果您的時間序列線性相關良好，您可以簡單地通過普通最小二乘回歸的平均值來預測缺失值。SciKit-Learn 提供了一個方便的接口來執行此類計算：

from sklearn import linear_model

from sklearn.model_selection import train_test_split

# Remove target site from features:

target = features.pop("s4")

# Split dataset into train (actual data) and test (missing temperatures):

x_train, x_test, y_train, y_test = train_test_split(features, target, train_size=0.25, random_state=123)

# Create a Linear Regressor and train it:

reg = linear_model.LinearRegression()

reg.fit(x_train, y_train)

# Assess regression score with test data:

reg.score(x_test, y_test) # 0.9926150729585087

# Predict missing values:

ypred = reg.predict(x_test)

ypred = pd.DataFrame(ypred, index=x_test.index, columns=["s4p"])

結果如下：

axe = features[:"2020-01-04"].plot()

target[:"2020-01-04"].plot(ax=axe)

ypred[:"2020-01-04"].plot(ax=axe, linestyle='None', marker='.')

axe.legend()

axe.grid()

error = (y_test - ypred.squeeze())

axe = error.plot()

axe.legend(["Prediction Error"])

axe.grid()

反對回復 2023-07-05

2 回答
0 關注
165 瀏覽

關注

添加回答

舉報

0/150

提交

取消

亚洲在线久爱草,狠狠天天香蕉网,天天搞日日干久草,伊人亚洲日本欧美

熱搜

最近搜索清空

Python 中缺失溫度數據的插值

Python 中缺失溫度數據的插值

2 回答

添加回答