如何以正确的方式平滑曲线?

让我们假设我们有一个数据集，它大概是

import numpy as np
x = np.linspace(0,2*np.pi,100)
y = np.sin(x) + np.random.random(100) * 0.2

Therefore we have a variation of 20% of the dataset. My first idea was to use the UnivariateSpline function of scipy, but the problem is that this does not consider the small noise in a good way. If you consider the frequencies, the background is much smaller than the signal, so a spline only of the cutoff might be an idea, but that would involve a back and forth fourier transformation, which might result in bad behaviour. Another way would be a moving average, but this would also need the right choice of the delay.

有什么提示/书籍或链接可以解决这个问题吗?

当前回答

对于我的一个项目，我需要为时间序列建模创建间隔，为了使过程更高效，我创建了tsmoothie:一个python库，用于以向量化的方式平滑时间序列和异常值检测。

它提供了不同的平滑算法以及计算间隔的可能性。

这里我使用了一个convolutionsmooth，但是你也可以测试其他的。

import numpy as np
import matplotlib.pyplot as plt
from tsmoothie.smoother import *

x = np.linspace(0,2*np.pi,100)
y = np.sin(x) + np.random.random(100) * 0.2

# operate smoothing
smoother = ConvolutionSmoother(window_len=5, window_type='ones')
smoother.smooth(y)

# generate intervals
low, up = smoother.get_intervals('sigma_interval', n_sigma=2)

# plot the smoothed timeseries with intervals
plt.figure(figsize=(11,6))
plt.plot(smoother.smooth_data[0], linewidth=3, color='blue')
plt.plot(smoother.data[0], '.k')
plt.fill_between(range(len(smoother.data[0])), low[0], up[0], alpha=0.3)

我还指出tsmoothie可以用向量化的方式对多个时间序列进行平滑

2020-08-24 11:30:27

其他回答

它提供了不同的平滑算法以及计算间隔的可能性。

这里我使用了一个convolutionsmooth，但是你也可以测试其他的。

import numpy as np
import matplotlib.pyplot as plt
from tsmoothie.smoother import *

x = np.linspace(0,2*np.pi,100)
y = np.sin(x) + np.random.random(100) * 0.2

# operate smoothing
smoother = ConvolutionSmoother(window_len=5, window_type='ones')
smoother.smooth(y)

# generate intervals
low, up = smoother.get_intervals('sigma_interval', n_sigma=2)

# plot the smoothed timeseries with intervals
plt.figure(figsize=(11,6))
plt.plot(smoother.smooth_data[0], linewidth=3, color='blue')
plt.plot(smoother.data[0], '.k')
plt.fill_between(range(len(smoother.data[0])), low[0], up[0], alpha=0.3)

我还指出tsmoothie可以用向量化的方式对多个时间序列进行平滑

2020-08-24 11:30:27

如果你对周期信号的“平滑”版本感兴趣(就像你的例子)，那么FFT是正确的方法。进行傅里叶变换并减去低贡献频率:

import numpy as np
import scipy.fftpack

N = 100
x = np.linspace(0,2*np.pi,N)
y = np.sin(x) + np.random.random(N) * 0.2

w = scipy.fftpack.rfft(y)
f = scipy.fftpack.rfftfreq(N, x[1]-x[0])
spectrum = w**2

cutoff_idx = spectrum < (spectrum.max()/5)
w2 = w.copy()
w2[cutoff_idx] = 0

y2 = scipy.fftpack.irfft(w2)

即使你的信号不是完全周期性的，这也能很好地去除白噪声。有许多类型的过滤器可以使用(高通，低通，等等…)，合适的一个取决于你正在寻找什么。

2013-12-16 19:24:44

在scipy中有一个简单的函数。Ndimage也适用于我:

from scipy.ndimage import uniform_filter1d

y_smooth = uniform_filter1d(y,size=15)

2022-10-07 19:11:38

另一个选择是在statmodel中使用KernelReg:

from statsmodels.nonparametric.kernel_regression import KernelReg
import numpy as np
import matplotlib.pyplot as plt

x = np.linspace(0,2*np.pi,100)
y = np.sin(x) + np.random.random(100) * 0.2

# The third parameter specifies the type of the variable x;
# 'c' stands for continuous
kr = KernelReg(y,x,'c')
plt.plot(x, y, '+')
y_pred, y_std = kr.fit(x)

plt.plot(x, y_pred)
plt.show()

2014-09-30 18:29:06

为你的数据拟合一个移动平均线可以消除噪音，看看这个如何做到这一点的答案。

如果你想使用LOWESS来拟合你的数据(它类似于移动平均，但更复杂)，你可以使用statmodels库:

import numpy as np
import pylab as plt
import statsmodels.api as sm

x = np.linspace(0,2*np.pi,100)
y = np.sin(x) + np.random.random(100) * 0.2
lowess = sm.nonparametric.lowess(y, x, frac=0.1)

plt.plot(x, y, '+')
plt.plot(lowess[:, 0], lowess[:, 1])
plt.show()

最后，如果你知道信号的函数形式，你就可以为你的数据拟合曲线，这可能是最好的办法。

2013-12-16 19:36:27

如何以正确的方式平滑曲线?

推荐文章

最新文章

标签