用 Python 密谋快速傅里叶变换

小开

最佳答案

So I run a functionally equivalent form of your code in an IPython notebook:

%matplotlib inline
import numpy as np
import matplotlib.pyplot as plt
import scipy.fftpack


# Number of samplepoints
N = 600
# sample spacing
T = 1.0 / 800.0
x = np.linspace(0.0, N*T, N)
y = np.sin(50.0 * 2.0*np.pi*x) + 0.5*np.sin(80.0 * 2.0*np.pi*x)
yf = scipy.fftpack.fft(y)
xf = np.linspace(0.0, 1.0/(2.0*T), N//2)


fig, ax = plt.subplots()
ax.plot(xf, 2.0/N * np.abs(yf[:N//2]))
plt.show()

我得到了我认为非常合理的输出。

enter image description here

It's been longer than I care to admit since I was in engineering school thinking about signal processing, but spikes at 50 and 80 are exactly what I would expect. So what's the issue?

回应张贴的原始数据和评论

这里的问题是你没有周期性的数据。您应该总是检查输入到任何算法中的数据，以确保它是适当的。

import pandas
import matplotlib.pyplot as plt
#import seaborn
%matplotlib inline


# the OP's data
x = pandas.read_csv('http://pastebin.com/raw.php?i=ksM4FvZS', skiprows=2, header=None).values
y = pandas.read_csv('http://pastebin.com/raw.php?i=0WhjjMkb', skiprows=2, header=None).values
fig, ax = plt.subplots()
ax.plot(x, y)

enter image description here

小开

The important thing about fft is that it can only be applied to data in which the timestamp is uniform (也就是说。 uniform sampling in time, like what you have shown above).

如果采样不均匀，请使用函数拟合数据。有几个教程和功能可供选择:

https://github.com/tiagopereira/python_tips/wiki/Scipy%3A-curve-fitting Http://docs.scipy.org/doc/numpy/reference/generated/numpy.polyfit.html

如果不能选择拟合，您可以直接使用某种形式的插值来将数据插入到统一的采样中:

Https://docs.scipy.org/doc/scipy-0.14.0/reference/tutorial/interpolate.html

当你有统一的样品，你只需要担心你的样品的时间差(t[1] - t[0])。在这种情况下，可以直接使用 fft 函数

Y    = numpy.fft.fft(y)
freq = numpy.fft.fftfreq(len(y), t[1] - t[0])


pylab.figure()
pylab.plot( freq, numpy.abs(Y) )
pylab.figure()
pylab.plot(freq, numpy.angle(Y) )
pylab.show()

这应该能解决你的问题。

小开

高峰值是由于直流(不变，即频率 = 0)部分的信号。这是一个规模的问题。如果您想看到非直流频率的内容，为了可视化，您可能需要从偏移量1绘图，而不是从信号的 FFT 偏移量0绘图。

修改上面@PaulH 给出的例子

import numpy as np
import matplotlib.pyplot as plt
import scipy.fftpack


# Number of samplepoints
N = 600
# sample spacing
T = 1.0 / 800.0
x = np.linspace(0.0, N*T, N)
y = 10 + np.sin(50.0 * 2.0*np.pi*x) + 0.5*np.sin(80.0 * 2.0*np.pi*x)
yf = scipy.fftpack.fft(y)
xf = np.linspace(0.0, 1.0/(2.0*T), N/2)


plt.subplot(2, 1, 1)
plt.plot(xf, 2.0/N * np.abs(yf[0:N/2]))
plt.subplot(2, 1, 2)
plt.plot(xf[1:], 2.0/N * np.abs(yf[0:N/2])[1:])

产出情况如下: Ploting FFT signal with DC and then when removing it (skipping freq = 0)

另一种方法是将数据以对数尺度可视化:

使用:

plt.semilogy(xf, 2.0/N * np.abs(yf[0:N/2]))

将显示: enter image description here

小开

作为对已经给出的答案的补充，我想指出，经常使用 FFT 的箱子的大小是很重要的。测试一组值并选择对应用程序更有意义的值是有意义的。通常，它是在同样数量的样品。大多数答案都是这样假设的，并产生了重大而合理的结果。如果有人想探究这个问题，以下是我的代码版本:

%matplotlib inline
import numpy as np
import matplotlib.pyplot as plt
import scipy.fftpack


fig = plt.figure(figsize=[14,4])
N = 600           # Number of samplepoints
Fs = 800.0
T = 1.0 / Fs      # N_samps*T (#samples x sample period) is the sample spacing.
N_fft = 80        # Number of bins (chooses granularity)
x = np.linspace(0, N*T, N)     # the interval
y = np.sin(50.0 * 2.0*np.pi*x) + 0.5*np.sin(80.0 * 2.0*np.pi*x)   # the signal


# removing the mean of the signal
mean_removed = np.ones_like(y)*np.mean(y)
y = y - mean_removed


# Compute the fft.
yf = scipy.fftpack.fft(y,n=N_fft)
xf = np.arange(0,Fs,Fs/N_fft)


##### Plot the fft #####
ax = plt.subplot(121)
pt, = ax.plot(xf,np.abs(yf), lw=2.0, c='b')
p = plt.Rectangle((Fs/2, 0), Fs/2, ax.get_ylim()[1], facecolor="grey", fill=True, alpha=0.75, hatch="/", zorder=3)
ax.add_patch(p)
ax.set_xlim((ax.get_xlim()[0],Fs))
ax.set_title('FFT', fontsize= 16, fontweight="bold")
ax.set_ylabel('FFT magnitude (power)')
ax.set_xlabel('Frequency (Hz)')
plt.legend((p,), ('mirrowed',))
ax.grid()


##### Close up on the graph of fft#######
# This is the same histogram above, but truncated at the max frequence + an offset.
offset = 1    # just to help the visualization. Nothing important.
ax2 = fig.add_subplot(122)
ax2.plot(xf,np.abs(yf), lw=2.0, c='b')
ax2.set_xticks(xf)
ax2.set_xlim(-1,int(Fs/6)+offset)
ax2.set_title('FFT close-up', fontsize= 16, fontweight="bold")
ax2.set_ylabel('FFT magnitude (power) - log')
ax2.set_xlabel('Frequency (Hz)')
ax2.hold(True)
ax2.grid()


plt.yscale('log')

the output plots:

小开

这个页面上已经有很好的解决方案，但是所有的解决方案都假设数据集是均匀/均匀采样/分布的。我将尝试提供一个随机抽样数据的更一般的例子。我还将使用这个 MATLAB 教程作为一个例子:

添加所需模块:

import numpy as np
import matplotlib.pyplot as plt
import scipy.fftpack
import scipy.signal

生成样本数据:

N = 600 # Number of samples
t = np.random.uniform(0.0, 1.0, N) # Assuming the time start is 0.0 and time end is 1.0
S = 1.0 * np.sin(50.0 * 2 * np.pi * t) + 0.5 * np.sin(80.0 * 2 * np.pi * t)
X = S + 0.01 * np.random.randn(N) # Adding noise

对数据集进行排序:

order = np.argsort(t)
ts = np.array(t)[order]
Xs = np.array(X)[order]

重新抽样:

T = (t.max() - t.min()) / N # Average period
Fs = 1 / T # Average sample rate frequency
f = Fs * np.arange(0, N // 2 + 1) / N; # Resampled frequency vector
X_new, t_new = scipy.signal.resample(Xs, N, ts)

绘制数据和重新采样的数据:

plt.xlim(0, 0.1)
plt.plot(t_new, X_new, label="resampled")
plt.plot(ts, Xs, label="org")
plt.legend()
plt.ylabel("X")
plt.xlabel("t")

现在计算 FFT:

Y = scipy.fftpack.fft(X_new)
P2 = np.abs(Y / N)
P1 = P2[0 : N // 2 + 1]
P1[1 : -2] = 2 * P1[1 : -2]


plt.ylabel("Y")
plt.xlabel("f")
plt.plot(f, P1)

我终于有时间实现一个更加规范的算法来获取分布不均匀的数据傅里叶变换。您可能会看到代码、描述和示例 Jupiter 笔记本给你。

小开

我已经建立了一个函数，处理绘制实际信号的 FFT。相对于前面的答案，我的函数额外的好处是你得到了信号的真的振幅。

此外，由于假设一个真实的信号，FFT 是对称的，所以我们只能绘制 x 轴的正面:

import matplotlib.pyplot as plt import numpy as np import warnings def fftPlot(sig, dt=None, plot=True): # Here it's assumes analytic signal (real signal...) - so only half of the axis is required if dt is None: dt = 1 t = np.arange(0, sig.shape[-1]) xLabel = 'samples' else: t = np.arange(0, sig.shape[-1]) * dt xLabel = 'freq [Hz]' if sig.shape[0] % 2 != 0: warnings.warn("signal preferred to be even in size, autoFixing it...") t = t[0:-1] sig = sig[0:-1] sigFFT = np.fft.fft(sig) / t.shape[0] # Divided by size t for coherent magnitude freq = np.fft.fftfreq(t.shape[0], d=dt) # Plot analytic signal - right half of frequence axis needed only... firstNegInd = np.argmax(freq < 0) freqAxisPos = freq[0:firstNegInd] sigFFTPos = 2 * sigFFT[0:firstNegInd] # *2 because of magnitude of analytic signal if plot: plt.figure() plt.plot(freqAxisPos, np.abs(sigFFTPos)) plt.xlabel(xLabel) plt.ylabel('mag') plt.title('Analytic FFT plot') plt.show() return sigFFTPos, freqAxisPos if __name__ == "__main__": dt = 1 / 1000 # Build a signal within Nyquist - the result will be the positive FFT with actual magnitude f0 = 200 # [Hz] t = np.arange(0, 1 + dt, dt) sig = ( 1 * np.sin(2 * np.pi * f0 * t) + 10 * np.sin(2 * np.pi * f0 / 2 * t) + 3 * np.sin(2 * np.pi * f0 / 4 * t) + 10 * np.sin(2 * np.pi * (f0 * 2 + 0.5) * t) # <--- not sampled on grid so the peak will not be actual height ) # Result in frequencies fftPlot(sig, dt=dt) # Result in samples (if the frequencies axis is unknown) fftPlot(sig)

小开

我写这个额外的答案，以解释起源的扩散尖峰时使用 FFT，特别是讨论 Scypy.fftpack教程，我不同意在一些点。

在这个例子中，记录时间 tmax=N*T=0.75。信号是 sin(50*2*pi*x) + 0.5*sin(80*2*pi*x)。频率信号应该包含两个尖峰在频率 50和 80与振幅 1和 0.5。但是，如果被分析的信号没有整数个周期，由于信号被截断，可能会出现扩散:

派克1: 50*tmax=37.5 = > 频率 50不是 1/tmax = > 扩散的存在的倍数，因为信号在这个频率被截断。

派克2: 80*tmax=60 = > 频率 80是 1/tmax = > 没有扩散的倍数，因为在这个频率下信号被截断。

下面的代码分析了与本教程(sin(50*2*pi*x) + 0.5*sin(80*2*pi*x))中相同的信号，但有一些细微的差别:

The original scipy.fftpack example.

具有整数个信号周期(tmax=1.0而不是 0.75，以避免截断扩散)的原始 scipy.fftpack 示例。

最初的 scypy.fftpack 示例具有整数个信号周期，其中的日期和频率取自 FFT 理论。

密码:

import numpy as np import matplotlib.pyplot as plt import scipy.fftpack # 1. Linspace N = 600 # Sample spacing tmax = 3/4 T = tmax / N # =1.0 / 800.0 x1 = np.linspace(0.0, N*T, N) y1 = np.sin(50.0 * 2.0*np.pi*x1) + 0.5*np.sin(80.0 * 2.0*np.pi*x1) yf1 = scipy.fftpack.fft(y1) xf1 = np.linspace(0.0, 1.0/(2.0*T), N//2) # 2. Integer number of periods tmax = 1 T = tmax / N # Sample spacing x2 = np.linspace(0.0, N*T, N) y2 = np.sin(50.0 * 2.0*np.pi*x2) + 0.5*np.sin(80.0 * 2.0*np.pi*x2) yf2 = scipy.fftpack.fft(y2) xf2 = np.linspace(0.0, 1.0/(2.0*T), N//2) # 3. Correct positioning of dates relatively to FFT theory ('arange' instead of 'linspace') tmax = 1 T = tmax / N # Sample spacing x3 = T * np.arange(N) y3 = np.sin(50.0 * 2.0*np.pi*x3) + 0.5*np.sin(80.0 * 2.0*np.pi*x3) yf3 = scipy.fftpack.fft(y3) xf3 = 1/(N*T) * np.arange(N)[:N//2] fig, ax = plt.subplots() # Plotting only the left part of the spectrum to not show aliasing ax.plot(xf1, 2.0/N * np.abs(yf1[:N//2]), label='fftpack tutorial') ax.plot(xf2, 2.0/N * np.abs(yf2[:N//2]), label='Integer number of periods') ax.plot(xf3, 2.0/N * np.abs(yf3[:N//2]), label='Correct positioning of dates') plt.legend() plt.grid() plt.show()

产出:

正如它可以在这里，即使使用一个整数的周期数，一些扩散仍然保留。这种行为是由于在 scypy.fftpack 教程中对日期和频率的定位不当造成的。因此，在离散傅里叶变换理论中:

the signal should be evaluated at dates t=0,T,...,(N-1)*T where T is the sampling period and the total duration of the signal is tmax=N*T. Note that we stop at tmax-T.

相关的频率是 f=0,df,...,(N-1)*df，其中 df=1/tmax=1/(N*T)是采样频率。信号的所有谐波都应该是采样频率的倍数，以避免扩散。

在上面的例子中，你可以看到使用 arange而不是 linspace可以避免额外的扩散频谱。此外，使用 linspace版本也导致一个位于略高于他们应该是因为它可以在第一张图片中看到的尖峰位于频率 50和 80的右边一点点尖峰的偏移。

我的结论是，应该用下面的代码来代替使用示例(我认为这样的代码误导性更小) :

import numpy as np from scipy.fftpack import fft # Number of sample points N = 600 T = 1.0 / 800.0 x = T*np.arange(N) y = np.sin(50.0 * 2.0*np.pi*x) + 0.5*np.sin(80.0 * 2.0*np.pi*x) yf = fft(y) xf = 1/(N*T)*np.arange(N//2) import matplotlib.pyplot as plt plt.plot(xf, 2.0/N * np.abs(yf[0:N//2])) plt.grid() plt.show()

输出(第二个峰值不再扩散) :

我认为这个答案对于如何正确应用离散傅里叶变换还有一些额外的解释。显然，我的回答太长了，而且总是有额外的东西要说(例如，Ewerlopes 简短地交谈了一下关于伪装和窗户可以说的很多) ，所以我就不说了。

我认为深入了解离散傅里叶变换的原则在应用时是非常重要的，因为我们都知道很多人在应用时会不断添加因素，以获得他们想要的东西。