亚洲在线久爱草,狠狠天天香蕉网,天天搞日日干久草,伊人亚洲日本欧美

為了賬號安全,請及時綁定郵箱和手機立即綁定
已解決430363個問題,去搜搜看,總會有你想問的

如何在時域計算音高基頻 f( 0) )?

如何在時域計算音高基頻 f( 0) )?

qq_花開花謝_0 2022-11-01 14:04:55
我是 DSP 的新手,試圖為f(0)音頻文件的每個分段幀計算基頻( )。F0估計的方法可以分為三類:基于信號時域的時間動態;基于頻率結構頻域,以及混合方法。大多數示例都是基于頻率結構頻域估計基頻,我正在尋找基于信號時域的時間動態。本文提供了一些信息,但我仍然不清楚如何在時域中計算它?https://gist.github.com/endolith/255291這是我發現的到目前為止使用的代碼:def freq_from_autocorr(sig, fs):    """    Estimate frequency using autocorrelation    """    # Calculate autocorrelation and throw away the negative lags    corr = correlate(sig, sig, mode='full')    corr = corr[len(corr)//2:]    # Find the first low point    d = diff(corr)    start = nonzero(d > 0)[0][0]    # Find the next peak after the low point (other than 0 lag).  This bit is    # not reliable for long signals, due to the desired peak occurring between    # samples, and other peaks appearing higher.    # Should use a weighting function to de-emphasize the peaks at longer lags.    peak = argmax(corr[start:]) + start    px, py = parabolic(corr, peak)    return fs / px如何在時域進行估計?提前致謝!
查看完整描述

1 回答

?
慕容3067478

TA貢獻1773條經驗 獲得超3個贊

這是一個正確的實現。不是很健壯,但肯定有效。為了驗證這一點,我們可以生成一個已知頻率的信號,看看我們會得到什么結果:


import numpy as np

from scipy.io import wavfile

from scipy.signal import correlate, fftconvolve

from scipy.interpolate import interp1d


fs = 44100

frequency = 440

length = 0.01 # in seconds


t = np.linspace(0, length, int(fs * length)) 

y = np.sin(frequency * 2 * np.pi * t)


def parabolic(f, x):

    xv = 1/2. * (f[x-1] - f[x+1]) / (f[x-1] - 2 * f[x] + f[x+1]) + x

    yv = f[x] - 1/4. * (f[x-1] - f[x+1]) * (xv - x)

    return (xv, yv)


def freq_from_autocorr(sig, fs):

    """

    Estimate frequency using autocorrelation

    """

    corr = correlate(sig, sig, mode='full')

    corr = corr[len(corr)//2:]

    d = np.diff(corr)

    start = np.nonzero(d > 0)[0][0]

    peak = np.argmax(corr[start:]) + start

    px, py = parabolic(corr, peak)


    return fs / px

結果

運行freq_from_autocorr(y, fs)得到我們~442.014 Hz,大約 0.45% 的錯誤。


獎金 - 我們可以改進

我們可以通過稍微多一點的編碼使它更精確和健壯:


def indexes(y, thres=0.3, min_dist=1, thres_abs=False):

    """Peak detection routine borrowed from 

    https://bitbucket.org/lucashnegri/peakutils/src/master/peakutils/peak.py

    """

    if isinstance(y, np.ndarray) and np.issubdtype(y.dtype, np.unsignedinteger):

        raise ValueError("y must be signed")


    if not thres_abs:

        thres = thres * (np.max(y) - np.min(y)) + np.min(y)


    min_dist = int(min_dist)


    # compute first order difference

    dy = np.diff(y)


    # propagate left and right values successively to fill all plateau pixels (0-value)

    zeros, = np.where(dy == 0)


    # check if the signal is totally flat

    if len(zeros) == len(y) - 1:

        return np.array([])


    if len(zeros):

        # compute first order difference of zero indexes

        zeros_diff = np.diff(zeros)

        # check when zeros are not chained together

        zeros_diff_not_one, = np.add(np.where(zeros_diff != 1), 1)

        # make an array of the chained zero indexes

        zero_plateaus = np.split(zeros, zeros_diff_not_one)


        # fix if leftmost value in dy is zero

        if zero_plateaus[0][0] == 0:

            dy[zero_plateaus[0]] = dy[zero_plateaus[0][-1] + 1]

            zero_plateaus.pop(0)


        # fix if rightmost value of dy is zero

        if len(zero_plateaus) and zero_plateaus[-1][-1] == len(dy) - 1:

            dy[zero_plateaus[-1]] = dy[zero_plateaus[-1][0] - 1]

            zero_plateaus.pop(-1)


        # for each chain of zero indexes

        for plateau in zero_plateaus:

            median = np.median(plateau)

            # set leftmost values to leftmost non zero values

            dy[plateau[plateau < median]] = dy[plateau[0] - 1]

            # set rightmost and middle values to rightmost non zero values

            dy[plateau[plateau >= median]] = dy[plateau[-1] + 1]


    # find the peaks by using the first order difference

    peaks = np.where(

        (np.hstack([dy, 0.0]) < 0.0)

        & (np.hstack([0.0, dy]) > 0.0)

        & (np.greater(y, thres))

    )[0]


    # handle multiple peaks, respecting the minimum distance

    if peaks.size > 1 and min_dist > 1:

        highest = peaks[np.argsort(y[peaks])][::-1]

        rem = np.ones(y.size, dtype=bool)

        rem[peaks] = False


        for peak in highest:

            if not rem[peak]:

                sl = slice(max(0, peak - min_dist), peak + min_dist + 1)

                rem[sl] = True

                rem[peak] = False


        peaks = np.arange(y.size)[~rem]


    return peaks


def freq_from_autocorr_improved(signal, fs):

    signal -= np.mean(signal)  # Remove DC offset

    corr = fftconvolve(signal, signal[::-1], mode='full')

    corr = corr[len(corr)//2:]


    # Find the first peak on the left

    i_peak = indexes(corr, thres=0.8, min_dist=5)[0]

    i_interp = parabolic(corr, i_peak)[0]


    return fs / i_interp, corr, i_interp


運行freq_from_autocorr_improved(y, fs)產量~441.825 Hz,大約 0.41% 的誤差。這種方法在更復雜的情況下會表現得更好,并且需要兩倍的時間來計算。


通過更長的采樣時間(即設置length為例如 0.1 秒),我們將獲得更準確的結果。


查看完整回答
反對 回復 2022-11-01
  • 1 回答
  • 0 關注
  • 145 瀏覽
慕課專欄
更多

添加回答

舉報

0/150
提交
取消
微信客服

購課補貼
聯系客服咨詢優惠詳情

幫助反饋 APP下載

慕課網APP
您的移動學習伙伴

公眾號

掃描二維碼
關注慕課網微信公眾號