1 回答

TA貢獻1712條經驗 獲得超3個贊
我切入正題。上面代碼中最關鍵的問題,關于以下錯誤:
sqrt(n-len(k)-3)*abs(z(sigma_inverse[i][j])) <= phi(1-alpha/2)
我誤解了 n 的平均值,它不是精度矩陣的大小,而是多變量觀察的總數(在我的情況下,是 10000 而不是 5)。另一個錯誤的假設是 z(sigma_inverse[i][j]) 必須提供 i 和 j 的部分相關性,給定所有其余部分。這是不正確的,z 是精度矩陣的適當子集上的 Fisher 變換,它估計給定 K 時 i 和 j 的偏相關。正確的測試如下:
if len(K) == 0: #CM is the correlation matrix, we have no variables conditioning (K has 0 length)
r = CM[i, j] #r is the partial correlation of i and j
elif len(K) == 1: #we have one variable conditioning, not very different from the previous version except for the fact that i have not to compute the correlations matrix since i start from it, and pandas provide such a feature on a DataFrame
r = (CM[i, j] - CM[i, K] * CM[j, K]) / math.sqrt((1 - math.pow(CM[j, K], 2)) * (1 - math.pow(CM[i, K], 2))) #r is the partial correlation of i and j given K
else: #more than one conditioning variable
CM_SUBSET = CM[np.ix_([i]+[j]+K, [i]+[j]+K)] #subset of the correlation matrix i'm looking for
PM_SUBSET = np.linalg.pinv(CM_SUBSET) #constructing the precision matrix of the given subset
r = -1 * PM_SUBSET[0, 1] / math.sqrt(abs(PM_SUBSET[0, 0] * PM_SUBSET[1, 1]))
r = min(0.999999, max(-0.999999,r))
res = math.sqrt(n - len(K) - 3) * 0.5 * math.log1p((2*r)/(1-r)) #estimating partial correlation with fisher's transofrmation
return 2 * (1 - norm.cdf(abs(res))) #obtaining p-value
我希望有人能發現這有幫助
添加回答
舉報