我們得到了一個 file.tsv,我們需要構建一個函數。其中之一是如果一列(此處稱為“low_confidence_variant”)= True,則刪除每一行。我在某種程度上為這部分而奮斗。另外,有什么優化建議嗎?根據結果,我們需要制作一個邁阿密圖。這是我到目前為止所做的。任何提示都會有用;import numpy as npimport pandas as pdimport matplotlib.pyplot as pltdef read_file(file, chromosome):df = pd.read_csv(file, sep='\t', usecols=['chromosome', 'position', 'pval', 'low_confidence_variant'])df.drop(['low_confidence_variant'], True)df.dropna()sub_data = df.replace({'pval': 0}, 1e-274)sub_data['log10'] = -np.log10(sub_data['pval'])chr_group = sub_data.groupby(['chromosome'])chromosome = chr_group.get_group(chromosome)return chromosomedf1 = read_file('vitamin_d.females.tsv.gz', 1)df2 = read_file('vitamin_d.males.tsv.gz', 1)xa = df2['position']ya = df2['log10']xb = df1['position']yb = df1['log10'] * -1fig, (ax1, ax2) = plt.subplots(nrows=2, ncols=1, sharex=True, figsize=(12, 4))ax1.scatter(xa, ya, s=1, c="tab:blue")ax1.set_ylabel('males $\it{-log_{10}(pval)}$')ax1.set_title('vitamin D (nmol/L)', fontweight='bold')ax1.axhline(-np.log10(5*10**-8), c ='darkgray', ls='--')ax2.scatter(xb, yb, s=1, c="tab:blue")ax2.set_ylabel('females $\it{log_{10}(pval)}$')ax2.axhline(np.log10(5*10**-8), c ='darkgray', ls='--')plt.xlabel('Chromosome 1 positions')plt.subplots_adjust(hspace=.0)plt.show()fig.savefig(fname='miami.png', dpi=300, bbox_inches='tight', format='png')
1 回答

largeQ
TA貢獻2039條經驗 獲得超8個贊
我有點不確定你的意思。
Say Df =
A B low_confidence_variant
10 20 True
2 4 False
6 0 False
So after deleting the rows with low_confidence_variant = True, you should have
df =
A B low_confidence_variant
2 4 False
6 0 False
正確的?
如果這就是您的意思:
### Add below line
df = df[df['low_confidence_variant'] != True]
并刪除這一行
### Delete this line from the code
df.drop(['low_confidence_variant'], True)
您所做的就是刪除整個列本身。
添加回答
舉報
0/150
提交
取消