2 回答
TA貢獻1856條經驗 獲得超5個贊
這是我如何使用它來解決它.explode(),.value_counts()您還可以將其分配為一列或隨心所欲地使用輸出:在一行中:
print(df.explode('column')['column'].value_counts())
完整示例:
import pandas as pd
data_1 = {'index':[0,1,2,3],'column':[['abc','mno'],['mno','pqr'],['abc','mno'],['mno','pqr']]}
df = pd.DataFrame(data_1)
df = df.set_index('index')
print(df)
column
index
0 [abc, mno]
1 [mno, pqr]
2 [abc, mno]
3 [mno, pqr]
在這里,我們執行.explode()從列表中創建單個值和 value_counts() 來計算唯一值的重復:
df_new = df.explode('column')
print(df_new['column'].value_counts())
輸出:
mno 4
abc 2
pqr 2
TA貢獻1825條經驗 獲得超4個贊
利用collections.Counter
from collections import Counter
from itertools import chain
Counter(chain.from_iterable(df.column))
Out[196]: Counter({'abc': 2, 'mno': 4, 'pqr': 2})
%時間
df1 = pd.concat([df]*10000, ignore_index=True)
In [227]: %timeit pd.Series(Counter(chain.from_iterable(df1.column)))
14.3 ms ± 279 μs per loop (mean ± std. dev. of 7 runs, 100 loops each)
In [228]: %timeit df1.column.explode().value_counts()
127 ms ± 3.06 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)
添加回答
舉報
