2 回答

TA貢獻1806條經驗 獲得超8個贊
這是使用列表理解和 itertools 的更快方法 -
import itertools
#Get vocab of items
vocab = list(df1['Id'].astype(int))
#get filtered list of combinations in each row of df2
filtered = [[int(j) for j in i.split(',') if int(j) in vocab] for i in list(df2['Tag Id'])]
#Get counts of the combinations and display as a dataframe
counts = list(zip(*np.unique(filtered, return_counts=True)))
pd.DataFrame(counts, columns=['Combinations', 'Counts'])
Combinations Counts
0 [181, 987] 2
1 [300, 653, 987] 1
2 [456] 1

TA貢獻1851條經驗 獲得超5個贊
讓我們嘗試將inexplode分開,然后用和計數:Tag Idsdf1mergedf1
s = (df2['Tag Id'].str.split(',')
.explode()
.reset_index()
)
(df1.merge(s, left_on='Id', right_on='Tag Id')
.sort_values('Tag Id')
.groupby('index')
.agg(Combination=('Id',','.join))
['Combination']
.value_counts().reset_index()
)
輸出:
index Combination
0 181,987 2
1 653,987,300 1
2 456 1
添加回答
舉報