1 回答

TA貢獻1995條經驗 獲得超2個贊
df從相應的設置初始數據幀dictionary:
df = pd.DataFrame({'urls': list(dictionary.keys()), 'strings': list(dictionary.values())})
pattern = '|'.join(phrases)
處理數據幀:
s = df.pop('strings').str.findall(pattern)
df = df.assign(phrasecount=s.str.len(), phrase=s.map(', '.join))
df = df.drop_duplicates(subset='phrasecount') if df['phrasecount'].eq(0).all() else df[df['phrasecount'].ne(0)]
結果:
# print(df)
urls phrasecount phrase
0 http://www.firsturl.com 2 going to the market, eating cookies
2 http://www.thirdurl.com 1 i am good
添加回答
舉報