2 回答

TA貢獻1836條經驗 獲得超13個贊
根據問題末尾的解釋,似乎兩列都是
str
類型,并且需要轉換為list
類型.applymap
與 一起使用ast.literal_eval
。如果只有一列是
str
類型,則使用df[col] = df[col].apply(literal_eval)
每列中的數據列表必須使用以下方法提取
pandas.DataFrame.explode
外部
explode
將值從列表轉換為標量(即[0.4]
轉換為0.4
)。
一旦值位于不同的行上,就可以使用布爾索引來選擇所需范圍內的數據。
如果您想
df
與結合使用df_new
,請使用df.join(df_new, rsuffix='_extracted')
測試于
python 3.10
,pandas 1.4.3
import pandas as pd
from ast import literal_eval
# setup the test data: this data is lists
# data = {'c1': [['abc', 'bcd', 'dog'], ['cat', 'bcd', 'def']], 'c2': [[[.4], [.5], [.9]], [[.9], [.5], [.4]]]}
# setup the test data: this data is strings
data = {'c1': ["['abc', 'bcd', 'dog', 'cat']", "['cat', 'bcd', 'def']"], 'c2': ["[[.4], [.5], [.9], [1.0]]", "[[.9], [.5], [.4]]"]}
# create the dataframe
df = pd.DataFrame(data)
# the description leads me to think the data is columns of strings, not lists
# convert the columns from string type to list type
# the following line is only required if the columns are strings
df = df.applymap(literal_eval)
# explode the lists in each column, and the explode the remaining lists in 'c2'
df_new = df.explode(['c1', 'c2'], ignore_index=True).explode('c2')
# use Boolean Indexing to select the desired data
df_new = df_new[df_new['c2'] >= 0.9]
# display(df_new)
? ? c1? ?c2
2? dog? 0.9
3? cat? 1.0
4? cat? 0.9

TA貢獻1884條經驗 獲得超4個贊
您可以使用列表推導式根據您的條件填充新列。
df['col3'] = [
[value for value, score in zip(c1, c2) if score[0] >= 0.9]
for c1, c2 in zip(df['col1'], df['col2'])
]
df['col4'] = [
[score[0] for score in c2 if score[0] >= 0.9]
for c2 in df['col2']
輸出
col1 col2 col3 col4
0 [abc, bcd, dog] [[0.4], [0.5], [0.9]] [dog] [0.9]
1 [cat, bcd, def] [[0.9], [0.5], [0.4]] [cat] [0.9]
添加回答
舉報