2 回答

TA貢獻1824條經驗 獲得超8個贊
使用defchararray.find來自numpy
s1=df.plan_identifier.values.astype(str)
s2=df.wellthie_issuer_identifier.values.astype(str)
~np.core.defchararray.find(s1,s2).astype(bool)
Out[64]: array([False, True, True, True, True, False, True])

TA貢獻1797條經驗 獲得超6個贊
Pandas 中的字符串方法通常很慢。您可以改用列表理解。IUC:
>>> [i in p for p,i in zip(df['plan_identifier'],df['wellthie_issuer_identifier'])]
[False, True, True, True, True, False, True]
# or assign to new column:
df['new_column'] = [i in p for p,i in zip(df['plan_identifier'],df['wellthie_issuer_identifier'])]
>>> df
plan_identifier wellthie_issuer_identifier new_column
0 UNM99901AL0000001-DEN UNM99902 False
1 UNM99902AK0000001-DEN UNM99902 True
2 UNM99904AZ0000001-DEN UNM99904 True
3 UNM99905AR0000001-DEN UNM99905 True
4 UNM99906CA0000001-DEN UNM99906 True
5 UNM99908CO0000001-DEN UNM99909 False
6 UNM99909CT0000001-DEN UNM99909 True
[編輯]在評論中,您說您只對字符串的開頭感興趣。在這種情況下,您可以startswith改用:
[p.startswith(i) for p,i in zip(df['plan_identifier'],df['wellthie_issuer_identifier'])]
添加回答
舉報