4 回答

TA貢獻1836條經驗 獲得超3個贊
您可以使用列表理解來if/else獲取輸出:
df['gender'] = ['women' if 'women' in word
else "men" if "men" in word
else "unisex"
for word in df.product_name.str.lower()]
df
product_name price gender
0 Women's pant 20.0 women
1 Men's Shirt 30.0 men
2 Women's Dress 40.0 women
3 Blue Shirt 30.0 unisex
或者,您可以使用numpy select來獲得相同的結果:
cond1 = df.product_name.str.lower().str.contains("women")
cond2 = df.product_name.str.lower().str.contains("men")
condlist = [cond1, cond2]
choicelist = ["women", "men"]
df["gender"] = np.select(condlist, choicelist, default="unisex")
通常,對于字符串,python 的迭代要快得多;你必須測試一下。

TA貢獻1790條經驗 獲得超9個贊
嘗試將您的for語句轉換為函數并使用apply. 所以像 -
def label_gender(product_name):
'''product_name is a str'''
if 'women' in product_name.lower():
return 'women'
elif 'men' in product_name.lower():
return 'men'
else:
return 'unisex'
df['gender'] = df.apply(lambda x: label_gender(x['product_name']),axis=1)
可以在這里找到使用 apply/lambda 的詳細分類:https ://towardsdatascience.com/apply-and-lambda-usage-in-pandas-b13a1ea037f7

TA貢獻1817條經驗 獲得超14個贊
您也可以使用np.where
+ Series.str.contains
,
import numpy as np
df['gender'] = (
np.where(df.product_name.str.contains("women", case=False), 'women',
np.where(df.product_name.str.contains("men", case=False), "men", 'unisex'))
)
product_name price gender
0 Women's pant 20.0 women
1 Men's Shirt 30.0 men
2 Women's Dress 40.0 women
3 Blue Shirt 30.0 unisex

TA貢獻1833條經驗 獲得超4個贊
在短語中使用np.where .str.containsand regex firstword`。以便;
#np.where(if product_name has WomenORMen, 1st Word in Phrase, otherwise;unisex)
df['Gender']=np.where(df.product_name.str.contains('Women|Men')\
,df.product_name.str.split('(^[\w]+)').str[1],'Unisex')
product_name price gender
0 Women's pant 20.0 Women
1 Men's Shirt 30.0 Men
2 Women's Dress 640.0 Women
3 Blue Shirt 30.0 Unisex
添加回答
舉報