4 回答

TA貢獻1854條經驗 獲得超8個贊
你可以這樣做:
city_list = ["Stevenage", "Essex", "Coventry", "Chester"]
def get_match(row):
col_2 = row["col 2"].replace(",", " ").split() # Here you should process the string as you want
for c in city_list:
if difflib.get_close_matches(col_2, c)
return c
return ""
df["col 3"] = df.apply(lambda row: get_match(row), axis=1)

TA貢獻1785條經驗 獲得超8個贊
查看str.contains測試模式是否匹配系列的函數:
df = pd.DataFrame([[59, '538 Walton Avenue, Chester,', 'FY6 7NP'],
[62, '42 Chesterton Road, Peterborough', '4HG 3HT'],
[179, '3 Wallbridge Street, Essex', '4HG 3HT'],
[180, '6 Stevenage Avenue, Coventry', '7PY 9NP']])
city_list = ["Stevenage", "Essex", "Coventry", "Chester"]
for city in city_list:
df.loc[df[1].str.contains(city), 'match'] = city

TA貢獻1856條經驗 獲得超17個贊
試試這個
def aux_func(address):
aux_list = ['Stevenage', 'Essex', 'Coventry', 'Chester']
# remove commas
address = address.split(',')
# avoide matches with the first part of the address
if len(address)>1:
# remove the first element of the address
address = address[1:]
for v in aux_list:
for chunk in address:
if v in chunk:
return v
return ""
df['col 3'] = [aux_func(address) for address in df['col 2']]

TA貢獻1811條經驗 獲得超4個贊
依靠這樣的輔助功能:
df = pd.DataFrame({'col 1': [59, 62, 179, 180],
'col 2': ['538 Walton Avenue, Chester, FY6 7NP',
'42 Chesterton Road, Peterborough, FR7 2NY',
'3 Wallbridge Street, Essex, 4HG 3HT',
'6 Stevenage Avenue, Coventry, 7PY 9NP'
]})
def aux_func(x):
# split by comma and select the interesting part ([1])
x = x.split(',')
x = x[1]
aux_list = ['Stevenage', 'Essex', 'Coventry', 'Chester']
for v in aux_list:
if v in x:
return v
return ""
df['col 3'] = [aux_func(name) for name in df['col 2']]
添加回答
舉報