1 回答

TA貢獻2003條經驗 獲得超2個贊
首先Series.str.split通過asfor 字典創建字典:
df3 = df[['tableNameFrom','tableNameJoin']].stack().str.split(' as ', expand=True).dropna()
d = dict(zip(df3[1], df3[0]))
print (d)
{'i': 'inventory', 'r': 'rental'}
將 index 參數添加到DataFrame構造函數并刪除 last reset_index:
df1 = (pd.DataFrame(df['attributeName'].values.tolist(), index=df.index)
.stack()
.str.split('.', expand=True))
df1.columns = ['tableName','attributeName']
print(df1)
tableName attributeName
1 0 film languageId
1 language languageID
2 film filmID
2 0 i inventoryId
1 r filmId
僅選擇列date和DataFrame.join新的DataFrame:
df2 = df[['date']].join(df1.reset_index(level=1, drop=True))
最后Series.replace是字典:
df2['tableName'] = df2['tableName'].replace(d)
df2 = df2.reset_index(drop=True)
print (df2)
date tableName attributeName
0 29-03-2019 film languageId
1 29-03-2019 language languageID
2 29-03-2019 film filmID
3 30-03-2019 inventory inventoryId
4 30-03-2019 rental filmId
添加回答
舉報