我需要將HTML文件列表讀取到熊貓數據幀中。每個HTML文件都有多個數據幀(我使用pd.concat來組合它們)。HTML文件名包含一個字符串,我想將其添加為列。# Read all files into a listfiles = glob.glob('monthly_*.html')# Zip the dfs with the desired string segmentzipped_dfs = [zip(pd.concat(pd.read_html(file)), file.split('_')[1]) for file in files]我在打開( df,產品)的壓縮列表時遇到問題。dfs = []# Loop through the list of zips, for _zip in zipped_dfs: # Unpack the zip for _df, product in _zip: # Adding the product string as a new column _df['Product'] = product dfs.append(_df)但是,我收到錯誤'str' object does not support item assignment有人可以解釋添加新列的最佳方法嗎?
1 回答

繁華開滿天機
TA貢獻1816條經驗 獲得超4個贊
您應該從列表理解中刪除該行。如果您想要串聯數據幀和產品名稱的元組,則應編寫:zip
zipped_dfs = [(pd.concat(pd.read_html(file)), file.split('_')[1])
for file in files]
但是,不需要創建元組列表的中間步驟。整個方法可以簡化如下:
dfs = []
for file in glob.glob('monthly_*.html'):
# NOTE: your code seemingly keeps .html in the product name
# so I modified the split operation
df = pd.concat(pd.read_html(file))
df['Product'] = file.split('.html')[0].split('_')[1]
dfs.append(df)
添加回答
舉報
0/150
提交
取消