我已經構建了一個代碼,可以從我網站的博客中提取信息(這些 URL 位于 Excel 文件中,因此我從那里提取這些信息)。我希望將我提取的每個 URL 信息放在單獨的 .txt 文件中(到目前為止,我只設法將這些信息放在 1 中)。我怎樣才能做到這一點?我不知道從哪里開始,我在這里很迷失:(任何幫助將不勝感激。import urllibfrom bs4 import BeautifulSoupimport pandas as pdimport timei = []crawl = pd.read_excel('C:/Users/Acer/Desktop/internal_all2.xlsx') addresses = crawl['Address'].tolist() for row in addresses: url = row time.sleep(5) response = urllib.request.urlopen(url) soup = BeautifulSoup(response, 'html.parser') content = soup.find_all('p') for content2 in content: print(url, content2) i.append([url,content2]) df = pd.DataFrame(i) df.to_csv('C:/Users/Acer/Desktop/scripts/content/test.txt', index=False)
1 回答

翻翻過去那場雪
TA貢獻2065條經驗 獲得超14個贊
只需在文件名后附加一個字符串:
import urllib
from bs4 import BeautifulSoup
import pandas as pd
import time
i = []
crawl = pd.read_excel('C:/Users/Acer/Desktop/internal_all2.xlsx')
addresses = crawl['Address'].tolist()
for row in addresses:
url = row
time.sleep(5)
response = urllib.request.urlopen(url)
soup = BeautifulSoup(response, 'html.parser')
content = soup.find_all('p')
for content2 in content:
print(url, content2)
i.append([url,content2])
df = pd.DataFrame(i)
df.to_csv(f'C:/Users/Acer/Desktop/scripts/content/test_{url}.txt', index=False)
添加回答
舉報
0/150
提交
取消