2 回答
TA貢獻1863條經驗 獲得超2個贊
無需在此處使用 Selenium,因為您可以從 API 訪問信息。這拉動了 247 條公告:
import requests
from pandas.io.json import json_normalize
url = 'https://api.bseindia.com/BseIndiaAPI/api/AnnGetData/w'
headers = {'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/73.0.3683.103 Safari/537.36'}
payload = {
'strCat': '-1',
'strPrevDate': '20190423',
'strScrip': '',
'strSearch': 'P',
'strToDate': '20190423',
'strType': 'C'}
jsonData = requests.get(url, headers=headers, params=payload).json()
df = json_normalize(jsonData['Table'])
df['ATTACHMENTNAME'] = '=HYPERLINK("https://www.bseindia.com/xml-data/corpfiling/AttachLive/' + df['ATTACHMENTNAME'] + '")'
df.to_csv('C:/filename.csv', index=False)
輸出:
...
GYSCOAL ALLOYS LTD. - 533275 - Announcement under Regulation 30 (LODR)-Code of Conduct under SEBI (PIT) Regulations, 2015
https://www.bseindia.com/xml-data/corpfiling/AttachLive/82f18673-de98-4a88-bbea-7d8499f25009.pdf
INDIAN SUCROSE LTD. - 500319 - Certificate Under Regulation 40(9) Of Listing Regulation For The Half Year Ended 31.03.2019
https://www.bseindia.com/xml-data/corpfiling/AttachLive/2539d209-50f6-4e56-a123-8562067d896e.pdf
Dhanvarsha Finvest Ltd - 540268 - Reply To Clarification Sought From The Company
https://www.bseindia.com/xml-data/corpfiling/AttachLive/f8d80466-af58-4336-b251-a9232db597cf.pdf
Prabhat Telecoms (India) Ltd - 540027 - Signing Of Framework Supply Agreement With METRO Cash & Carry India Private Limited
https://www.bseindia.com/xml-data/corpfiling/AttachLive/acfb1f72-efd3-4515-a583-2616d2942e78.pdf
...
TA貢獻1856條經驗 獲得超17個贊
有關您的用例的更多信息將有助于回答您的問題。但是,要提取有關您可以訪問該網站的網站內頁面總數的信息,請單擊帶有文本作為下一步的項目并提取所需的數據,您可以使用以下解決方案:
代碼塊:
from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
options = webdriver.ChromeOptions()
options.add_argument("start-maximized")
options.add_argument("--disable-extensions")
# options.add_argument('disable-infobars')
driver = webdriver.Chrome(chrome_options=options, executable_path=r'C:\Utility\BrowserDrivers\chromedriver.exe')
driver.get("https://www.bseindia.com/corporates/ann.html")
WebDriverWait(driver, 20).until(EC.element_to_be_clickable((By.XPATH, "//a[text()='Disclaimer']//following::div[1]//li[@class='pagination-last ng-scope']/a[@class='ng-binding' and text()='Last']"))).click()
print(WebDriverWait(driver, 20).until(EC.visibility_of_element_located((By.XPATH, "//a[text()='Disclaimer']//following::div[1]//li[@class='pagination-page ng-scope active']/a[@class='ng-binding']"))).get_attribute("innerHTML"))
控制臺輸出:
17
添加回答
舉報
