首頁猿問消除網頁上的重復鏈接并避免鏈接過時錯誤

消除網頁上的重復鏈接并避免鏈接過時錯誤

Java

達令說 2023-07-13 13:49:16

我有 20 個鏈接的列表，其中一些是重復的。我單擊第一個鏈接，將我帶到下一頁，我從下一頁下載一些文件。第1頁鏈接1鏈接2鏈接3鏈接1鏈接3鏈接4鏈接2鏈接 1（點擊）-->（打開）第 2 頁第 2 頁（單擊后退按鈕瀏覽器）-->（返回）第 1 頁現在我單擊鏈接 2 并重復相同的操作。 System.setProperty("webdriver.chrome.driver", "C:\\chromedriver.exe"); String fileDownloadPath = "C:\\Users\\Public\\Downloads"; //Set properties to supress popups Map<String, Object> prefsMap = new HashMap<String, Object>(); prefsMap.put("profile.default_content_settings.popups", 0); prefsMap.put("download.default_directory", fileDownloadPath); prefsMap.put("plugins.always_open_pdf_externally", true); prefsMap.put("safebrowsing.enabled", "false"); //assign driver properties ChromeOptions option = new ChromeOptions(); option.setExperimentalOption("prefs", prefsMap); option.addArguments("--test-type"); option.addArguments("--disable-extensions"); option.addArguments("--safebrowsing-disable-download-protection"); option.addArguments("--safebrowsing-disable-extension-blacklist"); WebDriver driver = new ChromeDriver(option); driver.get("http://www.mywebpage.com/"); List<WebElement> listOfLinks = driver.findElements(By.xpath("//a[contains(@href,'Link')]")); Thread.sleep(500); pageSize = listOfLinks.size(); System.out.println( "The number of links in the page is: " + pageSize); //iterate through all the links on the page for ( int i = 0; i < pageSize; i++) { System.out.println( "Clicking on link: " + i ); try { linkText = listOfLinks.get(i).getText(); listOfLinks.get(i).click(); }該代碼運行良好，單擊所有鏈接并下載文件?，F在我需要改進邏輯，省略重復的鏈接。我嘗試過濾掉列表中的重復項，但不確定應該如何處理 org.openqa.selenium.StaleElementReferenceException。我正在尋找的解決方案是單擊第一次出現的鏈接，并避免在再次出現時單擊該鏈接。（這是從門戶下載多個文件的復雜邏輯的一部分，我無法控制。因此，請不要帶著諸如為什么頁面上首先存在重復鏈接之類的問題回來。）

查看完整描述

3 回答

ibeautiful

TA貢獻1993條經驗獲得超6個贊

首先，我不建議您重復向 WebDriver 發出請求（findElements），沿著這條路徑您會看到很多性能問題，主要是如果您有很多鏈接和頁面。

另外，如果您始終在同一個選項卡上執行相同的操作，則需要等待刷新兩次（鏈接頁面和下載頁面），現在如果您在新選項卡中打開每個鏈接，則只需等待您要下載的頁面刷新。

我有一個建議，就像@supputuri所說的不同的重復鏈接，并在新選項卡中打開每個鏈接，這樣您就不需要處理過時的內容，不需要每次都在屏幕上搜索鏈接，不需要在每次迭代中等待帶有鏈接的頁面刷新。

List<WebElement> uniqueLinks = driver.findElements(By.xpath("//a[contains(@href,'Link')][not(@href = following::a/@href)]"));

for ( int i = 0; i < uniqueLinks.size(); i++)

{

new Actions(driver)

.keyDown(Keys.CONTROL)

.click(uniqueLinks.get(i))

.keyUp(Keys.CONTROL)

.build()

.perform();

// if you want you can create the array here on this line instead of create inside the method below.

driver.switchTo().window(new ArrayList<>(driver.getWindowHandles()).get(1));

//do your wait stuff.

driver.findElement(By.xpath("//span[contains(@title,'download')]")).click();

//do your wait stuff.

driver.close();

driver.switchTo().window(new ArrayList<>(driver.getWindowHandles()).get(0));

}

我現在無法正確測試我的代碼，此代碼的任何問題只需評論，我將更新答案，但這個想法是正確的，而且非常簡單。

反對回復 2023-07-13

叮當貓咪

TA貢獻1776條經驗獲得超12個贊

首先讓我們看看xpath。

示例 HTML：

<!DOCTYPE html>

<html>

? ? <body>

? ? <div>

? ? ? ? <a >Google</a>

? ? ? ? <a >Yahoo</a>

? ? ? ? <a >Google</a>

? ? ? ? <a >MSN</a>

? ? </body>

</html>

讓我們看看 xpath 以從上面獲取不同的鏈接。

//a[not(@href = following::a/@href)]

xpath 中的邏輯是我們確保鏈接的 href 與任何后續鏈接的 href 不匹配，如果匹配則將其視為重復，并且 xpath 不會返回該元素。

過時元素：所以，現在是時候處理代碼中的過時元素問題了。當您單擊鏈接 1 時，存儲在其中的所有引用都listOfLinks將無效，因為每次在頁面上加載元素時，selenium 都會將新引用分配給元素。當您嘗試訪問具有舊引用的元素時，您將得到異常stale element。下面是一段代碼，應該可以讓您有所了解。

List<WebElement> listOfLinks = driver.findElements(By.xpath("http://a[contains(@href,'Link')][not(@href = following::a/@href)]"));

Thread.sleep(500);

pageSize = listOfLinks.size();

System.out.println( "The number of links in the page is: " + pageSize);

//iterate through all the links on the page

for ( int i = 0; i < pageSize; i++)

{

? ? // ===> consider adding step to explicit wait for the Link element with "http://a[contains(@href,'Link')][not(@href = following::a/@href)]" xpath present using WebDriverWait?

? ? // don't hard code the sleep?

? ? // ===> added this line

? ? <WebElement> link = driver.findElements(By.xpath("http://a[contains(@href,'Link')][not(@href = following::a/@href)]")).get(i);

? ? System.out.println( "Clicking on link: " + i );

? ? // ===> updated next 2 lines

? ? linkText = link.getText();

? ? link.click();

? ? // ===> consider adding explicit wait using WebDriverWait to make sure the span exist before clicking.?

? ? driver.findElement(By.xpath("http://span[contains(@title,'download')]")).click();

? ? // ===> check this answer (https://stackoverflow.com/questions/34548041/selenium-give-file-name-when-downloading/56570364#56570364) for make sure the download is completed before clicking on browser back rather than sleep for x seconds.

? ? driver.navigate().back();

? ? // ===>? removed hard coded wait time (sleep)

}

如果您想在新窗口中打開鏈接，請使用以下邏輯。

WebDriverWait wait = new WebDriverWait(driver, 20);

? ? ? ? wait.until(ExpectedConditions.presenceOfAllElementsLocatedBy(By.xpath("http://a[contains(@href,'Link')][not(@href = following::a/@href)]")));

? ? ? ? List<WebElement> listOfLinks = driver.findElements(By.xpath("http://a[contains(@href,'Link')][not(@href = following::a/@href)]"));

? ? ? ? JavascriptExecutor js = (JavascriptExecutor) driver;?

? ? ? ? for (WebElement link : listOfLinks) {

? ? ? ? ? ? // get the href

? ? ? ? ? ? String href = link.getAttribute("href");

? ? ? ? ? ? // open the link in new tab

? ? ? ? ? ? js.executeScript("window.open('" + href +"')");

? ? ? ? ? ? // switch to new tab

? ? ? ? ? ? ArrayList<String> tabs = new ArrayList<String> (driver.getWindowHandles());

? ? ? ? ? ? driver.switchTo().window(tabs.get(1));

? ? ? ? ? ? //click on download

? ? ? ? ? ? //close the new tab

? ? ? ? ? ? driver.close();

? ? ? ? ? ? // switch to parent window

? ? ? ? ? ? driver.switchTo().window(tabs.get(0));

? ? ? ? ?}

反對回復 2023-07-13

慕仙森

TA貢獻1827條經驗獲得超8個贊

你可以這樣做。

將列表中元素的索引保存到哈希表
如果 Hashtable 已包含，則跳過它
一旦完成，HT只有獨特的元素，即第一個Foundones

HT 的值是 listOfLinks 中的索引

        HashTable < String, Integer > hs1 = new HashTable(String, Integer);
                for (int i = 0; i < listOfLinks.size(); i++) {
                            if (!hs1.contains(e.getText()) {

                    hs1.add(e.getText(), i);
                }
            }            for (int i: hs1.values()) {

                listOfLinks.get(i).click();
            }

反對回復 2023-07-13

3 回答
0 關注
187 瀏覽

關注

添加回答

舉報

0/150

提交

取消

亚洲在线久爱草,狠狠天天香蕉网,天天搞日日干久草,伊人亚洲日本欧美

熱搜

最近搜索清空

消除網頁上的重復鏈接并避免鏈接過時錯誤

消除網頁上的重復鏈接并避免鏈接過時錯誤

3 回答

添加回答