亚洲在线久爱草,狠狠天天香蕉网,天天搞日日干久草,伊人亚洲日本欧美

為了賬號安全,請及時綁定郵箱和手機立即綁定
已解決430363個問題,去搜搜看,總會有你想問的

檢查數組的某個元素是否與以下相同

檢查數組的某個元素是否與以下相同

海綿寶寶撒 2023-03-18 17:15:44
我正在創建一個將 pdf 解析為文本的服務。當我有該文本時,我必須匹配一組單詞。每次匹配時,它都會增加一個計數器。到目前為止,一切都很好。困難在于,在解析為文本時,我無法檢查我在 pdf 的哪一頁。我已經意識到,在拆分中,每次出現兩個連續的換行符(/n/n)就意味著發生了換頁。我想做的是檢查頁面是否已更改,并且除了計算一個單詞總共被找到的次數外,還要說出它在哪些頁面上。例子let data =  `resignations / resignations. adm. mancom .: berenguer llinaresappointments. adm. unique: calvo valenzuela. other concepts: change of the administrative body:joint administrators to sole administrator. change of registered office. ptda colomer, 6Official Gazette of the Commercial Registryno. 182 Friday, September 18, 2020 p. 33755cve: borme-a-2020-182-03 verifiable insarria). registry data. t 2257, f 100, s 8, h a 54815, i / a 4 (10.09.20) .`let wordsToSearch = ['resignations', "administrators"]    wordsToSearch.forEach((word) => {// inside of here would like to have track of the page as well        let stringArray = data.split(' ');        let count = 0;        let result = ""        for (var i = 0; i < stringArray.length; i++) {            let wordText = stringArray[i];            if (new RegExp(word).test(wordText)) {                count++            }        }        // the expected result would word has appeared count times in the pages etc        result += `${word} has appeared ${count} times\n`        console.log(result)        /*        resignations has appeared 2 times        administrators has appeared 1 times        */    })如果有人也想出另一種方法,那就太好了
查看完整描述

1 回答

?
莫回無

TA貢獻1865條經驗 獲得超7個贊

您可以在那些雙換行符處拆分文本,然后單獨分析每個頁面。我會這樣做:


let data = `resignations / Friday resignations. adm. mancom .: berenguer llinares

            appointments. adm. unique: calvo Friday valenzuela. other concepts: change of the administrative body:

            joint administrators to sole administrator. change of registered office. ptda colomer, 6, Friday


            Official Gazette of the Commercial Registry

            no. 182 Friday, September 18, 2020 p. 33755

            cve: borme-a-2020-182-03 verifiable in

            sarria). registry data. t 2257, f 100, s 8, h a 54815, i / a 4 (10.09.20) .`



function analyseText(text, wordsToFind) {

    const pages = data.split("\n\n");

    const result = {};

    for (let pageIndex = 0; pageIndex < pages.length; pageIndex++) {

        analysePage({

            pageIndex,

            pageText: pages[pageIndex]

        }, wordsToFind, result);

    }

    return Object.keys(result).map(k => result[k]);

}


function analysePage(page, wordsToFind, result) {

    const {

        pageText,

        pageIndex

    } = page;

    wordsToFind.forEach(word => {

        const count = (pageText.match(new RegExp(word, 'g')) || []).length;

        if (count > 0) {

            if (!result[word]) {

                result[word] = {

                    name: word,

                    pageIndices: [],

                    count: 0

                };

            }

            result[word].pageIndices.push(pageIndex);

            result[word].count += count;

        }

    });


}


const result = analyseText(data, ['resignations', "administrators", "Friday"]);

console.log(result);

在此示例中,我只是打印每一頁的結果,但您當然可以構建一些結果對象,在其中保存每一頁的結果。



查看完整回答
反對 回復 2023-03-18
  • 1 回答
  • 0 關注
  • 104 瀏覽
慕課專欄
更多

添加回答

舉報

0/150
提交
取消
微信客服

購課補貼
聯系客服咨詢優惠詳情

幫助反饋 APP下載

慕課網APP
您的移動學習伙伴

公眾號

掃描二維碼
關注慕課網微信公眾號