亚洲在线久爱草,狠狠天天香蕉网,天天搞日日干久草,伊人亚洲日本欧美

為了賬號安全,請及時綁定郵箱和手機立即綁定
已解決430363個問題,去搜搜看,總會有你想問的

正則表達式獲取 <script> 標簽

正則表達式獲取 <script> 標簽

慕雪6442864 2023-07-14 16:34:30
我試圖定位腳本內具有“ "@type": "NewsArticle" " 的整個腳本標記。像這樣的東西:<script type="application\/ld\+json">[^\{]*?{(.*?)\}[^\}]*?<\/script>我可以使用上面的正則表達式來定位最上面的腳本標簽。但我正在尋找 newsArticle JSON 信息,在本例中是第二個,但在某些頁面中有 4 個以上 application/ld+json 標簽,但 " "@type": "NewsArticle" "始終存在無論如何,在每一頁中。所以我正在尋找一個可以針對該特定腳本的腳本。感謝幫助。<script type="application/ld+json">{    "@context": "http://schema.org",    "@type": "Organization",    "@id": "https://www.givemesport.com/#gms",    "name": "GiveMeSport",    "url": "https://www.givemesport.com",    "logo": {        "@type": "ImageObject",        "url": "https://gmsrp.cachefly.net/v4/images/logo-gms-black.png"    },    "sameAs":[        "https://www.facebook.com/GiveMeSport",        "https://www.instagram.com/givemesport",        "https://twitter.com/GiveMeSport",        "https://www.youtube.com/user/GiveMeSport"    ]}</script>    <script type="application/ld+json">    {    "@context": "http://schema.org",    "@type": "NewsArticle",    "mainEntityOfPage": "https://www.givemesport.com/1612447-man-uniteds-scott-mctominay-delighted-fans-with-reaction-after-third-goal-vs-rb-leipzig",    "url": "https://www.givemesport.com/1612447-man-uniteds-scott-mctominay-delighted-fans-with-reaction-after-third-goal-vs-rb-leipzig",    "headline": "Man United's Scott McTominay delighted fans with reaction after third goal vs RB Leipzig",    "datePublished": "2020-10-30T21:52:48.3510000Z",    "dateModified": "2020-10-30T21:52:48.3510000Z",    "description": "Man United's Scott McTominay delighted fans with reaction after third goal vs RB Leipzig",    "articleSection": "Football",    "keywords": ["Football","Manchester United","Marcus Rashford","RB Leipzig","Scott McTominay","UEFA Champions"],    "creator": ["Scott Wilson"],    "thumbnailUrl": "https://gmsrp.cachefly.net/images/20/10/30/03a426c8204af5c8d02282afaeed6189/144.jpg",    "author": {    "@type": "Person",    "name": "Scott Wilson",    "sameAs": "https://www.givemesport.com/scott-wilson-1"    },
查看完整描述

1 回答

?
森欄

TA貢獻1810條經驗 獲得超5個贊

很遺憾得知您不想遵循最佳實踐,使用正則表達式解析 HTML 充滿了問題。但是,如果您想要快速而骯臟的解決方法,請使用

<script?type="application\/ld\+json">((?:(?!<\/?script)[\w\W])*?"@type":\s*"NewsArticle"[\w\W]*?)<\/script>

解釋

--------------------------------------------------------------------------------

? <script? ? ? ? ? ? ? ? ? '<script type="application'

? type="application

--------------------------------------------------------------------------------

? \/? ? ? ? ? ? ? ? ? ? ? ?'/'

--------------------------------------------------------------------------------

? ld? ? ? ? ? ? ? ? ? ? ? ?'ld'

--------------------------------------------------------------------------------

? \+? ? ? ? ? ? ? ? ? ? ? ?'+'

--------------------------------------------------------------------------------

? json">? ? ? ? ? ? ? ? ? ?'json">'

--------------------------------------------------------------------------------

? (? ? ? ? ? ? ? ? ? ? ? ? group and capture to \1:

--------------------------------------------------------------------------------

? ? (?:? ? ? ? ? ? ? ? ? ? ? group, but do not capture (0 or more

? ? ? ? ? ? ? ? ? ? ? ? ? ? ?times (matching the least amount

? ? ? ? ? ? ? ? ? ? ? ? ? ? ?possible)):

--------------------------------------------------------------------------------

? ? ? (?!? ? ? ? ? ? ? ? ? ? ? look ahead to see if there is not:

--------------------------------------------------------------------------------

? ? ? ? <? ? ? ? ? ? ? ? ? ? ? ? '<'

--------------------------------------------------------------------------------

? ? ? ? \/?? ? ? ? ? ? ? ? ? ? ? '/' (optional (matching the most

? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?amount possible))

--------------------------------------------------------------------------------

? ? ? ? script? ? ? ? ? ? ? ? ? ?'script'

--------------------------------------------------------------------------------

? ? ? )? ? ? ? ? ? ? ? ? ? ? ? end of look-ahead

--------------------------------------------------------------------------------

? ? ? [\w\W]? ? ? ? ? ? ? ? ? ?any character of: word characters (a-

? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?z, A-Z, 0-9, _), non-word characters

? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?(all but a-z, A-Z, 0-9, _)

--------------------------------------------------------------------------------

? ? )*?? ? ? ? ? ? ? ? ? ? ? end of grouping

--------------------------------------------------------------------------------

? ? "@type":? ? ? ? ? ? ? ? ?'"@type":'

--------------------------------------------------------------------------------

? ? \s*? ? ? ? ? ? ? ? ? ? ? whitespace (\n, \r, \t, \f, and " ") (0

? ? ? ? ? ? ? ? ? ? ? ? ? ? ?or more times (matching the most amount

? ? ? ? ? ? ? ? ? ? ? ? ? ? ?possible))

--------------------------------------------------------------------------------

? ? "NewsArticle"? ? ? ? ? ? '"NewsArticle"'

--------------------------------------------------------------------------------

? ? [\w\W]*?? ? ? ? ? ? ? ? ?any character of: word characters (a-z,

? ? ? ? ? ? ? ? ? ? ? ? ? ? ?A-Z, 0-9, _), non-word characters (all

? ? ? ? ? ? ? ? ? ? ? ? ? ? ?but a-z, A-Z, 0-9, _) (0 or more times

? ? ? ? ? ? ? ? ? ? ? ? ? ? ?(matching the least amount possible))

--------------------------------------------------------------------------------

? )? ? ? ? ? ? ? ? ? ? ? ? end of \1

--------------------------------------------------------------------------------

? <? ? ? ? ? ? ? ? ? ? ? ? '<'

--------------------------------------------------------------------------------

? \/? ? ? ? ? ? ? ? ? ? ? ?'/'

--------------------------------------------------------------------------------

? script>? ? ? ? ? ? ? ? ? 'script>'


查看完整回答
反對 回復 2023-07-14
  • 1 回答
  • 0 關注
  • 211 瀏覽
慕課專欄
更多

添加回答

舉報

0/150
提交
取消
微信客服

購課補貼
聯系客服咨詢優惠詳情

幫助反饋 APP下載

慕課網APP
您的移動學習伙伴

公眾號

掃描二維碼
關注慕課網微信公眾號