亚洲在线久爱草,狠狠天天香蕉网,天天搞日日干久草,伊人亚洲日本欧美

為了賬號安全,請及時綁定郵箱和手機立即綁定
已解決430363個問題,去搜搜看,總會有你想問的

將網頁抓取數據轉換為數據框

將網頁抓取數據轉換為數據框

紫衣仙女 2023-09-19 14:56:51
我抓取了數據并嘗試轉換為 json 格式。但是,它似乎不成功,我想用鍵和值轉換字典,然后轉換為數據幀。from bs4 import BeautifulSoupimport bs4import requestsimport jsonreq = Request(url, headers={'User-Agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_6)})webpage = urlopen(req).read().decode("utf-8")webpage = json.loads(webpage)輸出:{'data': [{'id': 'GILD',   'attributes': {'longDesc': "Gilead Sciences, Inc., a research-based biopharmaceutical company, discovers, develops, and commercializes medicines in the areas of unmet medical needs in the United States, Europe, and internationally. It was founded in 1987 and is headquartered in Foster City, California.",    'sectorname': 'Health Care',    'sectorgics': 35,    'primaryname': 'Biotechnology',    'primarygics': 35201010,    'numberOfEmployees': 11800.0,    'yearfounded': 1987,    'streetaddress': '333 Lakeside Drive',    'streetaddress2': None,    'streetaddress3': None,    'streetaddress4': None,    'city': 'Foster City',    'peRatioFwd': 9.02045209903122,    'lastClosePriceEarningsRatio': None,    'divRate': 2.72,    'divYield': 4.33,    'shortIntPctFloat': 1.433,    'impliedMarketCap': None,    'marketCap': 78796576654.0,    'divTimeFrame': 'forward'}}]}我想要的結果是:df = {'id':'GILD', 'longDesc', 'Gildead...}
查看完整描述

1 回答

?
當年話下

TA貢獻1890條經驗 獲得超9個贊

根據您的數據,您可以嘗試此操作,作為代碼的一部分:


d = {

    'data': [

        {

            'id': 'GILD',

            'attributes': {

                'longDesc': "Gilead Sciences, Inc., a research-based biopharmaceutical company, discovers, develops, and commercializes medicines in the areas of unmet medical needs in the United States, Europe, and internationally. It was founded in 1987 and is headquartered in Foster City, California.",

                'sectorname': 'Health Care',

                'sectorgics': 35,

                'primaryname': 'Biotechnology',

                'primarygics': 35201010,

                'numberOfEmployees': 11800.0,

                'yearfounded': 1987,

                'streetaddress': '333 Lakeside Drive',

                'streetaddress2': None,

                'streetaddress3': None,

                'streetaddress4': None,

                'city': 'Foster City',

                'peRatioFwd': 9.02045209903122,

                'lastClosePriceEarningsRatio': None,

                'divRate': 2.72,

                'divYield': 4.33,

                'shortIntPctFloat': 1.433,

                'impliedMarketCap': None,

                'marketCap': 78796576654.0,

                'divTimeFrame': 'forward'}

        }

    ]

}


try:

    _id = d['data'][0]['id']

    ld = d['data'][0]['attributes']['longDesc']

    df = {"id": _id, 'longDesc': ld}

except (KeyError, ValueError) as error:

    print(f"Failed to load data: {error}")


print(df)

輸出:


{'id': 'GILD', 'longDesc': 'Gilead Sciences, Inc., a research-based biopharmaceutical company, discovers, develops, and commercializes medicines in the areas of unmet medical needs in the United States, Europe, and internationally. It was founded in 1987 and is headquartered in Foster City, California.'}

注意: df通常被稱為dataframe,大多是用pandas模塊創建的。但是,您擁有的可能是JSON從您發出的請求返回的對象。話雖如此,您想要的輸出實際上是 a dictionary,但我保留了您的命名約定。


編輯:


要將您的轉換dict為df只需執行以下操作:


import pandas as pd

d = {'id': 'GILD', 'longDesc': 'Gilead Sciences, Inc., a research-based biopharmaceutical company, discovers, develops, and commercializes medicines in the areas of unmet medical needs in the United States, Europe, and internationally. It was founded in 1987 and is headquartered in Foster City, California.'}

df = pd.Dataframe(d.items())

print(df)

這輸出:


          0                                                  1

0        id                                               GILD

1  longDesc  Gilead Sciences, Inc., a research-based biopha...


查看完整回答
反對 回復 2023-09-19
  • 1 回答
  • 0 關注
  • 109 瀏覽
慕課專欄
更多

添加回答

舉報

0/150
提交
取消
微信客服

購課補貼
聯系客服咨詢優惠詳情

幫助反饋 APP下載

慕課網APP
您的移動學習伙伴

公眾號

掃描二維碼
關注慕課網微信公眾號