首頁猿問 Pandas 在 2...

Pandas 在 2 個數據幀上使用 netaddr 來查看 ip 列是否屬于帶有布爾結果的

Python

慕容森 2021-11-23 16:29:17

我正在使用 netaddr python 庫。我有 2 個數據幀，一個帶有可轉換為 CIDR 表示法的 IP 范圍，另一個帶有我想查看它們是否屬于任何范圍的 IP 地址。創建范圍數據框：import pandas as pdimport netaddrfrom netaddr import *a = {'StartAddress': ['65.14.88.64', '148.77.37.88', '65.14.41.128', '65.14.40.0'], 'EndAddress': ['65.14.88.95', '148.77.37.95','65.14.41.135', '65.14.40.255']}df1 = pd.DataFrame(data=a)#Convert range to netaddr cidr formatdef rangetocidr(row): return netaddr.iprange_to_cidrs(row.StartAddress, row.EndAddress) df1["CIDR"] = df1.apply(rangetocidr, axis=1)df1 StartAddress EndAddress CIDR0 65.14.88.64 65.14.88.95 [65.14.88.64/27]1 148.77.37.88 148.77.37.95 [148.77.37.88/29]2 65.14.41.128 65.14.41.135 [65.14.41.128/29]3 65.14.40.0 65.14.40.255 [65.14.40.0/24]df1["CIDR"].iloc[0][IPNetwork('65.14.88.64/27')]創建 IP 數據幀：b = {'IP': ['65.13.88.64', '148.65.37.88','65.14.88.65','148.77.37.93','66.15.41.132']}df2 = pd.DataFrame(data=b)#Convert ip to netaddr formatdef iptonetaddrformat (row): return netaddr.IPAddress(row.IP)df2["IP_Format"] = df2.apply(iptonetaddrformat, axis=1)df2 IP IP_Format0 65.13.88.64 65.13.88.641 148.65.37.88 148.65.37.882 65.14.88.65 65.14.88.653 148.77.37.93 148.77.37.934 66.15.41.132 66.15.41.132df2["IP_Format"].iloc[0]IPAddress('65.13.88.64')我期待中添加一列df2，如果IP地址是從CIDR塊df1。所以它看起來像：df2 IP IP_Format IN_CIDR0 65.13.88.64 65.13.88.64 False1 148.65.37.88 148.65.37.88 False2 65.14.88.65 65.14.88.65 True3 148.77.37.93 148.77.37.93 True4 66.15.41.132 66.15.41.132 False我更愿意僅使用 2 個數據幀中的列來執行此操作，但已通過將列轉換為列表并使用以下內容進行了嘗試，但這似乎不起作用：df2list = repr(df2[['IP_Format']])df1list = df[['CIDR']]def ipincidr (row): return netaddr.largest_matching_cidr(df2list, df1list)df2['INRANGE'] = df2.apply(ipincidr, axis=1)

查看完整描述

1 回答

開滿天機

TA貢獻1786條經驗獲得超13個贊

以下解決方案基于這樣的假設：只有第四組 IP 發生變化，而前三組 IP 保持不變，如問題所示。

# Splitting IP into 2 parts __.__.__ and __.

# Doing this for IP from df2 along with Start and End columns from df1

ip = pd.DataFrame(df2.IP.str.rsplit('.', 1, expand=True))

ip.columns = ['IP_init', 'IP_last']

start = pd.DataFrame(df1.StartAddress.str.rsplit('.', 1, expand=True))

start.columns = ['start_init', 'start_last']

end = pd.DataFrame(df1.EndAddress.str.rsplit('.', 1, expand=True))

end.columns = ['end_init', 'end_last']

df = pd.concat([ip, start, end], axis=1)

# Checking if any IP belongs to any of the given blocks, if yes, note their index

index = []

for idx, val in enumerate(df.itertuples()):

for i in range(df.start_init.count()):

if df.loc[idx, 'IP_init'] == df.loc[i, 'start_init']:

if df.loc[idx, 'IP_last'] >= df.loc[i, 'start_last']

and df.loc[idx, 'IP_last'] <= df.loc[i, 'end_last']:

index.append(idx)

break

# Creating column IN_CIDR and marking True against the row which exists in IP block

df2['IN_CIDR'] = False

df2.loc[index, 'IN_CIDR'] = True

df2

IP IP_Format IN_CIDR

0 65.13.88.64 65.13.88.64 False

1 148.65.37.88 148.65.37.88 False

2 65.14.88.65 65.14.88.65 True

3 148.77.37.93 148.77.37.93 True

4 66.15.41.132 66.15.41.132 False

注意 - 您也可以使用which results np.whereto 跳過第一次迭代，因此您以后可以只關注行，從而減少開銷。np.where(df.IP_init.isin(df.start_init), True, False)[False, False, True, True, False]True

反對回復 2021-11-23

1 回答
0 關注
211 瀏覽

關注

添加回答

舉報

0/150

提交

取消

亚洲在线久爱草,狠狠天天香蕉网,天天搞日日干久草,伊人亚洲日本欧美

熱搜

最近搜索清空

Pandas 在 2 個數據幀上使用 netaddr 來查看 ip 列是否屬于帶有布爾結果的

Pandas 在 2 個數據幀上使用 netaddr 來查看 ip 列是否屬于帶有布爾結果的

1 回答

添加回答