我有一張“借款人個人ID”和“貸款ID”表。BwrPersonld LoanId113225 16330113225 27073113225 68842113253 16341113269 16348113285 16354113289 26768113297 16360113299 16361113319 16369113418 16403113418 26854我想知道哪些貸款屬于同一借款人。所以我“groupby”“BwrPersonalId”和“LoanId”,如下所示?,F在我就這樣期待著。這是我的代碼,但它不起作用。grouped = pd.DataFrame()unique = loan['BwrPersonId'].unique()grouped['BwrPersonId'] = ''*len(loan['BwrPersonId'].unique())grouped['Loan1'] = ''grouped['Loan2'] = ''grouped['Loan3'] = ''grouped['Loan4'] = ''grouped['Loan5'] = ''grouped.iloc[:,0] = uniquefor i in grouped.index: idloan = loan.loc[loan['BwrPersonId'] == unique[i], 'LoanId'] grouped.iloc[i,1:len(idloan)+1] = idloan print(i)我現在該怎么做呢?還有其他方法可以簡化代碼嗎?非常感謝你的幫助。
1 回答

一只甜甜圈
TA貢獻1836條經驗 獲得超5個贊
基本上,您需要做的是創建一個臨時變量,該臨時變量將使用要排序的數據,以及負責 Id 的名稱,以便根據貸款對 Id 進行排序。
import pandas as pd
import numpy as np
from collections import defaultdict
from itertools import count
dict = defaultdict(count)
id, name = pd.factorize([*zip(grouped.id, grouped.name)])
joined = np.array([next(dict[x]) for x in id])
lenOfr, Max = len(name), joined.max() + 1
temp = np.empty((lenOfr, Max), dtype=np.object)
temp[id, joined] = grouped.LoanId
df1 = pd.DataFrame(name.tolist(), columns=['BwrPersonId'])
df2 = pd.DataFrame(temp, columns=['Loan1', 'Loan2', 'Loan3', 'Loan4', 'Loan5'])
final = df1.join(df2)
添加回答
舉報
0/150
提交
取消