我有一個像這樣的 df 列:col1[[0.73, 0.43, 0.5, 0.0], [0.39, 0.5], [0.37], [0.38, 0.51, 0.0, 0.2]][[0.53, 0.33, 0.2, 0.0], [0.79, 0.5], [0.96], [0.88, 0.21, 0.0, 0.0]]子列表可以是任意大小。我試圖將子列表中的數字轉換為浮點數(它們是字符串),然后創建一個對每個子列表求和的列,然后除以子列表中的項目數所以第 1 行的總和:(.73 + .43 + .5 + 0) / 4 =.415(.39 + .5) / 2 = .445(.37) / 1 = .37(.38 + .51 + 0.0 + .2) / 4 = .272對于第 2 行:(.53 + .33 + .2 + 0) / 4 = .265(.79 + .5) / 2 = .645(.96) / 1 = .96(.88 + .21 + 0.0 + 0.0) / 4 = .272結果:new_col[[.415],[.445],[.37],[.272]][[.265],[.645],[.96],[.272]]我嘗試過很多東西:#something like this where it creates a column of the number of elements in each sublist and then uses that to divide the sum of each number# this didn't work - just grabbed the first lists sizedf1['words_in_company_name'] = df1['children_org_name_sublists'].str.len()#this doesn't really work - i mean it shows the numbers per list, just not sure where to go from herefor i in df1.func_scores: length = [] for j in i: print(j)A
1 回答

幕布斯6054654
TA貢獻1876條經驗 獲得超7個贊
只要做apply與np.mean
df['new_col'] = df.col.apply(lambda x : [[np.mean(y)] for y in x ])
df
Out[17]:
col new_col
0 [[0.73, 0.43, 0.5, 0.0], [0.39, 0.5], [0.37], ... [[0.415], [0.445], [0.37], [0.2725]]
1 [[0.53, 0.33, 0.2, 0.0], [0.79, 0.5], [0.96], ... [[0.265], [0.645], [0.96], [0.2725]]
添加回答
舉報
0/150
提交
取消