2 回答

TA貢獻1821條經驗 獲得超5個贊
一般來說,你不需要自己做這些事情,因為pandas已經為你做了。
在這種情況下,您需要的是unique方法,您可以Series直接在 a 上調用該方法(pd.Series除其他外,它是表示列的抽象),并返回一個numpy包含該中唯一值的數組Series。
如果您想要多個列的唯一值,您可以執行以下操作:
which_columns = ... # specify the columns whose unique values you want here
uniques = {col: df[col].unique() for col in which_columns}

TA貢獻1836條經驗 獲得超5個贊
如果您正在處理分類列,那么以下代碼非常有用
它不僅會打印唯一值,還會打印每個唯一值的計數
col = ['col1', 'col2', 'col3'...., 'coln']
#Print frequency of categories
for col in categorical_columns:
print ('\nFrequency of Categories for varible %s'%col)
print (bd1[col].value_counts())
例子:
df
pets location owner
0 cat San_Diego Champ
1 dog New_York Ron
2 cat New_York Brick
3 monkey San_Diego Champ
4 dog San_Diego Veronica
5 dog New_York Ron
categorical_columns = ['pets','owner','location']
#Print frequency of categories
for col in categorical_columns:
print ('\nFrequency of Categories for varible %s'%col)
print (df[col].value_counts())
輸出:
# Frequency of Categories for varible pets
# dog 3
# cat 2
# monkey 1
# Name: pets, dtype: int64
# Frequency of Categories for varible owner
# Champ 2
# Ron 2
# Brick 1
# Veronica 1
# Name: owner, dtype: int64
# Frequency of Categories for varible location
# New_York 3
# San_Diego 3
# Name: location, dtype: int64
添加回答
舉報