1 回答

TA貢獻1878條經驗 獲得超4個贊
如果您這樣做
frequency, bins = np.histogram(latlong['Lat'], bins=20)
print(frequency)
print(bins)
你得到
[ 1 7 12 18 301 35831 504342 22081 1256 580
63 12 8 1 2 0 0 0 0 1]
[40.07 40.1725 40.275 40.3775 40.48 40.5825 40.685 40.7875 40.89
40.9925 41.095 41.1975 41.3 41.4025 41.505 41.6075 41.71 41.8125
41.915 42.0175 42.12 ]
你可以看到,有些計數與平均值相去甚遠。
您可以通過在指定的最小值和最大值之間剪切感興趣的變量來忽略那些遠離均值的條柱,然后繪制直方圖,如下所示
import pandas as pd
import matplotlib.pyplot as plt
import numpy as np
#Loading data
url = 'https://raw.githubusercontent.com/diggledoot/dataset/master/uber-raw-data-apr14.csv'
latlong = pd.read_csv(url)
#Plot
plt.figure(figsize=(8,6))
plt.title('Rides based on latitude')
plt.hist(np.clip(latlong['Lat'], 40.6, 40.9),bins=50,color='cyan')
plt.xlabel('Latitude')
plt.ylabel('Frequency')
plt.show()
添加回答
舉報