3 回答

TA貢獻1803條經驗 獲得超6個贊
我會將您的數據集匯總下來,以便您每行使用一個作者groupby并使用它來繪制條形圖,然后將其加入以獲取用于繪制書籍的值,例如:
import pandas as pd
import matplotlib.pyplot as plt
df = pd.DataFrame([
['foo', 1950, 1990, 1980],
['foo', 1950, 1990, 1985],
['bar', 1930, 2000, 1970],
], columns=['author', 'born', 'died', 'published'])
拉入包并創建一個虛擬數據集,接下來我們將其減少到每個作者一行,獲取他們出生和死亡的時間:
agg = df.groupby('author')['born', 'died'].agg(min).reset_index()
agg['auth_num'] = range(len(agg))
使reset_index后面author變成一個普通列,我們創建一個任意auth_num列,sort_values如果您想按作者姓名以外的其他內容對作者進行排序(我建議按字母順序排列通常不是最有用)
接下來,我們可以將其加入原始數據集,以獲取每本書的作者編號:
df2 = pd.merge(df, agg[['author', 'auth_num']], on='author')
最后把它全部繪制出來:
plt.barh(agg.auth_num, agg.died - agg.born, left=agg.born, zorder=-1, alpha=0.5)
plt.yticks(agg.auth_num, agg.author)
plt.scatter(df2.published, df2.auth_num)
給出類似的東西:

TA貢獻1875條經驗 獲得超5個贊
(下次請包括一個數據框示例?。?/p>
我會使用很棒的numpy.unique方法來執行分組操作。
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
dataset = pd.DataFrame({'BORN': [1900, 1920, 1900],
'DIED': [1980, 1978, 1980],
'AUTHOR': ['foo', 'bar', 'foo'],
'YEAR (BOOK)': [1950, 1972, 1961]})
# --group by author
unique_authors, index, reverse_index = np.unique(dataset.AUTHOR.values, return_index=True, return_inverse=True)
authors_df = dataset.loc[index, ['AUTHOR', 'BORN', 'DIED']]
dataset['AUTHOR_IDX'] = reverse_index # remember the index
# dataframe columns to arrays.
begin = authors_df.BORN.values
end = authors_df.DIED.values
authors = authors_df.AUTHOR.values
# --Author data to a barh graph (sideways bar)
plt.barh(range(len(begin)), end-begin, left=begin, zorder=2, color='#007acc', alpha=0.8, linewidth=5)
# Sets the titles of the y-axis.
plt.yticks(range(len(begin)), authors)
# Sets start and end of the x-axis.
plt.xlim([1835, 2019])
# --Overlay book information
# dataframe columns to arrays
book = dataset['YEAR (BOOK)'].values
# Plots the books in a scatterplot. Changes marker color and shape.
plt.scatter(book, reverse_index, color='purple', s=30, marker='D', zorder=3)
# Shows the plt
plt.show()
產量:

TA貢獻1828條經驗 獲得超4個贊
當然,您可以使用多種選項。您可以為第 1、第 2、第 3 本書創建另一個數組?;蛘吣梢詣摻ㄒ粋€字典或數組列表來繪制每個作者的書籍。
我使用下面的虛擬數據重新生成了一些示例。
import matplotlib.pyplot as plt
import numpy as np
fig,axs = plt.subplots(1,1,figsize=(10,10))
# dataframe columns to arrays. (dataset is my pandas dataframe)
begin = np.arange(1900,1950)
end = np.arange(1975,2025)
# create two random arrays for your book dates
book1 = np.array(np.random.randint(low=1950, high=1970, size=50))
book2 = np.array(np.random.randint(low=1950, high=1970, size=50))
# add some athor names
author_names = [f'Author_{x+1}' for x in range(50)]
# Data to a barh graph (sideways bar)
axs.barh(range(len(begin)), end-begin, left=begin, zorder=2,
color='#007acc', alpha=0.8, linewidth=5)
# Plots the books in a scatterplot. Changes marker color and shape.
axs.scatter(book1, range(len(begin)), color='purple', s=30, marker='D', zorder=3, label='1st Book')
# second array of books
axs.scatter(book2, range(len(begin)), color='yellow', s=30, marker='D', zorder=3, label='2nd Book')
# or plot a custom array of books
# you could do this in a for loop for all authors
axs.scatter(x=[1980,2005], y=[10,45], color='red', s=50, marker='X', zorder=3, label='3rd Book')
# Sets the titles of the y-axis.
axs.set_yticks(range(len(begin)))
axs.set_yticklabels(author_names)
# Add legend
axs.legend()
# Sets start and end of the x-axis.
axs.set_xlim([1895, 2025])
axs.set_ylim([-1,50]);
添加回答
舉報