首頁手記 Cluster Analysis with Iris...

Cluster Analysis with Iris Dataset

標簽：

Python 大數據機器學習

Data Science Day 19:

In Supervised Learning, we specify the possible categorical values and train the models for pattern recognition. However, *what if we don’t have the existing classified data model to learn from? *

[caption id=“attachment_1074” align=“alignnone” width=“750”]

Radfotosonn / Pixabay[/caption]

The case we model the data in order to discover the way it clusters, based on certain attributes is Unsupervised Learning.

Clustering Analysis in one of the Unsupervised Techniques, it rather than learning by example, learn by observation.

There are 3 types of clustering methods in general, Partitioning, Hierarchical, and Density-based clustering.

1.Partitioning: n objects is grouped into k ≤ n disjoint clusters.
Partitioning methods are based on a distance measure, it applies iterative relocation until some distance-based error metric is minimized.

2.Hierarchical: either combining(agglomerative) or splitting(divisive) cluster based on some measure (distance, density or continuity), in a stepwise fashion.

Agglomerative starts with each point in its own cluster and combine them in steps, and divisive starts with the data in one cluster and divide it up

3. The density-based method is based on its density; it measures the cluster “goodness”.

Example with Iris Dataset

Partitioning: K-Means=3

#Iris dataset
iris=datasets.load_iris()
x=iris.data
y=iris.target

#Plotting
fig = plt.figure(1, figsize=(7,7))
ax = Axes3D(fig, rect=[0, 0, 0.95, 1], elev=48, azim=134)
ax.scatter(x[:, 3], x[:, 0], x[:, 2],
          c=labels.astype(np.float), edgecolor="k", s=50)
ax.set_xlabel("Petal width")
ax.set_ylabel("Sepal length")
ax.set_zlabel("Petal length")
plt.title("Iris Clustering K Means=3", fontsize=14)
plt.show()

2. **Hierarchical **

#Hierachy Clustering 
hier=linkage(x,"ward")
max_d=7.08
plt.figure(figsize=(25,10))
plt.title('Iris Hierarchical Clustering Dendrogram')
plt.xlabel('Species')
plt.ylabel('distance')
dendrogram(
    hier,
    truncate_mode='lastp',  
    p=50,                  
    leaf_rotation=90.,      
    leaf_font_size=8.,     
)
plt.axhline(y=max_d, c='k')
plt.show()

3. Density-based method DBSCAN

dbscan=DBSCAN()
dbscan.fit(x)
pca=PCA(n_components=2).fit(x)
pca_2d=pca.transform(x)

for i in range(0, pca_2d.shape[0]):
    if dbscan.labels_[i] == 0:
        c1 = plt.scatter(pca_2d[i, 0], pca_2d[i, 1], c='r', marker='+')
    elif dbscan.labels_[i] == 1:
        c2 = plt.scatter(pca_2d[i, 0], pca_2d[i, 1], c='g', marker='o')
    elif dbscan.labels_[i] == -1:
        c3 = plt.scatter(pca_2d[i, 0], pca_2d[i, 1], c='b', marker='*')

plt.legend([c1, c2, c3], ['Cluster 1', 'Cluster 2', 'Noise'])
plt.title('DBSCAN finds 2 clusters and Noise')
plt.show()

Thanks very much to Dr.Rumbaugh’s clustering analysis notes!

Happy studying! 😊

點擊查看更多內容

為 TA 點贊

若覺得本文不錯，就分享一下吧！

評論

評論

共同學習，寫下你的評論

評論加載中...

展開查看更多評論

作者其他優質文章

正在加載中

烏然婭措

學生

手記
篇

粉絲

22

獲贊與收藏

13

關注作者，訂閱最新文章

閱讀免費教程

Python 辦公自動化教程

17個小節 27014 912

Python 算法入門教程

15個小節 29459 1134

Python 進階應用教程

38個小節 71087 1109

推薦

評論

收藏

共同學習，寫下你的評論



感謝您的支持，我會繼續努力的～

掃碼打賞，你說多少就多少

贊賞金額會直接到老師賬戶

支付方式

打開微信掃一掃，即可進行掃碼打賞哦

今天注冊有機會得

100積分直接送

付費專欄免費學

大額優惠券免費領

立即參與放棄機會

點擊
抽獎

慕課手記新用戶專享福利

恭喜你，你的運氣太好了，居然抽中了 100個積分！

恭喜你，抽中了價值元的專欄！

太棒了，直接落到你賬戶里！

積分商城里的羅技鼠標、機械鍵盤、
Kindle 閱讀器、小米平衡車
Apple iPad （10.2英寸）、大額優惠券
在等著你去兌換了噢

作者：

免費贈送

兌換碼：1111222211 復制

優惠券可用于購買實戰課、體系課
無門檻使用

先去看看，有什么好東西馬上兌換我愛學習，選課去


亚洲在线久爱草,狠狠天天香蕉网,天天搞日日干久草,伊人亚洲日本欧美

熱搜

最近搜索清空

Cluster Analysis with Iris Dataset

Data Science Day 19:

Example with Iris Dataset

閱讀免費教程