首頁手記【火爐煉AI】機器學習017-使用GridSearch...

【火爐煉AI】機器學習017-使用GridSearch搜索最佳參數組合

標簽：

機器學習

【火炉炼AI】机器学习017-使用GridSearch搜索最佳参数组合

(本文所使用的Python库和版本号: Python 3.5, Numpy 1.14, scikit-learn 0.19, matplotlib 2.2 )

在前面的文章，我们使用了验证曲线来优化模型的超参数，但是使用验证曲线难以同时优化多个参数的取值，只能一个参数一个参数的优化，从而获取每个参数的最优值，但是有时候，一个非常优秀的模型，可能A参数取最优值时，B参数并不一定是最优值，从而使得验证曲线的方式有其自身的弊端。

此处介绍的使用GridSearch来搜索最佳参数组合的方法，可以避免上述弊端，GridSearch可以同时优化多个不同参数的取值。

1. 准备数据集

数据集的准备工作和文章中一模一样，此处不再赘述。

2. 使用GridSearch函数来寻找最优参数

使用GridSearch函数来寻找最优参数，需要首先定义要搜索的参数候选值，然后定义模型的评价指标，以此来评价模型的优虐。，GridSearch会自动计算各种参数候选值，从而得到最佳的参数组合，使得评价指标最大化。

from sklearn import svm, grid_search, cross_validationfrom sklearn.metrics import classification_report

parameter_grid = [  {'kernel': ['linear'], 'C': [1, 10, 50, 600]}, # 需要优化的参数及其候选值
                    {'kernel': ['poly'], 'degree': [2, 3]},
                    {'kernel': ['rbf'], 'gamma': [0.01, 0.001], 'C': [1, 10, 50, 600]},
                 ]

metrics = ['precision', 'recall_weighted'] # 评价指标好坏的标准for metric in metrics:
    print("Searching optimal hyperparameters for: {}".format(metric))

    classifier = grid_search.GridSearchCV(svm.SVC(C=1), 
            parameter_grid, cv=5, scoring=metric)
    classifier.fit(train_X, train_y)

    print("\nScores across the parameter grid:") 
    for params, avg_score, _ in classifier.grid_scores_:  # 打印出该参数下的模型得分
        print('{}: avg_scores: {}'.format(params,round(avg_score,3)))

    print("\nHighest scoring parameter set: {}".format(classifier.best_params_))

    y_pred =classifier.predict(test_X) # 此处自动调用最佳参数？？
    print("\nFull performance report:\n {}".format(classification_report(test_y,y_pred)))

-------------------------------------输---------出--------------------------------

Searching optimal hyperparameters for: precision
Scores across the parameter grid:
{'C': 1, 'kernel': 'linear'}: avg_scores: 0.809
{'C': 10, 'kernel': 'linear'}: avg_scores: 0.809
{'C': 50, 'kernel': 'linear'}: avg_scores: 0.809
{'C': 600, 'kernel': 'linear'}: avg_scores: 0.809
{'degree': 2, 'kernel': 'poly'}: avg_scores: 0.859
{'degree': 3, 'kernel': 'poly'}: avg_scores: 0.852
{'C': 1, 'gamma': 0.01, 'kernel': 'rbf'}: avg_scores: 1.0
{'C': 1, 'gamma': 0.001, 'kernel': 'rbf'}: avg_scores: 0.0
{'C': 10, 'gamma': 0.01, 'kernel': 'rbf'}: avg_scores: 0.968
{'C': 10, 'gamma': 0.001, 'kernel': 'rbf'}: avg_scores: 0.855
{'C': 50, 'gamma': 0.01, 'kernel': 'rbf'}: avg_scores: 0.946
{'C': 50, 'gamma': 0.001, 'kernel': 'rbf'}: avg_scores: 0.975
{'C': 600, 'gamma': 0.01, 'kernel': 'rbf'}: avg_scores: 0.948
{'C': 600, 'gamma': 0.001, 'kernel': 'rbf'}: avg_scores: 0.968

Highest scoring parameter set: {'C': 1, 'gamma': 0.01, 'kernel': 'rbf'}

Full performance report:
precision recall f1-score support

0 0.75 1.00 0.86 36
1 1.00 0.69 0.82 39

avg / total 0.88 0.84 0.84 75

Searching optimal hyperparameters for: recall_weighted

Scores across the parameter grid:
{'C': 1, 'kernel': 'linear'}: avg_scores: 0.653
{'C': 10, 'kernel': 'linear'}: avg_scores: 0.653
{'C': 50, 'kernel': 'linear'}: avg_scores: 0.653
{'C': 600, 'kernel': 'linear'}: avg_scores: 0.653
{'degree': 2, 'kernel': 'poly'}: avg_scores: 0.889
{'degree': 3, 'kernel': 'poly'}: avg_scores: 0.884
{'C': 1, 'gamma': 0.01, 'kernel': 'rbf'}: avg_scores: 0.76
{'C': 1, 'gamma': 0.001, 'kernel': 'rbf'}: avg_scores: 0.507
{'C': 10, 'gamma': 0.01, 'kernel': 'rbf'}: avg_scores: 0.907
{'C': 10, 'gamma': 0.001, 'kernel': 'rbf'}: avg_scores: 0.658
{'C': 50, 'gamma': 0.01, 'kernel': 'rbf'}: avg_scores: 0.92
{'C': 50, 'gamma': 0.001, 'kernel': 'rbf'}: avg_scores: 0.72
{'C': 600, 'gamma': 0.01, 'kernel': 'rbf'}: avg_scores: 0.933
{'C': 600, 'gamma': 0.001, 'kernel': 'rbf'}: avg_scores: 0.902

Highest scoring parameter set: {'C': 600, 'gamma': 0.01, 'kernel': 'rbf'}

Full performance report:
precision recall f1-score support

0 1.00 0.92 0.96 36
1 0.93 1.00 0.96 39

avg / total 0.96 0.96 0.96 75

--------------------------------------------完-------------------------------------

########################小**********结###############################

1. 使用GridSearch中的GridSearchCV可以实现最佳参数组合的搜索，但需要指定候选参数和模型的评价指标。

2. 使用classifier.best_params_函数可以直接把最佳的参数组合打印出来，方便以后参数的直接调用

3. classifier.predict函数是自动调用最佳的参数组合来预测，从而得到该模型在测试集或训练集上的预测值。

#################################################################

如果要使用最佳参数来构建SVM模型，可以采用下面的代码来实现：

best_classifier=svm.SVC(C=600,gamma=0.01,kernel='rbf') # 上面的full performance report的确使用的是最佳参数组合best_classifier.fit(train_X, train_y)
y_pred =best_classifier.predict(test_X)print("\nFull performance report:\n {}".format(classification_report(test_y,y_pred)))

得到的结果和上面full performance report一模一样。

注：本部分代码已经全部上传到（我的github）上，欢迎下载。

参考资料:

1, Python机器学习经典实例，Prateek Joshi著，陶俊杰，陈小莉译

作者：炼丹老顽童
链接：https://www.jianshu.com/p/15123a665c0c

點擊查看更多內容

為 TA 點贊

若覺得本文不錯，就分享一下吧！

評論

評論

共同學習，寫下你的評論

評論加載中...

展開查看更多評論

作者其他優質文章

正在加載中

慕碼人8056858

手記
篇

粉絲

351

獲贊與收藏

1325

關注作者，訂閱最新文章

閱讀免費教程

后端通用面試教程

41個小節 32486 366

網絡編程入門教程

20個小節 13447 254

Pandas 入門教程

25個小節 20073 381

推薦

評論

收藏

共同學習，寫下你的評論



感謝您的支持，我會繼續努力的～

掃碼打賞，你說多少就多少

贊賞金額會直接到老師賬戶

支付方式

打開微信掃一掃，即可進行掃碼打賞哦

今天注冊有機會得

100積分直接送

付費專欄免費學

大額優惠券免費領

立即參與放棄機會

點擊
抽獎

慕課手記新用戶專享福利

恭喜你，你的運氣太好了，居然抽中了 100個積分！

恭喜你，抽中了價值元的專欄！

太棒了，直接落到你賬戶里！

積分商城里的羅技鼠標、機械鍵盤、
Kindle 閱讀器、小米平衡車
Apple iPad （10.2英寸）、大額優惠券
在等著你去兌換了噢

作者：

免費贈送

兌換碼：1111222211 復制

優惠券可用于購買實戰課、體系課
無門檻使用

先去看看，有什么好東西馬上兌換我愛學習，選課去


亚洲在线久爱草,狠狠天天香蕉网,天天搞日日干久草,伊人亚洲日本欧美

熱搜

最近搜索清空

【火爐煉AI】機器學習017-使用GridSearch搜索最佳參數組合

【火炉炼AI】机器学习017-使用GridSearch搜索最佳参数组合

1. 准备数据集

2. 使用GridSearch函数来寻找最优参数

閱讀免費教程