如何利用Python建立随机森林分类模型

时间：2024-11-15 12:38:00

下面介绍一下Python中的随机森林的简单实现。

工具/原料

python

sklearn

方法/步骤

1、利用pandas准备墙绅褡孛数据框数据。importpandasaspddf=pd.DataFrame({媪青怍牙'name':['Lily','Lucy','Jim','Tom','Anna','Jack','Sam'],'weight':[42,38,78,67,52,80,92],'height':[162,158,169,170,166,175,178],'is_fat':[0,0,1,0,1,0,1]})

2、导入sklearn工具包。fromsklearn.ensembleimportRandomForestClassifier

3、模型数据准备。X=df.loc[:,['weight','height']]y=df['is_fat']

4、建立模型，并进行模型训练。clf=RandomForestClassifier()clf.fit(X,y)

5、获得变量权重。X_importance=clf.feature_importances_print(X_importance)可以看到第一个变量'weight'的权重最高。

6、模型预测。y_pred=clf.predict(X)print(y_pred)

7、绘制预测结果图。importmatplotlib饱终柯肢.pyplotaspltplt.figure()df['is_fat_pred']=y_preddf_0=df[df[&垆杪屑丝#39;is_fat_pred']==0]df_1=df[df['is_fat_pred']==1]plt.scatter(df_0['weight'],df_0['height'],c='y',s=50,label='normal')plt.scatter(df_1['weight'],df_1['height'],c='lightblue',s=100,label='fat')forkinrange(len(X)):plt.text(X['weight'][k],X['height'][k],df['name'][k])plt.legend()

如何在WPS中显示NoteExpress插件论文中图表的正确格式是怎样的？如何查找知网上质量较好的文献综述 SPSSAU如何进行因子分析？ Python统计分析：[8]单因素卡方检验