随机森林 | Qi

随机森林（Random Forest）是一种集成学习方法，通过组合多个决策树来提高预测准确性和泛化能力。以下是使用Python实现随机森林的示例代码，使用了scikit-learn库：

首先，确保已安装 scikit-learn 库，可以通过以下命令安装：

1	pip install scikit-learn

接下来，使用下面的代码示例：

from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split
from sklearn.ensemble import RandomForestClassifier
from sklearn.metrics import accuracy_score, classification_report

# 加载鸢尾花数据集
iris = load_iris()
X, y = iris.data, iris.target

# 划分训练集和测试集
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# 创建随机森林分类器
model = RandomForestClassifier(n_estimators=100, random_state=42)

# 训练模型
model.fit(X_train, y_train)

# 在测试集上进行预测
y_pred = model.predict(X_test)

# 计算准确率和分类报告
accuracy = accuracy_score(y_test, y_pred)
classification_rep = classification_report(y_test, y_pred, target_names=iris.target_names)

print("Accuracy:", accuracy)
print("Classification Report:\n", classification_rep)

在上述代码中，我们首先加载了鸢尾花数据集，然后使用train_test_split函数将数据集划分为训练集和测试集。接着，我们创建了一个随机森林分类器，并通过设置n_estimators参数来指定包含的决策树数量。使用fit方法训练模型，在测试集上进行预测，并计算了模型的准确率和分类报告。

随机森林的优点之一是可以处理分类和回归问题，并且不容易过拟合。在实际应用中，您可以根据问题的需求调整超参数，如树的数量、最大深度等，以达到更好的性能。