模型切片

切片树模型

当 booster 设置为 gbtree 或 dart 时，XGBoost 会构建一个树模型，它是一个树列表，可以被切片成多个子模型。

import xgboost as xgb
from sklearn.datasets import make_classification
num_classes = 3
X, y = make_classification(n_samples=1000, n_informative=5,
                           n_classes=num_classes)
dtrain = xgb.DMatrix(data=X, label=y)
num_parallel_tree = 4
num_boost_round = 16
# total number of built trees is num_parallel_tree * num_classes * num_boost_round

# We build a boosted random forest for classification here.
booster = xgb.train({
    'num_parallel_tree': 4, 'subsample': 0.5, 'num_class': 3},
                    num_boost_round=num_boost_round, dtrain=dtrain)

# This is the sliced model, containing [3, 7) forests
# step is also supported with some limitations like negative step is invalid.
sliced: xgb.Booster = booster[3:7]

# Access individual tree layer
trees = [_ for _ in booster]
assert len(trees) == num_boost_round

library(xgboost)
data(agaricus.train, package = "xgboost")
dm <- xgb.DMatrix(agaricus.train$data, label = agaricus.train$label)

model <- xgb.train(
  params = xgb.params(objective = "binary:logistic", max_depth = 4),
  data = dm,
  nrounds = 20
)
sliced <- model[seq(3, 7)]
##### xgb.Booster
# of features: 126
# of rounds:  5

切片模型是选定树的副本，这意味着模型本身在切片过程中是不可变的。此功能是提前停止回调中 save_best 选项的基础。请参阅使用单独的树和模型切片进行预测的演示，了解如何将预测与切片树结合使用的示例。

注意

返回的模型切片不包含诸如 best_iteration 和 best_score 等属性。