主要组件
Boosting
void GBDT::Init(const Config* gbdt_config, const Dataset* train_data,const ObjectiveFunction* objective_function,const std::vector<const Metric*>& training_metrics) override
初始化,主要是创建样本采样策略data_sample_strategy_
,设置目标函数objective_function_
,创建tree_learner_
,创建train_score_updater_
,配置training_metrics_
void GBDT::Train(int snapshot_freq, const std::string& model_output_path) override
训练处理
bool GBDT::TrainOneIter(const score_t* gradients, const score_t* hessians) override
单次迭代训练
void GBDT::Boosting()
计算梯度和海森矩阵
void UpdateScore(const Tree* tree, const int cur_tree_id)
树训练完后更新评分
TreeLearner
目标函数ObjectiveFunciton
二分类对数损失
一般定义为
L(y,f(x))=log(1+exp(−y⋅f(x)))L(y, f(x))=log(1+exp(-y \cdot f(x)))L(y,f(x))=log(1+exp(−y⋅f(x)))
其中yyy是标签,取值为{−1,1}\left \{-1,1\right \}{−1,1},f(x)f(x)f(x)是模型输出的分数,令z=y⋅f(x)z=y\cdot f(x)z=y⋅f(x),则损失函数为L=log(1+exp(−z))L=log(1+exp(-z))L=log(1+exp(−z))
对zzz求导有∂L∂z=−exp(−z)1+exp(−z)=−11+exp(z)\frac{\partial L}{\partial z} = \frac{-exp(-z)}{1+exp(-z)} = -\frac{1}{1+exp(z)}∂z∂L=1+exp(−z)−exp(−z)=−1+exp(z)1,所以对f(x)f(x)f(x)求导有
∂L∂f(x)=∂L∂z⋅∂z∂f(x)=−y1+exp(y⋅f(x))\frac{\partial L}{\partial f(x)} = \frac{\partial L}{\partial z} \cdot \frac{\partial z}{\partial f(x)} = - \frac{y}{1+exp(y \cdot f(x))}∂f(x)∂L=∂z∂L⋅∂f(x)∂z=−1+exp(y⋅f(x))y
在BinaryLogloss
中对损失函数添加了缩放因子sigmoid_
,即
L(y,f(x))=log(1+exp(−y⋅σ⋅f(x)))L(y, f(x))=log(1+exp(-y \cdot \sigma \cdot f(x)))L(y,f(x))=log(1+exp(−y⋅σ⋅f(x)))
对f(x)f(x)f(x)求导
∂L∂f(x)=−y⋅σ1+exp(y⋅σ⋅f(x))\frac{\partial L}{\partial f(x)}=-\frac{y \cdot \sigma}{1+exp(y\cdot \sigma \cdot f(x))}∂f(x)∂L=−1+exp(y⋅σ⋅f(x))y⋅σ
BinaryLogloss
在计算梯度时添加了样本权重weights_[i]
和标签权重label_weight
const double response = -label * sigmoid_ / (1.0f + std::exp(label * sigmoid_ * score[i]));
gradients[i] = static_cast<score_t>(response * label_weight * weights_[i]);