Donald Lee, Applied Data Science Seminar, 17 Hillhouse Avenue, 3rd Floor

Monday, October 9, 2017 - 4:15pm to 5:30pm
Yale University School of Management, Associate Professor of Operations
Boosting hazard regression with time-varying covariates
Consider a left-truncated right-censored survival process whose evolution depends on time-varying covariates. Given functional data samples from the process, we propose a practical boosting procedure for estimating its log-intensity function. Our method does not require any separability assumptions like Cox proportional or Aalen additive hazards, thus it can flexibly capture time-covariate interactions. The estimator is consistent if the model is correctly specified; alternatively an oracle inequality can be demonstrated for tree-based models. We use the procedure to shed new light on a question from the operations literature concerning the effect of workload on service rates in an emergency department. To avoid overfitting, boosting employs several regularization devices. One of them is step-size restriction, but the rationale for this is somewhat mysterious from the viewpoint of consistency: In theoretical treatments of classification and regression problems, unrestricted greedy step-sizes appear to suffice. Given that the partial log-likelihood functional for hazard regression has unbounded curvature, our work suggests that step-size restriction might be a mechanism for preventing the curvature of the risk from derailing convergence. Joint work with Ningyuan Chen (HKUST)
Yale Institute for Network Science, 17 Hillhouse Avenue, 3rd Floor