张欣, 赵俊龙. 高维线性模型的影响点诊断[J]. 北京师范大学学报(自然科学版), 2023, 59(2): 313-318. DOI: 10.12202/j.0476-0301.2022308
引用本文: 张欣, 赵俊龙. 高维线性模型的影响点诊断[J]. 北京师范大学学报(自然科学版), 2023, 59(2): 313-318. DOI: 10.12202/j.0476-0301.2022308
ZHANG Xin, ZHAO Junlong. Influential point diagnosis for high-dimensional linear models[J]. Journal of Beijing Normal University(Natural Science), 2023, 59(2): 313-318. DOI: 10.12202/j.0476-0301.2022308
Citation: ZHANG Xin, ZHAO Junlong. Influential point diagnosis for high-dimensional linear models[J]. Journal of Beijing Normal University(Natural Science), 2023, 59(2): 313-318. DOI: 10.12202/j.0476-0301.2022308

高维线性模型的影响点诊断

Influential point diagnosis for high-dimensional linear models

  • 摘要: 从单影响点到多影响点2个角度回顾了影响点诊断领域的进展;重点介绍了近年发展起来的一些高维影响点检测新方法,该方法适用于自变量个数远超样本量的情形,可被看作是经典Cook距离在高维数据的推广.Cook距离量化了个体观测对最小二乘系数估计的影响,而新方法则捕获了个体观测对边际相关的影响,进而对变量选择和其他下游分析任务产生重要影响.数值模拟结果验证了新方法的可行性和有效性.

     

    Abstract: Advances in the field of influential point diagnosis are reviewed from both single and multiple influential point perspectives.Several new methods for high-dimensional influential point diagnosis developed in recent years are highlighted.The method is applicable to cases where the number of independent variables far exceeds sample size, and can be regarded as generalization of the classical Cook distance to high-dimensional data.The Cook distance measures effect of observations on least square coefficient estimates.In contrast, the new methods capture the effect of observations on marginal correlation, with important implications for variable selection and other downstream tasks.Numerical simulation results demonstrate effectiveness of these new methods.

     

/

返回文章
返回