豪斯曼, 拉姆齐检验，过度拟合，弱工具和过度识别，模型选择和重抽样问题

Original: 计量圈计量经济圈 2019-06-30

《正文》

1.Hausman specification test

The test evaluates the consistency of an estimator when compared to an alternative, less efficient estimator which is already known to be consistent. It helps one evaluate if a statistical model corresponds to the data.

用处1：检测变量的内生性This test can be used to check for the endogeneity of a variable (by comparing instrumental variable (IV) estimates to ordinary least squares (OLS) estimates).

用处2：检测增加一个额外工具变量的正当性It can also be used to check the validity of extra instruments by comparing IV estimates using a full set of instruments Z to IV estimates that use a proper subset of Z. Note that in order for the test to work in the latter case, we must be certain of the validity of the subset of Z and that subset must have enough instruments to identify the parameters of the equation.

用处3：区分面板数据中的固定效应和随机效应The Hausman test can be also used to differentiate between fixed effects model and random effects model in panel data. In this case, Random effects (RE) is preferred under the null hypothesis due to higher efficiency, while under the alternative Fixed effects (FE) is at least consistent and thus preferred.

2.Ramsey RESET test

Specification error occurs when an independent variable is correlated with the error term. There are several different causes of specification error:

用处1：检测是不是用了不正确的方程式An incorrect functional form could be employed;

用处2：检测是不是省略了重要变量a variable omitted from the model may have a relationship with both the dependent variable and one or more of the independent variables (omitted-variable bias);

用处3：检测是不是加入了不相关的变量an irrelevant variable may be included in the model;

用处4：检测是不是有联立性偏误the dependent variable may be part of a system of simultaneous equations (simultaneity bias);

用处5：检测是不是有测量误差measurement errors may affect the independent variables.

3.Overfitting

过拟合有两种原因：

1.训练集和测试机特征分布不一致（白天鹅黑天鹅）

2.或者模型太过复杂（记住了每道题）而样本量不足

在回归时，样本的数量n和参数的数量p

n>p时，最小二乘回归会有较小的方差
n=p时，容易产生过拟合(overfitting)

模型的解释能力：在模型中，总有一个variance与bias的平衡过程，如果这个模型与真实数据之间的variance很小，那么很可能在out-of-sample预测过程中会有较大的bias，这就是我们在overfitting中遇到的问题。

In order to avoid overfitting, it is necessary to use additional techniques (e.g. cross-validation（交叉验证）, regularization（正则化）, early stopping, pruning, Bayesian priors on parameters, model comparison or dropout), that can indicate when further training is not resulting in better generalization. 对于这些过度拟合的补救方法可以参看：http://dwz.cn/6uAcog（复制到浏览器）。

The basis of some techniques is either (1) to explicitly penalize overly complex models, or (2) to test the model's ability to generalize by evaluating its performance on a set of data not used for training, which is assumed to approximate the typical unseen data that a model will encounter.

4.Weak instruments and overidentification test

4.1.“Weak Instruments” （弱工具变量会造成回归的效率甚至一致性出问题）

• If cov(z, x) is weak, IV no longer has such desirable asymptotic properties

• IV estimates are not unbiased, and the bias tends to be larger when instruments are weak (even with very large datasets)

• Weak instruments tend to bias the results towards the OLS estimates

• Adding more and more instruments to improve asymptotic efficiency does not solve the problem. Recommendation always test the ‘strength’ of your instrument(s) by reporting the F-test on the instruments in the first stage regression (如果第一阶段的内生变量X对工具变量Z的回归中，F test的数值大于10，就不是weak instruments)。

4.2.Overidentification test（在工具变量多于内生变量情况下，检测变量这些工具变量是不是外生的）

sargan test原假设是所有工具变量外生时构造近似卡方统计量，如果违反原假设，2SLS有偏，随机干扰项估计也有偏，统计量自然也不服从卡方分布。如果违反原假设，2SLS有偏，随机干扰项估计也有偏，统计量自然也不服从卡方分布。这里检验只考虑原假设下统计量的显著性问题，如果卡方统计量大则拒绝原假设认为，工具变量有内生的，反之不能认为工具变量内生（当然也不能肯定外生）。由于原假设是外生，检验不能检验是否外生。

5.Criteria for model selection（模型选择标准）

Akaike information criterion

Bayes factor

Bayesian information criterion

Cross-validation

Deviance information criterion

False discovery rate

Focused information criterion

Likelihood-ratio test

Mallows's Cp

Minimum description length (Algorithmic information theory)

Minimum message length (Algorithmic information theory)

Structural Risk Minimization

Stepwise regression

The most commonly used criteria are (i) the Akaike information criterion and (ii) the Bayes factor and/or the Bayesian information criterion (which to some extent approximates the Bayes factor).

6.Bootstrap, Jacknife and Permutation test

Bootstrap自助法

在统计学中，自助法（BootstrapMethod，Bootstrapping或自助抽样法）可以指任何一种有放回的均匀抽样，也就是说，每当选中一个样本，它等可能地被再次选中并被再次添加到训练集中。自助法能对采样估计的准确性（标准误差、置信区间和偏差）进行比较好的估计，它基本上能够对任何采样分布的统计量进行估计。

Bootstrap有两种形式：非参数bootstrap和参数化的bootstrap，但基本思想都是模拟。参数化的bootstrap假设总体的分布已知或总体的分布形式已知，可以由样本估计出分布参数，再从参数化的分布中进行再采样，类似于MC。非参数化的bootstrap是从样本中再抽样，而不是从分布函数中进行再抽样。

Jackknife刀切法

Jackknife意为大摺刀,在统计分析中是一种估计方法,它是利用一次抽样的样本观察值,来构造未知参数的无偏估计(或偏性很小的估计量)的一种模拟抽样统计推断方法.该法每次从原样本中剔除一个样本,得到样本含量为n-1的新样本,称为Jackknife样本,共有n个,由每个样本计算估计值,称为Jackknife估计.本方法是Quenouille于1956年提出的.因为用该方法得到未知参数的估计量偏性小或无偏性,故而在精确度要求较高的研究领域中具有很大的应用价值.以下将介绍Jackknife估计的方法,并举一实例说明其在医学研究中的应用。

Efron1979年文章指出了自助法与刀切法的关系。首先，自助法通过经验分布函数构建了自助法世界，将不适定的估计概率分布的问题转化为从给定样本集中重采样。第二，自助法可以解决不光滑参数的问题。遇到不光滑(Smooth)参数估计时，刀切法会失效，而自助法可以有效地给出中位数的估计。第三，将自助法估计用泰勒公式展开，可以得到刀切法是自助法方法的一阶近似。第四，对于线性统计量的估计方差这个问题，刀切法或者自助法会得到同样的结果。但在非线性统计量的方差估计问题上，刀切法严重依赖于统计量线性的拟合程度，所以远不如自助法有效。

Permutation test 置换检验（非参数检验）

当样本量不够大，样本分布未知的情况下；用置换检验模拟出样本均值分布，然后再进行比较。

in detials：

两组数据：A:样本量n；B:样本量m，总体样本数量：n+m

则从n+m个样本中随机抽取n个值，计算出样本均值，然后重复此过程i次（i=1000），得到样本均值的分布情况，然后将A样本均值与得到的分布进行比较。则可以进行假设检验。

从n+m个样本中随机抽n个的为A，剩下m为B，计算两组差异，重复次过程i次，得到差异的分布情况，将实际差异与分布情况进行比较。

attention：模拟数据，想法与置换检验有相似点。去除掉混淆因素。

也可以看看这个：（变量内生性+工具变量知识汇总）

对于工具变量回归，计量经济圈推荐经典读物：

https://pan.baidu.com/s/1c1OK37M

《END》

写在后面：各位圈友，一个等待数日的好消息，是计量经济圈应圈友提议，09月04日创建了“计量经济圈的圈子”知识分享社群，如果你对计量感兴趣，并且考虑加入咱们这个计量圈子来受益彼此，那看看这篇介绍文章和操作步骤哦（戳这里），进去之后一定要看“群公告”，不然接收不了群信息。

这样的洞庭湖决堤，实在让人同情不起来

李尚福、魏凤和双双被拿下，与美国一份报告是否有关？

抗洪靠嘴，堵漏靠沙？印度官员真是绝了！

有的人走了，却永远活着

圈内疯传某谣言