通过对南开大学国际经济研究所1999级研究生考试分数及录取情况的研究,引入录取与未录取这一虚拟变量,比较线性概率模型与Probit模型,Logit模型,预测正确率。 二,模型设定
obs 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 Y 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 SCORE 401 401 392 387 384 379 378 378 376 371 362 362 361 359 358 356 356 355 354 354 353 350 349 349 348 347 347 344 339 338 338 336 obs 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 Y 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 SCORE 332 332 332 331 330 328 328 328 321 321 318 318 316 308 308 304 303 303 299 297 294 293 293 292 291 291 287 286 286 282 282 282 obs 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 Y 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 SCORE 275 273 273 272 267 266 263 261 260 256 252 252 245 243 242 241 239 235 232 228 219 219 214 210 204 198 189 188 182 166 123 33 0 334 66 0 278 定义变量SCORE :考生考试分数;Y :考生录取为1,未录取为0。 上图为样本观测值。 1. 线性概率模型
Dependent Variable: Y Method: Least Squares Date: 12/10/10 Time: 20:38 Sample: 1 97
Included observations: 97
Variable C SCORE
Adjusted R-squared S.E. of regression Sum squared resid Log likelihood Durbin-Watson stat
Coefficient -0.847407 0.003297
Std. Error 0.159663 0.000521
t-Statistic -5.307476 6.325970
Prob.?? 0.0000 0.0000 0.144330 0.353250 0.436060 0.489147 40.01790 0.000000
0.296390 ????Mean dependent var 0.288983 ????S.D. dependent var 0.297866 ????Akaike info criterion 8.428818 ????Schwarz criterion -19.14890 ????F-statistic 0.359992 ????Prob(F-statistic)
??-0.847407+0.003297 SCOREi 参数估计结果为:Yi Se=(0.159663)( 0.000521) t=(-5.307476) (6.325970)
p=(0.0000) (0.0000)
Forecast: YF Actual: Y
Forecast sample: 1 97 Included observations: 97 Root Mean Squared Error Mean Absolute Error?????
0.294780 0.233437
Mean Absolute Percentage Error 8.689503 Theil Inequality Coefficient? 0.475786 ?????Bias Proportion???????? 0.000000 ?????Variance Proportion? 0.294987 ?????Covariance Proportion?
Dependent Variable: Y
Method: ML - Binary Logit (Quadratic hill climbing) Date: 12/10/10 Time: 21:38
Sample: 1 97
Included observations: 97
Convergence achieved after 11 iterations
Covariance matrix computed using second derivatives
Variable Coefficient Std. Error z-Statistic Prob.?? C -243.7362 125.5564 -1.941248 0.0522 SCORE
0.0526 Mean dependent var 0.144330 ????S.D. dependent var 0.353250 S.E. of regression 0.115440 ????Akaike info criterion 0.123553 Sum squared resid 1.266017 ????Schwarz criterion 0.176640 Log likelihood -3.992330 ????Hannan-Quinn criter. 0.145019 Restr. log likelihood -40.03639 ????Avg. log likelihood -0.041158 LR statistic (1 df) 72.08812 ????McFadden R-squared 0.900282
Probability(LR stat) 0.000000
Obs with Dep=0 83 ?????Total obs 97
Obs with Dep=1
pi = F(yi) =
11?e?(?243.7362?0.6794xi) 拐点坐标 (358.7, 0.5)
比较线性模型和Probit模型 Logit模型