查看原文
其他

Stata, 不可能后悔的10篇文章, 编程code和注解

计量经济圈 计量经济圈 2020-02-22

凡是搞计量经济的,都关注这个号了

箱:econometrics666@sina.cn

所有计量经济圈方法论丛的code程序, 宏微观数据库和各种软件都放在社群里.欢迎到计量经济圈社群交流访问

**Copyrights @计量经济圈(ID: econometrics666)-

1.数据管理的Stata程序功夫秘籍

2.数据清洗管理的知识点全在里面, 征服数管不是梦

3.高效使用Stata的115页Tips, PDF版本可打印使用

4."高级计量经济学及Stata应用"和"Stata十八讲"配套数据

5.编程语言中的函数什么鬼?Stata所有函数在此集结

6.世界范围内使用最多的500个Stata程序,再不使用你就真的out了

7.Stata最有用的points都在这里,无可替代的材料

8.reg3, 多元回归, 面板数据, 方差分析, 异方差和自相关检验和修正的Stata程序Handbook

9.Stata统计功能、数据作图、学习资源等,一文打尽所有你的wonders

10.为发表SSCI刊物, 必看这21篇文章

如果你不懂下面每个程序运行的是什么,你可以到社群交流探讨。

http://fmwww.bc.edu/repec/bocode/t/textEditors.html#notepadplus
http://personal.lse.ac.uk/lembcke/ecStata/2010/MResStataNotesOct2010PartA.pdf
http://homepages.rpi.edu/~simonk/pdf/UsefulStataCommands.pdf
http://statadaily.com/tag/notepad/
http://personal.lse.ac.uk/lembcke/ecStata/2009/MResStataNotesFeb2009PartB.pdf
http://fmwww.bc.edu/GStat/docs/StataMLNL.pdf

第一个程序
capture program drop argdisp
program argdisp
version 13
args first second third //根据位置来分配宏定义
display "1st argument = first'" display "2nd argument =second'"
display "3rd argument = `third'"
end
argdisp cat dog mouse //第一个宏的位置为cat,第二个为dog,第三个为mouse
argdisp 3.456 2+5-12 X3+cat //第一个宏的位置为3.456,第二个为2+5-12,第三个为X3+cat

*第二个程序
capture program drop myprog
program myprog
version 15
syntax varlist [if] [in] [, adjust(real 1) title(string)]
display
if "title'"!= "" { display "title':"
}
foreach var of local varlist {
quietly summarize var'if' in' display "var'" " " r(mean)`adjust'
}
end
webuse auto.dta, clear
myprog mpg price //计算均值的一个程序
myprog mpg weight if foreign==1 //条件foreign==1的情况下的均值情况
myprog mpg weight if foreign==1, title("My title")
myprog mpg weight if foreign==1, title("My title") adjust(2)

**第三个程序
capture program drop doavar
program doavar
version 15
args touse name value
qui summarize 
name' iftouse'
display "
name'" " " r(mean)*value'
end

**第四个程序
capture program drop myprog
program myprog
version 15
syntax varlist [if] [in] [, adjust(real 1) title(string)]
marksample touse
display
if "
title'"!= "" { display "title':"
}
foreach var of local varlist {
doavar 
touse'var' `adjust'
}
end
webuse auto.dta, clear
doavar mpg weight trunk //表示第二变量weight的均值与第三个变量trunk的乘积
myprog mpg weight trunk //这三个变量的均值

**第五个程序
capture program drop lnsim
program lnsim
version 15
tempname sim //临时名字
postfile 
sim' mean var meansd sd using results, replace //四个变量储存在results.dta quietly { forvalues i = 1/10000 { drop _all set obs 100 gen z = exp(rnormal()) //lognormal随机数 sum z postsim' (r(mean)) (r(var)) (r(mean)/r(sd)) (r(sd)) //求这四个变量的相关统计量
}
}
postclose `sim'
end
set seed 12345
lnsim
use results, clear
describe
sum
*产生新的变量或者在变量中输入新的值
webuse genxmpl2, clear
generate str9 lastname = word(name, 2)

**输入新的变量
input x
x
1
2
end
input double (y z)
y z
3 4
5 6
end
input str2 s
s
ab
cd
end

*与Preserve功能相仿的恢复功能
webuse auto, clear
snapshot erase _all //先把这些snapshot从系统中清除
snapshot save, label("before changes")
generate gpm = 1/mpg //产生新的变量
label variable gpm "gallons per mile"
snapshot save, label("after changes") //保存这个改变后的数据
drop gpm //现在drop掉gpm
snapshot list //看看现在有几个snapshot
snapshot restore 2 //恢复snapshot 2即gpm
describe //可以看看现在恢复过后的数据
snapshot restore 1
snapshot list

*Program相关的功能
program dir //可以看到有多少ado文件在系统内存里
capture program drop rng
program rng
args n a b
if "
b'"=="" { display "You must type three arguments: n a b" exit } drop _all set obsn'
gen x = (_n-1)/(_N-1)
(
b'-a')+`a'
end
rng 10 2 3 //运行程序
list x in 1/10 //得到的结果

**第六个程序
capture program drop smooth
program smooth
args v1 v2
confirm variable 
v1' //核实数据集里有这个变量 confirm new variablev2' //核实数据集里没有这个变量
gen 
v2' = cond(_n==1 |_n==_N,v1',(v1'[_n-1]+v1'[_n+1])/3) //条件函数
end
webuse auto.dta, clear
smooth mpg new_mpg
list new_mpg in 1/10

**扩展宏
local logitprops: properties logit //logit程序的性质
di `logitprops'

*Putexcel输出操作
putexcel set results //设置results.xlsx作为输出excel
putexcel A1 = "Variable" B1 = "Mean" C1 = "Std. Dev.", border(bottom)
sysuse auto, clear
summarize mpg
return list
putexcel A2 = "mpg" B2 = 
r(mean)' C2 =r(sd)', nformat(number_d2)

*输出tabulation table
sysuse auto, clear
putexcel set results //设置results.xlsx作为输出excel
tab foreign, matcell(cell) matrow(rows)
putexcel A1=("Car type") B1=("Freq.")
putexcel A2=matrix(rows) B2=matrix(cell)
putexcel A4=("Total") B4=(r(N))

*输出回归中的各个结果
sysuse auto.dta, clear
regress price turn gear
putexcel set "results.xls", sheet("regress resutls")
putexcel F1=("Number of obs") G1=(e(N))
putexcel F2=("F") G2=(e(F))
putexcel F3=("Prob > F") G3=(Ftail(e(df_m), e(df_r),e(F)))
putexcel F4=("R-squared") G4=(e(r2))
putexcel F5=("Adj R-squared") G5=(e(r2_a))
putexcel F6=("Root MSE") G6=(e(rmse))
matrix a=r(table)'
matrix a=a[.,1..6]
putexcel A8=matrix(a)

*使用quietly进行编程
capture program drop myprog
program myprog
quietly{
regress 
1'2'
predict resid, resid
sort resid
summarize resid, detail
}
list 
1'2' resid if resid<r(p5) | resid>r(p95)
drop resid
end
sysuse auto.dta, clear
myprog mpg price //直接列出来三部分的值

展示共线性所删除的变量
sysuse auto.dta, clear
gen tt= turn+ trunk
_rmcoll turn trunk tt
display r(varlist)
_rmcoll i.rep78
display r(varlist)
_rmcoll rep78#foreign
display r(varlist)
syntax varlist [fweight iweight] ... [, noCONStant ... ]
marksample touse
if "
weight'"!= "" { tempvar w quietly gen doublew' = exp' iftouse'
local wgt [weight'=w']
}
else local wgt /
 is nothing */
gettoken depvar xvars : varlist
_rmcoll 
xvars'wgt' if touse',constant'
local xvars `r(varlist)'

*程序运行需要的时间记录
capture program drop tester
sysuse auto.dta, clear
program tester
version 15
timer clear 1
forvalues repeat=1(1)1{
timer on 1
logit foreign trunk price rep78 //这是需要运行的程序
timer off 1
}
timer list 1
end
sum turn
logit foreign trunk price if length > 190
marksample touse
reg headroom trunk price if touse==1

Scalar相关操作程序
sysuse auto.dta, clear
sum mpg, meanonly
scalar m1=r(mean)
sum trunk, meanonly
scalar m2=r(mean)
scalar df=m1-m2
dis df
scalar list //把所有的scalars显示出来
gen newvar1=mpg
m1
dis newvar1
gen newvar2 = mpg*scalar(m1) //这一个更好
dis newvar2

*构造一个简单的程序
capture program drop mysub
sysuse auto.dta, clear
program mysub
args m1 m2 m3
logit 
m1'm2' `m3'
end

capture program drop myprog
program myprog
drop z
set obs 100
gen z=uniform()
sum z
gen m1 = r(mean)
mysub foreign m1 trunk
end

*决定是否数据已经发生改变
sysuset auto.dta, clear
logit foreign trunk price, vce(cluster make)
predict xb
signestimationsample foreign trunk price
checkestimationsample //如果数据没有发生改变则silently return

quietly tsset //时间序列数据
signestimationsample r(timevar) lhsvar rhsvars othervars
quietly xtset //面板数据
signestimationsample r(panelvar) rtimevar lhsvar rhsvars clustervar

*让Stata等10秒钟再运行下一个程序
sleep 10000

*SMCL: Stata markup and control language
display "{title: this is SMCL, too}"
display "now we will try {help summarize: clicking}"
display "You can also run Stata commands by {stata summarize mpg: clicking}"
display "{center: The use of {ul:SMCL} in help files}"
display "{text}the variable mpg has mean {result: 21.3} in the sample"
display "{text}mpg {c |} {result}21.3"
display "{text}mpg {c |} {result:21.3}"
display "error: variable not found"
display "{txt}the variable mpg has mean {res:21.3} in the sample"
display "When using the {cmd:summarize} command, specify"
display "{cmdab:su:mmarize}[{it:varlist}][{it:weight}][{cmdab:if} {it:exp}]"
display "{opt replace}"
display "{opt bseunit(varname)}"
display "opt f:ormat"
display "sep:arator(#)"
display "{hilite:[R] anova} for more details"
display " this text will be ignored"

display "{hiline 20}"
display "{dup 20: A}"
display "{manhelpi mta M:Mata Reference Manual}"
display "{{pstd}You can change the style of the text using the {cmd} directive; see {help example##cmd} below}"
display "{help epitab}"
display "{newvar}"
display "{search anova: click here} for the latest info on ANOVA"
display "you can {browse "
http://www.stata.com":visit the Stata website}"
display "see {view "
http://www.stata.com/man/readme.smcl"}"

*一个SMCL相关的程序
program example2
display as text "{p}"
display "Below we will call a subroutine to contribute a sentence"
display "to this paragraph being constructed by example2:"
example2_subroutine
display "The text that example2_subroutine contributed became"
display "part of this single paragraph. Now we will end the paragraph."
end

program example2_subroutine
display "This sentence is being displayed by"
display "example2_subroutine"
end

*Sortpreserve把数据顺序恢复到原来位置
capture program drop myprog
program myprog, sortpreserve
args i j
sort 
i'j'
mysubcalculation 
i'j'
end

program mysubcalculation, sortpreserve
args i j
sort 
j'i'
end
sysuse auto.dta, clear
myprog mpg trunk

program myprog2, byable(recall) sortpreserve
syntax varname [if] [in]
marksample touse
sort 
touse'varname'
summarize 
varname' iftouse'
end
sysuse auto.dta, clear
myprog2 price

*Byable允许程序前面放by进行分组回归
program myprog1, byable(recall)
syntax [varlist] [if] [in]
marksample touse
summarize 
varlist' iftouse'
end
sysuse auto.dta, clear
by foreign: myprog1 price trunk weight

program myprog3, byable(onecall) sortpreserve
syntax newvarname =exp [if] [in]
marksample touse, novarlist
tempvar rhs
quietly {
gen double 
rhs'exp' if touse' sorttouse' _byvars'rhs'
by 
touse'_byvars': gen type'varlist' = /*
*/ 
rhs' -rhs'[_n-1] if `touse'
}
end
myprog3 mpg_new= mpg^2

*Syntax语言
capture program drop myprog
program myprog
version 15
syntax varlist [if] [in][,adjust(real 1) title(string)]
display "varlist contains |
varlist'|" display " if contains |if'|"
display " in contains |
in'|" display "adjust contains |adjust'|"
display "title contains |`title'|"
end
sysuse auto.dta, clear
myprog mpg weight if foreign in 1/20, title("My results") adjust(2.5) //执行程序

capture program drop myprog
program myprog
version 15
syntax varlist [if] [in] [, adjust(real 1) title(string)]
marksample touse //标记样本
display
if "
title'"!="" { display "title':"
}
foreach var of local varlist {
quietly sum 
var' iftouse'
display %9s "
var'" " " %9.0g r(mean)*adjust'
}
end
sysuse auto.dta, clear
myprog mpg weight if foreign, title("My results") adjust(2.5) //执行程序

*Gettoken获得token的程序
local str "cat+dog mouse++horse"
gettoken left: str //空格之前的放在left,之后的放在str里
display 
"left'"'
display 
"str'"'
gettoken left str: str, parse(" +") //+之前的放在left,之后的放在str里
display 
"left'"'
display 
"str'"'

*看一个面板中某个变量在追踪的年份中保持不变的数目
webuse nlswork, clear
xtset idcode
keep if idcode<=6
keep idcode year union
quietly {
local r=0
gen n=0
forvalues j=1/6 {
duplicates tag union if idcode==
j', gen(unionj')
tab union
j' return list scalar ij'=r(r)
if i
j'==1 { replace n =r' + 1
local r=`r'+1
}
}
}
sum n
display as text "总共有" r(mean) "是没有发生变化的"

*Sysdir系统directory
sysdir
sysdir set OLDPLACE "d:\ado" //改变当前的OLDPLACE路径
adopath //ado files的路径
set adosize 1550 //增加ado空间

*Tabdisp展示Table与list有相似处
webuse tabdxmpl1, clear
tabdisp a b, cell(c) //相当于当a,b=(x1, x2)时c的数值
sysuse auto2, clear
tabdisp make, cell(mpg weight displ rep78) //变量make与mpg,weight,displ和rep78的表
collapse (mean) mpg, by(foreign rep78)
tabdisp foreign rep78, cell(mpg) //这个与collapse有点相似
tabdisp foreign rep78, cell(mpg) format(%9.2f) center //数值格式发生了变化

webuse tabdxmpl3, clear
tabdisp agecat sex party, c(reaction) center //现在是三层叠加的表格挺好用
webuse tabdxmpl4, clear
tabdisp sex response, cell(pop) missing //缺失值显示出来
webuse tabdxmpl5, clear
tabdisp sex response, cell(pop) total //显示总共Total在最后一列

*Macro宏定义
local ++x //这与local x=
x'+1 local x=x'+1
sysuse auto.dta, clear
global x : type mpg //扩展方程宏
dis "x"
global x2 : variable label mpg //扩展方程宏
dis "x2"
constraint 1 price = weight //限制1为price=weight
constraint 2 mpg > 20
local myname : constraint 2 //写一个扩展宏
macro list _myname //把扩展宏显示出来
local aname : constraint dir
macro list _aname
local today c(current_date) //显示当前日期的
dis `today'

dis c(N)
dis c(current_time)
dis c(max_N_theory)
dis c(max_matsize)
dis c(max_macrolen)
dis c(mindouble)
dis c(Weekdays)
constraint 1 price = weight
local myname: constraint 1
macro list _myname
local lmyname: strlen local myname
macro list _lmyname
local string "a or b or c or d"
global newstr: subinstr local string "c" "sand"
display "$newstr"
local string2 : subinstr global newstr "or" "and", all count(local n)
display "
string2'" local x 5 display "x++'" //x++= x'+1 display "x'"
format `:format gear_ratio' headroom //把headroom的显示格式弄成与gear_ratio是一样的

*Tempfile的用处比较明显
preserve // preserve user’s data
keep var1 var2 xvar
tempfile master part1 // declare temporary files
save "
master'" drop var2 save "part1'"
use "
master'", clear drop var1 rename var2 var1 append using "part1'"

*Tokenize就是把子划成1、2、3这种形式
tokenize some words
display "1=|
1'|, 2=|2'|, 3=|3'|" tokenize "some more words" display "1=|1'|, 2=|2'|, 3=|3'|, 4=|`4'|"

*生成新的变量
set obs 100
gen x=uniform()
generate y = x[_n] //生成与x一样的y
generate xlag = x[_n-1] //生成x的之后一期,与时间序列里L.x
generate xlead = x[_n+1] //生成x的前一期,与时间序列里F.x

*计算置信区间(这是immediate程序)
sysuse auto, clear
ci means mpg price, level(90) //计算服从正态分布的mpg,price变量的均值置信区间
webuse petri, clear
ci means count, poisson //计算服从泊松分布的count的均值置信区间
webuse promonone, clear
ci proportions promoted //计算服从binomial分布的promoted均值置信区间
ci proportions promoted, wilson
ci proportions promoted, agresti
ci proportions promoted, jeffreys
webuse peas_normdist, clear
ci variances weight //计算weight的方差的置信区间
ci variances weight, sd bonett level(90)
cii means 166 19509 4379 //计算观测值为166,均值为19509,方差为4379的均值置信区间
cii means 166 19509 4379, level(90)
cii proportions 10 1
cii variances 15 2.1

*Trace跟踪某个语句出错了
program myprog
version 15
syntax varname , [Prefix(string)]
local newname "
prefix'varname'
local newname "new
end
sysuse auto.dta, clear
set trace on //可以知道哪个地方出错了
myprog mpg, prefix("new")

**以下是一个嵌套程序
capture program drop simple
program simple //一个简单的程序
version 15
args msg
if "
msg'"=="hello" { display "you said hello" } else display "you did not say hello" display "good-bye" end set trace on //可以跟踪程序运行 simple hello simple no program myprog2 args msg simple "msg'"
display "good"
end
program myprog1
args msg
myprog2 "`msg'"
display "bye"
end
set trace on //下面这几个一起选中执行
set tracenumber on //每个执行都有对应的行编号
set tracedepth 2 //根据嵌套进行缩进
myprog1 hello
set tracedepth 32000
set tracenumber off

*Unab把缩写的变量扩展成全称
sysuse auto, clear
unab x : mpg wei for, name(myopt())
display "`x'"
unab x : junk
unab x : mpg wei, max(1) name(myopt())
unab x : mpg wei, max(1) name(myopt()) min(0)
unab x : mpg wei, min(3) name(myopt())
unab x : mpg wei, min(3) name(myopt()) max(10)
unab x : mpg wei, min(3) max(10)

gen time = _n //时间序列数据
tsset time
tsunab mylist : l(1/3).mpg
display "
mylist'" tsunab mylist : l(1/3).(price turn displ) di "mylist'"
unab varn : mp
display "
varn'" set varabbrev off //一旦关闭这个varabbrev就不能使用unab unab varn : mp set varabbrev on unab varn : mp display "varn'"

*Unabcmd能够把系统自带的cmd扩展成全名
unabcmd gen
return list //能够看全名
unabcmd kappa
return list

*Viewsource能够看到每个ado和mata的源文件
viewsource ml.ado
viewsource xtreg.ado
viewsource panelsetup.mata

*While作为循环写程序
capture program drop demo
program demo
local i=1
while 
i'>0 { display "i is nowi'"
local i=`i'-1 //也可以写成local --i
}
display "done"
end
set trace on
demo i=2

*Nobreak可以让程序继续执行而不被打断,如果是break就是ctrl+pause break
capture program drop breakprocess
program breakprocess
args myv
nobreak {
rename 
myv' Result list Result in 1/5 rename Resultmyv'
}
end
sysuse auto.dta, clear
set trace on
breakprocess mpg

*输出回归的variance-covariance矩阵
capture program drop yourprog
program yourprog
args var2 var3 var4 var5 var6 var7
global alpha "B1 C1 D1 E1 F1 G1 H1"
matrix list e(V)
matrix x = e(V)
putexcel set var_cov, replace
forvalues i=2/7 {
putexcel A
i+1'=("vari''") foreach j of global alpha { putexcelj'=("vari''")
}
}
putexcel A8=("_cons")
putexcel I1=("_cons")
putexcel A9=("varaince-covariance matrix")
putexcel B2=matrix(x)
end

相应的do file都放在计量社群里, 有需要可以下载参看。

推荐阅读:

0.中国所有地级市各类空间权重矩阵数据release

1.工企数据库匹配160大步骤的完整程序和相应数据

2.1998-2016年中国地级市年均PM2.5数据release

3.1997-2014中国市场化指数权威版本release

4.2005-2015中国分省分行业CO2数据circulation

5.匹配方法(matching)操作指南, 值得收藏的16篇文章

6.内生性问题操作指南, 广为流传的22篇文章

7.面板数据模型操作指南, 不得不看的16篇文章

8.实证研究中用到的135篇文章, 社科学者常用toolkit

计量经济圈是中国计量第一大社区,我们致力于推动中国计量理论和实证技能的提升,圈子以海内外高校研究生和教师为主。计量经济圈绝对六多精神:社科资料最多、社科数据最多、科研牛人最多、海外名校最多、热情互助最多、前沿趋势最多如果你热爱计量并希望长见识,那欢迎你加入到咱们这个大家庭(戳这里),要不然你只能去其他那些Open access圈子了。注意:进去之后一定要看小鹅社群“群公告”,不然接收不了群息,也不知道怎么进入咱们独一无二的微信群和QQ群在规则框架下社群交流讨论无时间限制。

    您可能也对以下帖子感兴趣

    文章有问题?点此查看未经处理的缓存