主要是 1.sort 升序
sysuse auto, clear keep make price mpg length foreign*对一个变量排序 sort make //对make进行升序排列list sort mpg //对mpg进行升序排列list*对多个变量排序 sort mpg length //先对mpg进行排序,再对length排序list sort length mpg //先对length进行排序,再对mpg排序list 2.gsort 升序/降序
sysuse auto, clear keep make price mpg length foreign*对一个变量排序 gsort +make //升序排列,等价于gsort make 和 sort makelist gsort -make //降序排列list*对多个变量排序 gsort -mpg -length //先对mpg进行排序,再对length排序(均为降序排列)list gsort -length mpg //先对length进行排序(降序),再对mpg排序(升序)list 3.bysort 分组 排序
sysuse auto,clear by foreign : sum price mpg length weight rep78 //分组描述性统计sort rep78 by rep78 : sum price mpg length weight //分组之前,必须先进行排序by rep78,sort : sum price mpg length weight bysort rep78 : sum price mpg length weight*bysort可以简写为by clear input str2 v1 v2A 3B 4A 1A 1A 2B 5end bysort v1 v2 : gen num1 = _N //对v1、v2进行排序并分组,生成num1等于某组的观测值总数bysort v1(v2): gen num2 = _N //v2只排序,不分组,生成num2等于某组的观测值总数*示例1*事件研究删除停牌期间事件 use 事件列表,clear joinby stkcd using 停复牌 gen date1 = date(date,"YMD")bysort stkcd date : gen num1 = _N drop if date1 >= startdate & date1 <= enddate bysort stkcd date : gen num2 = _N drop if num1 != num2 keep stkcd date duplicates drop*示例2*赫芬达尔指数(HHI)的计算 clear use 赫芬达尔指数,clear sort year industry bysort year industry: egen sumsize = sum(资产总计) //egen与sum搭配时,生成的是列总和,而gen与sum搭配时生成的是列累积和。gen ratio = (资产总计/sumsize)^2bysort year industry: egen HHI = sum(ratio)drop sumsize ratio 城市排序 use 城市列表, clear replace city = ustrfrom(city,"gb18030",1)sort city //在stata14、stata15中,汉字的排序按照utf-8编码顺序 |
|