Python Statsmodels Mixedlm(混合线性模型)随机效应

印度阿三17 2019-10-09

展开全文

我对Statsmodels Mixedlm的输出感到有点困惑,我希望有人可以解释一下.

我有一个大型的单户住宅数据集,包括每个房产的前两个销售价格/销售日期.我对整个数据集进行了地理编码,并获取了每个属性的高程.我试图了解不同城市之间提升与房地产价格升值之间关系的变化方式.

我使用statsmodels混合线性模型来回归价格升值的高程,保持其他一些因素不变,以城市作为我的团体类别.

md = smf.mixedlm('price_relative_ind~Elevation YearBuilt Sale_Amount_1 LivingSqFt',data=Miami_SF,groups=Miami_SF['City'])

mdf = md.fit()

mdf.random_effects

输入mdf.random_effects将返回系数列表.我能否将此列表解释为每个城市的斜率(即,与销售价格升值相关的个别回归系数)？或者这些结果是每个城市的拦截？

解决方法:

我目前正试图在MixedLM中了解随机效应.看看the docs,似乎只使用groups参数,没有exog_re或re_formula只会为每个组添加一个随机拦截.来自文档的一个例子：

# A basic mixed model with fixed effects for the columns of exog and a random intercept for each distinct value of group:

model = sm.MixedLM(endog, exog, groups)
result = model.fit()

因此,在这种情况下,您会期望random_effects方法返回城市的截距,而不是系数/斜率.

要为您的其他功能添加随机斜率,您可以从statsmodels的Jupyter教程中执行与此示例类似的操作,可以使用斜率和截距：

model = sm.MixedLM.from_formula(
    "Y ~ X", data, re_formula="X", groups=data["C"])

或只有斜坡：

model = sm.MixedLM.from_formula(
    "Y ~ X", data, re_formula="0   X", groups=data["C"])

查看random_effects的文档,它表示它返回每个组的随机效果的均值.然而,由于随机效应仅仅是由于截距,这应该等于截距本身.

MixedLMResults.random_effects()[source]
    The conditional means of random effects given the data.

    Returns:    
        random_effects : dict
        A dictionary mapping the distinct group values to the means of the random effects for the group.

一些有用的资源,包括：

> Docs为MixedML的公式版本
> Docs为MixedML的结果
> This Jupyter笔记本以及使用MixedML(Python)的示例
> Stanford tutorial混合型号(R)
> Tutorial关于固定和随机效应(R)

来源：https://www./content-1-495501.html