我按月在产品上运行OLS。尽管这对于单个产品而言效果很好,但我的数据框包含许多产品。如果我创建一个groupby对象,则OLS会给出错误。
linear_regression_df: product_desc period_num TOTALS 0 product_a 1 53 3 product_a 2 52 6 product_a 3 50 1 product_b 1 44 4 product_b 2 43 7 product_b 3 41 2 product_c 1 36 5 product_c 2 35 8 product_c 3 34 from pandas import DataFrame, Series import statsmodels.api as sm linear_regression_grouped = linear_regression_df.groupby(['product_desc']) X = linear_regression_grouped['period_num'] y = linear_regression_grouped['TOTALS'] model = sm.OLS(y, X) results = model.fit()
我在sm.OLS()行上收到此错误:
ValueError: unrecognized data structures: <class 'pandas.core.groupby.SeriesGroupBy'>
那么,如何浏览数据框并为每个product_desc应用sm.OLS()?
你可以做这样的事情…
import pandas as pd import statsmodels.api as sm for products in linear_regression_df.product_desc.unique(): tempdf = linear_regression_df[linear_regression_df.product_desc == products] X = tempdf['period_num'] y = tempdf['TOTALS'] model = sm.OLS(y, X) results = model.fit() print results.params # Or whatever summary info you want