State space modeling in Python(View / download notebook | Report a problem)
Update: (June, 2016) The notebook has been updated to include recent changes to the state space library.
Update: (February, 2015) The pull request has been merged, and state space models will be included in Statsmodels beginning with version 0.7. The code, below, has been updated from the original post to reflect the current design.
A baseline version of state space models in Python is now ready as a pull request to the Statsmodels project, at https://github.com/statsmodels/statsmodels/pull/1698. Before it’s description, here is the general description of a state space model (See Durbin and Koopman 2012 for all notation):
Integrating state space modeling into Python required three elements (so far):
- An implementation of the Kalman filter
- A Python wrapper for easily building State space models to be filtered
- A Python wrapper for Maximum Likelihood estimation of state space models based on the likelihood evaluation performed as a byproduct of the Kalman filter.
These three are implemented in the pull request in the files
model.py. The first is a Cython implementation of the Kalman filter which does all of the heavy lifting. By taking advantage of static typing, compilation to C, and direct calls to underlying BLAS and LAPACK libraries, it achieves speeds that are an order of magnitude above a straightforward implementation of the Kalman filter in Python (at least in test cases I have performed so far).
The second handles setting and updating the state space representation matrices ($d_t, Z_t, H_t, c_t, T_t, R_t, Q_t$) , maintaining appropriate dimensions, and making sure that underlying datatypes are consistent (a requirement for the Cython Kalman filter - if the wrong datatype is passed directly to one of the underlying BLAS or LAPACK functions, it will cause an error in the best case and a segmentation fault in the worst).
The third introduces the idea that the representation matrices are composed of two types of elements: those that are known (often ones and zeros) and those that are unknown: parameters. It defines an interface for retrieving start parameters, transforming parameters (for example to induce stationarity in ARIMA models), and updating the corresponding elements of the state space matrices. Finally, it takes advantage of the updating structure to define a
fit method which uses
scipy.optimize methods to perform maximum likelihood estimation.
The pull request contains, right now, one example of a fully-fledged econometric model estimatable via state space methods. The Seasonal Autoregressive Integrated Moving Average with eXogenous regressors model is implemented in the
sarimax.py file. The bulk of the file is in describing the specific form of the state space matrices for the SARIMAX model, defining methods for finding good starting parameters, and updating the matrices appropriately when new parameters are tried.
In descending from the
Model class (3, above), it is able to ignore any of the intricacies of the actual optimization calls, or the construction of standard estimation output like the variance / covariance matrix, etc, summary tables, etc.
In descending from the
Representation class (2, above), it directly has a
filter method to apply the Kalman filter, it is able to ignore worries about dimensions and datatypes, and it gets all of the filter output “for free”. For example, the loglikelihood, residuals, and fitted values come directly from output from the filter. Finally, all prediction, dynamic prediction, and forecasting are performed in the generic representation results class and can be painlessly used by the SARIMAX model.
Two example notebooks using the resultant SARIMAX class:
- Replication of the Stata ARIMA examples: http://nbviewer.ipython.org/gist/ChadFulton/5127108f4c7025ed2648
- Demonstration of using missing data in SARIMAX model: http://nbviewer.ipython.org/gist/ChadFulton/9bfd32bd6e4e406fc351
Extensibility: Local Linear Trend Model
Of course, Statsmodels already has an ARIMAX class, so the marginal contribution of the SARIMAX model is mostly the ability to work with Seasonal (or arbitrary lag polynomial) models, and the ability to work with missing values. However it is just an example of the kind of models that can be easily produced from the given framework, which was specifically designed to be extensible.
As an example, I will present below the code for a full implementation of the Local Linear Trend model. This model has the form (see Durbin and Koopman 2012, Chapter 3.2 for all notation and details):
It is easy to see that this can be cast into state space form as:
Notice that much of the state space representation is composed of known values; in fact the only parts in which parameters to be estimated appear are in the variance / covariance matrices:
""" Univariate Local Linear Trend Model Author: Chad Fulton License: Simplified-BSD """ from __future__ import division, absolute_import, print_function import numpy as np import statsmodels.api as sm class LocalLinearTrend(sm.tsa.statespace.MLEModel): def __init__(self, endog, **kwargs): # Model order k_states = k_posdef = 2 # Initialize the statespace super(LocalLinearTrend, self).__init__( endog, k_states=k_states, k_posdef=k_posdef, **kwargs ) # Initialize the matrices self['design'] = np.r_[1, 0] self['transition'] = np.array([[1, 1], [0, 1]]) self['selection'] = np.eye(k_states) # Initialize the state space model as approximately diffuse self.initialize_approximate_diffuse() # Because of the diffuse initialization, burn first two # loglikelihoods self.loglikelihood_burn = 2 # Cache some indices idx = np.diag_indices(k_posdef) self._state_cov_idx = ('state_cov', idx, idx) # Setup parameter names self._param_names = ['sigma2.measurement', 'sigma2.level', 'sigma2.trend'] @property def start_params(self): # Simple start parameters: just set as 0.1 return np.r_[0.1, 0.1, 0.1] def transform_params(self, unconstrained): # Parameters must all be positive for likelihood evaluation. # This transforms parameters from unconstrained parameters # returned by the optimizer to ones that can be used in the model. return unconstrained**2 def untransform_params(self, constrained): # This transforms parameters from constrained parameters used # in the model to those used by the optimizer return constrained**0.5 def update(self, params, **kwargs): # The base Model class performs some nice things like # transforming the params and saving them params = super(LocalLinearTrend, self).update(params, **kwargs) # Extract the parameters measurement_variance = params level_variance = params trend_variance = params # Observation covariance self['obs_cov', 0, 0] = measurement_variance # State covariance self[self._state_cov_idx] = [level_variance, trend_variance]
Using this simple model, we can estimate the parameters from a local linear trend model. The following example is from Commandeur and Koopman (2007), section 3.4., modeling motor vehicle fatalities in Finland.
You can find the dataset here: http://www.ssfpack.com/CKbook.html
import pandas as pd from datetime import datetime # Load Dataset df = pd.read_table('NorwayFinland.txt', skiprows=1, header=None) df.columns = ['date', 'nf', 'ff'] df.index = pd.date_range(start=datetime(df.date, 1, 1), end=datetime(df.iloc[-1, 0], 1, 1), freq='AS') # Log transform df['lff'] = np.log(df['ff']) # Setup the model mod = LocalLinearTrend(df['lff']) # Fit it using MLE (recall that we are fitting the three variance parameters) res = mod.fit() print(res.summary())
Statespace Model Results ============================================================================== Dep. Variable: lff No. Observations: 34 Model: LocalLinearTrend Log Likelihood 26.740 Date: Sun, 22 Jan 2017 AIC -47.480 Time: 12:59:18 BIC -42.901 Sample: 01-01-1970 HQIC -45.919 - 01-01-2003 Covariance Type: opg ====================================================================================== coef std err z P>|z| [0.025 0.975] -------------------------------------------------------------------------------------- sigma2.measurement 0.0032 0.003 0.984 0.325 -0.003 0.010 sigma2.level 4.571e-10 0.006 7.12e-08 1.000 -0.013 0.013 sigma2.trend 0.0015 0.001 1.093 0.274 -0.001 0.004 =================================================================================== Ljung-Box (Q): 26.40 Jarque-Bera (JB): 0.64 Prob(Q): 0.70 Prob(JB): 0.72 Heteroskedasticity (H): 0.74 Skew: -0.22 Prob(H) (two-sided): 0.63 Kurtosis: 2.46 =================================================================================== Warnings:  Covariance matrix calculated using the outer product of gradients (complex-step).
%matplotlib inline import matplotlib.pyplot as plt fig, ax = plt.subplots(figsize=(10,4)) # Perform dynamic prediction and forecasting ndynamic = 5 predict_res = res.get_prediction(alpha=0.05, dynamic=df['lff'].shape-ndynamic) predict = predict_res.predicted_mean ci = predict_res.conf_int(alpha=0.05) # Plot the results ax.plot(df['lff'], 'k.', label='Observations'); ax.plot(df.index[:-ndynamic], predict[:-ndynamic], label='One-step-ahead Prediction'); ax.plot(df.index[:-ndynamic], ci.iloc[:-ndynamic], 'k--', alpha=0.5); ax.plot(df.index[-ndynamic:], predict[-ndynamic:], 'r', label='Dynamic Prediction'); ax.plot(df.index[-ndynamic:], ci.iloc[-ndynamic:], 'k--', alpha=0.5); # Cleanup the image ax.set_ylim((5, 8)); legend = ax.legend(loc='upper right'); legend.get_frame().set_facecolor('w')
Commandeur, Jacques J. F., and Siem Jan Koopman. 2007. An Introduction to State Space Time Series Analysis. Oxford ; New York: Oxford University Press. Durbin, James, and Siem Jan Koopman. 2012. Time Series Analysis by State Space Methods: Second Edition. Oxford University Press.