stats_model module
Time Series Statistical Modeling Module.
This module implements various time series models for analyzing and forecasting financial and economic data, with a focus on ARIMA for conditional mean modeling and GARCH for volatility modeling. It supports both univariate and multivariate approaches.
Key Components: - ModelARIMA: ARIMA model for conditional mean forecasting - ModelGARCH: GARCH model for volatility forecasting - ModelMultivariateGARCH: Multivariate GARCH for correlation/covariance modeling - ModelFactory: Factory pattern for creating appropriate model instances
Key Functions: - run_arima: Convenience function for ARIMA modeling - run_garch: Convenience function for GARCH modeling - run_multivariate_garch: Function for multivariate GARCH analysis - calculate_correlation_matrix: Compute correlation matrices - calculate_portfolio_risk: Assess risk based on volatility and correlations
Supported Models: - ARIMA(p,d,q): For modeling conditional means - GARCH(p,q): For modeling conditional volatility - CCC-GARCH: Constant Conditional Correlation - DCC-GARCH: Dynamic Conditional Correlation with EWMA
Typical Usage Flow: 1. Start with prepared data from data_processor.py 2. Fit ARIMA models to capture conditional mean 3. Extract residuals and fit GARCH models for volatility 4. For multiple series, analyze correlations with multivariate GARCH 5. Generate forecasts and risk metrics
The models in this module follow standard econometric practices and use statsmodels and arch packages for the underlying implementations.
- class timeseries_compute.stats_model.ModelARIMA(data: DataFrame, order: Tuple[int, int, int] = (1, 1, 1), steps: int = 5)[source]
Bases:
objectApplies the ARIMA (AutoRegressive Integrated Moving Average) model on all columns of a DataFrame.
- data
The input data on which ARIMA models will be applied.
- Type:
pd.DataFrame
- order
The (p, d, q) order of the ARIMA model.
- Type:
Tuple[int, int, int]
- steps
The number of steps to forecast.
- Type:
int
- models
A dictionary to store ARIMA models for each column.
- Type:
Dict[str, ARIMA]
- fits
A dictionary to store fitted ARIMA models for each column.
- Type:
Dict[str, ARIMA]
- fit() Dict[str, ARIMA][source]
Fits an ARIMA model to each column in the dataset.
- Returns:
- A dictionary where the keys are column names and the values are the
fitted ARIMA models for each column.
- Return type:
Dict[str, ARIMA]
- class timeseries_compute.stats_model.ModelFactory[source]
Bases:
objectFactory class for creating instances of different statistical models.
- create_model(model_type
str, **kwargs) -> Any: Static method that creates and returns an instance of a model based on the provided model_type.
- static create_model(model_type: str, data: DataFrame, order: Tuple[int, int, int] = (1, 1, 1), steps: int = 5, p: int = 1, q: int = 1, dist: str = 'normal', mv_model_type: str = 'cc') ModelARIMA | ModelGARCH | ModelMultivariateGARCH[source]
Creates and returns an instance of a statistical model based on the specified type.
- Parameters:
model_type (str) – Type of model to create (“ARIMA”, “GARCH”, or “MVGARCH”)
data (pd.DataFrame) – Input data for the model
order (Tuple[int, int, int]) – (p,d,q) order for ARIMA models
steps (int) – Forecast horizon for ARIMA models
p (int) – GARCH order parameter
q (int) – ARCH order parameter
dist (str) – Error distribution for GARCH models
mv_model_type (str) – Type of multivariate GARCH model (“cc” or “dcc”)
- Returns:
The created model instance
- Return type:
Union[ModelARIMA, ModelGARCH, ModelMultivariateGARCH]
- Raises:
ValueError – If an unsupported model type is provided
- class timeseries_compute.stats_model.ModelGARCH(data: DataFrame, p: int = 1, q: int = 1, dist: str = 'normal')[source]
Bases:
objectRepresents a GARCH model for time series data.
- data
The input time series data.
- Type:
pd.DataFrame
- p
The order of the GARCH model for the lag of the squared residuals.
- Type:
int
- q
The order of the GARCH model for the lag of the conditional variance.
- Type:
int
- dist
The distribution to use for the GARCH model (e.g., ‘normal’, ‘t’).
- Type:
str
- models
A dictionary to store models for each column of the data.
- Type:
Dict[str, arch_model]
- fits
A dictionary to store fitted models for each column of the data.
- Type:
Dict[str, arch_model]
- fit() Dict[str, arch_model][source]
Fits a GARCH model to each column of the data.
- Returns:
- A dictionary where the keys are column names and the values
are the fitted GARCH models.
- Return type:
Dict[str, arch_model]
- forecast(steps: int) Dict[str, float][source]
Generates forecasted variance for each fitted model.
- Parameters:
steps (int) – The number of steps ahead to forecast.
- Returns:
A dictionary where keys are column names and values are the forecasted variances for the specified horizon.
- Return type:
Dict[str, float]
- class timeseries_compute.stats_model.ModelMultivariateGARCH(data: DataFrame, p: int = 1, q: int = 1, model_type: str = 'cc')[source]
Bases:
objectImplements multivariate GARCH models including CC-GARCH and DCC-GARCH.
- timeseries_compute.stats_model.calculate_correlation_matrix(standardized_residuals: DataFrame) DataFrame[source]
Calculate constant conditional correlation matrix from standardized residuals.
- Parameters:
standardized_residuals (pd.DataFrame) – DataFrame of standardized residuals from GARCH models
- Returns:
Correlation matrix
- Return type:
pd.DataFrame
- timeseries_compute.stats_model.calculate_dynamic_correlation(ewma_cov: Series, ewma_vol1: Series, ewma_vol2: Series) Series[source]
Calculate dynamic conditional correlation from EWMA covariance and volatilities.
- Parameters:
ewma_cov (pd.Series) – EWMA covariance between two series
ewma_vol1 (pd.Series) – EWMA volatility of first series
ewma_vol2 (pd.Series) – EWMA volatility of second series
- Returns:
Dynamic conditional correlation
- Return type:
pd.Series
- timeseries_compute.stats_model.calculate_portfolio_risk(weights: ndarray, cov_matrix: ndarray) tuple[source]
Calculate portfolio variance and volatility for given weights and covariance matrix.
- Parameters:
weights (np.ndarray) – Array of portfolio weights
cov_matrix (np.ndarray) – Covariance matrix
- Returns:
(portfolio_variance, portfolio_volatility)
- Return type:
tuple
- timeseries_compute.stats_model.calculate_stats(series: Series) Dict[str, float][source]
Calculate comprehensive statistics for a time series.
- Parameters:
series (pd.Series) – Time series data to analyze
- Returns:
- Dictionary containing the following statistics:
’n’: Number of observations in the series
’mean’: Arithmetic mean of the series
’median’: Median value of the series
’min’: Minimum value in the series
’max’: Maximum value in the series
’std’: Standard deviation of the series
’skew’: Skewness of the distribution (asymmetry measure)
’kurt’: Kurtosis of the distribution (tail heaviness measure)
’annualized_vol’: Annualized volatility, calculated as standard deviation * sqrt(250) for daily data
- Return type:
Dict[str, float]
Example
>>> series = pd.Series([1.2, 2.3, 3.4, 4.5, 5.6]) >>> stats = calculate_stats(series) >>> print(f"Mean: {stats['mean']}, Std Dev: {stats['std']}")
- timeseries_compute.stats_model.construct_covariance_matrix(volatilities: list, correlation: float) ndarray[source]
Construct a 2x2 covariance matrix using volatilities and correlation.
- Parameters:
volatilities (list) – List of volatilities [vol1, vol2]
correlation (float) – Correlation coefficient
- Returns:
2x2 covariance matrix
- Return type:
np.ndarray
- timeseries_compute.stats_model.run_arima(df_stationary: DataFrame, p: int = 1, d: int = 1, q: int = 1, forecast_steps: int = 5) Tuple[Dict[str, object], Dict[str, float | List[float]]][source]
Runs an ARIMA model on stationary time series data.
This function fits ARIMA(p,d,q) models to each column in the provided DataFrame and generates forecasts for the specified number of steps ahead. It performs minimal logging to display only core information about the model and forecasts.
- Parameters:
df_stationary (pd.DataFrame) – The DataFrame with stationary time series data
p (int) – Autoregressive lag order, default=1
d (int) – Degree of differencing, default=1
q (int) – Moving average lag order, default=1
forecast_steps (int) – Number of steps to forecast, default=5
- Returns:
First element: Dictionary of fitted ARIMA models for each column
Second element: Dictionary of forecasted values for each column
- Return type:
Tuple[Dict[str, object], Dict[str, Union[float, List[float]]]]
- timeseries_compute.stats_model.run_garch(df_stationary: DataFrame, p: int = 1, q: int = 1, dist: str = 'normal', forecast_steps: int = 5) Tuple[Dict[str, Any], Dict[str, float]][source]
Runs the GARCH model on the provided stationary DataFrame.
This function fits GARCH(p,q) models to each column in the provided DataFrame and generates volatility forecasts. It performs minimal logging to display only core information about the model and forecasts.
- Parameters:
df_stationary (pd.DataFrame) – The stationary time series data for GARCH modeling
p (int) – The GARCH lag order, default=1
q (int) – The ARCH lag order, default=1
dist (str) – The error distribution - ‘normal’, ‘t’, etc., default=”normal”
forecast_steps (int) – The number of steps to forecast, default=5
- Returns:
First element: Dictionary of fitted GARCH models for each column
Second element: Dictionary of forecasted volatility values for each column
- Return type:
Tuple[Dict[str, Any], Dict[str, float]]
- timeseries_compute.stats_model.run_multivariate_garch(df_stationary: DataFrame, arima_fits: Dict[str, Any] | None = None, garch_fits: Dict[str, Any] | None = None, lambda_val: float = 0.95) Dict[str, Any][source]
Runs multivariate GARCH analysis on the provided stationary DataFrame.
This function implements both Constant Conditional Correlation (CCC) and Dynamic Conditional Correlation (DCC) GARCH models. It either uses provided ARIMA and GARCH models or fits new ones if not provided.
- Parameters:
df_stationary (pd.DataFrame) – The stationary time series data for GARCH modeling
arima_fits (dict, optional) – Dictionary of fitted ARIMA models for each column
garch_fits (dict, optional) – Dictionary of fitted GARCH models for each column
lambda_val (float, optional) – EWMA decay factor for DCC model. Defaults to 0.95.
- Returns:
- Dictionary containing multivariate GARCH results
’arima_residuals’: DataFrame of ARIMA residuals
’conditional_volatilities’: DataFrame of conditional volatilities
’standardized_residuals’: DataFrame of standardized residuals
’cc_correlation’: Constant conditional correlation matrix
’cc_covariance_matrix’: Covariance matrix using CCC
’dcc_correlation’: Series of dynamic conditional correlations
’dcc_covariance’: Series of dynamic conditional covariances
- Return type:
dict
Example
>>> # Create stationary returns for two assets >>> returns = pd.DataFrame({ ... 'Asset1': [0.01, -0.02, 0.015, -0.01, 0.02], ... 'Asset2': [0.015, -0.01, 0.02, -0.015, 0.01] ... }) >>> # Run multivariate GARCH analysis >>> results = run_multivariate_garch(returns) >>> # Access the correlation matrix >>> print(results['cc_correlation']) >>> # Plot dynamic correlation over time >>> plt.plot(results['dcc_correlation'])