The class Dynamic Generalized Linear Model (DGLM), which is the core of the PyBATS package.

The PyBATS library supports many types of DGLMs - Poisson, Bernoulli, Normal (a DLM), and Binomial.

A Dynamic Generalized Linear Model is defined by:

• The exponential family of the observations $y_t$: Normal, Poisson, Bernoulli, or Binomial
• The components in the state vector: Trend, Regression, Seasonal, Holiday, and Latent Factor

A DGLM is a linear state space model, defined by the dynamic regression:

\begin{aligned} y_t &\sim p(\lambda_t) \\ \lambda_{t} &= F_{t}' \theta_{t} \end{aligned}

Where:

• $p(\lambda_t)$ is the forecast distribution for the observation $y_t$
• $\theta_t$ is the state vector, containing the model coefficients
• $F_t$ is the regression vector, containing all of the predictors $X_t$

A PyBATS DGLM has three core methods.

## classdglm[source]

dglm(a0=None, R0=None, nregn=0, ntrend=0, nlf=0, nhol=0, seasPeriods=[], seasHarmComponents=[], deltrend=1, delregn=1, dellf=1, delhol=1, delseas=1, rho=1, interpolate=True, adapt_discount=False, adapt_factor=0.5, discount_forecast=False)

Below is a very simple example of manually defining a DGLM. However, in most situations, DGLMs will be defined automatically through analysis or through define_models.

This model has 1 trend coefficent and 2 regression coefficients. The trend is simply an intercept. The 2 regression coefficients mean that we have 2 predictor variables.

import numpy as np
from pybats.dglm import pois_dglm

a = np.array([1, 1, 1])
R = np.eye(3)
mod = pois_dglm(a, R, ntrend=1, nregn=2, deltrend=1, delregn=.9)

mod.get_coef()

Mean Standard Deviation
Intercept 1 1.0
Regn 1 1 1.0
Regn 2 1 1.0

mod.get_coef provides the mean and standard deviation of the state vector $\theta$. The mean vector and full variance matrix is also available using mod.a and mod.R.

deltrend is the discount factor on the trend component. A discount factor of 1 indicates no expected change over time.

delregn is the discount factor on the regression component. A discount factor of 0.9 indicates that we expect about a 10% change in the coefficient every time step. This is a very low discount factor. Most of the time, discount factors should be set between $0.95 - 1$.

Discounting old information is what makes this model dynamic, because we are telling the model that the coefficients should change over time.

#### dglm.update[source]

dglm.update(y=None, X=None, phi_mu=None, phi_sigma=None, analytic=True, phi_samps=None, **kwargs)

Update the DGLM state vector mean and covariance after observing 'y', with covariates 'X'.

We will make up an observation $y$ and predictors $X$ and demonstrate how the model updates:

y = 5
X = np.array([1,2])
mod.update(y = y, X = X)


Below, we'll see how the mean and variance of the state vector $\theta_t$ has now changed from the initial definition.

mod.get_coef()

Mean Standard Deviation
Intercept 0.6 0.92
Regn 1 0.6 0.97
Regn 2 0.2 0.63

#### dglm.forecast_marginal[source]

dglm.forecast_marginal(k, X=None, nsamps=1, mean_only=False, phi_mu=None, phi_sigma=None, analytic=True, phi_samps=None, state_mean_var=False, y=None)

Simulate from the forecast distribution at time t+k.

We can forecast from this model using future values of our predictors, X_future. The output are simulated values from the forecast distribution.

X_future = np.array([2,1])
forecast_samples = mod.forecast_marginal(k = 1, X = X_future, nsamps=1000)
forecast_samples[:10]

array([17, 10, 51, 10, 21, 10, 21, 11, 19, 20])

#### dglm.forecast_path[source]

dglm.forecast_path(k, X=None, nsamps=1, copula=True, phi_mu=None, phi_sigma=None, phi_psi=None, analytic=True, phi_samps=None, **kwargs)

Simulate from the path (joint) forecast distribution from 1 to k steps ahead.

Path forecasting is used to forecast over multiple step steps at once. For example, let's forecast the next 3 time steps. X_future is still the future values of our predictors, but it now has 3 rows.

X_future = np.array([[2,1], [2,2], [1,1]])
forecast_samples = mod.forecast_path(k = 3, X = X_future, nsamps=1000)
forecast_samples[:10]

array([[47, 40, 12],
[25, 19,  4],
[54, 42,  6],
[20, 10,  6],
[ 3,  8,  4],
[10, 10, 10],
[40, 36,  8],
[ 5,  5,  1],
[ 0,  1,  1],
[11,  6,  2]])

## classbern_dglm[source]

bern_dglm(a0=None, R0=None, nregn=0, ntrend=0, nlf=0, nhol=0, seasPeriods=[], seasHarmComponents=[], deltrend=1, delregn=1, dellf=1, delhol=1, delseas=1, rho=1, interpolate=True, adapt_discount=False, adapt_factor=0.5, discount_forecast=False) :: dglm

A Bernoulli DGLM models $0-1$ observations:

\begin{aligned} y_t &\sim Bern(\pi_t) \\ \pi_t &= 1/(1 + e^{-\lambda_t}) \\ \lambda_{t} &= F_{t}' \theta_{t} \end{aligned}

Where

## classpois_dglm[source]

pois_dglm(a0=None, R0=None, nregn=0, ntrend=0, nlf=0, nhol=0, seasPeriods=[], seasHarmComponents=[], deltrend=1, delregn=1, dellf=1, delhol=1, delseas=1, rho=1, interpolate=True, adapt_discount=False, adapt_factor=0.5, discount_forecast=False) :: dglm

A Poisson DGLM models non-negative integers:

\begin{aligned} y_t &\sim Pois(\mu_t) \\ \mu_t &= e^{\lambda_t} \\ \lambda_{t} &= F_{t}' \theta_{t} \end{aligned}

Where

• Pois($\mu_t$) is the Poisson distribution with rate parameter $\mu_t$
• $\mu_t$ is exponent of $\lambda_t$

## classdlm[source]

dlm(*args, n0=1, s0=1, delVar=1, **kwargs) :: dglm

A normal DLM has a slightly different form than other DGLMs:

\begin{aligned} y_t &= \mu_t + \epsilon_t \\ \mu_t &= \lambda_{t} = F_{t}' \theta_{t} \\ \epsilon_t &\sim N(0, \sigma^2_t) \end{aligned}

Where

• N($0, \sigma^2_t$) is the Normal distribution with mean $0$ and variance $\sigma^2_t$
• $\sigma^2_t$ is also a dynamic parameter, with its own evolution through time. The discount factor mod.delVar is set by the modeler, and controls the rate at which $\sigma^2$ changes. Typically it is set between $0.95-1$. The mean of $\sigma_t^2$ is stored as mod.s

## classbin_dglm[source]

bin_dglm(a0=None, R0=None, nregn=0, ntrend=0, nlf=0, nhol=0, seasPeriods=[], seasHarmComponents=[], deltrend=1, delregn=1, dellf=1, delhol=1, delseas=1, rho=1, interpolate=True, adapt_discount=False, adapt_factor=0.5, discount_forecast=False) :: dglm

A Binomial DGLM models the sum of independent $0-1$ observations:

\begin{aligned} y_t &\sim Bin(n_t, pi_t) \\ \pi_t &= 1/(1 + e^{-\lambda_t}) \\ \lambda_{t} &= F_{t}' \theta_{t} \end{aligned}

Where