Approximating Continuous Shocks

Lifecycle models need stochastic state variables — income, health, productivity — that follow continuous distributions. Because the DP solution operates on finite grids, these distributions must be approximated by finite Markov chains. This page derives the formulas implemented in lcm.shocks and links each to its source code.

Notation follows Sargent & Stachurski (2024), Section 3.1.3.

import jax

jax.config.update("jax_enable_x64", val=True)

import jax.numpy as jnp  # noqa: E402
import numpy as np  # noqa: E402
import plotly.graph_objects as go  # noqa: E402
from jax.scipy.stats.norm import cdf as Phi  # noqa: N812, E402
from jax.scipy.stats.norm import pdf as phi_pdf  # noqa: E402

from lcm.shocks._base import _gauss_hermite_normal  # noqa: E402

blue, orange, green = "#4C78A8", "#F58518", "#54A24B"

IID Shocks¶

IID shocks are draws from a fixed distribution each period — no persistence. pylcm provides Uniform, Normal, LogNormal, and NormalMixture in lcm.shocks.iid. Each must specify grid points and transition probabilities (weights). Because the draws are independent, every row of the transition matrix $P$ is identical.

Uniform¶

The simplest case: $n$ equally spaced points on $[a, b]$ with equal weights $w_i = 1/n$ .

Reference: iid.Uniform.compute_gridpoints (line 45 of iid.py) and compute_transition_probs (lines 48–53).

Normal (Gauss-Hermite Quadrature)¶

For $\varepsilon \sim N(\mu, \sigma^2)$ , Gauss-Hermite quadrature approximates integrals against the Gaussian weight function:

\int_{-\infty}^{\infty} f(x)\, e^{-x^2}\, dx \;\approx\; \sum_{i=1}^n w_i\, f(z_i)

(1)

where $\{z_i\}$ and $\{w_i\}$ are the physicist’s Hermite nodes and weights.

To integrate against $N(\mu, \sigma^2)$ instead, apply the affine transformation $x_i = \mu + \sqrt{2}\,\sigma\,z_i$ and rescale the weights $\tilde{w}_i = w_i / \sqrt{\pi}$ . Then

\int_{-\infty}^{\infty} f(x)\, \frac{1}{\sigma\sqrt{2\pi}} e^{-(x-\mu)^2/(2\sigma^2)}\, dx \;\approx\; \sum_{i=1}^n \tilde{w}_i\, f(x_i).

(2)

Reference: _base._gauss_hermite_normal (lines 16–29 of _base.py).

CDF-based binning (alternative)¶

When gauss_hermite=False, Normal uses $n$ equally spaced points spanning $\mu \pm m\sigma$ and assigns weights by CDF binning. Define midpoints $m_j = (x_j + x_{j+1})/2$ between consecutive grid points. Then

w_j = \begin{cases} \Phi\!\left(\dfrac{m_1 - \mu}{\sigma}\right) & j = 1 \\[6pt] \Phi\!\left(\dfrac{m_j - \mu}{\sigma}\right) - \Phi\!\left(\dfrac{m_{j-1} - \mu}{\sigma}\right) & 1 < j < n \\[6pt] 1 - \Phi\!\left(\dfrac{m_{n-1} - \mu}{\sigma}\right) & j = n \end{cases}

(3)

The first and last bins absorb the tails.

Reference: iid.Normal.compute_transition_probs (lines 110–125 of iid.py).

n_points = 5
mu, sigma = 0.0, 1.0
gh_nodes, gh_weights = _gauss_hermite_normal(n_points=n_points, mu=mu, sigma=sigma)
n_std = 3.0
eq_nodes = jnp.linspace(mu - n_std * sigma, mu + n_std * sigma, n_points)
step = eq_nodes[1] - eq_nodes[0]
half_step = 0.5 * step
eq_weights = jnp.zeros(n_points)
eq_weights = eq_weights.at[1:].set(jnp.diff(Phi((eq_nodes + half_step - mu) / sigma)))
eq_weights = eq_weights.at[0].set(Phi((eq_nodes[0] + half_step - mu) / sigma))
eq_weights = eq_weights.at[-1].set(1 - Phi((eq_nodes[-1] - half_step - mu) / sigma))
fig = go.Figure()
fig.add_trace(
    go.Bar(
        x=np.asarray(eq_nodes),
        y=np.asarray(eq_weights),
        marker_color=orange,
        width=0.15,
        opacity=0.7,
        name="Equally spaced + CDF",
    ),
)
fig.add_trace(
    go.Bar(
        x=np.asarray(gh_nodes),
        y=np.asarray(gh_weights),
        marker_color=blue,
        width=0.15,
        opacity=0.7,
        name="Gauss-Hermite",
    ),
)
fig.update_layout(
    template="plotly_dark",
    title=f"Normal quadrature: \u03bc = {mu}, σ = {sigma}, n = {n_points}",
    xaxis_title="Node",
    yaxis_title="Weight",
    barmode="overlay",
    height=400,
    width=700,
    legend={"yanchor": "top", "y": 0.98, "xanchor": "right", "x": 0.98},
)
fig.show()

Loading...

Gauss-Hermite nodes cluster near the mean where the density is highest, so the central nodes carry the largest weights. Equally spaced nodes spread uniformly across $\pm 3\sigma$ ; the outer nodes fall in the tails where little probability mass lives, so their CDF-binned weights are small. With these parameters, the two schemes place nodes at quite different locations, but this can be different depending on the n_points and the n_std parameter. GH is generally more accurate for smooth integrands.

LogNormal¶

If $X \sim N(\mu, \sigma^2)$ , then $Y = e^X$ is log-normal. Grid points are $y_i = e^{x_i}$ where $\{x_i\}$ are the normal nodes (GH or equally spaced); weights are unchanged.

Reference: iid.LogNormal.compute_gridpoints (lines 163–169 of iid.py).

mu_ln, sigma_ln = 0.0, 0.5
normal_nodes, normal_weights = _gauss_hermite_normal(
    n_points=n_points, mu=mu_ln, sigma=sigma_ln
)
lognormal_nodes = jnp.exp(normal_nodes)
fig = go.Figure()
fig.add_trace(
    go.Scatter(
        x=np.asarray(normal_nodes),
        y=np.asarray(normal_weights),
        mode="markers+lines",
        marker={"color": blue, "size": 8},
        line={"color": blue, "width": 1, "dash": "dot"},
        name="Normal nodes",
    )
)
fig.add_trace(
    go.Scatter(
        x=np.asarray(lognormal_nodes),
        y=np.asarray(normal_weights),
        mode="markers+lines",
        marker={"color": orange, "size": 8},
        line={"color": orange, "width": 1, "dash": "dot"},
        name="LogNormal nodes",
    )
)
fig.update_layout(
    template="plotly_dark",
    title=f"LogNormal: \u03bc = {mu_ln}, σ = {sigma_ln}, n = {n_points}",
    xaxis_title="Node value",
    yaxis_title="Weight",
    height=400,
    width=700,
)
fig.show()

Loading...

The exponential transform $y_i = e^{x_i}$ shifts all nodes to the right and compresses negative normal nodes into $(0, 1)$ while stretching positive ones above 1. The weights stay the same — only the node positions change. This ensures that expectations $E[f(e^X)]$ are computed correctly for log-normal shocks using the same Gauss-Hermite machinery.

NormalMixture¶

When the shock follows a two-component normal mixture $\varepsilon \sim p_1 \, N(\mu_1, \sigma_1^2) + (1-p_1) \, N(\mu_2, \sigma_2^2)$ , NormalMixture uses equally spaced points spanning the mixture mean $\pm m$ mixture standard deviations and assigns weights via CDF binning with the mixture CDF in place of $\Phi$ — the same approach described in the Non-Normal Innovations section below, but without persistence.

Reference: iid.NormalMixture in iid.py.

AR(1) Processes¶

The canonical process (Sargent & Stachurski (2024) notation):

X_{t+1} = \rho\, X_t + b + \nu\,\varepsilon_{t+1}, \quad |\rho| < 1, \quad \varepsilon_t \stackrel{\text{IID}}{\sim} N(0, 1)

(4)

Stationary distribution: $\psi^* = N(\mu_x,\, \sigma_x^2)$ with $\mu_x = b/(1-\rho)$ and $\sigma_x^2 = \nu^2/(1-\rho^2)$ .

The goal: construct a finite state space $X = \{x_1, \ldots, x_n\}$ and an $n \times n$ transition matrix $P$ where $P_{ij} = \Pr(X_{t+1} = x_j \mid X_t = x_i)$ , such that the Markov chain $(X_t)$ on $X$ has moments close to the continuous process.

Symbol	Meaning	pylcm name
$b$	intercept (drift)	`mu`
$\nu$	innovation std dev	`sigma`
$\rho$	persistence	`rho`
$\mu_x = b/(1-\rho)$	unconditional mean	—
$\sigma_x = \nu/\sqrt{1-\rho^2}$	unconditional std	`std_y` in code
$m$	grid half-span in SDs	`n_std`
$s$	step size $(x_n - x_1)/(n-1)$	`step`
$X = \{x_1,\ldots,x_n\}$	discrete state space	grid points
$P$	transition matrix	return of `compute_transition_probs`
$\Phi$	standard normal CDF	`jax.scipy.stats.norm.cdf`
$\psi^*$	stationary distribution	—

def ar1_moments(rho, sigma, mu):
    """Compute theoretical moments of an AR(1) process."""

    mu_x = mu / (1 - rho)
    sigma_x = sigma / jnp.sqrt(1 - rho**2)
    return mu_x, sigma_x


def conditional_density(x_prime, x_i, rho, sigma, mu):
    """PDF of X_{t+1} | X_t = x_i for a normal-innovation AR(1)."""

    cond_mean = rho * x_i + mu
    return jnp.exp(-0.5 * ((x_prime - cond_mean) / sigma) ** 2) / (
        sigma * jnp.sqrt(2 * jnp.pi)
    )

Tauchen (1986)¶

Following Sargent & Stachurski (2024), Section 3.1.3:

1. Grid. Choose $m$ (number of unconditional SDs) and set

x_n = \mu_x + m\,\sigma_x, \qquad x_1 = \mu_x - m\,\sigma_x, \qquad s = \frac{x_n - x_1}{n - 1}.

(5)

2. Transition probabilities. The conditional distribution of $X_{t+1}$ given

$X_t = x_i$ is $N(\rho\, x_i + b,\; \nu^2)$ . Bin the probability mass at midpoints $x_j \pm s/2$ :

P_{ij} = \begin{cases} \Phi\!\left(\dfrac{x_1 + s/2 - \rho\, x_i - b}{\nu}\right) & j = 1 \\[6pt] \Phi\!\left(\dfrac{x_j + s/2 - \rho\, x_i - b}{\nu}\right) - \Phi\!\left(\dfrac{x_j - s/2 - \rho\, x_i - b}{\nu}\right) & 1 < j < n \\[6pt] 1 - \Phi\!\left(\dfrac{x_n - s/2 - \rho\, x_i - b}{\nu}\right) & j = n \end{cases}

(6)

Note: the denominator is $\nu$ (innovation std), not $\sigma_x$

(unconditional std).

Reference: ar1.Tauchen.compute_gridpoints (lines 80–89 of ar1.py) and compute_transition_probs (lines 91–121).

Tauchen with Gauss-Hermite nodes¶

When gauss_hermite=True, the grid uses Gauss-Hermite quadrature nodes of $N(\mu_x, \sigma_x^2)$ instead of equally spaced points. Transition probabilities use CDF binning at midpoints between consecutive GH nodes (same idea, different node placement).

Reference: ar1.Tauchen.compute_gridpoints (lines 82–84 of ar1.py) and compute_transition_probs (lines 95–108).

rho, sigma_ar, mu_ar = 0.85, 0.127, 0.0
mu_x, sigma_x = ar1_moments(rho, sigma_ar, mu_ar)

# Gauss-Hermite Tauchen grid
std_y = jnp.sqrt(sigma_ar**2 / (1 - rho**2))
gh_ar_nodes, _ = _gauss_hermite_normal(n_points=n_points, mu=mu_x, sigma=std_y)

# CDF-based transition probs at midpoints between GH nodes
midpoints = (gh_ar_nodes[:-1] + gh_ar_nodes[1:]) / 2
cdf_vals = Phi((midpoints[None, :] - rho * gh_ar_nodes[:, None]) / sigma_ar)
first_col = cdf_vals[:, :1]
last_col = 1 - cdf_vals[:, -1:]
P_gh = jnp.concatenate([first_col, jnp.diff(cdf_vals, axis=1), last_col], axis=1)
fig = go.Figure()
fig.add_trace(
    go.Bar(
        x=np.asarray(gh_ar_nodes),
        y=np.asarray(P_gh[0]),
        marker_color=blue,
        width=0.04,
        opacity=0.7,
        name=f"From state 0 (x = {float(gh_ar_nodes[0]):.2f})",
    )
)
fig.add_trace(
    go.Bar(
        x=np.asarray(gh_ar_nodes),
        y=np.asarray(P_gh[2]),
        marker_color=orange,
        width=0.04,
        opacity=0.7,
        name=f"From state 2 (x = {float(gh_ar_nodes[2]):.2f})",
    )
)
fig.update_layout(
    template="plotly_dark",
    title=(
        f"Tauchen "
        f" (GH): transition probabilities (ρ = {rho}, σ = {sigma_ar}, n = {n_points}))"
    ),
    xaxis_title="x′",
    yaxis_title="P[i, j]",
    barmode="overlay",
    height=400,
    width=700,
)
fig.show()

Loading...

Two rows of the transition matrix are shown: from the lowest state (state 0, far left of the grid) and from the middle state (state 2). Starting from a low value of $x$ , the AR(1) conditional mean $\rho x + b$ is negative, so the transition probabilities shift leftward. From the middle state the distribution is centred. The GH nodes are unevenly spaced — wider in the tails — so each bin captures more probability mass there than equally spaced bins would.

# Tauchen: equally spaced grid and transition matrix
n_std_ar = 3
x_max = n_std_ar * sigma_x
x_tauchen = jnp.linspace(mu_x - x_max, mu_x + x_max, n_points)
step_t = (2 * x_max) / (n_points - 1)
z = x_tauchen[None, :] - rho * x_tauchen[:, None]
upper = Phi((z + 0.5 * step_t - mu_ar) / sigma_ar)
lower = Phi((z - 0.5 * step_t - mu_ar) / sigma_ar)
P_tauchen = upper - lower
P_tauchen = P_tauchen.at[:, 0].set(upper[:, 0])
P_tauchen = P_tauchen.at[:, -1].set(1 - lower[:, -1])

# Transition from middle state, overlaid with true conditional density
row_idx = n_points // 2
x_dense = jnp.linspace(float(x_tauchen[0]) - 0.3, float(x_tauchen[-1]) + 0.3, 300)
pdf_vals = conditional_density(x_dense, x_tauchen[row_idx], rho, sigma_ar, mu_ar)
fig = go.Figure()
fig.add_trace(
    go.Scatter(
        x=np.asarray(x_dense),
        y=np.asarray(pdf_vals),
        mode="lines",
        line={"color": "gray", "width": 1},
        name=f"True density given x = {float(x_tauchen[row_idx]):.2f}",
    )
)
fig.add_trace(
    go.Bar(
        x=np.asarray(x_tauchen),
        y=np.asarray(P_tauchen[row_idx] / step_t),
        width=float(step_t) * 0.9,
        marker_color=blue,
        opacity=0.6,
        name="Tauchen probabilities (scaled)",
    )
)
fig.update_layout(
    template="plotly_dark",
    title=f"Tauchen: transition from x = {float(x_tauchen[row_idx]):.2f}"
    f" (ρ = {rho}, σ = {sigma_ar}, n = {n_points})",
    xaxis_title="x′",
    yaxis_title="Density / Scaled probability",
    height=400,
    width=700,
)
fig.show()

Loading...

The bars show Tauchen transition probabilities $P_{ij}$ from the middle grid point, scaled by $1/s$ so they can be compared with the continuous conditional density $N(\rho x_i + b,\, \nu^2)$ (gray curve). The match is reasonable for 5 points — each bar captures the probability mass in its bin.

# Stationary distribution by repeated matrix multiplication
psi = jnp.ones(n_points) / n_points
for _ in range(1000):
    psi = psi @ P_tauchen

# Theoretical stationary density: N(mu_x, sigma_x^2)
x_stat = jnp.linspace(float(x_tauchen[0]) - 0.2, float(x_tauchen[-1]) + 0.2, 300)
pdf_stat = jnp.exp(-0.5 * ((x_stat - mu_x) / sigma_x) ** 2) / (
    sigma_x * jnp.sqrt(2 * jnp.pi)
)
fig = go.Figure()
fig.add_trace(
    go.Scatter(
        x=np.asarray(x_stat),
        y=np.asarray(pdf_stat),
        mode="lines",
        line={"color": "gray", "width": 1},
        name=f"N({float(mu_x):.1f}, {float(sigma_x):.3f}²)",
    )
)
fig.add_trace(
    go.Bar(
        x=np.asarray(x_tauchen),
        y=np.asarray(psi / step_t),
        width=float(step_t) * 0.9,
        marker_color=blue,
        opacity=0.6,
        name="Tauchen stationary dist.",
    )
)
fig.update_layout(
    template="plotly_dark",
    title=(
        f"Tauchen: stationary distribution "
        f" (ρ = {rho}, σ = {sigma_ar}, n = {n_points}))"
    ),
    xaxis_title="x",
    yaxis_title="Density",
    height=400,
    width=700,
)
fig.show()

Loading...

The stationary distribution of the Tauchen chain (bars) is compared with the theoretical stationary density $\psi^* = N(\mu_x, \sigma_x^2)$ of the continuous AR(1) process (gray curve). The chain’s stationary vector $\psi$ is obtained by repeatedly multiplying an arbitrary initial distribution by $P$ . For moderate persistence ( $\rho = 0.85$ ) and 5 points, the approximation tracks the Gaussian well.

Limitations¶

Tauchen degrades for high persistence ( $\rho \geq 0.97$ ) and few grid points, because the equally spaced grid cannot capture the tails of the near-unit-root process Fella et al., 2019, Tables 1–5. Rouwenhorst addresses this.

Rouwenhorst (1995)¶

Following Kopecky & Suen (2010):

1. Grid. Set

x_1 = \mu_x - \frac{\nu\sqrt{n-1}}{\sqrt{1-\rho^2}}, \qquad x_n = \mu_x + \frac{\nu\sqrt{n-1}}{\sqrt{1-\rho^2}}

(7)

equally spaced. (Equivalently, the half-span is $\sigma_x\sqrt{n-1}$ .)

2. Transition matrix. Define $q = (1+\rho)/2$ . For $n = 2$ :

P_2 = \begin{pmatrix} q & 1-q \\ 1-q & q \end{pmatrix}

(8)

For $n > 2$ , the recursion (Kopecky & Suen (2010), Eq. 3):

P_n = q \begin{pmatrix} P_{n-1} & \mathbf{0} \\ \mathbf{0}' & 0 \end{pmatrix} + (1-q) \begin{pmatrix} \mathbf{0} & P_{n-1} \\ 0 & \mathbf{0}' \end{pmatrix} + (1-q) \begin{pmatrix} \mathbf{0}' & 0 \\ P_{n-1} & \mathbf{0} \end{pmatrix} + q \begin{pmatrix} 0 & \mathbf{0}' \\ \mathbf{0} & P_{n-1} \end{pmatrix}

(9)

with all rows except the first and last divided by 2 to normalize.

The closed-form expression (used in pylcm):

P_{ij} = \sum_k \binom{i}{k} \binom{n-1-i}{j-k}\, q^{\,n-1-i-j+2k}\, (1-q)^{\,i+j-2k}

(10)

Reference: ar1.Rouwenhorst.compute_gridpoints (lines 157–161 of ar1.py) and compute_transition_probs (lines 163–187).

Exact Moment Matching¶

The key advantage Fella et al., 2019, Eqs. 9, 11Kopecky & Suen, 2010, Propositions 1–2:

Conditional mean: $\sum_j P_{ij}\, x_j = \rho\, x_i + b$ (exact)
Conditional variance: $\sum_j P_{ij}\,(x_j - \rho\, x_i - b)^2 = \nu^2$ (exact)

These hold for any $n \geq 2$ . Tauchen only approximates these.

from math import comb

# Rouwenhorst grid
nu_r = jnp.sqrt((n_points - 1) / (1 - rho**2)) * sigma_ar
x_rouw = jnp.linspace(mu_x - nu_r, mu_x + nu_r, n_points)
step_r = x_rouw[1] - x_rouw[0]

# Transition matrix (closed-form binomial)
q = (rho + 1) / 2
C = jnp.array([[comb(nn, kk) for kk in range(n_points)] for nn in range(n_points)])
i = jnp.arange(n_points)[:, None, None]
j = jnp.arange(n_points)[None, :, None]
k = jnp.arange(n_points)[None, None, :]
valid = (k >= jnp.maximum(0, i + j - n_points + 1)) & (k <= jnp.minimum(i, j))
k_s = jnp.where(valid, k, 0)
jmk = jnp.where(valid, j - k, 0)
terms = (
    C[i, k_s]
    * C[n_points - 1 - i, jmk]
    * q ** (n_points - 1 - i - j + 2 * k_s)
    * (1 - q) ** (i + j - 2 * k_s)
)
P_rouw = jnp.where(valid, terms, 0.0).sum(axis=-1)

# Transition from middle state
pdf_rouw = conditional_density(x_dense, x_rouw[n_points // 2], rho, sigma_ar, mu_ar)
fig = go.Figure()
fig.add_trace(
    go.Scatter(
        x=np.asarray(x_dense),
        y=np.asarray(pdf_rouw),
        mode="lines",
        line={"color": "gray", "width": 1},
        name=f"True density given x = {float(x_rouw[n_points // 2]):.2f}",
    )
)
fig.add_trace(
    go.Bar(
        x=np.asarray(x_rouw),
        y=np.asarray(P_rouw[n_points // 2] / step_r),
        width=float(step_r) * 0.9,
        marker_color=green,
        opacity=0.6,
        name="Rouwenhorst probabilities (scaled)",
    )
)
fig.update_layout(
    template="plotly_dark",
    title=f"Rouwenhorst: transition from x = {float(x_rouw[n_points // 2]):.2f}"
    f" (ρ = {rho}, σ = {sigma_ar}, n = {n_points})",
    xaxis_title="x′",
    yaxis_title="Density / Scaled probability",
    height=400,
    width=700,
)
fig.show()

Loading...

Same layout as the Tauchen transition plot above: Rouwenhorst probabilities from the middle state (scaled by $1/s$ ) overlaid on the true conditional density. The Rouwenhorst grid is wider than Tauchen’s (half-span $\sigma_x\sqrt{n-1}$ vs $m\sigma_x$ ), so the bars extend further into the tails. At moderate persistence the visual fit is similar to Tauchen, but the exact moment-matching properties (shown next) make a real difference at high $\rho$ .

# High-persistence case: rho = 0.98
rho_hi = 0.98
mu_x_hi, sigma_x_hi = ar1_moments(rho_hi, sigma_ar, mu_ar)

# Tauchen
x_max_hi = n_std_ar * sigma_x_hi
x_t_hi = jnp.linspace(mu_x_hi - x_max_hi, mu_x_hi + x_max_hi, n_points)
step_t_hi = (2 * x_max_hi) / (n_points - 1)
z_hi = x_t_hi[None, :] - rho_hi * x_t_hi[:, None]
upper_hi = Phi((z_hi + 0.5 * step_t_hi - mu_ar) / sigma_ar)
lower_hi = Phi((z_hi - 0.5 * step_t_hi - mu_ar) / sigma_ar)
P_t_hi = upper_hi - lower_hi
P_t_hi = P_t_hi.at[:, 0].set(upper_hi[:, 0])
P_t_hi = P_t_hi.at[:, -1].set(1 - lower_hi[:, -1])

# Rouwenhorst
nu_r_hi = jnp.sqrt((n_points - 1) / (1 - rho_hi**2)) * sigma_ar
x_r_hi = jnp.linspace(mu_x_hi - nu_r_hi, mu_x_hi + nu_r_hi, n_points)
step_r_hi = x_r_hi[1] - x_r_hi[0]
q_hi = (rho_hi + 1) / 2
C_hi = jnp.array([[comb(nn, kk) for kk in range(n_points)] for nn in range(n_points)])
i_hi = jnp.arange(n_points)[:, None, None]
j_hi = jnp.arange(n_points)[None, :, None]
k_hi = jnp.arange(n_points)[None, None, :]
valid_hi = (k_hi >= jnp.maximum(0, i_hi + j_hi - n_points + 1)) & (
    k_hi <= jnp.minimum(i_hi, j_hi)
)
k_s_hi = jnp.where(valid_hi, k_hi, 0)
jmk_hi = jnp.where(valid_hi, j_hi - k_hi, 0)
terms_hi = (
    C_hi[i_hi, k_s_hi]
    * C_hi[n_points - 1 - i_hi, jmk_hi]
    * q_hi ** (n_points - 1 - i_hi - j_hi + 2 * k_s_hi)
    * (1 - q_hi) ** (i_hi + j_hi - 2 * k_s_hi)
)
P_r_hi = jnp.where(valid_hi, terms_hi, 0.0).sum(axis=-1)

# Stationary distributions
psi_t_hi = jnp.ones(n_points) / n_points
psi_r_hi = jnp.ones(n_points) / n_points
for _ in range(1000):
    psi_t_hi = psi_t_hi @ P_t_hi
    psi_r_hi = psi_r_hi @ P_r_hi

# Theoretical density
x_range = max(float(x_max_hi), float(nu_r_hi)) * 1.3
x_stat_hi = jnp.linspace(-x_range, x_range, 300)
pdf_stat_hi = jnp.exp(-0.5 * ((x_stat_hi - mu_x_hi) / sigma_x_hi) ** 2) / (
    sigma_x_hi * jnp.sqrt(2 * jnp.pi)
)

# Use x_t_hi as the common x-axis for both bar sets
fig = go.Figure()
fig.add_trace(
    go.Scatter(
        x=np.asarray(x_stat_hi),
        y=np.asarray(pdf_stat_hi),
        mode="lines",
        line={"color": "gray", "width": 1},
        name="N(\u03bcₓ, σₓ²)",
    )
)
fig.add_trace(
    go.Bar(
        x=np.asarray(x_t_hi),
        y=np.asarray(psi_t_hi / step_t_hi),
        width=float(step_t_hi) * 0.85,
        marker_color=blue,
        opacity=0.5,
        name="Tauchen",
    )
)
fig.add_trace(
    go.Bar(
        x=np.asarray(x_r_hi),
        y=np.asarray(psi_r_hi / step_r_hi),
        width=float(step_r_hi) * 0.85,
        marker_color=green,
        opacity=0.5,
        name="Rouwenhorst",
    )
)
fig.update_layout(
    template="plotly_dark",
    title=f"Stationary distributions (ρ = {rho_hi}, σ = {sigma_ar}, n = {n_points})",
    xaxis_title="x",
    yaxis_title="Density",
    barmode="overlay",
    height=400,
    width=700,
)
fig.show()

Loading...

With $\rho = 0.98$ and only 5 grid points, Tauchen’s stationary distribution (blue) concentrates too much mass on the extreme states — the equally spaced grid cannot span the wide unconditional distribution while also resolving the interior. Rouwenhorst (green) uses a wider grid ( $\sigma_x\sqrt{n-1}$ ) with exact moment matching and produces a much smoother approximation of the Gaussian target.

# Conditional mean errors: Tauchen vs Rouwenhorst
cond_mean_t = P_tauchen @ x_tauchen
true_cond_mean_t = rho * x_tauchen + mu_ar
mean_err_t = jnp.abs(cond_mean_t - true_cond_mean_t)
cond_mean_r = P_rouw @ x_rouw
true_cond_mean_r = rho * x_rouw + mu_ar
mean_err_r = jnp.abs(cond_mean_r - true_cond_mean_r)
fig = go.Figure()
fig.add_trace(
    go.Bar(
        x=np.arange(n_points),
        y=np.asarray(mean_err_t),
        marker_color=blue,
        opacity=0.7,
        name="Tauchen",
    )
)
fig.add_trace(
    go.Bar(
        x=np.arange(n_points),
        y=np.asarray(mean_err_r),
        marker_color=green,
        opacity=0.7,
        name="Rouwenhorst",
    )
)
fig.update_layout(
    template="plotly_dark",
    title=f"Conditional mean error (ρ = {rho}, σ = {sigma_ar}, n = {n_points})",
    xaxis_title="Grid point index i",
    yaxis_title="Absolute error",
    yaxis_type="log",
    yaxis_exponentformat="e",
    yaxis_range=[-16, None],
    barmode="overlay",
    height=400,
    width=700,
)
fig.show()

Loading...

The plot shows the absolute error in the conditional mean at each grid point $x_i$ :

\bigl|\,\textstyle\sum_j P_{ij}\, x_j \;-\; (\rho\, x_i + b)\,\bigr|

(11)

This is the difference between the Markov chain’s expected next-period value and the true conditional mean of the AR(1) process. Tauchen only approximates the conditional mean, with errors around 1e-2 that vary across grid points. Rouwenhorst matches it exactly in exact arithmetic Kopecky & Suen, 2010, Proposition 1; the residual errors visible here (~1e-15) are floating-point roundoff at double-precision machine epsilon.

Non-Normal Innovations (FGP Section 4.3)¶

When $\varepsilon_t$ has a mixture-of-normals distribution:

\varepsilon_t \sim p_1 \, N(\mu_1, \sigma_1^2) + (1 - p_1) \, N(\mu_2, \sigma_2^2)

(12)

the only change to Tauchen’s method is replacing $\Phi$ with the mixture CDF Fella et al., 2019, Section 4.3:

F_\varepsilon(x) = p_1 \, \Phi\!\left(\frac{x - \mu_1}{\sigma_1}\right) + (1 - p_1) \, \Phi\!\left(\frac{x - \mu_2}{\sigma_2}\right)

(13)

The grid spans $\pm m\,\sigma_x$ as before, with the unconditional variance computed from the mixture:

\text{Var}(\varepsilon) = p_1(\sigma_1^2 + \mu_1^2) + (1 - p_1)(\sigma_2^2 + \mu_2^2) - [p_1\,\mu_1 + (1 - p_1)\,\mu_2]^2

(14)

The example below uses $p_1 = 0.9$ , $\mu_1 = 0$ , $\sigma_1 = 0.1$ for a tight “normal times” component and $\mu_2 = -0.5$ , $\sigma_2 = 0.3$ for a rare bad-shock component — think of a 10 % probability of job loss, where the new job typically pays less and the outcome is more uncertain. This produces a distribution with a heavier left tail but a lighter right tail than a Gaussian with the same variance: large negative shocks are more likely, large positive shocks less likely.

Reference: _base._mixture_cdf in _base.py and ar1.TauchenNormalMixture.compute_transition_probs in ar1.py. The IID variant iid.NormalMixture uses the same mixture CDF for CDF-binning weights without persistence.

p1, mu1, sigma1 = 0.9, 0.0, 0.1
mu2, sigma2 = -0.5, 0.3
x_pdf = jnp.linspace(-1.5, 1.0, 300)

f_mix = p1 * phi_pdf(x_pdf, loc=mu1, scale=sigma1) + (1 - p1) * phi_pdf(
    x_pdf, loc=mu2, scale=sigma2
)

# Normal PDF with same mean and variance
mean_eps = p1 * mu1 + (1 - p1) * mu2
var_eps = p1 * (sigma1**2 + mu1**2) + (1 - p1) * (sigma2**2 + mu2**2) - mean_eps**2
std_eps = jnp.sqrt(var_eps)
f_normal = phi_pdf(x_pdf, loc=mean_eps, scale=std_eps)
fig = go.Figure()
fig.add_trace(
    go.Scatter(
        x=np.asarray(x_pdf),
        y=np.asarray(f_normal),
        mode="lines",
        line={"color": "gray", "width": 1},
        name="Normal (same mean/var)",
    )
)
fig.add_trace(
    go.Scatter(
        x=np.asarray(x_pdf),
        y=np.asarray(f_mix),
        mode="lines",
        line={"color": orange, "width": 2},
        name="Mixture PDF",
    )
)
fig.update_layout(
    template="plotly_dark",
    title=(
        f"Mixture vs Normal PDF (p\u2081 = {p1}, \u03bc\u2081 = {mu1},"
        f" σ\u2081 = {sigma1}, \u03bc\u2082 = {mu2}, σ\u2082 = {sigma2})"
    ),
    xaxis_title="ε",
    yaxis_title="f(ε)",
    height=400,
    width=700,
)
fig.show()

Loading...

The gray curve is a normal PDF with the same mean and variance as the mixture, computed from the mixture parameters: $\mu_\varepsilon = p_1 \mu_1 + (1-p_1)\mu_2$ and $\text{Var}(\varepsilon) = p_1(\sigma_1^2 + \mu_1^2) + (1-p_1)(\sigma_2^2 + \mu_2^2) - \mu_\varepsilon^2$ . Any difference between the two curves is therefore purely due to higher-order moments (skewness, kurtosis). The mixture has a sharper central peak (from the tight majority component) and a heavier left tail (from the wide, negatively shifted minority component), but a lighter right tail — large positive shocks are less likely than under a Gaussian with the same variance.

# Mixture transition matrix vs normal-innovation Tauchen
rho_mix, mu_mix, n_std_mix = 0.85, 0.0, 3
sigma_eps_sq = p1 * (sigma1**2 + mu1**2) + (1 - p1) * (sigma2**2 + mu2**2) - mean_eps**2
std_y_mix = jnp.sqrt(sigma_eps_sq / (1 - rho_mix**2))
long_run_mean_mix = (mu_mix + mean_eps) / (1 - rho_mix)
x_max_mix = n_std_mix * std_y_mix
x_mix = jnp.linspace(
    long_run_mean_mix - x_max_mix, long_run_mean_mix + x_max_mix, n_points
)
step_mix = (2 * x_max_mix) / (n_points - 1)
half_step_mix = 0.5 * step_mix

# Mixture transition matrix
z_mix = x_mix[None, :] - mu_mix - rho_mix * x_mix[:, None]
upper_mix = p1 * Phi((z_mix + half_step_mix - mu1) / sigma1) + (1 - p1) * Phi(
    (z_mix + half_step_mix - mu2) / sigma2
)
lower_mix = p1 * Phi((z_mix - half_step_mix - mu1) / sigma1) + (1 - p1) * Phi(
    (z_mix - half_step_mix - mu2) / sigma2
)
P_mix = upper_mix - lower_mix
P_mix = P_mix.at[:, 0].set(upper_mix[:, 0])
P_mix = P_mix.at[:, -1].set(1 - lower_mix[:, -1])

# Normal-innovation Tauchen (same unconditional std for comparison)
sigma_normal_equiv = jnp.sqrt(sigma_eps_sq)
mu_x_norm, sigma_x_norm = ar1_moments(rho_mix, sigma_normal_equiv, mu_mix)
x_max_norm = n_std_mix * sigma_x_norm
x_norm = jnp.linspace(mu_x_norm - x_max_norm, mu_x_norm + x_max_norm, n_points)
step_norm = (2 * x_max_norm) / (n_points - 1)
z_norm = x_norm[None, :] - rho_mix * x_norm[:, None]
upper_norm = Phi((z_norm + 0.5 * step_norm - mu_mix) / sigma_normal_equiv)
lower_norm = Phi((z_norm - 0.5 * step_norm - mu_mix) / sigma_normal_equiv)
P_norm = upper_norm - lower_norm
P_norm = P_norm.at[:, 0].set(upper_norm[:, 0])
P_norm = P_norm.at[:, -1].set(1 - lower_norm[:, -1])

# Compare one row (middle state)
row_idx_mix = n_points // 2
fig = go.Figure()
fig.add_trace(
    go.Bar(
        x=np.arange(n_points),
        y=np.asarray(P_norm[row_idx_mix]),
        marker_color=blue,
        opacity=0.7,
        name="Normal innovations",
    )
)
fig.add_trace(
    go.Bar(
        x=np.arange(n_points),
        y=np.asarray(P_mix[row_idx_mix]),
        marker_color=orange,
        opacity=0.7,
        name="Mixture innovations",
    )
)
fig.update_layout(
    template="plotly_dark",
    title=f"Transition from middle state (ρ = {rho_mix}, n = {n_points})",
    xaxis_title="Target state index j",
    yaxis_title="P[i, j]",
    barmode="overlay",
    height=400,
    width=700,
)
fig.show()

Loading...

Both transition matrices use the same grid and the same unconditional variance, so they are directly comparable. The mixture-innovation row (orange) puts more probability on low target states and less on high target states than the normal-innovation row (blue). This reflects the left-skewed innovation distribution: conditional on the current state, large negative transitions are more likely and large positive transitions less likely than under a symmetric Gaussian. In a lifecycle context, this asymmetry matters beyond what the variance alone captures — agents face a higher probability of sharp income drops (job loss) with no corresponding upside, which can strengthen the precautionary savings motive.