Skip to article frontmatterSkip to article content
Site not loading correctly?

This may be due to an incorrect BASE_URL configuration. See the MyST Documentation for reference.

Approximating Continuous Shocks

Lifecycle models need stochastic state variables — income, health, productivity — that follow continuous distributions. Because the DP solution operates on finite grids, these distributions must be approximated by finite Markov chains. This page derives the formulas implemented in lcm.shocks and links each to its source code.

Notation follows Sargent & Stachurski (2024), Section 3.1.3.

Source
import jax

jax.config.update("jax_enable_x64", val=True)

import jax.numpy as jnp  # noqa: E402
import numpy as np  # noqa: E402
import plotly.graph_objects as go  # noqa: E402
from jax.scipy.stats.norm import cdf as Phi  # noqa: N812, E402
from jax.scipy.stats.norm import pdf as phi_pdf  # noqa: E402

from lcm.shocks._base import _gauss_hermite_normal  # noqa: E402

blue, orange, green = "#4C78A8", "#F58518", "#54A24B"

IID Shocks

IID shocks are draws from a fixed distribution each period — no persistence. pylcm provides Uniform, Normal, LogNormal, and NormalMixture in lcm.shocks.iid. Each must specify grid points and transition probabilities (weights). Because the draws are independent, every row of the transition matrix PP is identical.

Uniform

The simplest case: nn equally spaced points on [a,b][a, b] with equal weights wi=1/nw_i = 1/n.

Reference: iid.Uniform.compute_gridpoints (line 45 of iid.py) and compute_transition_probs (lines 48–53).

Normal (Gauss-Hermite Quadrature)

For εN(μ,σ2)\varepsilon \sim N(\mu, \sigma^2), Gauss-Hermite quadrature approximates integrals against the Gaussian weight function:

f(x)ex2dx    i=1nwif(zi)\int_{-\infty}^{\infty} f(x)\, e^{-x^2}\, dx \;\approx\; \sum_{i=1}^n w_i\, f(z_i)

where {zi}\{z_i\} and {wi}\{w_i\} are the physicist’s Hermite nodes and weights.

To integrate against N(μ,σ2)N(\mu, \sigma^2) instead, apply the affine transformation xi=μ+2σzix_i = \mu + \sqrt{2}\,\sigma\,z_i and rescale the weights w~i=wi/π\tilde{w}_i = w_i / \sqrt{\pi}. Then

f(x)1σ2πe(xμ)2/(2σ2)dx    i=1nw~if(xi).\int_{-\infty}^{\infty} f(x)\, \frac{1}{\sigma\sqrt{2\pi}} e^{-(x-\mu)^2/(2\sigma^2)}\, dx \;\approx\; \sum_{i=1}^n \tilde{w}_i\, f(x_i).

Reference: _base._gauss_hermite_normal (lines 16–29 of _base.py).

CDF-based binning (alternative)

When gauss_hermite=False, Normal uses nn equally spaced points spanning μ±mσ\mu \pm m\sigma and assigns weights by CDF binning. Define midpoints mj=(xj+xj+1)/2m_j = (x_j + x_{j+1})/2 between consecutive grid points. Then

wj={Φ ⁣(m1μσ)j=1Φ ⁣(mjμσ)Φ ⁣(mj1μσ)1<j<n1Φ ⁣(mn1μσ)j=nw_j = \begin{cases} \Phi\!\left(\dfrac{m_1 - \mu}{\sigma}\right) & j = 1 \\[6pt] \Phi\!\left(\dfrac{m_j - \mu}{\sigma}\right) - \Phi\!\left(\dfrac{m_{j-1} - \mu}{\sigma}\right) & 1 < j < n \\[6pt] 1 - \Phi\!\left(\dfrac{m_{n-1} - \mu}{\sigma}\right) & j = n \end{cases}

The first and last bins absorb the tails.

Reference: iid.Normal.compute_transition_probs (lines 110–125 of iid.py).

Source
n_points = 5
mu, sigma = 0.0, 1.0
gh_nodes, gh_weights = _gauss_hermite_normal(n_points=n_points, mu=mu, sigma=sigma)
n_std = 3.0
eq_nodes = jnp.linspace(mu - n_std * sigma, mu + n_std * sigma, n_points)
step = eq_nodes[1] - eq_nodes[0]
half_step = 0.5 * step
eq_weights = jnp.zeros(n_points)
eq_weights = eq_weights.at[1:].set(jnp.diff(Phi((eq_nodes + half_step - mu) / sigma)))
eq_weights = eq_weights.at[0].set(Phi((eq_nodes[0] + half_step - mu) / sigma))
eq_weights = eq_weights.at[-1].set(1 - Phi((eq_nodes[-1] - half_step - mu) / sigma))
fig = go.Figure()
fig.add_trace(
    go.Bar(
        x=np.asarray(eq_nodes),
        y=np.asarray(eq_weights),
        marker_color=orange,
        width=0.15,
        opacity=0.7,
        name="Equally spaced + CDF",
    ),
)
fig.add_trace(
    go.Bar(
        x=np.asarray(gh_nodes),
        y=np.asarray(gh_weights),
        marker_color=blue,
        width=0.15,
        opacity=0.7,
        name="Gauss-Hermite",
    ),
)
fig.update_layout(
    template="plotly_dark",
    title=f"Normal quadrature: \u03bc = {mu}, σ = {sigma}, n = {n_points}",
    xaxis_title="Node",
    yaxis_title="Weight",
    barmode="overlay",
    height=400,
    width=700,
    legend={"yanchor": "top", "y": 0.98, "xanchor": "right", "x": 0.98},
)
fig.show()
Loading...

Gauss-Hermite nodes cluster near the mean where the density is highest, so the central nodes carry the largest weights. Equally spaced nodes spread uniformly across ±3σ\pm 3\sigma; the outer nodes fall in the tails where little probability mass lives, so their CDF-binned weights are small. With these parameters, the two schemes place nodes at quite different locations, but this can be different depending on the n_points and the n_std parameter. GH is generally more accurate for smooth integrands.

LogNormal

If XN(μ,σ2)X \sim N(\mu, \sigma^2), then Y=eXY = e^X is log-normal. Grid points are yi=exiy_i = e^{x_i} where {xi}\{x_i\} are the normal nodes (GH or equally spaced); weights are unchanged.

Reference: iid.LogNormal.compute_gridpoints (lines 163–169 of iid.py).

Source
mu_ln, sigma_ln = 0.0, 0.5
normal_nodes, normal_weights = _gauss_hermite_normal(
    n_points=n_points, mu=mu_ln, sigma=sigma_ln
)
lognormal_nodes = jnp.exp(normal_nodes)
fig = go.Figure()
fig.add_trace(
    go.Scatter(
        x=np.asarray(normal_nodes),
        y=np.asarray(normal_weights),
        mode="markers+lines",
        marker={"color": blue, "size": 8},
        line={"color": blue, "width": 1, "dash": "dot"},
        name="Normal nodes",
    )
)
fig.add_trace(
    go.Scatter(
        x=np.asarray(lognormal_nodes),
        y=np.asarray(normal_weights),
        mode="markers+lines",
        marker={"color": orange, "size": 8},
        line={"color": orange, "width": 1, "dash": "dot"},
        name="LogNormal nodes",
    )
)
fig.update_layout(
    template="plotly_dark",
    title=f"LogNormal: \u03bc = {mu_ln}, σ = {sigma_ln}, n = {n_points}",
    xaxis_title="Node value",
    yaxis_title="Weight",
    height=400,
    width=700,
)
fig.show()
Loading...

The exponential transform yi=exiy_i = e^{x_i} shifts all nodes to the right and compresses negative normal nodes into (0,1)(0, 1) while stretching positive ones above 1. The weights stay the same — only the node positions change. This ensures that expectations E[f(eX)]E[f(e^X)] are computed correctly for log-normal shocks using the same Gauss-Hermite machinery.

NormalMixture

When the shock follows a two-component normal mixture εp1N(μ1,σ12)+(1p1)N(μ2,σ22)\varepsilon \sim p_1 \, N(\mu_1, \sigma_1^2) + (1-p_1) \, N(\mu_2, \sigma_2^2), NormalMixture uses equally spaced points spanning the mixture mean ±m\pm m mixture standard deviations and assigns weights via CDF binning with the mixture CDF in place of Φ\Phi — the same approach described in the Non-Normal Innovations section below, but without persistence.

Reference: iid.NormalMixture in iid.py.

AR(1) Processes

The canonical process (Sargent & Stachurski (2024) notation):

Xt+1=ρXt+b+νεt+1,ρ<1,εtIIDN(0,1)X_{t+1} = \rho\, X_t + b + \nu\,\varepsilon_{t+1}, \quad |\rho| < 1, \quad \varepsilon_t \stackrel{\text{IID}}{\sim} N(0, 1)

Stationary distribution: ψ=N(μx,σx2)\psi^* = N(\mu_x,\, \sigma_x^2) with μx=b/(1ρ)\mu_x = b/(1-\rho) and σx2=ν2/(1ρ2)\sigma_x^2 = \nu^2/(1-\rho^2).

The goal: construct a finite state space X={x1,,xn}X = \{x_1, \ldots, x_n\} and an n×nn \times n transition matrix PP where Pij=Pr(Xt+1=xjXt=xi)P_{ij} = \Pr(X_{t+1} = x_j \mid X_t = x_i), such that the Markov chain (Xt)(X_t) on XX has moments close to the continuous process.

SymbolMeaningpylcm name
bbintercept (drift)mu
ν\nuinnovation std devsigma
ρ\rhopersistencerho
μx=b/(1ρ)\mu_x = b/(1-\rho)unconditional mean
σx=ν/1ρ2\sigma_x = \nu/\sqrt{1-\rho^2}unconditional stdstd_y in code
mmgrid half-span in SDsn_std
ssstep size (xnx1)/(n1)(x_n - x_1)/(n-1)step
X={x1,,xn}X = \{x_1,\ldots,x_n\}discrete state spacegrid points
PPtransition matrixreturn of compute_transition_probs
Φ\Phistandard normal CDFjax.scipy.stats.norm.cdf
ψ\psi^*stationary distribution
Source
def ar1_moments(rho, sigma, mu):
    """Compute theoretical moments of an AR(1) process."""

    mu_x = mu / (1 - rho)
    sigma_x = sigma / jnp.sqrt(1 - rho**2)
    return mu_x, sigma_x


def conditional_density(x_prime, x_i, rho, sigma, mu):
    """PDF of X_{t+1} | X_t = x_i for a normal-innovation AR(1)."""

    cond_mean = rho * x_i + mu
    return jnp.exp(-0.5 * ((x_prime - cond_mean) / sigma) ** 2) / (
        sigma * jnp.sqrt(2 * jnp.pi)
    )

Tauchen (1986)

Following Sargent & Stachurski (2024), Section 3.1.3:

1. Grid. Choose mm (number of unconditional SDs) and set

xn=μx+mσx,x1=μxmσx,s=xnx1n1.x_n = \mu_x + m\,\sigma_x, \qquad x_1 = \mu_x - m\,\sigma_x, \qquad s = \frac{x_n - x_1}{n - 1}.

2. Transition probabilities. The conditional distribution of Xt+1X_{t+1} given

Xt=xiX_t = x_i is N(ρxi+b,  ν2)N(\rho\, x_i + b,\; \nu^2). Bin the probability mass at midpoints xj±s/2x_j \pm s/2:

Pij={Φ ⁣(x1+s/2ρxibν)j=1Φ ⁣(xj+s/2ρxibν)Φ ⁣(xjs/2ρxibν)1<j<n1Φ ⁣(xns/2ρxibν)j=nP_{ij} = \begin{cases} \Phi\!\left(\dfrac{x_1 + s/2 - \rho\, x_i - b}{\nu}\right) & j = 1 \\[6pt] \Phi\!\left(\dfrac{x_j + s/2 - \rho\, x_i - b}{\nu}\right) - \Phi\!\left(\dfrac{x_j - s/2 - \rho\, x_i - b}{\nu}\right) & 1 < j < n \\[6pt] 1 - \Phi\!\left(\dfrac{x_n - s/2 - \rho\, x_i - b}{\nu}\right) & j = n \end{cases}

Note: the denominator is ν\nu (innovation std), not σx\sigma_x

(unconditional std).

Reference: ar1.Tauchen.compute_gridpoints (lines 80–89 of ar1.py) and compute_transition_probs (lines 91–121).

Tauchen with Gauss-Hermite nodes

When gauss_hermite=True, the grid uses Gauss-Hermite quadrature nodes of N(μx,σx2)N(\mu_x, \sigma_x^2) instead of equally spaced points. Transition probabilities use CDF binning at midpoints between consecutive GH nodes (same idea, different node placement).

Reference: ar1.Tauchen.compute_gridpoints (lines 82–84 of ar1.py) and compute_transition_probs (lines 95–108).

Source
rho, sigma_ar, mu_ar = 0.85, 0.127, 0.0
mu_x, sigma_x = ar1_moments(rho, sigma_ar, mu_ar)

# Gauss-Hermite Tauchen grid
std_y = jnp.sqrt(sigma_ar**2 / (1 - rho**2))
gh_ar_nodes, _ = _gauss_hermite_normal(n_points=n_points, mu=mu_x, sigma=std_y)

# CDF-based transition probs at midpoints between GH nodes
midpoints = (gh_ar_nodes[:-1] + gh_ar_nodes[1:]) / 2
cdf_vals = Phi((midpoints[None, :] - rho * gh_ar_nodes[:, None]) / sigma_ar)
first_col = cdf_vals[:, :1]
last_col = 1 - cdf_vals[:, -1:]
P_gh = jnp.concatenate([first_col, jnp.diff(cdf_vals, axis=1), last_col], axis=1)
fig = go.Figure()
fig.add_trace(
    go.Bar(
        x=np.asarray(gh_ar_nodes),
        y=np.asarray(P_gh[0]),
        marker_color=blue,
        width=0.04,
        opacity=0.7,
        name=f"From state 0 (x = {float(gh_ar_nodes[0]):.2f})",
    )
)
fig.add_trace(
    go.Bar(
        x=np.asarray(gh_ar_nodes),
        y=np.asarray(P_gh[2]),
        marker_color=orange,
        width=0.04,
        opacity=0.7,
        name=f"From state 2 (x = {float(gh_ar_nodes[2]):.2f})",
    )
)
fig.update_layout(
    template="plotly_dark",
    title=(
        f"Tauchen "
        f" (GH): transition probabilities (ρ = {rho}, σ = {sigma_ar}, n = {n_points}))"
    ),
    xaxis_title="x′",
    yaxis_title="P[i, j]",
    barmode="overlay",
    height=400,
    width=700,
)
fig.show()
Loading...

Two rows of the transition matrix are shown: from the lowest state (state 0, far left of the grid) and from the middle state (state 2). Starting from a low value of xx, the AR(1) conditional mean ρx+b\rho x + b is negative, so the transition probabilities shift leftward. From the middle state the distribution is centred. The GH nodes are unevenly spaced — wider in the tails — so each bin captures more probability mass there than equally spaced bins would.

Source
# Tauchen: equally spaced grid and transition matrix
n_std_ar = 3
x_max = n_std_ar * sigma_x
x_tauchen = jnp.linspace(mu_x - x_max, mu_x + x_max, n_points)
step_t = (2 * x_max) / (n_points - 1)
z = x_tauchen[None, :] - rho * x_tauchen[:, None]
upper = Phi((z + 0.5 * step_t - mu_ar) / sigma_ar)
lower = Phi((z - 0.5 * step_t - mu_ar) / sigma_ar)
P_tauchen = upper - lower
P_tauchen = P_tauchen.at[:, 0].set(upper[:, 0])
P_tauchen = P_tauchen.at[:, -1].set(1 - lower[:, -1])

# Transition from middle state, overlaid with true conditional density
row_idx = n_points // 2
x_dense = jnp.linspace(float(x_tauchen[0]) - 0.3, float(x_tauchen[-1]) + 0.3, 300)
pdf_vals = conditional_density(x_dense, x_tauchen[row_idx], rho, sigma_ar, mu_ar)
fig = go.Figure()
fig.add_trace(
    go.Scatter(
        x=np.asarray(x_dense),
        y=np.asarray(pdf_vals),
        mode="lines",
        line={"color": "gray", "width": 1},
        name=f"True density given x = {float(x_tauchen[row_idx]):.2f}",
    )
)
fig.add_trace(
    go.Bar(
        x=np.asarray(x_tauchen),
        y=np.asarray(P_tauchen[row_idx] / step_t),
        width=float(step_t) * 0.9,
        marker_color=blue,
        opacity=0.6,
        name="Tauchen probabilities (scaled)",
    )
)
fig.update_layout(
    template="plotly_dark",
    title=f"Tauchen: transition from x = {float(x_tauchen[row_idx]):.2f}"
    f" (ρ = {rho}, σ = {sigma_ar}, n = {n_points})",
    xaxis_title="x′",
    yaxis_title="Density / Scaled probability",
    height=400,
    width=700,
)
fig.show()
Loading...

The bars show Tauchen transition probabilities PijP_{ij} from the middle grid point, scaled by 1/s1/s so they can be compared with the continuous conditional density N(ρxi+b,ν2)N(\rho x_i + b,\, \nu^2) (gray curve). The match is reasonable for 5 points — each bar captures the probability mass in its bin.

Source
# Stationary distribution by repeated matrix multiplication
psi = jnp.ones(n_points) / n_points
for _ in range(1000):
    psi = psi @ P_tauchen

# Theoretical stationary density: N(mu_x, sigma_x^2)
x_stat = jnp.linspace(float(x_tauchen[0]) - 0.2, float(x_tauchen[-1]) + 0.2, 300)
pdf_stat = jnp.exp(-0.5 * ((x_stat - mu_x) / sigma_x) ** 2) / (
    sigma_x * jnp.sqrt(2 * jnp.pi)
)
fig = go.Figure()
fig.add_trace(
    go.Scatter(
        x=np.asarray(x_stat),
        y=np.asarray(pdf_stat),
        mode="lines",
        line={"color": "gray", "width": 1},
        name=f"N({float(mu_x):.1f}, {float(sigma_x):.3f}²)",
    )
)
fig.add_trace(
    go.Bar(
        x=np.asarray(x_tauchen),
        y=np.asarray(psi / step_t),
        width=float(step_t) * 0.9,
        marker_color=blue,
        opacity=0.6,
        name="Tauchen stationary dist.",
    )
)
fig.update_layout(
    template="plotly_dark",
    title=(
        f"Tauchen: stationary distribution "
        f" (ρ = {rho}, σ = {sigma_ar}, n = {n_points}))"
    ),
    xaxis_title="x",
    yaxis_title="Density",
    height=400,
    width=700,
)
fig.show()
Loading...

The stationary distribution of the Tauchen chain (bars) is compared with the theoretical stationary density ψ=N(μx,σx2)\psi^* = N(\mu_x, \sigma_x^2) of the continuous AR(1) process (gray curve). The chain’s stationary vector ψ\psi is obtained by repeatedly multiplying an arbitrary initial distribution by PP. For moderate persistence (ρ=0.85\rho = 0.85) and 5 points, the approximation tracks the Gaussian well.

Limitations

Tauchen degrades for high persistence (ρ0.97\rho \geq 0.97) and few grid points, because the equally spaced grid cannot capture the tails of the near-unit-root process Fella et al., 2019, Tables 1–5. Rouwenhorst addresses this.

Rouwenhorst (1995)

Following Kopecky & Suen (2010):

1. Grid. Set

x1=μxνn11ρ2,xn=μx+νn11ρ2x_1 = \mu_x - \frac{\nu\sqrt{n-1}}{\sqrt{1-\rho^2}}, \qquad x_n = \mu_x + \frac{\nu\sqrt{n-1}}{\sqrt{1-\rho^2}}

equally spaced. (Equivalently, the half-span is σxn1\sigma_x\sqrt{n-1}.)

2. Transition matrix. Define q=(1+ρ)/2q = (1+\rho)/2. For n=2n = 2:

P2=(q1q1qq)P_2 = \begin{pmatrix} q & 1-q \\ 1-q & q \end{pmatrix}

For n>2n > 2, the recursion (Kopecky & Suen (2010), Eq. 3):

Pn=q(Pn1000)+(1q)(0Pn100)+(1q)(00Pn10)+q(000Pn1)P_n = q \begin{pmatrix} P_{n-1} & \mathbf{0} \\ \mathbf{0}' & 0 \end{pmatrix} + (1-q) \begin{pmatrix} \mathbf{0} & P_{n-1} \\ 0 & \mathbf{0}' \end{pmatrix} + (1-q) \begin{pmatrix} \mathbf{0}' & 0 \\ P_{n-1} & \mathbf{0} \end{pmatrix} + q \begin{pmatrix} 0 & \mathbf{0}' \\ \mathbf{0} & P_{n-1} \end{pmatrix}

with all rows except the first and last divided by 2 to normalize.

The closed-form expression (used in pylcm):

Pij=k(ik)(n1ijk)qn1ij+2k(1q)i+j2kP_{ij} = \sum_k \binom{i}{k} \binom{n-1-i}{j-k}\, q^{\,n-1-i-j+2k}\, (1-q)^{\,i+j-2k}

Reference: ar1.Rouwenhorst.compute_gridpoints (lines 157–161 of ar1.py) and compute_transition_probs (lines 163–187).

Exact Moment Matching

The key advantage Fella et al., 2019, Eqs. 9, 11Kopecky & Suen, 2010, Propositions 1–2:

  • Conditional mean: jPijxj=ρxi+b\sum_j P_{ij}\, x_j = \rho\, x_i + b (exact)

  • Conditional variance: jPij(xjρxib)2=ν2\sum_j P_{ij}\,(x_j - \rho\, x_i - b)^2 = \nu^2 (exact)

These hold for any n2n \geq 2. Tauchen only approximates these.

Source
from math import comb

# Rouwenhorst grid
nu_r = jnp.sqrt((n_points - 1) / (1 - rho**2)) * sigma_ar
x_rouw = jnp.linspace(mu_x - nu_r, mu_x + nu_r, n_points)
step_r = x_rouw[1] - x_rouw[0]

# Transition matrix (closed-form binomial)
q = (rho + 1) / 2
C = jnp.array([[comb(nn, kk) for kk in range(n_points)] for nn in range(n_points)])
i = jnp.arange(n_points)[:, None, None]
j = jnp.arange(n_points)[None, :, None]
k = jnp.arange(n_points)[None, None, :]
valid = (k >= jnp.maximum(0, i + j - n_points + 1)) & (k <= jnp.minimum(i, j))
k_s = jnp.where(valid, k, 0)
jmk = jnp.where(valid, j - k, 0)
terms = (
    C[i, k_s]
    * C[n_points - 1 - i, jmk]
    * q ** (n_points - 1 - i - j + 2 * k_s)
    * (1 - q) ** (i + j - 2 * k_s)
)
P_rouw = jnp.where(valid, terms, 0.0).sum(axis=-1)

# Transition from middle state
pdf_rouw = conditional_density(x_dense, x_rouw[n_points // 2], rho, sigma_ar, mu_ar)
fig = go.Figure()
fig.add_trace(
    go.Scatter(
        x=np.asarray(x_dense),
        y=np.asarray(pdf_rouw),
        mode="lines",
        line={"color": "gray", "width": 1},
        name=f"True density given x = {float(x_rouw[n_points // 2]):.2f}",
    )
)
fig.add_trace(
    go.Bar(
        x=np.asarray(x_rouw),
        y=np.asarray(P_rouw[n_points // 2] / step_r),
        width=float(step_r) * 0.9,
        marker_color=green,
        opacity=0.6,
        name="Rouwenhorst probabilities (scaled)",
    )
)
fig.update_layout(
    template="plotly_dark",
    title=f"Rouwenhorst: transition from x = {float(x_rouw[n_points // 2]):.2f}"
    f" (ρ = {rho}, σ = {sigma_ar}, n = {n_points})",
    xaxis_title="x′",
    yaxis_title="Density / Scaled probability",
    height=400,
    width=700,
)
fig.show()
Loading...

Same layout as the Tauchen transition plot above: Rouwenhorst probabilities from the middle state (scaled by 1/s1/s) overlaid on the true conditional density. The Rouwenhorst grid is wider than Tauchen’s (half-span σxn1\sigma_x\sqrt{n-1} vs mσxm\sigma_x), so the bars extend further into the tails. At moderate persistence the visual fit is similar to Tauchen, but the exact moment-matching properties (shown next) make a real difference at high ρ\rho.

Source
# High-persistence case: rho = 0.98
rho_hi = 0.98
mu_x_hi, sigma_x_hi = ar1_moments(rho_hi, sigma_ar, mu_ar)

# Tauchen
x_max_hi = n_std_ar * sigma_x_hi
x_t_hi = jnp.linspace(mu_x_hi - x_max_hi, mu_x_hi + x_max_hi, n_points)
step_t_hi = (2 * x_max_hi) / (n_points - 1)
z_hi = x_t_hi[None, :] - rho_hi * x_t_hi[:, None]
upper_hi = Phi((z_hi + 0.5 * step_t_hi - mu_ar) / sigma_ar)
lower_hi = Phi((z_hi - 0.5 * step_t_hi - mu_ar) / sigma_ar)
P_t_hi = upper_hi - lower_hi
P_t_hi = P_t_hi.at[:, 0].set(upper_hi[:, 0])
P_t_hi = P_t_hi.at[:, -1].set(1 - lower_hi[:, -1])

# Rouwenhorst
nu_r_hi = jnp.sqrt((n_points - 1) / (1 - rho_hi**2)) * sigma_ar
x_r_hi = jnp.linspace(mu_x_hi - nu_r_hi, mu_x_hi + nu_r_hi, n_points)
step_r_hi = x_r_hi[1] - x_r_hi[0]
q_hi = (rho_hi + 1) / 2
C_hi = jnp.array([[comb(nn, kk) for kk in range(n_points)] for nn in range(n_points)])
i_hi = jnp.arange(n_points)[:, None, None]
j_hi = jnp.arange(n_points)[None, :, None]
k_hi = jnp.arange(n_points)[None, None, :]
valid_hi = (k_hi >= jnp.maximum(0, i_hi + j_hi - n_points + 1)) & (
    k_hi <= jnp.minimum(i_hi, j_hi)
)
k_s_hi = jnp.where(valid_hi, k_hi, 0)
jmk_hi = jnp.where(valid_hi, j_hi - k_hi, 0)
terms_hi = (
    C_hi[i_hi, k_s_hi]
    * C_hi[n_points - 1 - i_hi, jmk_hi]
    * q_hi ** (n_points - 1 - i_hi - j_hi + 2 * k_s_hi)
    * (1 - q_hi) ** (i_hi + j_hi - 2 * k_s_hi)
)
P_r_hi = jnp.where(valid_hi, terms_hi, 0.0).sum(axis=-1)

# Stationary distributions
psi_t_hi = jnp.ones(n_points) / n_points
psi_r_hi = jnp.ones(n_points) / n_points
for _ in range(1000):
    psi_t_hi = psi_t_hi @ P_t_hi
    psi_r_hi = psi_r_hi @ P_r_hi

# Theoretical density
x_range = max(float(x_max_hi), float(nu_r_hi)) * 1.3
x_stat_hi = jnp.linspace(-x_range, x_range, 300)
pdf_stat_hi = jnp.exp(-0.5 * ((x_stat_hi - mu_x_hi) / sigma_x_hi) ** 2) / (
    sigma_x_hi * jnp.sqrt(2 * jnp.pi)
)

# Use x_t_hi as the common x-axis for both bar sets
fig = go.Figure()
fig.add_trace(
    go.Scatter(
        x=np.asarray(x_stat_hi),
        y=np.asarray(pdf_stat_hi),
        mode="lines",
        line={"color": "gray", "width": 1},
        name="N(\u03bcₓ, σₓ²)",
    )
)
fig.add_trace(
    go.Bar(
        x=np.asarray(x_t_hi),
        y=np.asarray(psi_t_hi / step_t_hi),
        width=float(step_t_hi) * 0.85,
        marker_color=blue,
        opacity=0.5,
        name="Tauchen",
    )
)
fig.add_trace(
    go.Bar(
        x=np.asarray(x_r_hi),
        y=np.asarray(psi_r_hi / step_r_hi),
        width=float(step_r_hi) * 0.85,
        marker_color=green,
        opacity=0.5,
        name="Rouwenhorst",
    )
)
fig.update_layout(
    template="plotly_dark",
    title=f"Stationary distributions (ρ = {rho_hi}, σ = {sigma_ar}, n = {n_points})",
    xaxis_title="x",
    yaxis_title="Density",
    barmode="overlay",
    height=400,
    width=700,
)
fig.show()
Loading...

With ρ=0.98\rho = 0.98 and only 5 grid points, Tauchen’s stationary distribution (blue) concentrates too much mass on the extreme states — the equally spaced grid cannot span the wide unconditional distribution while also resolving the interior. Rouwenhorst (green) uses a wider grid (σxn1\sigma_x\sqrt{n-1}) with exact moment matching and produces a much smoother approximation of the Gaussian target.

Source
# Conditional mean errors: Tauchen vs Rouwenhorst
cond_mean_t = P_tauchen @ x_tauchen
true_cond_mean_t = rho * x_tauchen + mu_ar
mean_err_t = jnp.abs(cond_mean_t - true_cond_mean_t)
cond_mean_r = P_rouw @ x_rouw
true_cond_mean_r = rho * x_rouw + mu_ar
mean_err_r = jnp.abs(cond_mean_r - true_cond_mean_r)
fig = go.Figure()
fig.add_trace(
    go.Bar(
        x=np.arange(n_points),
        y=np.asarray(mean_err_t),
        marker_color=blue,
        opacity=0.7,
        name="Tauchen",
    )
)
fig.add_trace(
    go.Bar(
        x=np.arange(n_points),
        y=np.asarray(mean_err_r),
        marker_color=green,
        opacity=0.7,
        name="Rouwenhorst",
    )
)
fig.update_layout(
    template="plotly_dark",
    title=f"Conditional mean error (ρ = {rho}, σ = {sigma_ar}, n = {n_points})",
    xaxis_title="Grid point index i",
    yaxis_title="Absolute error",
    yaxis_type="log",
    yaxis_exponentformat="e",
    yaxis_range=[-16, None],
    barmode="overlay",
    height=400,
    width=700,
)
fig.show()
Loading...

The plot shows the absolute error in the conditional mean at each grid point xix_i:

jPijxj    (ρxi+b)\bigl|\,\textstyle\sum_j P_{ij}\, x_j \;-\; (\rho\, x_i + b)\,\bigr|

This is the difference between the Markov chain’s expected next-period value and the true conditional mean of the AR(1) process. Tauchen only approximates the conditional mean, with errors around 1e-2 that vary across grid points. Rouwenhorst matches it exactly in exact arithmetic Kopecky & Suen, 2010, Proposition 1; the residual errors visible here (~1e-15) are floating-point roundoff at double-precision machine epsilon.

Non-Normal Innovations (FGP Section 4.3)

When εt\varepsilon_t has a mixture-of-normals distribution:

εtp1N(μ1,σ12)+(1p1)N(μ2,σ22)\varepsilon_t \sim p_1 \, N(\mu_1, \sigma_1^2) + (1 - p_1) \, N(\mu_2, \sigma_2^2)

the only change to Tauchen’s method is replacing Φ\Phi with the mixture CDF Fella et al., 2019, Section 4.3:

Fε(x)=p1Φ ⁣(xμ1σ1)+(1p1)Φ ⁣(xμ2σ2)F_\varepsilon(x) = p_1 \, \Phi\!\left(\frac{x - \mu_1}{\sigma_1}\right) + (1 - p_1) \, \Phi\!\left(\frac{x - \mu_2}{\sigma_2}\right)

The grid spans ±mσx\pm m\,\sigma_x as before, with the unconditional variance computed from the mixture:

Var(ε)=p1(σ12+μ12)+(1p1)(σ22+μ22)[p1μ1+(1p1)μ2]2\text{Var}(\varepsilon) = p_1(\sigma_1^2 + \mu_1^2) + (1 - p_1)(\sigma_2^2 + \mu_2^2) - [p_1\,\mu_1 + (1 - p_1)\,\mu_2]^2

The example below uses p1=0.9p_1 = 0.9, μ1=0\mu_1 = 0, σ1=0.1\sigma_1 = 0.1 for a tight “normal times” component and μ2=0.5\mu_2 = -0.5, σ2=0.3\sigma_2 = 0.3 for a rare bad-shock component — think of a 10 % probability of job loss, where the new job typically pays less and the outcome is more uncertain. This produces a distribution with a heavier left tail but a lighter right tail than a Gaussian with the same variance: large negative shocks are more likely, large positive shocks less likely.

Reference: _base._mixture_cdf in _base.py and ar1.TauchenNormalMixture.compute_transition_probs in ar1.py. The IID variant iid.NormalMixture uses the same mixture CDF for CDF-binning weights without persistence.

Source
p1, mu1, sigma1 = 0.9, 0.0, 0.1
mu2, sigma2 = -0.5, 0.3
x_pdf = jnp.linspace(-1.5, 1.0, 300)

f_mix = p1 * phi_pdf(x_pdf, loc=mu1, scale=sigma1) + (1 - p1) * phi_pdf(
    x_pdf, loc=mu2, scale=sigma2
)

# Normal PDF with same mean and variance
mean_eps = p1 * mu1 + (1 - p1) * mu2
var_eps = p1 * (sigma1**2 + mu1**2) + (1 - p1) * (sigma2**2 + mu2**2) - mean_eps**2
std_eps = jnp.sqrt(var_eps)
f_normal = phi_pdf(x_pdf, loc=mean_eps, scale=std_eps)
fig = go.Figure()
fig.add_trace(
    go.Scatter(
        x=np.asarray(x_pdf),
        y=np.asarray(f_normal),
        mode="lines",
        line={"color": "gray", "width": 1},
        name="Normal (same mean/var)",
    )
)
fig.add_trace(
    go.Scatter(
        x=np.asarray(x_pdf),
        y=np.asarray(f_mix),
        mode="lines",
        line={"color": orange, "width": 2},
        name="Mixture PDF",
    )
)
fig.update_layout(
    template="plotly_dark",
    title=(
        f"Mixture vs Normal PDF (p\u2081 = {p1}, \u03bc\u2081 = {mu1},"
        f" σ\u2081 = {sigma1}, \u03bc\u2082 = {mu2}, σ\u2082 = {sigma2})"
    ),
    xaxis_title="ε",
    yaxis_title="f(ε)",
    height=400,
    width=700,
)
fig.show()
Loading...

The gray curve is a normal PDF with the same mean and variance as the mixture, computed from the mixture parameters: με=p1μ1+(1p1)μ2\mu_\varepsilon = p_1 \mu_1 + (1-p_1)\mu_2 and Var(ε)=p1(σ12+μ12)+(1p1)(σ22+μ22)με2\text{Var}(\varepsilon) = p_1(\sigma_1^2 + \mu_1^2) + (1-p_1)(\sigma_2^2 + \mu_2^2) - \mu_\varepsilon^2. Any difference between the two curves is therefore purely due to higher-order moments (skewness, kurtosis). The mixture has a sharper central peak (from the tight majority component) and a heavier left tail (from the wide, negatively shifted minority component), but a lighter right tail — large positive shocks are less likely than under a Gaussian with the same variance.

Source
# Mixture transition matrix vs normal-innovation Tauchen
rho_mix, mu_mix, n_std_mix = 0.85, 0.0, 3
sigma_eps_sq = p1 * (sigma1**2 + mu1**2) + (1 - p1) * (sigma2**2 + mu2**2) - mean_eps**2
std_y_mix = jnp.sqrt(sigma_eps_sq / (1 - rho_mix**2))
long_run_mean_mix = (mu_mix + mean_eps) / (1 - rho_mix)
x_max_mix = n_std_mix * std_y_mix
x_mix = jnp.linspace(
    long_run_mean_mix - x_max_mix, long_run_mean_mix + x_max_mix, n_points
)
step_mix = (2 * x_max_mix) / (n_points - 1)
half_step_mix = 0.5 * step_mix

# Mixture transition matrix
z_mix = x_mix[None, :] - mu_mix - rho_mix * x_mix[:, None]
upper_mix = p1 * Phi((z_mix + half_step_mix - mu1) / sigma1) + (1 - p1) * Phi(
    (z_mix + half_step_mix - mu2) / sigma2
)
lower_mix = p1 * Phi((z_mix - half_step_mix - mu1) / sigma1) + (1 - p1) * Phi(
    (z_mix - half_step_mix - mu2) / sigma2
)
P_mix = upper_mix - lower_mix
P_mix = P_mix.at[:, 0].set(upper_mix[:, 0])
P_mix = P_mix.at[:, -1].set(1 - lower_mix[:, -1])

# Normal-innovation Tauchen (same unconditional std for comparison)
sigma_normal_equiv = jnp.sqrt(sigma_eps_sq)
mu_x_norm, sigma_x_norm = ar1_moments(rho_mix, sigma_normal_equiv, mu_mix)
x_max_norm = n_std_mix * sigma_x_norm
x_norm = jnp.linspace(mu_x_norm - x_max_norm, mu_x_norm + x_max_norm, n_points)
step_norm = (2 * x_max_norm) / (n_points - 1)
z_norm = x_norm[None, :] - rho_mix * x_norm[:, None]
upper_norm = Phi((z_norm + 0.5 * step_norm - mu_mix) / sigma_normal_equiv)
lower_norm = Phi((z_norm - 0.5 * step_norm - mu_mix) / sigma_normal_equiv)
P_norm = upper_norm - lower_norm
P_norm = P_norm.at[:, 0].set(upper_norm[:, 0])
P_norm = P_norm.at[:, -1].set(1 - lower_norm[:, -1])

# Compare one row (middle state)
row_idx_mix = n_points // 2
fig = go.Figure()
fig.add_trace(
    go.Bar(
        x=np.arange(n_points),
        y=np.asarray(P_norm[row_idx_mix]),
        marker_color=blue,
        opacity=0.7,
        name="Normal innovations",
    )
)
fig.add_trace(
    go.Bar(
        x=np.arange(n_points),
        y=np.asarray(P_mix[row_idx_mix]),
        marker_color=orange,
        opacity=0.7,
        name="Mixture innovations",
    )
)
fig.update_layout(
    template="plotly_dark",
    title=f"Transition from middle state (ρ = {rho_mix}, n = {n_points})",
    xaxis_title="Target state index j",
    yaxis_title="P[i, j]",
    barmode="overlay",
    height=400,
    width=700,
)
fig.show()
Loading...

Both transition matrices use the same grid and the same unconditional variance, so they are directly comparable. The mixture-innovation row (orange) puts more probability on low target states and less on high target states than the normal-innovation row (blue). This reflects the left-skewed innovation distribution: conditional on the current state, large negative transitions are more likely and large positive transitions less likely than under a symmetric Gaussian. In a lifecycle context, this asymmetry matters beyond what the variance alone captures — agents face a higher probability of sharp income drops (job loss) with no corresponding upside, which can strengthen the precautionary savings motive.

References
  1. Sargent, T. J., & Stachurski, J. (2024). Dynamic Programming. QuantEcon. https://dp.quantecon.org/
  2. Fella, G., Gallipoli, G., & Pan, J. (2019). Markov-Chain Approximations for Life-Cycle Models. Review of Economic Dynamics, 34, 183–201. 10.1016/j.red.2019.03.013
  3. Kopecky, K. A., & Suen, R. M. H. (2010). Finite State Markov-Chain Approximations to Highly Persistent Processes. Review of Economic Dynamics, 13(3), 701–714. 10.1016/j.red.2010.02.002