From Stylised Facts to GARCH
Volatility : the tendency of asset prices to fluctuate : matters for:
Yet volatility is not directly observable : we must estimate it from price data.
By the end of this session, you should be able to:
Volatility is not merely hard to measure : it is fundamentally unobservable.
We have a sequence of daily returns. From these, we must infer the hidden variance process that generated them.
Every GARCH estimate carries irreducible inference error. This is not a limitation of the model : it is the nature of the problem.
GDP is measured with error : but it exists. In principle, sum every transaction and you have it.
Volatility \(\sigma_t\) is different. No amount of better data reveals it.
| Concept | Type of Problem |
|---|---|
| GDP | Measurement error : the true value exists; our instruments are imperfect |
| Inflation | Index construction : definition matters, but price changes are observable |
| \(\sigma_t\) | Latent state : the concept itself is model-dependent; no true value exists independently |
Every GARCH estimate is a belief, not a fact. Two economists with identical data but different models produce different estimates : and neither is wrong.
Here the textbook picture breaks down. GARCH measures volatility : but in modern financial systems, it also causes it.
| Step | What Happens |
|---|---|
| 1. Shock | Adverse move raises GARCH-estimated volatility |
| 2. Signal | VaR limits are breached; positions must be cut |
| 3. Deleveraging | Institutions sell simultaneously : all running the same models |
| 4. Price impact | Coordinated selling drives prices lower |
| 5. Amplification | Lower prices generate new shocks → back to Step 1 |
Danielsson (2002) called this the emperor’s dilemma: the map changes the territory.
When all institutions share the same risk model, the model becomes the systemic risk (Danielsson, Shin, and Zigrand 2012).
Each episode shares the same amplification logic : different costumes, same mechanism.
2008 Global Financial Crisis
Bank VaR models simultaneously breached limits → mass deleveraging → forced selling of CDOs, MBS, and equities → further price falls → VaR breach again. The system tightened precisely when it needed to breathe.
February 2018 : “Volmageddon”
Short-volatility ETNs (notably XIV) were contractually required to buy VIX futures when volatility spiked : a death spiral engineered into the product’s own hedging rules. XIV lost 93% of its value in a single session.
March 2020 : COVID Crash
Margin calls forced simultaneous selling across equities, credit, and gold : assets with no fundamental correlation. The cross-asset contagion was a liquidity crisis, not a fundamental one. Correlations went to one.
The financial system has embedded a regulatory structure that forces procyclical behaviour during the very crises it is designed to contain.
Note
Macroprudential tools such as the Countercyclical Capital Buffer (CCyB) are a partial answer : releasing capital requirements in a downturn to reverse the deleveraging signal. The Bank of England deployed exactly this in March 2020.
We have now established three foundational arguments:
The rest of this session builds the tools : ARCH, GARCH, and their asymmetric extensions : that allow us to model, estimate, and forecast that hidden state. Understanding where these tools work well, and where the endogenous risk argument says they will fall short, is the animating question throughout.
Last week we modelled the mean of returns using ARIMA.
This week we model the variance : which turns out to be predictable even when returns are not.
| Week 3 Concept | Week 4 Extension |
|---|---|
| Stationarity | Conditional vs unconditional variance |
| ACF of returns | ACF of squared returns |
| ARMA for mean | GARCH for variance |
| AR(p) = lagged values | ARCH(q) = lagged squared errors |
Key insight: Returns show no autocorrelation (efficient market), but volatility clusters.
Before building models, understand what we’re trying to capture:
These are empirical facts, not assumptions.
Tsay (2010): “Volatility is not constant over time. There are periods of high volatility alternating with periods of relative calm.”
Returns show no autocorrelation (consistent with efficiency). Squared returns show strong autocorrelation (volatility persistence).
Financial returns have fatter tails than Normal at every horizon examined.
The practitioner fix is to assume Student-\(t\) or GED (Generalised Error Distribution) innovations. The \(t\)-distribution with 5–8 degrees of freedom fits most equity series well. This is the default in Bloomberg’s GARCH estimation.
Negative returns increase volatility more than positive returns of the same magnitude.
First documented by Black (1976).
| Theory | Mechanism |
|---|---|
| Leverage hypothesis | Price falls → debt/equity rises → firm riskier |
| Volatility feedback | Expected volatility up → required return up → price falls |
| Risk premium | Higher expected volatility → higher risk premium |
| Behavioural | Investors react more strongly to losses |
| Margin constraints | Downturns trigger margin calls, forced selling |
All likely operate simultaneously.
Engle (1982) asked: What if variance depends on recent shocks?
\[r_t = \mu + \varepsilon_t, \quad \varepsilon_t = \sigma_t z_t, \quad z_t \sim N(0,1)\]
\[\sigma^2_t = \alpha_0 + \alpha_1 \varepsilon^2_{t-1} + \cdots + \alpha_q \varepsilon^2_{t-q}\]
Interpretation: Today’s volatility depends on recent surprises.
| Component | Meaning |
|---|---|
| \(\sigma^2_t\) | Conditional variance at time \(t\) |
| \(\alpha_0\) | Baseline variance |
| \(\alpha_i\) | Weight on shock from \(i\) periods ago |
| \(\varepsilon^2_{t-i}\) | Past squared shocks |
Problem: Need many lags to capture persistence → many parameters.
Bollerslev (1986) extended ARCH by including lagged conditional variances:
\[\sigma^2_t = \alpha_0 + \alpha_1 \varepsilon^2_{t-1} + \beta_1 \sigma^2_{t-1}\]
This is GARCH(1,1) : one ARCH term, one GARCH term.
Brooks (2019): “A GARCH(1,1) model will be sufficient to capture the volatility clustering in the data.”
| Parameter | Name | Interpretation |
|---|---|---|
| \(\alpha_0\) | Constant | Long-run variance floor |
| \(\alpha_1\) | ARCH term | Reaction to recent shocks |
| \(\beta_1\) | GARCH term | Persistence of volatility |
| \(\alpha_1 + \beta_1\) | Persistence | How long shocks affect volatility |
Tip
For most financial assets, \(\alpha_1 + \beta_1\) is close to (but less than) 1. Values above 0.9 are typical.
Typical GARCH(1,1) estimates on daily returns from major markets:
| Asset | \(\hat{\alpha}\) (ARCH) | \(\hat{\beta}\) (GARCH) | \(\hat{\alpha}+\hat{\beta}\) |
|---|---|---|---|
| S&P 500 | 0.07–0.09 | 0.88–0.91 | ~0.97 |
| FTSE 100 | 0.07–0.10 | 0.87–0.90 | ~0.97 |
| EUR/USD | 0.04–0.06 | 0.92–0.94 | ~0.98 |
| Gold | 0.04–0.06 | 0.91–0.93 | ~0.97 |
High persistence (\(\hat{\alpha}+\hat{\beta} \approx 0.97\)) means a shock today still explains roughly 40% of excess volatility three weeks later.
During the 2020 COVID crisis, S&P 500 persistence estimates briefly approached 0.999 : not because the true process changed, but because extreme observations dominated the likelihood. This is the IGARCH warning signal in real data.
The unconditional variance (when \(\alpha_1 + \beta_1 < 1\)):
\[\sigma^2 = \frac{\alpha_0}{1 - \alpha_1 - \beta_1}\]
Once fitted, GARCH(1,1) produces \(h\)-step ahead forecasts through a simple mean-reversion recursion.
Let \(\bar{\sigma}^2 = \alpha_0 / (1 - \alpha_1 - \beta_1)\) be the long-run variance. Then:
\[E_t\left[\sigma^2_{t+h}\right] = \bar{\sigma}^2 + (\alpha_1 + \beta_1)^h \left(\sigma^2_t - \bar{\sigma}^2\right)\]
With persistence \(\alpha_1 + \beta_1 = 0.97\) and today’s variance doubled:
| Horizon | Excess variance remaining |
|---|---|
| 1 day | 97% |
| 1 week (5 days) | 86% |
| 1 month (21 days) | 53% |
| 3 months (63 days) | 15% |
| 6 months (126 days) | 2% |
A shock in a high-persistence market takes months to dissipate : not days. This is why central banks and regulators focus on persistence as a systemic indicator.
Parameters: ω=0.00001, α=0.08, β=0.90
Persistence: α+β = 0.98
Unconditional std: 2.24%
GARCH parameters are not computed from a formula : they are searched for numerically. The approach is Maximum Likelihood Estimation: find the parameters that make the observed return sequence most probable.
For GARCH with Normal innovations, the log-likelihood is:
\[\ell(\theta) = -\frac{1}{2} \sum_{t=1}^{T} \left[ \ln(2\pi) + \ln(\sigma^2_t(\theta)) + \frac{\varepsilon^2_t}{\sigma^2_t(\theta)} \right]\]
At each \(t\), the model asks: how surprising was this return, given what I estimated the variance to be?
A \(-5\%\) return when \(\hat{\sigma}_t = 1\%\) contributes enormous negative log-likelihood. The same return when \(\hat{\sigma}_t = 5\%\) contributes far less : the model was already expecting turbulence.
MLE is clean in theory. On real financial data, practitioners encounter:
Practitioner discipline: always try multiple starting values; constrain \(\alpha + \beta < 1\) explicitly; compare estimates across sub-samples. Two practitioners with the same data but different software can report different parameters : and both may be at local optima.
A fitted GARCH model must pass two tests before being trusted.
Ljung-Box test on standardised residuals
If GARCH has captured all the volatility clustering, the standardised residuals \(\hat{z}_t = \hat{\varepsilon}_t / \hat{\sigma}_t\) should be approximately i.i.d. Test both \(\hat{z}_t\) (mean dynamics) and \(\hat{z}_t^2\) (variance dynamics). Significant autocorrelation in \(\hat{z}_t^2\) means the model has not captured all clustering : consider a higher-order specification or an asymmetric extension.
AIC and BIC for model comparison
When choosing between competing specifications:
\[\text{AIC} = -2\hat{\ell} + 2k \qquad \text{BIC} = -2\hat{\ell} + k\ln(T)\]
where \(\hat{\ell}\) is the maximised log-likelihood and \(k\) is the number of parameters. BIC penalises complexity more heavily : for most daily equity series, GARCH(1,1) wins the BIC competition against higher-order alternatives.
Practical workflow: fit GARCH(1,1) → check Ljung-Box on \(\hat{z}_t^2\) → if autocorrelation remains, add asymmetry → compare AIC/BIC → stop when residuals are clean.
Standard GARCH treats positive and negative shocks symmetrically.
But the leverage effect says this is wrong.
Solutions:
Glosten, Jagannathan, and Runkle (1993) add an indicator for negative shocks:
\[\sigma^2_t = \alpha_0 + (\alpha_1 + \gamma I_{t-1}) \varepsilon^2_{t-1} + \beta_1 \sigma^2_{t-1}\]
Where \(I_{t-1} = 1\) if \(\varepsilon_{t-1} < 0\).
| Parameter | Interpretation |
|---|---|
| \(\gamma\) | Additional impact of bad news |
| \(\gamma > 0\) | Leverage effect present |
| \(\gamma = 0\) | Symmetric GARCH |
Nelson (1991) use logarithms:
\[\ln(\sigma^2_t) = \alpha_0 + \alpha_1 \left| \frac{\varepsilon_{t-1}}{\sigma_{t-1}} \right| + \gamma \frac{\varepsilon_{t-1}}{\sigma_{t-1}} + \beta_1 \ln(\sigma^2_{t-1})\]
Advantages:
The GARCH-in-Mean model embeds a direct test of the risk-return trade-off:
\[r_t = \mu + \delta \sigma_{t-1} + \varepsilon_t\]
If \(\delta > 0\): higher expected volatility commands higher expected returns. The empirical evidence is genuinely mixed : and the reason it is mixed is instructive:
A more powerful approach uses the variance risk premium : the spread between implied (VIX²) and expected realised variance. Bollerslev, Tauchen, and Zhou (2009) show this predicts future excess returns with \(R^2\) around 4–7% at quarterly horizons, substantially stronger than GARCH-M’s \(\delta\).
When \(\alpha_1 + \beta_1 = 1\), we have Integrated GARCH : shocks persist forever, and the unconditional variance no longer exists.
This sounds alarming. In practice, it is almost always a diagnostic signal rather than a true finding:
If your estimated \(\hat{\alpha}+\hat{\beta} \geq 0.999\), the first question is not “is this IGARCH?” but “is there a structural break in my sample?”
For portfolios, we need covariances, not just variances.
Dynamic Conditional Correlation (DCC) (Engle 2002):
\[H_t = D_t R_t D_t\]
where \(D_t\) is a diagonal matrix of GARCH-estimated volatilities and \(R_t\) is the time-varying correlation matrix.
In your Bloomberg lab, you observed the practical implication directly: the correlation between equities and gold was negative in January 2020 and strongly positive at the peak of the COVID sell-off. A static correlation matrix used in portfolio optimisation would have been deeply misleading. DCC captures exactly this instability.
No model is universally reliable. GARCH(1,1) has well-documented failure modes that every practitioner should know before relying on it.
| Failure Mode | The Problem | Practitioner Response |
|---|---|---|
| Regime changes | One parameter set cannot fit both calm and crisis | Rolling windows; Markov regime-switching |
| Jump risk | GARCH is a diffusion model; overnight gaps are not captured | Jump-GARCH extensions |
| Illiquid markets | Non-trading creates spurious return autocorrelation | Higher-frequency data; bid-ask correction |
| Long-sample contamination | Crises push persistence to near-1, dominating the likelihood | Sub-sample estimation; crisis dummies |
| Residual fat tails | Standardised GARCH residuals remain leptokurtic under Normal assumption | Student-\(t\) or GED error distributions |
The right response is not to abandon GARCH : it is to know when each failure mode is active.
VIX = market’s expectation of 30-day volatility (from options)
Realised volatility = actual volatility observed over 30 days
Comparing them reveals:
The premium is positive on average : but inverts sharply during crisis episodes. The simulated series below illustrates both regimes; the real data in your Bloomberg lab will show the same pattern in 2008, 2018, and 2020.
Pre-crisis mean premium: -77.3%
Crisis-period mean premium: -145.4%
Days premium negative (full sample): 99%
Volatility modelling is fundamentally about quantifying uncertainty:
This connects directly to Week 1’s theme: data science as the study of variation and uncertainty.
GARCH models are parametric : they assume a specific functional form.
Modern extensions relax this:
| Classical | ML Extension |
|---|---|
| GARCH | Neural network volatility models |
| AR for returns | RNNs, LSTMs for sequences |
| Regime switching | Hidden Markov Models |
| Linear time series | Transformer architectures |
The principles (stationarity, persistence, asymmetry) remain the same.
Classical GARCH assumes a specific functional form for how past shocks feed into future variance. Can sequence learning do better?
| GARCH Approach | Sequence Learning Alternative |
|---|---|
| GARCH(1,1) with fixed parameters | LSTM that learns volatility dynamics |
| Regime-switching GARCH | Attention mechanisms for regime detection |
| Assumes specific functional form | Learns flexible non-linear mappings |
| Interpretable parameters (\(\alpha\), \(\beta\)) | Black-box but potentially more flexible |
| 3–5 parameters | Hundreds to thousands of parameters |
In rigorous out-of-sample forecasting comparisons, simple GARCH models regularly match or outperform LSTM-based alternatives : the Makridakis principle applies: simpler models generalise better on shorter samples with high noise-to-signal ratios.
GARCH advantages:
Sequence learning advantages:
The choice depends on your objectives: explanation vs pure prediction, data availability, whether interpretability is a regulatory or communication requirement. In most practitioner contexts, GARCH is the baseline that ML must demonstrably beat : and often does not.
Homework (Colab : complete before class):
Simulation-first approach: fit GARCH(1,1) to synthetic data where you know the true parameters, then apply the same workflow to real SPY returns from the shared Bloomberg database. Compare symmetric vs GJR-GARCH and evaluate model diagnostics.
In-Class Lab (Bloomberg Terminal : forensic investigation):
Working across three crisis episodes, you will use Bloomberg to ask whether the endogenous risk feedback discussed in the lecture is detectable in real data:
Core reading and tasks
Read Tsay (2010) Chapter 3 (volatility models) and Brooks (2019) Chapter 9 (GARCH in practice). Complete the homework lab before the Bloomberg session : the in-class investigation builds directly on it.
For the Bloomberg session
Before you arrive, form a view on each episode: which do you expect to show the highest persistence, the largest VRP sign reversal, and the clearest cross-asset contagion? Having a prior makes the data analysis active rather than passive.
Optional extension
Read Danielsson (2002) in full : it is short (25 pages) and more readable than most academic finance papers. Note where his 2002 critique did and did not anticipate the episodes you examined in the lab.
Answer these before you leave : one sentence each is sufficient:
Latency: Why is estimating \(\sigma_t\) fundamentally different from measuring GDP? What does “irreducible inference error” mean in practice?
Persistence: Your GARCH estimate gives \(\hat{\alpha} + \hat{\beta} = 0.97\). A shock today doubles the conditional variance. Roughly how many trading days until variance returns to within 10% of its long-run level?
The feedback loop: Identify the single step in the endogenous risk spiral that regulatory capital reform could most plausibly interrupt. Justify your choice in one sentence.
Failure modes: You are fitting GARCH to a small-cap UK equity series from 2005 to 2024. Name the two failure modes from today’s session most likely to affect your estimates.
FinTech & Data Science