METHODOLOGY

How we calculate
market risk.

Four layers. 23 KPIs, 38 sub-scores. All thresholds, weights and formulas — public.

This page documents the complete methodology behind the Boiling Frog risk score. We publish every weight, every threshold and every data source. Anyone with a spreadsheet can recompute our results.

Methodological framework

Before diving into the layers — a short note on methodology. Boiling Frog was not designed ad hoc. All thresholds, weights and calibrations are derived from established methods of statistical learning.

Reference

G. James, D. Witten, T. Hastie, R. Tibshirani — An Introduction to Statistical Learning, with Applications in R (ISL), 2nd Edition, Springer 2021. Four chapters are applied directly:

Chapter
Method
BFR application
ISL Ch. 4
Classification (Logistic Regression)
Score thresholds 30/41/58 derived empirically from P(red | score), not set heuristically.
ISL Ch. 5
Resampling (Walk-forward Validation)
Train 2010–2018, Test 2019–2026 — no data leakage between training and test period.
ISL Ch. 8
Ensemble (Weight Optimization)
Grid search over layer weights; 40/30/30 tested against alternatives.
ISL Ch. 12
Unsupervised Learning (GMM)
Independent confirmation of natural score clusters at 28.6 / 31.9 / 40.2.
Methodological discipline — the circular benchmark

VIX itself is an input to Layer C. A ROC-AUC analysis with VIX > 25 as crisis label is therefore partially circular: a higher Layer-C weight trivially improves apparent performance. We additionally use the forward drawdown (maximum S&P decline over the next 20 trading days) as a non-circular benchmark — and publish both results, even when they disagree.

Full backtest results for each of these chapters are documented in the backtest report.

The four layers

We split market risk into four independent layers. Each layer is computed separately, scored from 0 to 100, and then weighted into a single risk verdict. The weights are fixed and public.

Layer A
35 %
Macro

Structural macro stress: central bank balance sheet, interest rates, real rates, USD system role, sovereign debt actor risk.

Layer B
25 %
Politics

Geopolitical and policy events from RSS feeds, classified by severity and compound patterns. Decays over 48 hours.

Layer C
25 %
Markets

Live market mechanics: equity, gold, FX, technical regime detection. Includes crisis overlay for sharp moves.

Layer D
15 %
Energy

Energy market structure: oil, gas, electricity transmission, intraday dynamics, OVX volatility.

Layer A — Macro (35 %)

Layer A captures structural macro pressure. Seven KPIs with seventeen sub-scores in total, all sourced from public data (FRED, Treasury, TIC, H.4.1, ECB).

KPIs and weights inside Layer A
Debt Stress
0.15

Fed balance sheet (WALCL) and Treasury General Account (TGA). Measures liquidity drain.

Maturity Wall
0.20

10-year yield level, 2s10s inversion, yield momentum, real-rate estimate. Refinancing pressure on US Treasuries.

Real Rate Regime
0.15

Real interest rate estimate (10Y minus expected inflation).

USD System Role
0.09

DXY strength and USD momentum (Trade-Weighted Broad). RRP moved to Funding Stress in Phase 9 to avoid double-counting.

Actor Risk
0.16

Treasury auction bid-to-cover, indirect bidder share, TIC foreign holdings, custody (H.4.1). Foreign demand for US debt.

FX Stress
0.13

CNH 10-day return (capital-flight indicator) and TED-spread level (interbank stress). Phase 1B.3.

Funding Stress
0.12

Liquidity pressure in money markets — four sub-scores: DCPF3M−SOFR (USD corporate term funding), SOFR−EFFR (overnight bank stress), RRP volume momentum, Euribor 3M−€STR (EUR equivalent). Phase 9.

Sub-scores exposed in the dashboard (17, grouped)

Debt Stress: WALCL · TGA · Maturity/Rates: 10Y · 2s10s · yield momentum · real rate · USD System Role: DXY · USD momentum · Actor Risk: auction BTC · indirect % · TIC · custody H.4.1 · Funding Stress: CP−SOFR · SOFR−EFFR · RRP · Euribor−€STR

Layer B — Politics (25 %)

Layer B watches political and geopolitical events from 22 RSS feeds, classifies them by severity, and decays the impact over 48 hours.

Severity scale
LOW
3
MEDIUM
8
HIGH
20
CRITICAL
50

Only events with a base score of 15 or higher (HIGH/CRITICAL) enter the layer score.

CRITICAL compound patterns (these trigger the geo overlay)
israel_iran_strike

Israel/IDF + Iran/Tehran + airstrike/bomb/strike/attack

iran_military

Iran/Hormuz + carrier/fleet/strike/deploy/naval/troops

russia_nato_attack

Russia/Kremlin + NATO/Poland/Baltics/Finland + invade/attack/strike

nuclear_use

Nuclear/atom + use/deploy/launch/detonate/fired/explode

taiwan_threat

Taiwan + invasion/blockade/military/strike/attack/war

territorial_threat

Greenland + annex/acquire/military/invade

Decay function

Day 0 = +10 per event · Day 1 = +5 per event · Day 2+ = 0. Cap at +25 total.

Layer C — Markets (25 %)

Layer C reads live market mechanics from price data: equity, gold, FX, intraday volatility, and — since Phase 9 — the risk premium on corporate bonds (HY+IG OAS). The crisis overlay activates on sharp moves.

KPIs exposed in the dashboard
Equity Stress
0.2625

Drawdowns in S&P 500, Nasdaq, DAX, Euro STOXX 50 (20-day rolling-max distance).

Cross-Region Correlation
0.15

60-day correlation between regions — high correlation = risk-off synchronization.

Volatility Regime
0.15

VIX level + 5-day spike. Phase 9: weight reduced from 0.225 to 0.15 (anti-redundancy with Credit Stress).

Gold Deleveraging
0.1125

Gold selling during equity stress = liquidity crisis. 5-day net + 5-day peak drawdown.

Credit Stress
0.075

Premium on corporate bonds over Treasuries — 0.7 × HY OAS + 0.3 × IG OAS, all capped at 90. Phase 9.

Additional internal (not exposed in the drawer):

Regional Equity Stress (0.15) — Nikkei/EM/CSI/FTSE — and Commodity Stress (0.10) — Oil/Gas/Copper/Wheat — feed into the layer score without being shown as separate KPI cards.

Crisis overlay thresholds
Equity overlay

Triggers when any of the four main indices (S&P 500, Nasdaq 100, DAX, Euro STOXX 50) drops more than 7 % over 10 days. The worst crash determines the overlay magnitude (worst-of logic).

Gold overlay

Triggers when gold drops more than 5 % over 5 days.

Deleveraging signal

Both triggers active simultaneously: gold ≤ −5 % AND at least one equity index ≥ −7 % decline.

VIX acceleration (Phase 11)

1-day VIX jump > +12 points → overlay +15. Jump > +20 points → overlay +25. Very rare events (~5× in 20 years) signaling acute volatility eruptions — Volmageddon 2018-02-05 (+20pt → dark red), COVID 2020-03-12 (+18pt).

HY-OAS velocity (Phase 11)

ICE BofA HY-OAS series widens by +50 bps in 5 trading days → overlay +12. Widens by +100 bps → overlay +25. Gated on equity_stress > 30 (prevents false positives from isolated junk refinancing worries). HY-OAS led the VIX in 2008/2020/2023 — Lehman, COVID, SVB bank run are correctly captured in the backtest with this trigger.

Layer D — Energy (15 %)

Layer D models energy market stress. Six sub-layers feed into the layer score.

Structure

Oil/gas market structure (contango/backwardation).

Market Structure

OVX (oil VIX), implied volatility regime.

Price Dynamics

Brent/WTI returns, momentum, regime breaks.

Event

Energy-related shocks (production cuts, pipeline events).

Intraday

Intraday Brent moves outside normal ranges.

Power Transmission

ENTSO-E electricity grid stress (continental Europe).

Layer D is active from 2026-04-18. Historical backfills before this date use the legacy three-layer formula (Macro 40 % · Politics 30 % · Markets 30 %).

Overlays

Three overlays modulate the aggregated score. They are additive on top of the weighted base score, and they are transparent — every active overlay is shown in the dashboard with its value and reason.

Geo Overlay

Up to +25 points when CRITICAL compound patterns from Layer B fire within the last 48 hours. Diplomatic statements do not trigger — only military escalation.

Channel Overlay

Boost when any single risk channel exceeds 70: channel_boost = min(15, max(0, max_channel − 70) × 0.5). Up to +15 points.

Dominant Floor

Layer B ≥ 50: factor 0.90 — Layer C ≥ 50: factor 0.90 — Layer A ≥ 80 (only structural extreme): factor 0.80. Prevents undershooting when one layer is in clear distress.

Aggregation formula

The complete daily risk score is computed in this order:

  1. 1.base = 0.35 × LayerA + 0.25 × LayerB + 0.25 × LayerC + 0.15 × LayerD
  2. 2.score = max(base, dominant_floor × max(LayerA, LayerB, LayerC, LayerD))
  3. 3.score += geo_overlay (0–25)
  4. 4.score += channel_overlay (0–15)
  5. 5.score = clamp(score, 0, 100)

Before 2026-04-18, the legacy formula 0.40 × A + 0.30 × B + 0.30 × C is used (no Layer D, no channel overlay).

Drawdown forecast

The dashboard translates the risk score into an expected drawdown range over the next 4–6 weeks. The bands are calibrated against 15+ years of market history and validated by a multinomial logistic regression on 3,872 trading days.

Score → status → expected drawdown
Score
Status
Range
Interpretation
0–30
Green
−3 % to 0 %
Negligible — markets stable
31–41
Yellow
−12 % to −3 %
Moderate — watch closely
42–58
Red
−30 % to −12 %
Elevated — consider response
59–100
Dark Red
−55 % to −30 %
Extreme — systemic stress
Calibration anchors

Score 33 ≈ 12 % drawdown (2018 correction) · Score 66 ≈ 30 % drawdown (COVID crash) · Score 85+ ≈ 40–50 % drawdown (2008 crisis). Linear interpolation between anchors.

The drawdown range is a market-wide expectation derived from the aggregated score, not a portfolio-specific forecast. It does not constitute investment advice.

Dominant risk channel

The score is decomposed into five risk channels. Each channel aggregates specific sub-KPIs across layers. The dominant channel is the highest-scoring one and answers: "Where does today's risk come from?"

Growth

Equity stress (0.40) · Cross-region correlation (0.30) · Real-rate regime (0.30). Active when the world economy is decelerating or recession risk is rising.

Rates

Maturity wall (0.50) · Real-rate regime (0.50). Active when refinancing pressure or yield-curve regime breaks dominate.

Politics

Layer B score directly. Composed of 5 categories: geopolitics (0.30) · corporate events (0.25) · tariffs (0.20) · alliance shifts (0.15) · sanctions (0.10).

Liquidity

Debt stress (0.35) · Gold deleveraging (0.35) · Actor risk (0.15) · Volatility regime (0.15). Active when cash and balance-sheet pressure dominate.

System

USD system role (0.30) · Actor risk (0.20) · Cross-layer systemic factor (0.70 if average of A/B/C all exceed thresholds). Active when trust in the system itself is at risk.

Confidence

confidence = min(1.0, 0.5 + margin)

Where margin = (highest_score − second_highest_score) / highest_score. Range: [0.5, 1.0]. Lower bound 0.5 means: even with two equal channels we report at least 50 % confidence — full confidence only when one channel is clearly leading.

Trend analysis

For each tracked asset (Gold, S&P 500, Nasdaq 100, DAX, Euro STOXX 50) we run a per-asset technical analysis. The dashboard shows the most-stressed asset with: recent return, technical signals (EMA / MACD / RSI), confidence and whether the crisis surcharge is active.

Status labels (trend classification)
Trend reversal

bearish_signals ≥ 3 OR (bearish_signals ≥ 2 AND RSI is not an oversold-bounce). Strong technical regime break.

Correction

RSI shows oversold-bounce AND bearish_signals < 3, OR bearish_signals ≥ 1. Pull-back within an intact trend.

Uptrend

bearish_signals = 0. No technical pressure.

Unclear

Fewer than 50 data points available — confidence too low to classify.

Asset regime classification (per asset, in the dashboard)
BULL

5-day return > +1.5 %. Clear uptrend.

BEAR

5-day return < −1.5 % AND 1-day negative. Clear downtrend.

RECOVERY

5-day return < −1.5 % BUT 1-day positive. Rebound within an intact downtrend.

NEUTRAL

5-day return between ±1.5 %. Sideways movement.

BREAKOUT ↑

Donchian breakout (Phase 9.1): today's high > max(prev 5d high) AND close in the upper 5 % of the day's range. Only fires past the asset-specific anchor hour (US assets from 15:30 Berlin / EU assets from 10:00 Berlin) — opening volatility filtered out.

BREAKOUT ↓

Symmetric to the upward breakout: today's low < min(prev 5d low) AND close in the lower 5 % of the day's range.

Indicator thresholds
EMA

Bullish: Price > EMA50 > EMA200. Bearish: Price < EMA50 < EMA200.

MACD

Bullish: histogram > 0. Bearish: histogram < 0 for ≥ 4 of the last 5 candles.

RSI

Overbought: > 70. Oversold: < 30. Recovery: bounce above 40 after an oversold reading. Otherwise neutral.

Confidence

confidence = |bearish_signals − bullish_signals| / (total_signals + 2)

EMA bearish/bullish counts as +2 points · EMA weakening as +1 · MACD as +1 · RSI oversold-bounce as +1. Saturated at 4 signals.

Crisis surcharge

The technical analysis only triggers a surcharge on the GHI when the trend status is Trend reversal AND a price-based crash threshold is met (Equity 10-day drawdown ≥ 7 %, Gold 5-day return ≤ −5 %, or both for deleveraging). Surcharge values range from +8 to +22 points depending on severity. See section 06 (Overlays) for the full mechanic.

Data sources

Every input is verifiable against a public source.

FRED

US Federal Reserve economic data — WALCL, TGA, yields, DXY, RRP, SOFR, EFFR, DCPF3M (Phase 9), BAML HY/IG OAS, VIX.

US Treasury

Auction results — bid-to-cover, indirect bidder share.

TIC / H.4.1

Foreign holdings of US debt and Fed custody data.

yfinance + Stooq

Equity, gold, EUR/USD, oil/Brent prices (intraday + EOD), OVX (oil VIX).

ECB Data Portal

Eurozone money-market data — €STR (daily, EST/B.EU000A2X2A25.WT) and Euribor 3M (monthly, FM/M.U2.EUR.RT.MM.EURIBOR3MD_.HSTA). Phase 9.

EIA

US energy data (crude, Cushing, distillate).

ENTSO-E

Day-ahead electricity prices DE/FR/IT/Nord (Layer D power transmission).

RSS Feeds

22 news sources for political event ingestion.

Changelog

Every methodology change — new weights, new thresholds, new compound patterns — is logged with date and reasoning. Recent entries:

2026-05-01

Phase 9.1: Donchian breakout detection as new asset-regime classes (BREAKOUT_LONG / BREAKDOWN_SHORT). Asset-specific anchor hours (US 15:30 Berlin, EU 10:00 Berlin) filter opening volatility. V-shaped days with a new 5-day high are now correctly classified instead of NEUTRAL.

2026-05-01

Phase 9: Credit Stress (HY+IG OAS) added as 5th exposed Layer-C KPI (weight 0.075, with Vola weight reduced 0.225 → 0.15 as anti-redundancy). Funding Stress (money-market spreads US+EUR) added as 7th Layer-A KPI (weight 0.12, Layer-A weights redistributed). New ECB ingestor for €STR + Euribor. Backtest 122 trading days: mean drift +0.63 points, ρ(credit, vola)=0.04, ρ(funding, rrp)=−0.23 — anti-redundancy holds.

2026-04-18

Layer D (Energy) activated. Aggregation moved from 40/30/30 to 35/25/25/15. Channel Overlay introduced.

2026-03-16

Geo Overlay calibrated to fire only on CRITICAL compound patterns of military escalation. Diplomatic statements removed from trigger logic.

2026-02-28

Layer B severity scale recalibrated. Filter raised to score ≥ 15. Dominant Floor logic refined.

2026-02-17

Crisis Overlay thresholds calibrated: Equity 7 % over 10 days, Gold −5 % over 5 days.

Validation & Backtest

We tested all of the above against 3,872 trading days of historical data — including out-of-sample episodes like Lehman 2008, COVID-19, the Ukraine war and the 2025 Liberation Day tariffs. Read the public report including the limitations.

Back to Trust

Open methodology is not a feature. It's the foundation.