👁

Signal Methodology v1.0

Multi-Model Ensemble + HRRR Weather Signal System
for Temperature Prediction Markets

Version 1.0
Published May 2026
Track Record exchange-verified at /track
Ensemble GFS + ECMWF + ICON + GEM
Cities Scanned 40+
Important: This document describes a signal methodology, not a guaranteed trading strategy. The full per-trade record is published and exchange-verified at /track, reconstructed line-by-line from Kalshi's own settlement and fill records (every stop-loss exit counted at its realized price). Past performance is not indicative of future results. Seasonality, model uncertainty, and market liquidity all affect outcomes. See Section 7 for failure modes.
Table of Contents
  1. Executive Overview
  2. Data Sources: Multi-Model Ensemble + HRRR
  3. Signal Generation Pipeline
  4. Win-Probability & Calibration
  5. Execution Framework
  6. Risk Management & Bankroll
  7. Known Failure Modes
  8. Live Track Record Summary
  9. Legal Disclosures
§ 1

Executive Overview

3rd Eyes publishes meteorological signals for US weather prediction markets - Kalshi, and the identical contracts carried by Coinbase and Robinhood - focused on daily high- and low-temperature contracts. The core insight is that weather models, particularly a multi-model forecast ensemble and HRRR short-range radar assimilation, can identify temperature outcomes with sufficient confidence to generate positive expected value (EV) on binary contracts.

The strategy targets a specific market inefficiency: market makers price temperature thresholds using basic climatological priors. When numerical weather prediction (NWP) models diverge strongly from those priors, due to an identifiable synoptic pattern, we publish a NO signal. The edge is meteorological, not statistical arbitrage.

Core Thesis in One Sentence

When an independent multi-model ensemble and the HRRR hourly update agree that a temperature threshold will NOT be reached, and the market is still pricing that event at 3-18¢ (implying 3-18% probability), the true probability is materially lower, creating positive EV on the NO side.

We operate exclusively on NO contracts, betting that a temperature threshold will not be exceeded (for high-temp markets) or not be reached (for low-temp markets). This creates asymmetric payouts: risk $X to win ~$Y where Y > X if probability is mispriced by ≥5 percentage points.

§ 2

Data Sources: Multi-Model Ensemble + HRRR

🌐 Multi-Model Forecast Ensemble

Source: Four independent global numerical weather models - GFS (NOAA), ECMWF, ICON (DWD) and GEM (Environment Canada).

Resolution: ~9-25 km horizontal; each model re-runs every 6-12 hours.

Lead time used: 12-96 hours (Day 0 through Day 4)

Role in signal: Primary temperature forecast. Pooling four independent models cancels out any single model's bias; the spread across them quantifies uncertainty. Narrow spread (low σ) = strong agreement = high confidence.

Key metric: Ensemble mean 2m temperature vs. the market threshold. A calibrated minimum separation (the "buffer") is required before a signal is scored - exact thresholds are proprietary.

🛰 HRRR (High-Resolution Rapid Refresh)

Source: NOAA/ESRL HRRR - single deterministic model

Resolution: 3 km horizontal, hourly runs

Lead time used: 0-36 hours (same-day and next-day markets only)

Role in signal: Confirmation for near-term signals. HRRR assimilates real-time radar, surface observations, and satellite data - it has materially higher skill at ≤24 hours than the global models.

Key metric: HRRR maximum 2m temperature vs. the market threshold. Agreement with the ensemble required for signal publication.

⚠️ Model Agreement Requirement: A signal is only published when every model in the ensemble and (for ≤36h markets) HRRR independently forecast the threshold will not be reached. Any disagreement = no signal, regardless of individual model confidence.
Attribute Multi-Model Ensemble HRRR
Best for Day 2-4 signals Same-day / Day 1 signals
Resolution ~9-25 km 3 km
Update frequency Every 6-12 h Hourly
Composition 4 global models (GFS, ECMWF, ICON, GEM) 1 (deterministic)
Radar assimilation No Yes (real-time)
Weakness Coarser resolution can miss urban heat islands No ensemble spread - single point of failure
§ 3

Signal Generation Pipeline

The signal pipeline runs automatically every 5 minutes, scanning open Kalshi weather markets and evaluating each against current NWP model output.

Step 1
Market Scan
Step 2
NWP Pull
Step 3
Buffer Check
Step 4
Multi-Model OK
Step 5
Score + Publish
Step 1: Market Scan

Fetch open temperature markets from Kalshi (KXHIGH*/KXLOW* series via the Kalshi API). The same contracts are mirrored on Coinbase and Robinhood, so a Kalshi signal is tradeable on any of the three. Filter for markets resolving within the next 96 hours with adequate liquidity.

Step 2: NWP Temperature Pull

For each market, extract the forecast high temperature for the relevant city, date, and hour from the multi-model ensemble mean (GFS, ECMWF, ICON, GEM - all markets) and HRRR (markets resolving ≤36h).

Step 3: Buffer Calculation

Compute the temperature buffer: the gap between the forecast high and the Kalshi threshold:

buffer = | forecast high - market threshold |
A signal is published only when the buffer clears a calibrated
minimum. Exact thresholds are proprietary.

A positive buffer means the model forecasts the threshold will NOT be reached. Larger buffer = higher confidence = higher win-probability.

Step 4: Multi-Model Confirmation

For ≤36h markets, HRRR must independently agree with the ensemble. For Day 2-4 markets, the model ensemble spread must fall within a calibrated tolerance; high spread = uncertain = no signal.

Step 5: Win-Probability & Publish

Compute a calibrated win-probability, check orderbook depth for minimum fill ≥$8, and publish signal via Telegram with full reasoning: city, threshold, model mean, HRRR (if applicable), buffer, win-probability, and suggested position sizing.

§ 4

Win-Probability & Calibration

Each signal carries a calibrated win-probability - the model's estimate of the chance the NO side resolves correctly. It is derived from a Normal distribution fitted to the multi-model ensemble, with the spread deliberately widened for forecast lead time so the number stays honest rather than overconfident. This win-probability directly informs suggested position sizing across our three risk tiers.

Win-Probability Buffer (°F) Model Spread Interpretation Suggested Tier
≥ 93% ≥ 3°F < 2°F σ High conviction: every model clears, low uncertainty Safe Medium Aggressive
88 - 93% 1.5 - 3°F < 3°F σ Moderate conviction: good buffer, manageable spread Medium Aggressive
84 - 88% 0.5 - 1.5°F < 4°F σ Lower conviction: edge exists but thinner; smallest size only Aggressive only
⚠️ Calibration Status: The full per-trade record is live and reconstructed line-by-line from Kalshi's own settlement and fill records - including every stop-loss exit counted at its realized price. Win-probability calibration curves will be expanded as the sample grows. See the exchange-verified ledger at /track.

Calibration is ongoing. We will update this document at: 100 trades (v1.1), summer 2026 season end (v1.2), 500 trades (v2.0). All updates will be published at 3rdeyes.io/methodology.

§ 5

Execution Framework

Signals are published via Telegram. Subscribers execute trades independently on Kalshi (US residents) or equivalent platforms (international). 3rd Eyes never touches subscriber funds.

Position Sizing - Risk Tiers
Tier Stake per trade Signals Taken
Safe 1 - 2% of bankroll ≥ 93% win-prob
Medium 2 - 3% of bankroll ≥ 88% win-prob
Aggressive 3 - 5% of bankroll ≥ 85% win-prob

All tiers manage a single bankroll - no slots. We hold no more than 2 positions in any one weather region, so correlated weather can't take out several at once.

Position Sizing - Fixed-Fractional

Subscribers manage a single bankroll and stake a fixed fraction of it on each signal - no slots, no progressive ladders.

position = bankroll × risk_fraction ← e.g. 2% of $2,000 = $40
contracts = floor(position / NO_price)
max_loss ≈ contracts × $0.20 (stop) + fees ≈ 21% of position

When the bankroll grows, the same fraction naturally stakes a little more; when it shrinks, you stake less. A loss never increases the next bet - which removes the temptation to "chase" and caps the damage of any losing streak.

Stop-Loss Protocol

The stop sits a fixed 20¢ below entry. When an active position's NO price falls 20¢ under your fill (e.g., 95¢ → 75¢), the signal system flags an immediate exit. This caps the loss at roughly 20¢ per contract - about 21% of a 95¢ position - regardless of how much time is left.

Max loss per position ≈ contracts × ($0.20 stop + $0.01 fee)
Example: $80 @ 95¢ NO → 84 contracts → max loss ≈ 84 × $0.21 = ~$17.64
Minimum Fill Threshold

Signals where the Kalshi orderbook cannot fill ≥$8 (thin markets, usually <10 contracts at the target NO price) are flagged as SKIP. These are published for informational purposes but are excluded from the W/L track record.

Kalshi Fee Structure

Kalshi charges a flat $0.01 per contract fee on fills. This fee applies to both entry and exit (stop-loss) fills.

Net profit per win = payout - entry_cost - (contracts × $0.01)
Example: 84 contracts @ 95¢ NO → win payout $84 - cost $79.80 - fee $0.84 = net +$3.36
Example: 84 contracts @ 97¢ NO → win payout $84 - cost $81.48 - fee $0.84 = net +$1.68

At high NO prices (95¢+), the fee is material (25-50% of gross profit). Always check net-of-fees EV before sizing up.

§ 6

Risk Management & Bankroll Discipline

The 3rd Eyes framework prioritizes capital preservation over maximizing individual trade returns. This manifests in four hard rules:

Rule 1: Never size into thin orderbooks

Minimum fill $8. If the orderbook cannot fill $8 at the target price, skip the trade entirely. Thin markets mean: poor price execution, high slippage on exit, and outsized fee impact.

Rule 2: Respect the stop-loss unconditionally

The stop - 20¢ below entry - is not a suggestion. When the NO price falls that far (e.g., 95¢ → 75¢), the position is exiting a scenario where our model edge has deteriorated (an unexpected weather development). Holding through a stop risks full position loss.

Rule 3: No pyramiding into losing positions

If a position moves against us, we do not add contracts. The stop-loss exits the position. We wait for the next independent signal.

Rule 4: Fixed-fractional sizing

Every position is a small, fixed fraction of one bankroll (commonly 1-5%). A loss never increases the next stake, and no single trade can lose more than ~21% of its own position. We also cap concurrent positions in any one weather region at 2, so a single correlated weather event can't hit several at once.

§ 7

Known Failure Modes

We are transparent about the conditions under which this methodology underperforms or fails. Every subscriber should understand these risks before placing any position.

Failure Mode Cause Mitigation Frequency (estimated)
Sudden synoptic shift A frontal system moves faster or slower than models predict, pushing temps across the threshold after signal entry Stop-loss at 75¢ (20¢ below entry) ~1-2% of trades
Urban heat island miss The global models' ~25km resolution can miss localized urban heating; HRRR 3km typically catches this for ≤24h, but not always Use HRRR confirmation for city markets ~1% of trades
Convective outlier An unexpected thunderstorm (especially summer) suppresses daytime high below model forecast - but for NO bets, this is usually a win N/A (benefits NO positions) N/A
Model initialization error A bad radiosonde observation poisons the GFS analysis, creating a systematically wrong forecast for 1-2 days HRRR cross-check (HRRR uses independent radar data) <0.5% of trades
Kalshi thin market Low liquidity → poor fill price → fee drag erases edge $8 minimum fill filter Excluded from W/L
Summer convective season June-August afternoon convection randomizes peak temperatures; model uncertainty 3× higher than spring Lower-band signals (< 90% win-prob) suspended; position sizing halved on the rest Seasonal - 3 months/year
API / system downtime Kalshi API outage, VPS failure, or cron job failure can cause missed signals or missed stop-loss exits 5-min scan + 1-min position monitor; Telegram alerts on error Rare, monitored
⚠️ Seasonality Warning: Our published win rate is a spring 2026 figure. Spring frontal systems produce the most predictable temperature deviations in the continental US. Summer 2026 will be the first real stress-test of this methodology. We expect win rates to normalize downward in summer conditions. We will update this document with real data after August 2026.
§ 8

Track Record

The full per-trade record is published and updated daily at /track, reconstructed line-by-line from Kalshi's own settlement and fill records. Every win, every loss, every stop-loss exit at its realized price.

How we got here, in the open. An earlier version of this section displayed self-reported win/loss numbers that were under-recording stop-loss exits. Rather than patch a figure we couldn't fully stand behind, we took the table offline, rebuilt the entire ledger directly from Kalshi's own settlement and fill records (the exchange's data, not ours), and republished it as the live table at /track.

Every row there is reconciled to a Kalshi fill. Every stop-loss exit is counted at the actual price we got out, not the model's intended exit. If you ever see a discrepancy with your own broker view, the exchange data is the source we trust.