In-Context Learning Under Regime Change

In-Context Learning Under Regime Change

Carson Dudley
Yutong Bi
Xiaofeng Liu
Samet Oymak
Published on 4/18/2026
Cross-asset
Machine learning
Risk management
Market timing

This paper studies how transformers handle non-stationary sequences with abrupt regime changes, formalizing the problem as in-context change-point detection. The authors provide constructive theory showing that transformers can approximate the Bayesian model-averaged predictor for piecewise-linear tasks, with model complexity (depth, width, attention heads) depending on the level of side information about the change point. Specifically, knowing the exact change point reduces the candidate set size and thus the required attention heads, while partial information (e.g., support) requires more heads. Synthetic experiments on linear regression and linear dynamical systems confirm that trained transformers match optimal baselines (oracle least-squares when informed, BMA when uninformed). Real-world experiments on infectious disease forecasting (with policy changes) and financial volatility forecasting (around FOMC announcements) show that encoding change-point information via positional encoding improves pretrained foundation model performance without retraining, with up to 25% MAE reduction in disease forecasting. The work bridges classical change-point detection and modern in-context learning, offering practical methods for deploying transformers in non-stationary environments.

Highlights

  • 1Formalizes in-context change-point detection for transformers under non-stationary sequences.
  • 2Provides constructive theory showing transformer complexity depends on change-point information level.
  • 3Validates theory with synthetic experiments where trained transformers match optimal baselines.
  • 4Demonstrates real-world improvement by encoding change-point information into pretrained models.

Methods

  • M
    Bayesian model averaging (BMA) for change-point adaptation.
  • M
    Transformer construction with positional encoding to communicate change-point side information.
  • M
    Synthetic experiments on piecewise-linear regression and dynamical systems.
  • M
    Real-world experiments on infectious disease and financial volatility forecasting.

Results

  • R
    Transformers can approximate BMA predictor for piecewise-linear change-point problems.
  • R
    Model complexity (attention heads) scales with candidate change-point set size.
  • R
    Trained transformers match oracle least-squares and BMA baselines in synthetic tasks.
  • R
    Positional encoding of change-point information reduces MAE by ~25% in disease forecasting.
  • R
    Linear positional encoding improves financial volatility forecasting around FOMC events.
0/5

Analyze Paper

Generate insights from "In-Context Learning Under Regime Change".

Suggested Actions