Lecture 1: Foundations

Course outline · Backtesting fundamentals

Prof. Dr. Andre Guettler
Prof. Dr. Andre Guettler Director of the Institute
Helmholtzstraße 22, Room 205
andre.guettler@uni-ulm.de
+49 731 50 31 030
Oliver Padmaperuma
Oliver Padmaperuma Doctoral Candidate
Helmholtzstraße 22, Room 203
oliver.padmaperuma@uni-ulm.de
+49 731 50 31 036

1.1 Course objectives

  • 1.1 Course objectives
  • 1.2 Aim & organisation
  • 1.3 Backtesting fundamentals
  • 1.4 Conclusion of Lecture 1
  • Welcome to
  • Course Objective
  • Course at a glance (1/2)
  • Course at a glance (2/2)
  • Assignments / Exams

Welcome to Finance Project — Asset Management

  • This is a project course: there is no central exam to register for. Sign up on the course Moodle page by 15 April 2026 so you receive announcements and the data link.
  • Submit the project by 30 June 2026 as a single zip — name pattern: Asset2026_surname1_surname2_surname3. Email it to oliver.padmaperuma@uni-ulm.de, CC andre.guettler@uni-ulm.de and your team-mates.
  • Ask questions during or right after each session — that is the preferred channel.
  • Admin / studies / exam-eligibility questions go to the registrar’s office (Studiensekretariat) at studiensekretariat@uni-ulm.de.
  • Course-content questions outside class: email oliver.padmaperuma@uni-ulm.de, CC andre.guettler@uni-ulm.de.
  • We also recommend the student advisory service.

Course Objective

Scope

We will:

  • Build an end-to-end empirical pipeline in R: load, explore, model, back-test
  • Cover the core ML toolbox for asset-management research: linear models, Ridge, Lasso, Elastic Net, cross-validation
  • Apply it to a non-traditional asset class: prediction markets
  • Develop your own indicator library and trading strategy in groups of three

We will NOT:

  • Drift into deep-learning or reinforcement-learning methods
  • Cover prediction markets in depth
  • Provide a “ready-to-fork” backtest — the demo code is intentionally basic

Approach

Part I — Foundations

  • L1: Motivation, organisation, backtesting fundamentals
  • L2: Hands-on R intro — RStudio, live coding, etc.
  • L3 + L4: Statistical learning — model accuracy, regularisation, resampling

Part II — Application

  • L5: Prediction-markets primer + the Polymarket dataset + assignment briefing
  • Project work in groups of three (≈ 7 weeks of self-organised work)
  • Final session (1 July): 20-minute presentations per team

Course at a glance (1/2)

Foundations

Week 1

15.04.2026

Course outline · Backtesting fundamentals

  • Course aim & organisation
  • Backtesting overview & case study
  • In-sample tests (Welch & Goyal 2008)
  • Out-of-sample (walk-forward, R²_OS)
  • Useful predictors & p-hacking

Introduction to R

Week 2

22.04.2026

RStudio · variables · vectors · data frames · live coding

  • Why R for empirical asset-management research
  • RStudio and the script editor
  • Variables, vectors, matrices, data frames, lists
  • Functions and loops
  • Data import and export

Assessing model accuracy & Ridge regression

Week 3

29.04.2026

Statistical learning · MSE · bias-variance · linear model selection · Ridge

  • Statistical learning: Y = f(X) + ε
  • Quality of fit and the train/test MSE distinction
  • Bias-variance trade-off and overfitting
  • OLS limits: prediction accuracy & interpretability
  • Ridge regression and the L2 penalty

Lasso, cross-validation & Elastic Net

Week 4

06.05.2026

Sparse regularisation · resampling for honest test error · choosing λ

  • Lasso: L1 penalty and exact-zero coefficients
  • Cross-validation: validation set, LOOCV, K-fold
  • Choosing the optimal λ for Lasso
  • OLS post-Lasso for cleaner coefficient inference
  • Elastic Net — combining Ridge and Lasso

Prediction markets, the Polymarket Quant Bench & your project

Week 5

13.05.2026

From Welch-Goyal to event-resolved binary contracts

  • Prediction markets — definition and Polymarket as the canonical venue
  • How prices form: liquidity, resolution, mechanics
  • The Polymarket Quant Bench dataset (HuggingFace): access and schema
  • First look at the data in R
  • Your project: indicator design, back-test, deliverables, R toolbox

Course at a glance (2/2)

Final presentations

Week 13

01.07.2026

Group presentations · Q&A · wrap-up

  • Presentation order and time budget
  • Q&A rules
  • Closing thoughts and feedback

Assignments / Exams

Project (Code + Report) 50% of your grade

Rmd code + knitr-rendered PDF report. Build a library of indicators over the Polymarket Quant Bench dataset (curated OHLCV bars on HuggingFace, derived from Jon Becker’s polymarket-data dump), derive trade signals, back-test, and write a critical reflection.

Group of up to 3.

Submit by emailing oliver.padmaperuma@uni-ulm.de, CC andre.guettler@uni-ulm.de. Subject pattern: Finance Project — Asset Management_assignment-1-project-report_surname1_surname2_…

30 June 2026

Final Presentation 50% of your grade

20-minute group presentation in class on 1 July 2026; submit slides as PDF together with the project zip.

Group of up to 3.

Submit by emailing oliver.padmaperuma@uni-ulm.de, CC andre.guettler@uni-ulm.de. Subject pattern: Finance Project — Asset Management_assignment-2-final-presentation_surname1_surname2_…

1 July 2026

1.2 Aim & organisation

  • 1.1 Course objectives
  • 1.2 Aim & organisation
  • 1.3 Backtesting fundamentals
  • 1.4 Conclusion of Lecture 1
  • Aim of the Finance Project
  • Course outline
  • General course information
  • Reading & grading
  • Slides, materials & contact

Aim of the Finance Project

  • Students develop their own indicator-based trading strategies for prediction markets using machine-learning approaches in R.
  • Combination of lectures (key concepts in Machine Learning, backtesting, basic R) plus hands-on empirical implementations of your own strategy.
  • Ideally, you can run your strategy on a live dashboard.

Course outline

  1. Motivation & Organisation (today)
  2. Backtesting Fundamentals (today)
  3. Statistical Learning with Applications in R
    1. Introduction to R — Lecture 2 (Oliver)
    2. Assessing Model Accuracy — Lecture 3 (Andre)
    3. Linear Model Selection & Regularisation — Lectures 3–4 (Andre)
    4. Resampling Methods — Lecture 4 (Andre)
  4. Practical Implementation — Lecture 5 (Oliver)
    1. Prediction-markets primer & the Polymarket Quant Bench dataset
    2. Your project: build indicators & back-test on prediction-market data

General course information

  • All lectures: Wednesdays at 12:15 in Helmholtzstraße 18, room E60, Ulm.
  • See dates in the course overview or on Moodle (first lecture on 15 April 2026).
  • After Lecture 5, the practical phase starts and we offer regular consultation hours.
  • Important deadlines:
    • Written project (Rmd + PDF + slides as zip): 30 June 2026, 18:00.
    • Final presentations: 1 July 2026.

Reading & grading

An Introduction to Statistical Learning with Applications in R — James, Witten, Hastie, Tibshirani (free eBook).

  • Project report (10–15 pages, R Markdown → PDF): 50%
  • Final presentation (~20 minutes, PDF slides): 50%
  • Group-based: groups of 3 (we allocate if you don’t form one).

Slides, materials & contact

  • Slides + teaching material: posted to Moodle and the course homepage.
  • Dataset link: shared via Moodle ahead of Lecture 5 (do not start early — we may update the dataset).
  • Lecture content — Andre Guettler (andre.guettler@uni-ulm.de)
  • Practical implementations — Oliver Padmaperuma (oliver.padmaperuma@uni-ulm.de)

1.3 Backtesting fundamentals

  • 1.1 Course objectives
  • 1.2 Aim & organisation
  • 1.3 Backtesting fundamentals
  • 1.4 Conclusion of Lecture 1
  • Backtesting — overview
  • Case study — SRA Credit Spread Trading
  • In-sample tests — general setup
  • In-sample tests — properties
  • Welch & Goyal (2008) — bootstrap inference
  • In-sample results —
  • Out-of-sample tests — overview
  • Out-of-sample R²
  • Out-of-sample results —
  • Plots — IS vs OOS R² over time
  • Example — Book-to-market
  • Useful predictors — requirements
  • P-hacking / data mining

Backtesting — overview

  1. Case study — what does a polished backtest look like in practice?
  2. In-sample tests — how do we know a predictor matters on the historical sample?
  3. Out-of-sample forecasts — does the predictor still work outside the estimation window?

Case study — SRA Credit Spread Trading

  • Forecasting horizon: 1 day
  • Trading instruments: $HYG ETF
  • Risk-free: US Treasury (1y)
  • Benchmark: $HYG buy-and-hold
  • Rebalancing: daily
  • Trading costs: bid/ask + 2 bp ETF + 0.5 bp TSY
  • Sharpe ratio: 1.38 (strategy) vs 0.40 (benchmark)
  • Std. dev. (annualised): 8.3% vs 21.0%
  • Max draw-down: −7.5% vs −33.5%
  • Annualised return: 16.73% vs 7.99%

In-sample tests — general setup

\[ r_{t,t+h} \;=\; a + b\,X_t + u_t \]

  • \(r\) — log excess return (\(=\) asset return − risk-free rate)
  • \(X\) — vector of predictor(s)
  • \(t\) — end of period; \(h\) — forecast horizon (e.g. 6 months)

Example.

  • \(X_t\) = Dividend yield as of 31.12.2020
  • \(r_{t,t+6}\) = excess return for 31.12.2020 → 30.06.2021

In-sample tests — properties

  • In-sample OLS estimators have high power (when the model specification is valid).
  • Results often depend heavily on sample period (start and end dates).
  • Inference is tricky, especially with multi-period excess returns and persistent predictors.

Welch & Goyal (2008) — bootstrap inference

  • Bootstrap procedure in Welch and Goyal (2008) follows Mark (1995) and Kilian (1999).
  • Construct 10 000 bootstrapped time series by drawing residuals with replacement.
  • Initial observation: pick one date from the actual data at random.
  • The procedure preserves the autocorrelation structure of the predictor.

In-sample results — Welch and Goyal (2008)

In-sample adjusted \(R^2\) over the full sample — significant predictors (* = p<0.10, ** = p<0.05, *** = p<0.01):

Predictor Sample Adj. R²
b/m Book-to-market 1921–2005 3.20*
i/k Investment / capital 1947–2005 6.63**
ntis Net equity expansion 1927–2005 8.15***
eqis Pct equity issuing 1927–2005 9.15***
all Kitchen sink 1927–2005 13.81**

Many other predictors have negative adjusted R² — see Welch and Goyal (2008).

Out-of-sample tests — overview

  • OOS = robustness check for in-sample estimation.
  • Predictive performance is evaluated on data outside the training window.
  • Walk-forward: estimate parameters \(\hat a_{t-1}, \hat b_{t-1}\) using data through \(t-1\); forecast \(\hat r_t = \hat a_{t-1} + \hat b_{t-1} X_{t-1}\); advance one step; repeat.

Out-of-sample R²

Compare model forecast \(\hat r_t\) vs realisation \(r_t\), and historical-mean forecast \(\bar r_t\) vs realisation:

\[ R^2_{OS} \;=\; 1 \;-\; \dfrac{\sum_{t=1}^T (r_t - \hat r_t)^2}{\sum_{t=1}^T (r_t - \bar r_t)^2} \]

A positive \(R^2_{OS}\) means the model has lower prediction error than the historical mean.

Out-of-sample results — Welch and Goyal (2008)

In WG08, only one predictor (eqis) remains significant out-of-sample:

Predictor Adj. R²_OS
eqis Pct equity issuing 2.04**

All other predictors deliver negative OOS R² — including the kitchen-sink combination at −139.03.

Plots — IS vs OOS R² over time

  • IS line: cumulative squared demeaned equity premium − cumulative squared regression residual.
  • OOS line: cumulative squared error of the prevailing mean − cumulative squared error of the predictor’s regression.
  • Line rising ⇒ predictor gaining forecasting ability.
  • Line falling ⇒ historical mean predicts better.

Example — Book-to-market

Positive IS, but negative OOS — fails the robustness check.

  • IS prediction (dotted line) trends positive — the predictor looks useful in-sample.
  • OOS prediction (solid black line) collapses post-1974 — the predictor loses forecasting power.
  • Lesson: in-sample alpha alone is not enough. The same logic applies to your prediction-market indicators.

Useful predictors — requirements

A predictor is useful if it shows:

  1. Both significant IS and reasonably good OOS performance over the entire sample.
  2. A generally upward drift (irregular is fine).
  3. Drift in more than one short or unusual sample period (not just two years around the Oil Shock).
  4. Drift that remains positive over recent decades — predictors often lose forecasting power after publication.

P-hacking / data mining

  • Multiple-testing fallacy — running enough backtests on a single dataset is certain to produce a result that meets any pre-specified statistical-significance threshold.
  • Solutions:
    • Paper performance — track strategy returns publicly online; raises the bar for parallel strategies.
    • Real-money performance — gold standard; serious investors require ≥ 3 years of live, public performance (managed account or fund).

When you back-test five indicators and one “wins”, ask yourself how many silent losers it crowds out.

1.4 Conclusion of Lecture 1

  • 1.1 Course objectives
  • 1.2 Aim & organisation
  • 1.3 Backtesting fundamentals
  • 1.4 Conclusion of Lecture 1
  • Course at a glance (1/2)
  • Course at a glance (2/2)
  • Further reading
  • Prepare before next lecture
  • See you next time
  • References

Course at a glance (1/2)

Foundations

Week 1

15.04.2026

Course outline · Backtesting fundamentals

  • Course aim & organisation
  • Backtesting overview & case study
  • In-sample tests (Welch & Goyal 2008)
  • Out-of-sample (walk-forward, R²_OS)
  • Useful predictors & p-hacking

Introduction to R

Week 2

22.04.2026

RStudio · variables · vectors · data frames · live coding

  • Why R for empirical asset-management research
  • RStudio and the script editor
  • Variables, vectors, matrices, data frames, lists
  • Functions and loops
  • Data import and export

Assessing model accuracy & Ridge regression

Week 3

29.04.2026

Statistical learning · MSE · bias-variance · linear model selection · Ridge

  • Statistical learning: Y = f(X) + ε
  • Quality of fit and the train/test MSE distinction
  • Bias-variance trade-off and overfitting
  • OLS limits: prediction accuracy & interpretability
  • Ridge regression and the L2 penalty

Lasso, cross-validation & Elastic Net

Week 4

06.05.2026

Sparse regularisation · resampling for honest test error · choosing λ

  • Lasso: L1 penalty and exact-zero coefficients
  • Cross-validation: validation set, LOOCV, K-fold
  • Choosing the optimal λ for Lasso
  • OLS post-Lasso for cleaner coefficient inference
  • Elastic Net — combining Ridge and Lasso

Prediction markets, the Polymarket Quant Bench & your project

Week 5

13.05.2026

From Welch-Goyal to event-resolved binary contracts

  • Prediction markets — definition and Polymarket as the canonical venue
  • How prices form: liquidity, resolution, mechanics
  • The Polymarket Quant Bench dataset (HuggingFace): access and schema
  • First look at the data in R
  • Your project: indicator design, back-test, deliverables, R toolbox

Course at a glance (2/2)

Final presentations

Week 13

01.07.2026

Group presentations · Q&A · wrap-up

  • Presentation order and time budget
  • Q&A rules
  • Closing thoughts and feedback

Further reading

  • James et al. (2021) — free PDF + exercises at https://www.statlearning.com.
  • Welch and Goyal (2008) — the canonical IS-vs-OOS empirical benchmark we’ll keep coming back to.
  • Mark (1995) and Kilian (1999) — long-horizon regressions and the bootstrap procedure WG08 build on.

Prepare before next lecture

  1. Install R and RStudio from https://posit.co/download/rstudio-desktop.
  2. Make sure your installation works — open RStudio and run 1 + 1 in the Console.
  3. Bring a laptop to Lecture 2; we’ll do live coding.

See you next time

Reminder

  • Sign up on the course Moodle page so you receive announcements and the dataset link before Lecture 5.
  • Lecture 2 (22 April 2026): Introduction to R — RStudio, variables, vectors, data frames, live coding.

References

James, Gareth, Daniela Witten, Trevor Hastie, and Robert Tibshirani. 2021. An Introduction to Statistical Learning with Applications in R. 2nd ed. New York, NY: Springer. https://www.statlearning.com/.
Kilian, Lutz. 1999. “Exchange Rates and Monetary Fundamentals: What Do We Learn from Long-Horizon Regressions?” Journal of Applied Econometrics 14 (5): 491–510. https://doi.org/10.1002/(SICI)1099-1255(199909/10)14:5<491::AID-JAE527>3.0.CO;2-D.
Mark, Nelson C. 1995. “Exchange Rates and Fundamentals: Evidence on Long-Horizon Predictability.” American Economic Review 85 (1): 201–18.
Welch, Ivo, and Amit Goyal. 2008. “A Comprehensive Look at the Empirical Performance of Equity Premium Prediction.” Review of Financial Studies 21 (4): 1455–1508. https://doi.org/10.1093/rfs/hhm014.