About this online course

This online course introduces statistical and machine learning techniques and their applications to empirical asset pricing. Materials cover classical problems, such as the estimation of risk premia and stochastic discount factor, the construction of mean-variance or factor-mimicking portfolios, the test of alphas, and the prediction of returns, but are cast in a modern big data setting, in which variable selection and dimension reduction techniques are necessary. The course also goes beyond structured datasets and imports natural language processing and image recognition techniques from computer science to collect new insights from unstructured text and image data.

Numerical analysis can be conducted in any programming language. Python, R, and Matlab are encouraged.

Additional details

There is no required text for this course. Four recommended references are

  • Asset Pricing: Revised Edition by John H. Cochrane,
  • The Econometrics of Financial Markets by John Y. Campbell, Andrew W. Lo, and A. Craig MacKinlay,
  • The Elements of Statistical Learning by Trevor Hastie, Robert Tibshirani, and Jerome Friedman, and
  • Deep Learning by Ian Goodfellow, Yoshua Bengio, and Aaron Courville.

Additionally, the following papers will be assigned for reading.

  1. “Inferential Theory for Factor Models of Large Dimensions,” by Jushan Bai, Econometrica, Volume 71, Issue 1, 2003, 135-171.
  2. “Asset Pricing with Omitted Factors,” by Stefano Giglio and Dacheng Xiu, 2019.
  3. “Thousands of Alpha Tests,” by Stefano Giglio, Yuan Liao, and Dacheng Xiu, 2020.
  4. “Characteristics are Covariances: A Unified Model of Risk and Return,” by Bryan Kelly, Seth Pruitt, and Yinan Su. Journal of Financial Economics, Volume 134, Issue 3, 2019, 501-524.
  5. “Market Expectations in the Cross-section of Present Values,” by Bryan Kelly and Seth Pruitt. Journal of Finance, Volume 68, Issue 5, 2013, 1721-1756.
  6. “Shrinking the Cross-section,” by Serhiy Kozak, Stefan Nagel, and Shrihari Santosh, Journal of Financial Economics, Volume 135, Issue 2, 2020, 272-292.
  7. “Double/Debiased Machine Learning for Treatment and Structural Parameters,” by V. Chernozhukov, D. Chetverikov, M. Demirer, E. Duflo, C. Hansen, W. Newey, and J. Robins, Econometrics Journal, 2018, Volume 21, Issue 1, 1-68.
  8. “Taming the Factor Zoo: A Test of New Factors,” by Gavin Feng, Stefano Giglio, and Dacheng Xiu, Journal of Finance, Volume 75, Issue 3, 2020, 1327-1370.
  9. “Empirical Asset Pricing via Machine Learning,” by Shihao Gu, Bryan Kelly, and Dacheng Xiu, Review of Financial Studies, Volume 33, Issue 5, 2020, 2223-2273.
  10. “Autoencoder Asset Pricing Models,”by Shihao Gu, Bryan Kelly, and Dacheng Xiu, Journal of Econometrics, forthcoming, 2020.
  11. “Textual Analysis in Accounting and Finance: A Survey,” by Loughran Tim and Bill McDonald. Journal of Accounting Research, 54, 2016, 1187-1230.
  12. “Predicting Returns with Text Data,” by Tracy Ke, Bryan Kelly, and Dacheng Xiu, 2020.
  13. “(Re-)Imag(ine)ing Pricing Trends,” by Jingwen Jiang, Bryan Kelly, and Dacheng Xiu, 2020.

Fees & eligibility criteria

There is no fee for participating in the course, however SFI PhD students have priority. Participation for all other students is contingent on availability. Registration deadline is 11.10.2020 23:59 CEST. Participation confirmation will be sent to admitted participants on 13.10.2020.


  • Session 1 - 19.10.2020 from 15:00 to 18:00 CEST
  • Session 2 - 20.10.2020 from 15:00 to 18:00 CEST
  • Session 3 - 26.10.2020 from 15:00 to 18:00 CET
  • Session 4 - 27.10.2020 from 15:00 to 18:00 CET
  • Session 5 - 02.11.2020 from 15:00 to 18:00 CET


Sandy Haldimann
University of Geneva

Bd du Pont d'Arve 40
1211 Geneva 4