Juri Marcucci
Bank of Italy
Via Nazionale 91, 00184 Rome, Italy


  • Wolfgang Härdle (University of Berlin)
  • Cathy Yi-Hsuan Chen (University of Berlin)

Application Deadline  is   MAY 31

Deadline for FEE PAYMENT is  June 23th

Course outline

TASA Learning Objectives

Since information mostly exists in language data, the TASA course presents tools and concepts for text data with a strong focus on modeling the econometric effects of language or more specific sentiment.  It presents the decision analytics in a way that is understandable for non-mathematicians and practitioners who are confronted with day to day number crunching statistical textual analysis.  This course details the development of textual analysis and sentiment projection, and compare the pros and cons of them. The TASA course endows the practitioner with ready to use practical tools for these purposes and applications. All practical examples may be recalculated and modified: software and Quantlets  are in


Schedule of the course:

Data are everywhere and the ubiquitous availability of huge amounts of data makes it necessary to develop smart data analytics.  Out of the plethora of tools that are available for many scientific disciplines this course offers for the common data analyst an easy access to all levels of analysis without deep computer programming knowledge. Python is becoming the lingua franca, and can be easily applied for the analysis involved textual data.  TASA provides a wide variety of exercises, with Python or R step-by-step demonstrations for web-scraping, Natural Language Processing combined with statistical learning methods. 

  • Basic concepts
  • Data Management
  • Structuring Data elements
  • Natural Language Processing
  • Stemming, lemmatizing
  • DTM Dynamic Topic Modeling
  • Python tools for text mining
  • Text mining in Quantitative Finance
  • Applications & Empirics
  • Cluster Analysis and Classification
  • Support Vector Machine
  • CRIX a CRypto currency IndeX
  • Unsupervised projection: lexicon-based
  • Supervised projection: sentence-based
  • News sentiment extraction
  • Crypocurrency-specific lexicon and sentiment projection
  • Financial Risk Meter
  • DDI Networks Topology
  • Q3 D3 LSA
  • fraud and scam detection
  • Options on cryptos
  • Adaptive weight clustering
  • Machine learning in Economics
  • Deep Learning of Forecasts
  • Complexity in Banking, Scores and Networks



  • Franke J, Härdle WK, Hafner C (2015) Statistics of Financial Markets: an Introduction. 4th ed., Springer Verlag, Heidelberg. ISBN: 978-3-642-54538-2

  • Chen C YH, Härdle WK, Overbeck L (2017) Applied Quantitative Finance. 3rd extended ed., Springer Verlag, Heidelberg.

  • Härdle WK, Simar L (2015) Applied Multivariate Statistical Analysis. 4th ed., Springer Verlag, Heidelberg. ISBN 978-3-662-45170-0

  • Härdle WK, Okhrin O, Okhrin Y (2017) Basics of Computational Statistics, Springer Verlag, Heidelberg.


All examples are presented in R or Python.


The Module will be held in the Bank of Italy's Scuola di Automazione per Dirigenti Bancari (SADiBa), via San Marco n.54, Perugia. Participants will be accommodated at SADiBa.

Fees and Enrollment

  •  Students: 850€
  • University staff: 1000€ 
  •  Others: 2500€

Fee includes full board accommodation  starting from Sunday.