NLP Economics: Fundamentals to Frontier

Perugia, 30 June -4 July 2025

 

Director

Juri Marcucci
Bank of Italy
Via Nazionale 91, 00184 Rome, Italy
Email: juri.marcucci@bancaditalia.it,   juri.marcucci@gmail.com

 

Lecturers

 

Requirements

A prior knowledge of statistics, econometrics and some programming skills are required

 

Course description

NLP Economics: Fundamentals to Frontier provides a comprehensive exploration of advanced natural language processing (NLP) techniques that are increasingly central to economic research, policy analysis, and business applications. Students will learn how to handle the complexities of high-dimensional, unstructured data—transitioning from dictionary-based methods and bag-of-words approaches to cuttingedge large language models. Along the way, they will explore the frontier of NLP Economics, studying how state-of-the-art text analysis can inform both econometric modeling and real-world decision-making. Through lectures, interactive discussions, and hands-on exercises, students will engage deeply with existing literature and practical techniques, building the skills to use text-based data as a powerful new lens on economic questions.

Over the span of five intensive sessions, participants will begin by examining fundamental concepts of text measurement and representation—framing NLP as a dimensionality reduction challenge. They will learn various methods for encoding text and measuring similarities, comparing approaches that range from word count vectors to LLM-based embeddings. Students will then develop classification models for sentiment, topic models, and more, while also delving into best practices for model selection and external validation. In the final module, the course connects NLP techniques to broader economic theory and research design, highlighting how machine learning methods, causal inference, and structural models can be integrated with AI-driven text analysis. By the end of the program, students will not only master the technical fundamentals but also be equipped to incorporate NLP methods and tools effectively into their own economics research and professional projects.

Disclaimer: All views presented in this course are those of the instructors and not of the Federal Reserve Bank of Philadelphia or the Federal Reserve System

Lecture notes, slides, codes, and data will be provided.

 

Schedule of the course  (Preliminary):

Day 1: Introduction

● Philosophy of NLP: How to measure language?

● NLP as dimension reduction challenge: from words to numbers

● Fundamentals to frontier: methods from dictionary approaches to bag-of-words models to large language models

● NLP Economics as application of NLP in economics research

● NLP Economics in the meta sense: prudent LLM integration into the economics research process

 

Day 2: Encoding Text and Measuring Similarity

● Text vectors: word counts and LLM embeddings

● Distances in text-space: cosine similarity

● Model selection: narrowing down the list of hundreds of embedding models

 

Day 3: Text Classification

● Data-driven assignment of labels to documents and words

● Types of categories: sentiment, topics and more

● Model selection: horse races and ensembles

 

Day 4: Validation and Model Selection

● External validation as the central benchmark for NLP Economics

● Recommendations for human validation, theory validation, and robustness checks

 

Day 5: NLP Economics Integration

● Economic theory and modeling with text

    ○ Text and causal inference

    ○ Text and structural models

    ○ Theory informed text analysis

● Economics research process with text and AI

    ○ AI and tasks of the researcher

    ○ Tools for augmenting and streamlining economics research

 

Venue and timetables

The Module will be held in the Bank of Italy's Scuola di Automazione per Dirigenti Bancari (S.A.Di.Ba.), via San Marco n.54, Perugia. Participants will be accommodated at S.A.Di.Ba.. (in case of reduced availability of rooms in the Centre, they will be accommodated in local hotels).
Lectures and tutorials will be in English, with the following schedule:

  • Monday to Friday: lectures 9:00-12:30, 14:30-18:00

 

Contacts