Using Large Language Models for empirical research in Social Science

General information

Practicalities

Application Form

10 hour course by Gaël Le Mens (UPF-BSM, and BSE)

Schedule: 6 & 7 June (10.00am - 1.00pm / 2.00pm - 4.00pm)

The emergence of ChatGPT and other large language models (LLMs) has dramatically transformed the field of natural language processing (NLP), making these tools increasingly valuable for research in social and political sciences. This intensive summer course is designed to equip participants with a thorough understanding of how LLMs can be effectively used as a research tool in these disciplines. Participants will engage in both theoretical and practical learning, starting with an exploration of LLMs' foundational concepts, such as their architecture and training processes. Practical sessions will then demonstrate how to apply LLMs to text data analysis, contrasting their performance with traditional methods like human coding and conventional NLP techniques based on bags-of-words. The course also addresses the challenges of employing LLMs in empirical research, focusing on issues like their opaque nature and the consistency of their outputs. Basic familiarity with Python is recommended to facilitate hands-on experience with running models and using LLM APIs (such as the OpenAI API or the MistralAI API). By the conclusion of this course, attendees will not only have acquired hands-on skills in applying LLMs for social and political science research but also developed a nuanced understanding of both the potential benefits and limitations of these tools. This course is designed for researchers, academics, graduate students, and professionals interested in harnessing the power of LLMs to advance their own research agenda.

Gaël Le Mens is Professor at Universitat Pompeu Fabra and Affiliated Professor of the BSE. Professor Le Mens's research focuses on how the social environment and learning processes affect inference, judgment and valuation. His current theoretical focus is on developing models of the influence of categories on inference and valuation, models of how people learn from feedback on Twitter and models of the dynamics of collective valuation (such as online review scores) and popularity. Professor Le Mens tests the predictions of his models using a variety of methods, such as the analysis of text data using deep learning (BERT) and a combination of online and laboratory experiments.