Topic modelling and RAG chatbot for the parliamentary library
As part of a research project, we are developing an application that will make parliamentary business easier to find with the help of a RAG chatbot. We are also investigating whether parliamentary business can be classified automatically.
Factsheet
- Lead school Business School
- Institute(s) Institute for Public Sector Transformation
- Research unit(s) Digital Sustainability Lab
- Strategic thematic field Thematic field "Humane Digital Transformation"
- Duration (planned) 01.04.2024 - 31.10.2024
- Project management Prof. Dr. Marcel Gygli
- Head of project Prof. Dr. Marcel Gygli
-
Project staff
Prof. Dr. Marcel Gygli
Siddhartha Singh
Veton Matoshi - Partner Parlamentsbibliothek
- Keywords RAG, Topic Model, Chatbot, Parliamentlibrary
Situation
The Parliamentary Library is responsible for managing parliamentary businesses on the official business platform Curia Vista. Several subject areas must be entered for each new transaction. A step that is currently done entirely manually. Based on this data, parliamentary librarians must also answer queries from parliamentarians (e.g. “How many transactions on the topic of tax evasion have there been in the last 5 years”). This is done with search systems that are only keyword-based and therefore only have limited possibilities to include relevant context.
Course of action
In our work, we examine two aspects separately. On the one hand, we create models (so-called topic models) which automatically suggest one or more topic area for new transactions. We create these models based on open source software. The first results of this work are already publicly available at: https://huggingface.co/spaces/rcds/SwissParlTopicModelling In a second step, we are developing a RAG chatbot-based application that parliamentary librarians can use to search the business. To do this, they can ask a question using natural language and also create search criteria based on recorded metadata. Based on this, relevant transactions are then searched for, summarized using a language model if desired, and returned to the librarians.