Topic modelling and RAG chatbot for the parliamentary library

As part of a research project, we are developing an application that will make parliamentary business easier to find with the help of a RAG chatbot. We are also investigating whether parliamentary business can be classified automatically.

Factsheet

  • Schools involved Business School
  • Institute(s) Institute for Public Sector Transformation
  • Research unit(s) Digital Sustainability Lab
  • Strategic thematic field Thematic field "Humane Digital Transformation"
  • Duration (planned) 01.04.2024 - 31.10.2024
  • Head of project Prof. Dr. Marcel Gygli
  • Project staff Prof. Dr. Marcel Gygli
    Siddhartha Singh
    Veton Matoshi
  • Partner Parlamentsbibliothek
  • Keywords RAG, Topic Model, Chatbot, Parliamentlibrary

Situation

The Parliamentary Library is responsible for managing parliamentary businesses on the official business platform Curia Vista. Several subject areas must be entered for each new transaction. A step that is currently done entirely manually. Based on this data, parliamentary librarians must also answer queries from parliamentarians (e.g. “How many transactions on the topic of tax evasion have there been in the last 5 years”). This is done with search systems that are only keyword-based and therefore only have limited possibilities to include relevant context.

Course of action

In our work, we examine two aspects separately. On the one hand, we create models (so-called topic models) which automatically suggest one or more topic area for new transactions. We create these models based on open source software. The first results of this work are already publicly available at: https://huggingface.co/spaces/rcds/SwissParlTopicModelling In a second step, we are developing a RAG chatbot-based application that parliamentary librarians can use to search the business. To do this, they can ask a question using natural language and also create search criteria based on recorded metadata. Based on this, relevant transactions are then searched for, summarized using a language model if desired, and returned to the librarians.

This project contributes to the following SDGs

  • 9: Industry, innovation and infrastructure
  • 16: Peace, justice and strong institutions