Sorbobot

Overview

Sorbobot helps researchers, managers, and partners discover expertise within Sorbonne University. Given a research topic or question, the system surfaces relevant researchers, labs, and publications across the university’s broad disciplinary landscape.

Partners

InstitutionRole
SUMMITEngineering team, infrastructure, data owner
SCAIML/NLP expertise, LLM pipeline, internship supervision

My Role

Contributed NLP and LLM architecture expertise (RAG pipeline design), mentored junior engineers, supervised one Master’s intern, and coordinated with a freelance senior data engineer.

Technical Approach

  • Data ingestion: researcher profiles, lab pages, publication databases (HAL)
  • Indexing: embedding-based vector index over researcher and publication content
  • RAG pipeline: user queries matched against the index; retrieved context passed to an LLM for synthesis
  • Entity linking: connecting researcher, lab, and topic mentions across heterogeneous sources

Stack: Python · LangChain · LLM APIs · PostgreSQL

Status

🟢 SCAI phase completed — prototype built and validated. Project continues under SUMMIT’s ownership.