The ReQAP Method

Demo coming soon!
Question answering over mixed sources, like text and tables, has been advanced by verbalizing all contents and encoding it with a language model. A prominent case of such heterogeneous data is personal information: user devices log vast amounts of data every day, such as calendar entries, workout statistics, shopping records, streaming history, and more. Information needs range from simple look-ups to queries of analytical nature. The challenge is to provide humans with convenient access with small footprint, so that all personal data stays on the user devices. We present ReQAP, a novel method that creates an executable operator tree for a given question, via recursive decomposition. Operators are designed to enable seamless integration of structured and unstructured sources, and the execution of the operator tree yields a traceable answer.

Code

GitHub link to ReQAP code Directly download ReQAP code

Example

ReQAP operates in two stages: (i) the question understanding and decomposition (QUD) for constructing an executable operator tree, and (ii) the operator tree execution (OTX) stage for deriving the answer with the relevant user events. Further details can be found in our paper.
The figure below visualizes the answering process of ReQAP for the question "How often did I eat Italian food after playing football":


The PerQA Benchmark

We constructed the PerQA benchmark, for training and evaluating methods for QA over heterogeneous personal data. PerQA synthesizes realistic user data and questions based on handcrafted personas, and has 3,500 complex questions and more than 40,000 events per persona. For constructing the benchmark, we utilized LLMs for verbalizing user events to derive realistic unstructured texts, such as mails, social media posts and calendar entries.

Download PerQA

Train Set (12 personas / ~1200 questions each) Dev Set (2 personas / ~170 questions each) Test Set (6 personas / ~600 questions each)
The PerQA benchmark is licensed under a Creative Commons Attribution 4.0 International License.
Creative Commons License

The PerQA Leaderboard

Method Hit@1 Relaxed Hit@1
ReQAP (GPT4o)
Christmann and Weikum '25
0.386 0.52
ReQAP (SFT)
Christmann and Weikum '25
0.380 0.53
CodeGen (GPT4o)
0.319 0.44
CodeGen (SFT)
0.313 0.47
RAG (GPT4o)
0.149 0.20
RAG (SFT)
0.029 0.06

Real User Questions

In our user study with local students, we collected 2,005 real information needs from humans. You can download these questions here: User Questions (2005 questions)

Paper

"Recursive Question Understanding for Complex Question Answering over Heterogeneous Personal Data",
Philipp Christmann and Gerhard Weikum. In ACL 2025 Findings.
[Preprint coming soon] [Code]

Contact

For feedback and clarifications, please contact: To know more about our group, please visit our website:
https://qa.mpi-inf.mpg.de/.