← Back to papers
Evaluating Decision-Making Generalization in RAG Agent Architectures
Mehar Shienh, Evan Dennison, Jordan Leis, Devon Kisob, Jennifer Yu, Yalda Nikookar, Madhav Malhotra
CUCAI 2025 Proceedings • 2025
Published 2025/03/26
Abstract
This paper explores LLMs as generalized decisionmaking assistants. We propose an assessment framework where retrieval-augmented generation (RAG) architectures are compared in simulated environments. By comparing objective win rates in games like Monopoly and Werewolf, we assess the efficacy of architectural options like reflection or multi-agent roles. This allows us to then apply the best performing architectures to the real-life context of political analysis. With this method, we find that the RAG architectures explored do not show generalization across decision-making contexts.