CUCAI 2025 Archive
← Back to papers

Evaluating Decision-Making Generalization in RAG Agent Architectures

Mehar Shienh, Evan Dennison, Jordan Leis, Devon Kisob, Jennifer Yu, Yalda Nikookar, Madhav Malhotra

CUCAI 2025 Proceedings2025

Published 2025/03/26

Abstract

This paper explores LLMs as generalized decisionmaking assistants. We propose an assessment framework where retrieval-augmented generation (RAG) architectures are compared in simulated environments. By comparing objective win rates in games like Monopoly and Werewolf, we assess the efficacy of architectural options like reflection or multi-agent roles. This allows us to then apply the best performing architectures to the real-life context of political analysis. With this method, we find that the RAG architectures explored do not show generalization across decision-making contexts.