CUCAI 2026
← Back to papers

Reinforcement Learning Ensemble for Dynamic Portfolio Allocation

Arya Farivar

CUCAI 2026 Proceedings - 2026

Published 2026/03/07

Abstract

Reinforcement learning offers a promising framework for dynamic portfolio allocation, yet individual agents are often brittle across changing market regimes. This work presents an adaptive ensemble strategy that trains three deep reinforcement learning agents: Proximal Policy Optimization, Advantage Actor-Critic, and Twin Delayed Deep Deterministic Policy Gradient, on a diversified universe of nine exchange-traded funds spanning U.S. and international equities, fixed income, commodities, and real estate. A rolling 20-day Sharpe ratio selector dynamically assigns portfolio control to the best-performing agent at monthly rebalance points. Over an out-of-sample test period from October 2022 to December 2025, the ensemble achieved an annualized Sharpe ratio of 1.14 with a maximum drawdown of −10.9%, roughly half that of a passive S&P 500 benchmark. Bootstrap confidence intervals confirm that the ensemble’s risk-adjusted performance is marginally statistically significant.