TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/Optimizing Attention and Cognitive Control Costs Using Tem...

Optimizing Attention and Cognitive Control Costs Using Temporally-Layered Architectures

Devdhar Patel, Terrence Sejnowski, Hava Siegelmann

2023-05-30Reinforcement LearningContinuous ControlOpenAI Gymreinforcement-learning
PaperPDFCode(official)

Abstract

The current reinforcement learning framework focuses exclusively on performance, often at the expense of efficiency. In contrast, biological control achieves remarkable performance while also optimizing computational energy expenditure and decision frequency. We propose a Decision Bounded Markov Decision Process (DB-MDP), that constrains the number of decisions and computational energy available to agents in reinforcement learning environments. Our experiments demonstrate that existing reinforcement learning algorithms struggle within this framework, leading to either failure or suboptimal performance. To address this, we introduce a biologically-inspired, Temporally Layered Architecture (TLA), enabling agents to manage computational costs through two layers with distinct time scales and energy requirements. TLA achieves optimal performance in decision-bounded environments and in continuous control environments, it matches state-of-the-art performance while utilizing a fraction of the compute cost. Compared to current reinforcement learning algorithms that solely prioritize performance, our approach significantly lowers computational energy expenditure while maintaining performance. These findings establish a benchmark and pave the way for future research on energy and time-aware control.

Results

TaskDatasetMetricValueModel
OpenAI GymHalfCheetah-v2Action Repetition0.1805TLA
OpenAI GymHalfCheetah-v2Average Decisions831.42TLA
OpenAI GymHalfCheetah-v2Mean Reward9571.99TLA
OpenAI GymInvertedDoublePendulum-v2Action Repetition0.7522TLA
OpenAI GymInvertedDoublePendulum-v2Average Decisions247.76TLA
OpenAI GymInvertedDoublePendulum-v2Mean Reward9356.67TLA
OpenAI GymPendulum-v1Action Repetition0.7032TLA
OpenAI GymPendulum-v1Average Decisions62.31TLA
OpenAI GymPendulum-v1Mean Reward-154.92TLA
OpenAI GymAnt-v2Action Repetition0.1268TLA
OpenAI GymAnt-v2Average Decisions860.21TLA
OpenAI GymAnt-v2Mean Reward5163.54TLA
OpenAI GymWalker2d-v2Action Repetition0.4745TLA
OpenAI GymWalker2d-v2Average Decisions513.12TLA
OpenAI GymWalker2d-v2Mean Reward3878.41TLA
OpenAI GymHopper-v2Action Repetition0.5722TLA
OpenAI GymHopper-v2Average Decisions423.91TLA
OpenAI GymHopper-v2Mean Reward3458.22TLA
OpenAI GymMountainCarContinuous-v0Action Repetition0.914TLA
OpenAI GymMountainCarContinuous-v0Average Decisions10.6TLA
OpenAI GymMountainCarContinuous-v0Mean Reward93.88TLA
OpenAI GymInvertedPendulum-v2Action Repetition0.8882TLA
OpenAI GymInvertedPendulum-v2Average Decisions111.79TLA
OpenAI GymInvertedPendulum-v2Mean Reward1000TLA

Related Papers

CUDA-L1: Improving CUDA Optimization via Contrastive Reinforcement Learning2025-07-18VisionThink: Smart and Efficient Vision Language Model via Reinforcement Learning2025-07-17Spectral Bellman Method: Unifying Representation and Exploration in RL2025-07-17Aligning Humans and Robots via Reinforcement Learning from Implicit Human Feedback2025-07-17VAR-MATH: Probing True Mathematical Reasoning in Large Language Models via Symbolic Multi-Instance Benchmarks2025-07-17QuestA: Expanding Reasoning Capacity in LLMs via Question Augmentation2025-07-17Inverse Reinforcement Learning Meets Large Language Model Post-Training: Basics, Advances, and Opportunities2025-07-17Autonomous Resource Management in Microservice Systems via Reinforcement Learning2025-07-17