Creating Hierarchical Dispositions of Needs in an Agent

Tofara Moyo

2024-11-23Reinforcement Learning OpenAI Gym

Abstract

We present a novel method for learning hierarchical abstractions that prioritize competing objectives, leading to improved global expected rewards. Our approach employs a secondary rewarding agent with multiple scalar outputs, each associated with a distinct level of abstraction. The traditional agent then learns to maximize these outputs in a hierarchical manner, conditioning each level on the maximization of the preceding level. We derive an equation that orders these scalar values and the global reward by priority, inducing a hierarchy of needs that informs goal formation. Experimental results on the Pendulum v1 environment demonstrate superior performance compared to a baseline implementation.We achieved state of the art results.

Results

Task	Dataset	Metric	Value	Model
OpenAI Gym	Pendulum-v1	Action Repetition	0.8073	TLA with Hierarchical Reward Functions
OpenAI Gym	Pendulum-v1	Average Decisions	38.6	TLA with Hierarchical Reward Functions
OpenAI Gym	Pendulum-v1	Mean Reward	-125.02	TLA with Hierarchical Reward Functions

Related Papers

CUDA-L1: Improving CUDA Optimization via Contrastive Reinforcement Learning2025-07-18 VisionThink: Smart and Efficient Vision Language Model via Reinforcement Learning2025-07-17 Spectral Bellman Method: Unifying Representation and Exploration in RL2025-07-17 Aligning Humans and Robots via Reinforcement Learning from Implicit Human Feedback2025-07-17 VAR-MATH: Probing True Mathematical Reasoning in Large Language Models via Symbolic Multi-Instance Benchmarks2025-07-17 QuestA: Expanding Reasoning Capacity in LLMs via Question Augmentation2025-07-17 Inverse Reinforcement Learning Meets Large Language Model Post-Training: Basics, Advances, and Opportunities2025-07-17 Autonomous Resource Management in Microservice Systems via Reinforcement Learning2025-07-17