Unity ML Agents: Wall Jump and SoccerTwos Environment using Reinforcement Learning Techniques
Implementation and comparative study of reinforcement learning algorithms in Unity ML environments, with case studies on WallJump and SoccerTwos.
This study applies reinforcement learning (RL) to the Unity ML-Agents Toolkit, with case studies on WallJump and SoccerTwos.
Algorithms implemented include Imitation Learning, Proximal Policy Optimization (PPO), Soft Actor-Critic (SAC), and POCA, evaluated with TensorBoard monitoring.
Enhancements explored:
- Curriculum learning in WallJump for complex task training
- Self-play in SoccerTwos for competitive multi-agent strategies
- Hyperparameter tuning for improved stability and performance
Key results:
- SAC outperformed PPO in continuous-control environments (3DBall)
- Curriculum learning accelerated convergence in WallJump
- Self-play produced stronger agents in adversarial settings
- Hyperparameter tuning critically influenced model robustness

Unity ML-Agents environments explored: WallJump and SoccerTwos