Algorithms

Currently, EvoRL supports various training pipelines (workflows):

  1. Reinforcement Learning (RL) Algorithms

  2. Evolutionary Computation (EC) Algorithms, specific for policy search

  3. Evolutionary Reinforcement Learning (EvoRL):

    • Evolution-guided Reinforcement Learning (ERL)

    • Population-based AutoRL

This document introduces these types of algorithms implemented in EvoRL. All algorithms are defined in evorl.algorithms.

RL Algorithms

Supported RL Algorithms:

Algorithm

Workflow

Policy Type

Supported Action Space

Random

RandomAgentWorkflow

-

Discrete & Continuous

A2C

A2CWorkflow

Stochastic

Discrete & Continuous

PPO

PPOWorkflow

Stochastic

Discrete & Continuous

IMPALA

IMPALAWorkflow

Stochastic

Discrete & Continuous

DQN

DQNWorkflow

Value-based

Discrete

DDPG

DDPGWorkflow

Deterministic

Continuous

TD3

TD3Workflow

Deterministic

Continuous

SAC

SACWorkflow

Stochastic

Discrete & Continuous

TD7

TD7Workflow

Deterministic

Continuous

EC Algorithms

EC Algorithms are defines in the subpackage evorl.algorithms.ec.

Workflows for Single objective EC are derived from ECWorkflowTemplate.

Algorithm

Workflow

Policy Type

Supported Action Space

OpenES

OpenESWorkflow

Deterministic

Continuous

VanillaES

VanillaESWorkflow

Deterministic

Continuous

ARS

ARSWorkflow

Deterministic

Continuous

CMA-ES

CMAESWorkflow

Deterministic

Continuous

Workflows for Multi-objective EC are derived from MultiObjectiveECWorkflowTemplate. Currently, we provide NSGA-II with NSGA2Workflow for brax environments.

ERL Algorithms

The ERL algorithms are defined in the subpackage evorl.algorithms.erl. We provide ERL and CEM-RL and their variants.

Population-based AutoRL Algorithms

The Population-based AutoRL algorithms are defined in the subpackage evorl.algorithms.meta. We provide some general population-based training pipelines for RL hyperparameter tuning.