evorl.algorithms.contrib.pop_episodic_td3

Module Contents

Classes

PopEpisodicTD3Workflow

A batched TD3 workflow like CEMERL.

Functions

build_rl_update_fn

K actors + 1 shared critic.

API

class evorl.algorithms.contrib.pop_episodic_td3.PopEpisodicTD3Workflow(**kwargs)[source]

Bases: evorl.algorithms.erl.cemrl_td3.cemrl_td3_workflow.CEMRLTD3WorkflowTemplate

A batched TD3 workflow like CEMERL.

The differences from CEMRL are:

  • Each individual has an actor and a critic.

  • All individuals are updated by RL.

evaluate(state: evorl.types.State) tuple[evorl.metrics.MetricBase, evorl.types.State][source]
learn(state: evorl.types.State) evorl.types.State[source]
classmethod name()[source]
step(state: evorl.types.State) tuple[evorl.metrics.MetricBase, evorl.types.State][source]
evorl.algorithms.contrib.pop_episodic_td3.build_rl_update_fn(agent: evorl.agent.Agent, optimizer: optax.GradientTransformation, config: omegaconf.DictConfig, agent_state_vmap_axes: evorl.agent.AgentState)[source]

K actors + 1 shared critic.