`evorl.algorithms.contrib.pop_episodic_td3`¶

Module Contents¶

Classes¶

PopEpisodicTD3Workflow

A batched TD3 workflow like CEMERL.

Functions¶

build_rl_update_fn

K actors + 1 shared critic.

API¶

class evorl.algorithms.contrib.pop_episodic_td3.PopEpisodicTD3Workflow(**kwargs)[source]¶

Bases: evorl.algorithms.erl.cemrl_td3.cemrl_td3_workflow.CEMRLTD3WorkflowTemplate

A batched TD3 workflow like CEMERL.

The differences from CEMRL are:

Each individual has an actor and a critic.
All individuals are updated by RL.

evaluate(state: evorl.types.State) → tuple[evorl.metrics.MetricBase, evorl.types.State][source]¶

learn(state: evorl.types.State) → evorl.types.State[source]¶

classmethod name()[source]¶

step(state: evorl.types.State) → tuple[evorl.metrics.MetricBase, evorl.types.State][source]¶

evorl.algorithms.contrib.pop_episodic_td3.build_rl_update_fn(agent: evorl.agent.Agent, optimizer: optax.GradientTransformation, config: omegaconf.DictConfig, agent_state_vmap_axes: evorl.agent.AgentState)[source]¶: K actors + 1 shared critic.