evorl.algorithms.contrib.pop_episodic_td3¶
Module Contents¶
Classes¶
A batched TD3 workflow like CEMERL. |
Functions¶
K actors + 1 shared critic. |
API¶
- class evorl.algorithms.contrib.pop_episodic_td3.PopEpisodicTD3Workflow(**kwargs)[source]¶
Bases:
evorl.algorithms.erl.cemrl_td3.cemrl_td3_workflow.CEMRLTD3WorkflowTemplateA batched TD3 workflow like CEMERL.
The differences from CEMRL are:
Each individual has an actor and a critic.
All individuals are updated by RL.
- evaluate(state: evorl.types.State) tuple[evorl.metrics.MetricBase, evorl.types.State][source]¶
- learn(state: evorl.types.State) evorl.types.State[source]¶
- step(state: evorl.types.State) tuple[evorl.metrics.MetricBase, evorl.types.State][source]¶
- evorl.algorithms.contrib.pop_episodic_td3.build_rl_update_fn(agent: evorl.agent.Agent, optimizer: optax.GradientTransformation, config: omegaconf.DictConfig, agent_state_vmap_axes: evorl.agent.AgentState)[source]¶
K actors + 1 shared critic.