evorl.algorithms.offpolicy_utils

Module Contents

Classes

OffPolicyWorkflowTemplate

Wrapping some common template for off-policy RL with TD Learning.

Functions

clean_trajectory

Clean the trajectory to make it suitable for the replay buffer.

skip_replay_buffer_state

Utility function to remove replay_buffer_state from state.

API

class evorl.algorithms.offpolicy_utils.OffPolicyWorkflowTemplate(env: evorl.envs.Env, agent: evorl.agent.Agent, optimizer: optax.GradientTransformation, evaluator: evorl.evaluators.Evaluator, replay_buffer: evorl.replay_buffers.AbstractReplayBuffer, config: omegaconf.DictConfig)[source]

Bases: evorl.workflows.OffPolicyWorkflow

Wrapping some common template for off-policy RL with TD Learning.

classmethod enable_jit() None[source]
learn(state: evorl.types.State) evorl.types.State[source]
evorl.algorithms.offpolicy_utils.clean_trajectory(trajectory: evorl.sample_batch.SampleBatch) evorl.sample_batch.SampleBatch[source]

Clean the trajectory to make it suitable for the replay buffer.

evorl.algorithms.offpolicy_utils.skip_replay_buffer_state(state: evorl.types.State) evorl.types.State[source]

Utility function to remove replay_buffer_state from state.

Usually used when saving the off-policy workflow state to disk.