evorl.envs.wrappers.training_wrapper

Module Contents

Classes

AutoresetMode

Autoreset mode.

EpisodeWrapper

Maintains episode step count and sets done at episode end.

FastVmapAutoResetWrapper

Brax-style AutoReset: no randomness in reset.

OneEpisodeWrapper

Maintains episode step count and sets done at episode end.

VmapAutoResetWrapper

Vectorize env and Autoreset.

VmapEnvPoolAutoResetWrapper

EnvPool style AutoReset.

VmapWrapper

Vectorize env.

API

class evorl.envs.wrappers.training_wrapper.AutoresetMode[source]

Bases: enum.Enum

Autoreset mode.

DISABLED

‘disabled’

ENVPOOL

‘envpool’

FAST

‘fast’

NORMAL

‘normal’

class evorl.envs.wrappers.training_wrapper.EpisodeWrapper(env: evorl.envs.env.Env, episode_length: int, record_ori_obs: bool = True, discount: float | None = None)[source]

Bases: evorl.envs.wrappers.wrapper.Wrapper

Maintains episode step count and sets done at episode end.

This is the same as brax’s EpisodeWrapper, and add some new fields in transition.info. Including:

  • steps: the current step count of the episode

  • trunction: whether the episode is truncated

  • termination: whether the episode is terminated

  • ori_obs: the next observation without autoreset

  • episode_return: the current sum of dicounted reward of the episode

reset(key: chex.PRNGKey) evorl.envs.env.EnvState[source]
step(state: evorl.envs.env.EnvState, action: jax.Array) evorl.envs.env.EnvState[source]
class evorl.envs.wrappers.training_wrapper.FastVmapAutoResetWrapper(env: evorl.envs.env.Env, num_envs: int = 1)[source]

Bases: evorl.envs.wrappers.wrapper.Wrapper

Brax-style AutoReset: no randomness in reset.

This wrapper reuses the state in the return of env.reset(). When the episodes have short length or the env.reset() is expensive, This wrapper is more efficient than VmapAutoResetWrapper.

reset(key: chex.PRNGKey) evorl.envs.env.EnvState[source]

Reset the vmapped env.

Args: key: support batched keys [B,2] or single key [2]

step(state: evorl.envs.env.EnvState, action: jax.Array) evorl.envs.env.EnvState[source]
class evorl.envs.wrappers.training_wrapper.OneEpisodeWrapper(env: evorl.envs.env.Env, episode_length: int, record_ori_obs: bool = True, discount: float | None = None)[source]

Bases: evorl.envs.wrappers.training_wrapper.EpisodeWrapper

Maintains episode step count and sets done at episode end.

When call step() after the env is done, stop simulation and directly return previous state.

step(state: evorl.envs.env.EnvState, action: jax.Array) evorl.envs.env.EnvState[source]
class evorl.envs.wrappers.training_wrapper.VmapAutoResetWrapper(env: evorl.envs.env.Env, num_envs: int = 1)[source]

Bases: evorl.envs.wrappers.wrapper.Wrapper

Vectorize env and Autoreset.

reset(key: chex.PRNGKey) evorl.envs.env.EnvState[source]

Reset the vmapped env.

Parameters:

key – support batched keys [B,2] or single key [2]

step(state: evorl.envs.env.EnvState, action: jax.Array) evorl.envs.env.EnvState[source]
class evorl.envs.wrappers.training_wrapper.VmapEnvPoolAutoResetWrapper(env: evorl.envs.env.Env, num_envs: int = 1)[source]

Bases: evorl.envs.wrappers.wrapper.Wrapper

EnvPool style AutoReset.

When the episode ends, an additional reset step is performed. See EnvPool: https://envpool.readthedocs.io/en/latest/content/python_interface.html#auto-reset, and the Next-Step Mode in gymnasium: https://farama.org/Vector-Autoreset-Mode. This is helpful for algorithms that require n-step TD or GAE with Partial episode bootstrapping (PEB) support on time-limited environments. When using this wrapper, remember to skip the invalid transitions via the mask autoreset.

reset(key: chex.PRNGKey) evorl.envs.env.EnvState[source]

Reset the vmapped env.

Parameters:

key – support batched keys [B,2] or single key [2]

step(state: evorl.envs.env.EnvState, action: jax.Array) evorl.envs.env.EnvState[source]
class evorl.envs.wrappers.training_wrapper.VmapWrapper(env: evorl.envs.env.Env, num_envs: int = 1, vmap_step: bool = False)[source]

Bases: evorl.envs.wrappers.wrapper.Wrapper

Vectorize env.

reset(key: chex.PRNGKey) evorl.envs.env.EnvState[source]

Reset the vmapped env.

Parameters:

key – support batched keys [B,2] or single key [2]

step(state: evorl.envs.env.EnvState, action: jax.Array) evorl.envs.env.EnvState[source]