evorl.envs.wrappers.training_wrapper¶
Module Contents¶
Classes¶
Autoreset mode. |
|
Maintains episode step count and sets done at episode end. |
|
Brax-style AutoReset: no randomness in reset. |
|
Maintains episode step count and sets done at episode end. |
|
Vectorize env and Autoreset. |
|
EnvPool style AutoReset. |
|
Vectorize env. |
API¶
- class evorl.envs.wrappers.training_wrapper.AutoresetMode[source]¶
Bases:
enum.EnumAutoreset mode.
- DISABLED¶
‘disabled’
- ENVPOOL¶
‘envpool’
- FAST¶
‘fast’
- NORMAL¶
‘normal’
- class evorl.envs.wrappers.training_wrapper.EpisodeWrapper(env: evorl.envs.env.Env, episode_length: int, record_ori_obs: bool = True, discount: float | None = None)[source]¶
Bases:
evorl.envs.wrappers.wrapper.WrapperMaintains episode step count and sets done at episode end.
This is the same as brax’s EpisodeWrapper, and add some new fields in transition.info. Including:
steps: the current step count of the episode
trunction: whether the episode is truncated
termination: whether the episode is terminated
ori_obs: the next observation without autoreset
episode_return: the current sum of dicounted reward of the episode
- reset(key: chex.PRNGKey) evorl.envs.env.EnvState[source]¶
- step(state: evorl.envs.env.EnvState, action: jax.Array) evorl.envs.env.EnvState[source]¶
- class evorl.envs.wrappers.training_wrapper.FastVmapAutoResetWrapper(env: evorl.envs.env.Env, num_envs: int = 1)[source]¶
Bases:
evorl.envs.wrappers.wrapper.WrapperBrax-style AutoReset: no randomness in reset.
This wrapper reuses the state in the return of
env.reset(). When the episodes have short length or theenv.reset()is expensive, This wrapper is more efficient thanVmapAutoResetWrapper.- reset(key: chex.PRNGKey) evorl.envs.env.EnvState[source]¶
Reset the vmapped env.
Args: key: support batched keys [B,2] or single key [2]
- step(state: evorl.envs.env.EnvState, action: jax.Array) evorl.envs.env.EnvState[source]¶
- class evorl.envs.wrappers.training_wrapper.OneEpisodeWrapper(env: evorl.envs.env.Env, episode_length: int, record_ori_obs: bool = True, discount: float | None = None)[source]¶
Bases:
evorl.envs.wrappers.training_wrapper.EpisodeWrapperMaintains episode step count and sets done at episode end.
When call step() after the env is done, stop simulation and directly return previous state.
- step(state: evorl.envs.env.EnvState, action: jax.Array) evorl.envs.env.EnvState[source]¶
- class evorl.envs.wrappers.training_wrapper.VmapAutoResetWrapper(env: evorl.envs.env.Env, num_envs: int = 1)[source]¶
Bases:
evorl.envs.wrappers.wrapper.WrapperVectorize env and Autoreset.
- reset(key: chex.PRNGKey) evorl.envs.env.EnvState[source]¶
Reset the vmapped env.
- Parameters:
key – support batched keys [B,2] or single key [2]
- step(state: evorl.envs.env.EnvState, action: jax.Array) evorl.envs.env.EnvState[source]¶
- class evorl.envs.wrappers.training_wrapper.VmapEnvPoolAutoResetWrapper(env: evorl.envs.env.Env, num_envs: int = 1)[source]¶
Bases:
evorl.envs.wrappers.wrapper.WrapperEnvPool style AutoReset.
When the episode ends, an additional reset step is performed. See EnvPool: https://envpool.readthedocs.io/en/latest/content/python_interface.html#auto-reset, and the Next-Step Mode in gymnasium: https://farama.org/Vector-Autoreset-Mode. This is helpful for algorithms that require n-step TD or GAE with Partial episode bootstrapping (PEB) support on time-limited environments. When using this wrapper, remember to skip the invalid transitions via the mask
autoreset.- reset(key: chex.PRNGKey) evorl.envs.env.EnvState[source]¶
Reset the vmapped env.
- Parameters:
key – support batched keys [B,2] or single key [2]
- step(state: evorl.envs.env.EnvState, action: jax.Array) evorl.envs.env.EnvState[source]¶
- class evorl.envs.wrappers.training_wrapper.VmapWrapper(env: evorl.envs.env.Env, num_envs: int = 1, vmap_step: bool = False)[source]¶
Bases:
evorl.envs.wrappers.wrapper.WrapperVectorize env.
- reset(key: chex.PRNGKey) evorl.envs.env.EnvState[source]¶
Reset the vmapped env.
- Parameters:
key – support batched keys [B,2] or single key [2]
- step(state: evorl.envs.env.EnvState, action: jax.Array) evorl.envs.env.EnvState[source]¶