# {py:mod}`evorl.utils.rl_toolkits` ```{py:module} evorl.utils.rl_toolkits ``` ```{autodoc2-docstring} evorl.utils.rl_toolkits :parser: autodoc2_docstrings_parser :allowtitles: ``` ## Module Contents ### Functions ````{list-table} :class: autosummary longtable :align: left * - {py:obj}`approximate_kl ` - ```{autodoc2-docstring} evorl.utils.rl_toolkits.approximate_kl :parser: autodoc2_docstrings_parser :summary: ``` * - {py:obj}`average_episode_discount_return ` - ```{autodoc2-docstring} evorl.utils.rl_toolkits.average_episode_discount_return :parser: autodoc2_docstrings_parser :summary: ``` * - {py:obj}`compute_discount_return ` - ```{autodoc2-docstring} evorl.utils.rl_toolkits.compute_discount_return :parser: autodoc2_docstrings_parser :summary: ``` * - {py:obj}`compute_episode_length ` - ```{autodoc2-docstring} evorl.utils.rl_toolkits.compute_episode_length :parser: autodoc2_docstrings_parser :summary: ``` * - {py:obj}`compute_gae ` - ```{autodoc2-docstring} evorl.utils.rl_toolkits.compute_gae :parser: autodoc2_docstrings_parser :summary: ``` * - {py:obj}`compute_gae_with_horizon ` - ```{autodoc2-docstring} evorl.utils.rl_toolkits.compute_gae_with_horizon :parser: autodoc2_docstrings_parser :summary: ``` * - {py:obj}`flatten_pop_rollout_episode ` - ```{autodoc2-docstring} evorl.utils.rl_toolkits.flatten_pop_rollout_episode :parser: autodoc2_docstrings_parser :summary: ``` * - {py:obj}`flatten_rollout_trajectory ` - ```{autodoc2-docstring} evorl.utils.rl_toolkits.flatten_rollout_trajectory :parser: autodoc2_docstrings_parser :summary: ``` * - {py:obj}`fold_multi_steps ` - ```{autodoc2-docstring} evorl.utils.rl_toolkits.fold_multi_steps :parser: autodoc2_docstrings_parser :summary: ``` * - {py:obj}`shuffle_sample_batch ` - ```{autodoc2-docstring} evorl.utils.rl_toolkits.shuffle_sample_batch :parser: autodoc2_docstrings_parser :summary: ``` * - {py:obj}`soft_target_update ` - ```{autodoc2-docstring} evorl.utils.rl_toolkits.soft_target_update :parser: autodoc2_docstrings_parser :summary: ``` ```` ### API ````{py:function} approximate_kl(logratio: jax.Array, mode='k3', axis=-1) -> jax.Array :canonical: evorl.utils.rl_toolkits.approximate_kl ```{autodoc2-docstring} evorl.utils.rl_toolkits.approximate_kl :parser: autodoc2_docstrings_parser ``` ```` ````{py:function} average_episode_discount_return(episode_discount_return: jax.Array, dones: jax.Array, dp_axis_name: str | None = None) -> jax.Array :canonical: evorl.utils.rl_toolkits.average_episode_discount_return ```{autodoc2-docstring} evorl.utils.rl_toolkits.average_episode_discount_return :parser: autodoc2_docstrings_parser ``` ```` ````{py:function} compute_discount_return(rewards: chex.Array, dones: chex.Array, discount: float = 1.0) -> chex.Array :canonical: evorl.utils.rl_toolkits.compute_discount_return ```{autodoc2-docstring} evorl.utils.rl_toolkits.compute_discount_return :parser: autodoc2_docstrings_parser ``` ```` ````{py:function} compute_episode_length(dones: chex.Array) -> chex.Array :canonical: evorl.utils.rl_toolkits.compute_episode_length ```{autodoc2-docstring} evorl.utils.rl_toolkits.compute_episode_length :parser: autodoc2_docstrings_parser ``` ```` ````{py:function} compute_gae(rewards: jax.Array, values: jax.Array, dones: jax.Array, terminations: jax.Array, gae_lambda: float = 1.0, discount: float = 0.99) -> tuple[jax.Array, jax.Array] :canonical: evorl.utils.rl_toolkits.compute_gae ```{autodoc2-docstring} evorl.utils.rl_toolkits.compute_gae :parser: autodoc2_docstrings_parser ``` ```` ````{py:function} compute_gae_with_horizon(rewards: jax.Array, values: jax.Array, dones: jax.Array, terminations: jax.Array, gae_horizon: int = 0, gae_lambda: float = 1.0, discount: float = 0.99) -> tuple[jax.Array, jax.Array] :canonical: evorl.utils.rl_toolkits.compute_gae_with_horizon ```{autodoc2-docstring} evorl.utils.rl_toolkits.compute_gae_with_horizon :parser: autodoc2_docstrings_parser ``` ```` ````{py:function} flatten_pop_rollout_episode(trajectory: evorl.sample_batch.SampleBatch) :canonical: evorl.utils.rl_toolkits.flatten_pop_rollout_episode ```{autodoc2-docstring} evorl.utils.rl_toolkits.flatten_pop_rollout_episode :parser: autodoc2_docstrings_parser ``` ```` ````{py:function} flatten_rollout_trajectory(trajectory: evorl.sample_batch.SampleBatch) -> evorl.sample_batch.SampleBatch :canonical: evorl.utils.rl_toolkits.flatten_rollout_trajectory ```{autodoc2-docstring} evorl.utils.rl_toolkits.flatten_rollout_trajectory :parser: autodoc2_docstrings_parser ``` ```` ````{py:function} fold_multi_steps(step_fn, num_steps) :canonical: evorl.utils.rl_toolkits.fold_multi_steps ```{autodoc2-docstring} evorl.utils.rl_toolkits.fold_multi_steps :parser: autodoc2_docstrings_parser ``` ```` ````{py:function} shuffle_sample_batch(sample_batch: evorl.sample_batch.SampleBatch, key: chex.PRNGKey) :canonical: evorl.utils.rl_toolkits.shuffle_sample_batch ```{autodoc2-docstring} evorl.utils.rl_toolkits.shuffle_sample_batch :parser: autodoc2_docstrings_parser ``` ```` ````{py:function} soft_target_update(target_params, source_params, tau: float) :canonical: evorl.utils.rl_toolkits.soft_target_update ```{autodoc2-docstring} evorl.utils.rl_toolkits.soft_target_update :parser: autodoc2_docstrings_parser ``` ````