# {py:mod}`evorl.algorithms.sac` ```{py:module} evorl.algorithms.sac ``` ```{autodoc2-docstring} evorl.algorithms.sac :parser: autodoc2_docstrings_parser :allowtitles: ``` ## Module Contents ### Classes ````{list-table} :class: autosummary longtable :align: left * - {py:obj}`SACAgent ` - * - {py:obj}`SACDiscreteAgent ` - * - {py:obj}`SACNetworkParams ` - * - {py:obj}`SACTrainMetric ` - * - {py:obj}`SACWorkflow ` - ```` ### Functions ````{list-table} :class: autosummary longtable :align: left * - {py:obj}`make_mlp_sac_agent ` - ```{autodoc2-docstring} evorl.algorithms.sac.make_mlp_sac_agent :parser: autodoc2_docstrings_parser :summary: ``` ```` ### API `````{py:class} SACAgent :canonical: evorl.algorithms.sac.SACAgent Bases: {py:obj}`evorl.agent.Agent` ````{py:method} actor_loss(agent_state: evorl.agent.AgentState, sample_batch: evorl.sample_batch.SampleBatch, key: chex.PRNGKey) -> evorl.types.LossDict :canonical: evorl.algorithms.sac.SACAgent.actor_loss ```{autodoc2-docstring} evorl.algorithms.sac.SACAgent.actor_loss :parser: autodoc2_docstrings_parser ``` ```` ````{py:attribute} actor_network :canonical: evorl.algorithms.sac.SACAgent.actor_network :type: flax.linen.Module :value: > None ```{autodoc2-docstring} evorl.algorithms.sac.SACAgent.actor_network :parser: autodoc2_docstrings_parser ``` ```` ````{py:method} alpha_loss(agent_state: evorl.agent.AgentState, sample_batch: evorl.sample_batch.SampleBatch, key: chex.PRNGKey) -> evorl.types.LossDict :canonical: evorl.algorithms.sac.SACAgent.alpha_loss ```{autodoc2-docstring} evorl.algorithms.sac.SACAgent.alpha_loss :parser: autodoc2_docstrings_parser ``` ```` ````{py:method} compute_actions(agent_state: evorl.agent.AgentState, sample_batch: evorl.sample_batch.SampleBatch, key: chex.PRNGKey) -> tuple[evorl.types.Action, evorl.types.PolicyExtraInfo] :canonical: evorl.algorithms.sac.SACAgent.compute_actions ```` ````{py:method} critic_loss(agent_state: evorl.agent.AgentState, sample_batch: evorl.sample_batch.SampleBatch, key: chex.PRNGKey) -> evorl.types.LossDict :canonical: evorl.algorithms.sac.SACAgent.critic_loss ```{autodoc2-docstring} evorl.algorithms.sac.SACAgent.critic_loss :parser: autodoc2_docstrings_parser ``` ```` ````{py:attribute} critic_network :canonical: evorl.algorithms.sac.SACAgent.critic_network :type: flax.linen.Module :value: > None ```{autodoc2-docstring} evorl.algorithms.sac.SACAgent.critic_network :parser: autodoc2_docstrings_parser ``` ```` ````{py:attribute} discount :canonical: evorl.algorithms.sac.SACAgent.discount :type: float :value: > 0.99 ```{autodoc2-docstring} evorl.algorithms.sac.SACAgent.discount :parser: autodoc2_docstrings_parser ``` ```` ````{py:method} evaluate_actions(agent_state: evorl.agent.AgentState, sample_batch: evorl.sample_batch.SampleBatch, key: chex.PRNGKey) -> tuple[evorl.types.Action, evorl.types.PolicyExtraInfo] :canonical: evorl.algorithms.sac.SACAgent.evaluate_actions ```` ````{py:method} init(obs_space: evorl.envs.Space, action_space: evorl.envs.Space, key: chex.PRNGKey) -> evorl.agent.AgentState :canonical: evorl.algorithms.sac.SACAgent.init ```{autodoc2-docstring} evorl.algorithms.sac.SACAgent.init :parser: autodoc2_docstrings_parser ``` ```` ````{py:attribute} init_alpha :canonical: evorl.algorithms.sac.SACAgent.init_alpha :type: float :value: > 1.0 ```{autodoc2-docstring} evorl.algorithms.sac.SACAgent.init_alpha :parser: autodoc2_docstrings_parser ``` ```` ````{py:property} normalize_obs :canonical: evorl.algorithms.sac.SACAgent.normalize_obs ```{autodoc2-docstring} evorl.algorithms.sac.SACAgent.normalize_obs :parser: autodoc2_docstrings_parser ``` ```` ````{py:attribute} obs_preprocessor :canonical: evorl.algorithms.sac.SACAgent.obs_preprocessor :type: typing.Any :value: > 'pytree_field(...)' ```{autodoc2-docstring} evorl.algorithms.sac.SACAgent.obs_preprocessor :parser: autodoc2_docstrings_parser ``` ```` ````` `````{py:class} SACDiscreteAgent :canonical: evorl.algorithms.sac.SACDiscreteAgent Bases: {py:obj}`evorl.agent.Agent` ````{py:method} actor_loss(agent_state: evorl.agent.AgentState, sample_batch: evorl.sample_batch.SampleBatch, key: chex.PRNGKey) -> evorl.types.LossDict :canonical: evorl.algorithms.sac.SACDiscreteAgent.actor_loss ```{autodoc2-docstring} evorl.algorithms.sac.SACDiscreteAgent.actor_loss :parser: autodoc2_docstrings_parser ``` ```` ````{py:attribute} actor_network :canonical: evorl.algorithms.sac.SACDiscreteAgent.actor_network :type: flax.linen.Module :value: > None ```{autodoc2-docstring} evorl.algorithms.sac.SACDiscreteAgent.actor_network :parser: autodoc2_docstrings_parser ``` ```` ````{py:method} alpha_loss(agent_state: evorl.agent.AgentState, sample_batch: evorl.sample_batch.SampleBatch, key: chex.PRNGKey) -> evorl.types.LossDict :canonical: evorl.algorithms.sac.SACDiscreteAgent.alpha_loss ```{autodoc2-docstring} evorl.algorithms.sac.SACDiscreteAgent.alpha_loss :parser: autodoc2_docstrings_parser ``` ```` ````{py:method} compute_actions(agent_state: evorl.agent.AgentState, sample_batch: evorl.sample_batch.SampleBatch, key: chex.PRNGKey) -> tuple[evorl.types.Action, evorl.types.PolicyExtraInfo] :canonical: evorl.algorithms.sac.SACDiscreteAgent.compute_actions ```` ````{py:method} critic_loss(agent_state: evorl.agent.AgentState, sample_batch: evorl.sample_batch.SampleBatch, key: chex.PRNGKey) -> evorl.types.LossDict :canonical: evorl.algorithms.sac.SACDiscreteAgent.critic_loss ```{autodoc2-docstring} evorl.algorithms.sac.SACDiscreteAgent.critic_loss :parser: autodoc2_docstrings_parser ``` ```` ````{py:attribute} critic_network :canonical: evorl.algorithms.sac.SACDiscreteAgent.critic_network :type: flax.linen.Module :value: > None ```{autodoc2-docstring} evorl.algorithms.sac.SACDiscreteAgent.critic_network :parser: autodoc2_docstrings_parser ``` ```` ````{py:attribute} discount :canonical: evorl.algorithms.sac.SACDiscreteAgent.discount :type: float :value: > 0.99 ```{autodoc2-docstring} evorl.algorithms.sac.SACDiscreteAgent.discount :parser: autodoc2_docstrings_parser ``` ```` ````{py:method} evaluate_actions(agent_state: evorl.agent.AgentState, sample_batch: evorl.sample_batch.SampleBatch, key: chex.PRNGKey) -> tuple[evorl.types.Action, evorl.types.PolicyExtraInfo] :canonical: evorl.algorithms.sac.SACDiscreteAgent.evaluate_actions ```` ````{py:method} init(obs_space: evorl.envs.Space, action_space: evorl.envs.Space, key: chex.PRNGKey) -> evorl.agent.AgentState :canonical: evorl.algorithms.sac.SACDiscreteAgent.init ```{autodoc2-docstring} evorl.algorithms.sac.SACDiscreteAgent.init :parser: autodoc2_docstrings_parser ``` ```` ````{py:attribute} init_alpha :canonical: evorl.algorithms.sac.SACDiscreteAgent.init_alpha :type: float :value: > 1.0 ```{autodoc2-docstring} evorl.algorithms.sac.SACDiscreteAgent.init_alpha :parser: autodoc2_docstrings_parser ``` ```` ````{py:property} normalize_obs :canonical: evorl.algorithms.sac.SACDiscreteAgent.normalize_obs ```{autodoc2-docstring} evorl.algorithms.sac.SACDiscreteAgent.normalize_obs :parser: autodoc2_docstrings_parser ``` ```` ````{py:attribute} obs_preprocessor :canonical: evorl.algorithms.sac.SACDiscreteAgent.obs_preprocessor :type: typing.Any :value: > 'pytree_field(...)' ```{autodoc2-docstring} evorl.algorithms.sac.SACDiscreteAgent.obs_preprocessor :parser: autodoc2_docstrings_parser ``` ```` ````{py:attribute} target_entropy_ratio :canonical: evorl.algorithms.sac.SACDiscreteAgent.target_entropy_ratio :type: float :value: > 0.98 ```{autodoc2-docstring} evorl.algorithms.sac.SACDiscreteAgent.target_entropy_ratio :parser: autodoc2_docstrings_parser ``` ```` ````` `````{py:class} SACNetworkParams :canonical: evorl.algorithms.sac.SACNetworkParams Bases: {py:obj}`evorl.types.PyTreeData` ````{py:attribute} actor_params :canonical: evorl.algorithms.sac.SACNetworkParams.actor_params :type: evorl.types.Params :value: > None ```{autodoc2-docstring} evorl.algorithms.sac.SACNetworkParams.actor_params :parser: autodoc2_docstrings_parser ``` ```` ````{py:attribute} critic_params :canonical: evorl.algorithms.sac.SACNetworkParams.critic_params :type: evorl.types.Params :value: > None ```{autodoc2-docstring} evorl.algorithms.sac.SACNetworkParams.critic_params :parser: autodoc2_docstrings_parser ``` ```` ````{py:attribute} log_alpha :canonical: evorl.algorithms.sac.SACNetworkParams.log_alpha :type: evorl.types.Params :value: > None ```{autodoc2-docstring} evorl.algorithms.sac.SACNetworkParams.log_alpha :parser: autodoc2_docstrings_parser ``` ```` ````{py:attribute} target_critic_params :canonical: evorl.algorithms.sac.SACNetworkParams.target_critic_params :type: evorl.types.Params :value: > None ```{autodoc2-docstring} evorl.algorithms.sac.SACNetworkParams.target_critic_params :parser: autodoc2_docstrings_parser ``` ```` ````` `````{py:class} SACTrainMetric :canonical: evorl.algorithms.sac.SACTrainMetric Bases: {py:obj}`evorl.metrics.MetricBase` ````{py:attribute} actor_loss :canonical: evorl.algorithms.sac.SACTrainMetric.actor_loss :type: chex.Array :value: > None ```{autodoc2-docstring} evorl.algorithms.sac.SACTrainMetric.actor_loss :parser: autodoc2_docstrings_parser ``` ```` ````{py:attribute} alpha_loss :canonical: evorl.algorithms.sac.SACTrainMetric.alpha_loss :type: chex.Array | None :value: > None ```{autodoc2-docstring} evorl.algorithms.sac.SACTrainMetric.alpha_loss :parser: autodoc2_docstrings_parser ``` ```` ````{py:attribute} critic_loss :canonical: evorl.algorithms.sac.SACTrainMetric.critic_loss :type: chex.Array :value: > None ```{autodoc2-docstring} evorl.algorithms.sac.SACTrainMetric.critic_loss :parser: autodoc2_docstrings_parser ``` ```` ````{py:attribute} raw_loss_dict :canonical: evorl.algorithms.sac.SACTrainMetric.raw_loss_dict :type: evorl.types.LossDict :value: > 'metric_field(...)' ```{autodoc2-docstring} evorl.algorithms.sac.SACTrainMetric.raw_loss_dict :parser: autodoc2_docstrings_parser ``` ```` ````` `````{py:class} SACWorkflow(env: evorl.envs.Env, agent: evorl.agent.Agent, optimizer: optax.GradientTransformation, evaluator: evorl.evaluators.Evaluator, replay_buffer: evorl.replay_buffers.AbstractReplayBuffer, config: omegaconf.DictConfig) :canonical: evorl.algorithms.sac.SACWorkflow Bases: {py:obj}`evorl.algorithms.offpolicy_utils.OffPolicyWorkflowTemplate` ````{py:method} name() :canonical: evorl.algorithms.sac.SACWorkflow.name :classmethod: ```` ````{py:method} step(state: evorl.types.State) -> tuple[evorl.metrics.MetricBase, evorl.types.State] :canonical: evorl.algorithms.sac.SACWorkflow.step ```` ````` ````{py:function} make_mlp_sac_agent(action_space: evorl.envs.Space, num_critics: int = 2, critic_hidden_layer_sizes: tuple[int] = (256, 256), actor_hidden_layer_sizes: tuple[int] = (256, 256), init_alpha: float = 1.0, discount: float = 0.99, target_entropy_ratio: float = 0.98, normalize_obs: bool = False, policy_obs_key: str = '', value_obs_key: str = '') :canonical: evorl.algorithms.sac.make_mlp_sac_agent ```{autodoc2-docstring} evorl.algorithms.sac.make_mlp_sac_agent :parser: autodoc2_docstrings_parser ``` ````