evorl.algorithms.ec.ec_agent¶
Module Contents¶
Classes¶
Deterministic Agent for continuous action space in [-1, 1]. |
|
Contains training state for the learner. |
|
Stochastic Agent. |
Functions¶
API¶
- class evorl.algorithms.ec.ec_agent.DeterministicECAgent[source]¶
Bases:
evorl.agent.AgentDeterministic Agent for continuous action space in [-1, 1].
- compute_actions(agent_state: evorl.agent.AgentState, sample_batch: evorl.sample_batch.SampleBatch, key: chex.PRNGKey) tuple[evorl.types.Action, evorl.types.PolicyExtraInfo][source]¶
- evaluate_actions(agent_state: evorl.agent.AgentState, sample_batch: evorl.sample_batch.SampleBatch, key: chex.PRNGKey) tuple[evorl.types.Action, evorl.types.PolicyExtraInfo][source]¶
- init(obs_space: evorl.envs.Space, action_space: evorl.envs.Space, key: chex.PRNGKey) evorl.agent.AgentState[source]¶
- property normalize_obs¶
- obs_preprocessor: Any¶
‘pytree_field(…)’
- policy_network: flax.linen.Module¶
None
- class evorl.algorithms.ec.ec_agent.ECNetworkParams[source]¶
Bases:
evorl.types.PyTreeDataContains training state for the learner.
- policy_params: evorl.types.Params¶
None
- class evorl.algorithms.ec.ec_agent.StochasticECAgent[source]¶
Bases:
evorl.agent.AgentStochastic Agent.
Support continuous action space in [-1, 1] via TanhNormal distribution or discrete action space via Softmax distribution.
- compute_actions(agent_state: evorl.agent.AgentState, sample_batch: evorl.sample_batch.SampleBatch, key: chex.PRNGKey) tuple[evorl.types.Action, evorl.types.PolicyExtraInfo][source]¶
- continuous_action: bool¶
None
- evaluate_actions(agent_state: evorl.agent.AgentState, sample_batch: evorl.sample_batch.SampleBatch, key: chex.PRNGKey) tuple[evorl.types.Action, evorl.types.PolicyExtraInfo][source]¶
- init(obs_space: evorl.envs.Space, action_space: evorl.envs.Space, key: chex.PRNGKey) evorl.agent.AgentState[source]¶
- property normalize_obs¶
- obs_preprocessor: Any¶
‘pytree_field(…)’
- policy_network: flax.linen.Module¶
None