gfn.gym.line¶
Classes¶
Mixture of Gaussians Line environment. |
Module Contents¶
- class gfn.gym.line.Line(mus, sigmas, init_value, n_sd=4.5, n_steps_per_trajectory=5, device='cpu', debug=False)¶
Bases:
gfn.env.EnvMixture of Gaussians Line environment.
- Parameters:
mus (list)
sigmas (list)
init_value (float)
n_sd (float)
n_steps_per_trajectory (int)
device (Literal['cpu', 'cuda'] | torch.device)
debug (bool)
- mus¶
The means of the Gaussians.
- sigmas¶
The standard deviations of the Gaussians.
- n_sd¶
The number of standard deviations to consider for the bounds.
- n_steps_per_trajectory¶
The number of steps per trajectory.
- mixture¶
The mixture of Gaussians.
- init_value¶
The initial value of the state.
- backward_step(states, actions)¶
Performs a backward step in the environment.
- Parameters:
states (gfn.states.States) – The current states.
actions (gfn.actions.Actions) – The actions to take.
- Returns:
The previous states.
- Return type:
- init_value¶
- is_action_valid(states, actions, backward=False)¶
Checks if the actions are valid.
- Parameters:
states (gfn.states.States) – The current states.
actions (gfn.actions.Actions) – The actions to check.
backward (bool) – Whether to check for backward actions.
- Returns:
True if the actions are valid, False otherwise.
- Return type:
bool
- log_partition(condition=None)¶
Returns the log partition of the reward function.
- Return type:
torch.Tensor
- log_reward(final_states)¶
Computes the log reward of the environment.
- Parameters:
final_states (gfn.states.States) – The final states of the environment.
- Returns:
The log reward.
- Return type:
torch.Tensor
- mixture¶
- mus¶
- n_sd = 4.5¶
- n_steps_per_trajectory = 5¶
- sigmas¶
- step(states, actions)¶
Performs a step in the environment.
- Parameters:
states (gfn.states.States) – The current states.
actions (gfn.actions.Actions) – The actions to take.
- Returns:
The next states.
- Return type: