tutorials.examples.train_line¶
Attributes¶
Classes¶
A deep neural network for the forward and backward policy. |
|
Extends the Beta distribution by considering the step counter. When sampling, |
|
Estimator for PF and PB of the Line environment. |
Functions¶
|
|
|
Renders the reward distribution over the 1D env. |
|
Trains a GFlowNet on the Line Environment. |
Module Contents¶
- class tutorials.examples.train_line.GaussianStepMLP(hidden_dim, n_hidden_layers, policy_std_min=0.1, policy_std_max=1)¶
Bases:
gfn.utils.modules.MLPA deep neural network for the forward and backward policy.
- Parameters:
hidden_dim (int)
n_hidden_layers (int)
policy_std_min (float)
policy_std_max (float)
- forward(preprocessed_states)¶
Calculate the gaussian parameters, applying the bound to sigma.
- Parameters:
preprocessed_states (torch.Tensor) – a tensor of shape (*batch_shape, 2) containing the states.
- Return type:
torch.Tensor
Returns a tensor of shape (*batch_shape, 2) containing the mean and variance of the Gaussian distribution.
- input_dim = 2¶
- output_dim = 2¶
- policy_std_max = 1¶
- policy_std_min = 0.1¶
- class tutorials.examples.train_line.ScaledGaussianWithOptionalExit(states, mus, scales, backward, n_steps=5)¶
Bases:
torch.distributions.DistributionExtends the Beta distribution by considering the step counter. When sampling, the step counter can be used to ensure the exit_action [inf, inf] is sampled.
- Parameters:
states (gfn.states.States)
mus (torch.Tensor)
scales (torch.Tensor)
backward (bool)
n_steps (int)
- backward¶
- dist¶
- exit_action¶
- idx_at_final_backward_step¶
- idx_at_final_forward_step¶
- log_prob(sampled_actions)¶
Computes log-probabilities, returning 0 for deterministic exit/BTS transitions.
- sample(sample_shape=())¶
- class tutorials.examples.train_line.StepEstimator(env, module, backward)¶
Bases:
gfn.estimators.Estimator,gfn.estimators.PolicyMixinEstimator for PF and PB of the Line environment.
- Parameters:
env (gfn.gym.line.Line)
module (torch.nn.Module)
backward (bool)
- backward¶
- property expected_output_dim: int¶
Expected output dimension of the module.
- Returns:
The expected output dimension of the module, or None if the output dimension is not well-defined (e.g., when the output is a TensorDict for GraphActions).
- Return type:
int
- n_steps_per_trajectory¶
- to_probability_distribution(states, module_output, scale_factor=0)¶
Converts the output of the neural network to a probability distribution.
- Parameters:
states (gfn.states.States) – The states to use for the distribution.
module_output (torch.Tensor) – The output of the neural network as a tensor of shape (*batch_shape, output_dim).
scale_factor – The scale factor to use for the distribution.
- Return type:
torch.distributions.Distribution
Returns a distribution object.
- tutorials.examples.train_line.main(args)¶
- tutorials.examples.train_line.parser¶
- tutorials.examples.train_line.render(env, validation_samples=None)¶
Renders the reward distribution over the 1D env.
- tutorials.examples.train_line.train(gflownet, env, seed=4444, n_trajectories=3000000.0, batch_size=128, lr_base=0.001, gradient_clip_value=5, exploration_var_starting_val=2)¶
Trains a GFlowNet on the Line Environment.