tutorials.examples.train_line¶

Attributes¶

parser

Classes¶

`GaussianStepMLP`	A deep neural network for the forward and backward policy.
`ScaledGaussianWithOptionalExit`	Extends the Beta distribution by considering the step counter. When sampling,
`StepEstimator`	Estimator for PF and PB of the Line environment.

Functions¶

`main`(args)
`render`(env[, validation_samples])	Renders the reward distribution over the 1D env.
`train`(gflownet, env[, seed, n_trajectories, ...])	Trains a GFlowNet on the Line Environment.

Module Contents¶

class tutorials.examples.train_line.GaussianStepMLP(hidden_dim, n_hidden_layers, policy_std_min=0.1, policy_std_max=1)¶

Bases: gfn.utils.modules.MLP

A deep neural network for the forward and backward policy.

Parameters:

hidden_dim (int)
n_hidden_layers (int)
policy_std_min (float)
policy_std_max (float)

forward(preprocessed_states)¶

Calculate the gaussian parameters, applying the bound to sigma.

Parameters:: preprocessed_states (torch.Tensor) – a tensor of shape (*batch_shape, 2) containing the states.
Return type:: torch.Tensor

Returns a tensor of shape (*batch_shape, 2) containing the mean and variance of the Gaussian distribution.

input_dim = 2¶

output_dim = 2¶

policy_std_max = 1¶

policy_std_min = 0.1¶

class tutorials.examples.train_line.ScaledGaussianWithOptionalExit(states, mus, scales, backward, n_steps=5)¶

Bases: torch.distributions.Distribution

Extends the Beta distribution by considering the step counter. When sampling, the step counter can be used to ensure the exit_action [inf, inf] is sampled.

Parameters:

states (gfn.states.States)
mus (torch.Tensor)
scales (torch.Tensor)
backward (bool)
n_steps (int)

backward¶

dist¶

exit_action¶

idx_at_final_backward_step¶

idx_at_final_forward_step¶

log_prob(sampled_actions)¶: Computes log-probabilities, returning 0 for deterministic exit/BTS transitions.

sample(sample_shape=())¶

class tutorials.examples.train_line.StepEstimator(env, module, backward)¶

Bases: gfn.estimators.Estimator, gfn.estimators.PolicyMixin

Estimator for PF and PB of the Line environment.

Parameters:

env (gfn.gym.line.Line)
module (torch.nn.Module)
backward (bool)

backward¶

property expected_output_dim: int¶

Expected output dimension of the module.

Returns:: The expected output dimension of the module, or None if the output dimension is not well-defined (e.g., when the output is a TensorDict for GraphActions).
Return type:: int

n_steps_per_trajectory¶

to_probability_distribution(states, module_output, scale_factor=0)¶

Converts the output of the neural network to a probability distribution.

Parameters:

states (gfn.states.States) – The states to use for the distribution.
module_output (torch.Tensor) – The output of the neural network as a tensor of shape (*batch_shape, output_dim).
scale_factor – The scale factor to use for the distribution.

Return type:

torch.distributions.Distribution

Returns a distribution object.

tutorials.examples.train_line.main(args)¶

tutorials.examples.train_line.parser¶

tutorials.examples.train_line.render(env, validation_samples=None)¶: Renders the reward distribution over the 1D env.

tutorials.examples.train_line.train(gflownet, env, seed=4444, n_trajectories=3000000.0, batch_size=128, lr_base=0.001, gradient_clip_value=5, exploration_var_starting_val=2)¶: Trains a GFlowNet on the Line Environment.

torchgfn

Documentation

tutorials.examples.train_line¶

Attributes¶

Classes¶

Functions¶

Module Contents¶