tutorials.examples.train_line ============================= .. py:module:: tutorials.examples.train_line Attributes ---------- .. autoapisummary:: tutorials.examples.train_line.parser Classes ------- .. autoapisummary:: tutorials.examples.train_line.GaussianStepMLP tutorials.examples.train_line.ScaledGaussianWithOptionalExit tutorials.examples.train_line.StepEstimator Functions --------- .. autoapisummary:: tutorials.examples.train_line.main tutorials.examples.train_line.render tutorials.examples.train_line.train Module Contents --------------- .. py:class:: GaussianStepMLP(hidden_dim, n_hidden_layers, policy_std_min = 0.1, policy_std_max = 1) Bases: :py:obj:`gfn.utils.modules.MLP` A deep neural network for the forward and backward policy. .. py:method:: forward(preprocessed_states) Calculate the gaussian parameters, applying the bound to sigma. :param preprocessed_states: a tensor of shape (*batch_shape, 2) containing the states. Returns a tensor of shape (*batch_shape, 2) containing the mean and variance of the Gaussian distribution. .. py:attribute:: input_dim :value: 2 .. py:attribute:: output_dim :value: 2 .. py:attribute:: policy_std_max :value: 1 .. py:attribute:: policy_std_min :value: 0.1 .. py:class:: ScaledGaussianWithOptionalExit(states, mus, scales, backward, n_steps = 5) Bases: :py:obj:`torch.distributions.Distribution` Extends the Beta distribution by considering the step counter. When sampling, the step counter can be used to ensure the `exit_action` [inf, inf] is sampled. .. py:attribute:: backward .. py:attribute:: dist .. py:attribute:: exit_action .. py:attribute:: idx_at_final_backward_step .. py:attribute:: idx_at_final_forward_step .. py:method:: log_prob(sampled_actions) Computes log-probabilities, returning 0 for deterministic exit/BTS transitions. .. py:method:: sample(sample_shape=()) .. py:class:: StepEstimator(env, module, backward) Bases: :py:obj:`gfn.estimators.Estimator`, :py:obj:`gfn.estimators.PolicyMixin` Estimator for PF and PB of the Line environment. .. py:attribute:: backward .. py:property:: expected_output_dim :type: int Expected output dimension of the module. :returns: The expected output dimension of the module, or None if the output dimension is not well-defined (e.g., when the output is a TensorDict for GraphActions). .. py:attribute:: n_steps_per_trajectory .. py:method:: to_probability_distribution(states, module_output, scale_factor=0) Converts the output of the neural network to a probability distribution. :param states: The states to use for the distribution. :param module_output: The output of the neural network as a tensor of shape (*batch_shape, output_dim). :param scale_factor: The scale factor to use for the distribution. Returns a distribution object. .. py:function:: main(args) .. py:data:: parser .. py:function:: render(env, validation_samples=None) Renders the reward distribution over the 1D env. .. py:function:: train(gflownet, env, seed=4444, n_trajectories=3000000.0, batch_size=128, lr_base=0.001, gradient_clip_value=5, exploration_var_starting_val=2) Trains a GFlowNet on the Line Environment.