tutorials.examples.train_box

The goal of this script is to train a GFlowNet on the Box environment using Cartesian per-dimension increments.

Example usage:

python train_box.py –delta 0.25 –tied –loss TB python train_box.py –delta 0.1 –loss DB –n_components 5

Based on results from: [A theory of continuous generative flow networks](https://arxiv.org/abs/2301.12594)

Attributes

DEFAULT_SEED

parser

Functions

estimate_jsd(kde1, kde2)

Estimate Jensen-Shannon divergence between two distributions defined by KDEs

get_test_states([n, maxi])

Create a list of states from [0, 1]^2 by discretizing it into n x n grid.

main(args)

plot_trajectories(env, sampler[, n_trajectories, ...])

Plot sampled trajectories on the Box environment.

sample_from_reward(env, n_samples)

Samples states from the true reward distribution

Module Contents

tutorials.examples.train_box.DEFAULT_SEED: int = 4444
tutorials.examples.train_box.estimate_jsd(kde1, kde2)

Estimate Jensen-Shannon divergence between two distributions defined by KDEs

Returns:

A float value of the estimated JSD

Parameters:
  • kde1 (sklearn.neighbors.KernelDensity)

  • kde2 (sklearn.neighbors.KernelDensity)

Return type:

float

tutorials.examples.train_box.get_test_states(n=100, maxi=1.0)

Create a list of states from [0, 1]^2 by discretizing it into n x n grid.

Returns:

A numpy array of shape (n^2, 2) containing the test states,

Parameters:
  • n (int)

  • maxi (float)

Return type:

numpy.typing.NDArray[numpy.float64]

tutorials.examples.train_box.main(args)
Parameters:

args (argparse.Namespace)

Return type:

float

tutorials.examples.train_box.parser
tutorials.examples.train_box.plot_trajectories(env, sampler, n_trajectories=100, output_path=None, alpha=0.1)

Plot sampled trajectories on the Box environment.

Each trajectory is plotted as a line from s0 to the terminal state, with transparency to visualize overlapping paths.

Parameters:
  • env (gfn.gym.Box) – The Box environment.

  • sampler (gfn.samplers.Sampler) – The sampler to use for generating trajectories.

  • n_trajectories (int) – Number of trajectories to sample and plot.

  • output_path (Optional[str]) – Path to save the output plot. If None, defaults to EXAMPLES_OUTPUTS / ‘train_box_trajectories.png’.

  • alpha (float) – Transparency for each trajectory line.

Return type:

None

tutorials.examples.train_box.sample_from_reward(env, n_samples)

Samples states from the true reward distribution

Implement rejection sampling, with proposal being uniform distribution in [0, 1]^2 :returns: A numpy array of shape (n_samples, 2) containing the sampled states

Parameters:
  • env (gfn.gym.Box)

  • n_samples (int)

Return type:

numpy.typing.NDArray[numpy.float64]