tutorials.examples.train_box¶
The goal of this script is to train a GFlowNet on the Box environment using Cartesian per-dimension increments.
- Example usage:
python train_box.py –delta 0.25 –tied –loss TB python train_box.py –delta 0.1 –loss DB –n_components 5
Based on results from: [A theory of continuous generative flow networks](https://arxiv.org/abs/2301.12594)
Attributes¶
Functions¶
|
Estimate Jensen-Shannon divergence between two distributions defined by KDEs |
|
Create a list of states from [0, 1]^2 by discretizing it into n x n grid. |
|
|
|
Plot sampled trajectories on the Box environment. |
|
Samples states from the true reward distribution |
Module Contents¶
- tutorials.examples.train_box.DEFAULT_SEED: int = 4444¶
- tutorials.examples.train_box.estimate_jsd(kde1, kde2)¶
Estimate Jensen-Shannon divergence between two distributions defined by KDEs
- Returns:
A float value of the estimated JSD
- Parameters:
kde1 (sklearn.neighbors.KernelDensity)
kde2 (sklearn.neighbors.KernelDensity)
- Return type:
float
- tutorials.examples.train_box.get_test_states(n=100, maxi=1.0)¶
Create a list of states from [0, 1]^2 by discretizing it into n x n grid.
- Returns:
A numpy array of shape (n^2, 2) containing the test states,
- Parameters:
n (int)
maxi (float)
- Return type:
numpy.typing.NDArray[numpy.float64]
- tutorials.examples.train_box.main(args)¶
- Parameters:
args (argparse.Namespace)
- Return type:
float
- tutorials.examples.train_box.parser¶
- tutorials.examples.train_box.plot_trajectories(env, sampler, n_trajectories=100, output_path=None, alpha=0.1)¶
Plot sampled trajectories on the Box environment.
Each trajectory is plotted as a line from s0 to the terminal state, with transparency to visualize overlapping paths.
- Parameters:
env (gfn.gym.Box) – The Box environment.
sampler (gfn.samplers.Sampler) – The sampler to use for generating trajectories.
n_trajectories (int) – Number of trajectories to sample and plot.
output_path (Optional[str]) – Path to save the output plot. If None, defaults to EXAMPLES_OUTPUTS / ‘train_box_trajectories.png’.
alpha (float) – Transparency for each trajectory line.
- Return type:
None
- tutorials.examples.train_box.sample_from_reward(env, n_samples)¶
Samples states from the true reward distribution
Implement rejection sampling, with proposal being uniform distribution in [0, 1]^2 :returns: A numpy array of shape (n_samples, 2) containing the sampled states
- Parameters:
env (gfn.gym.Box)
n_samples (int)
- Return type:
numpy.typing.NDArray[numpy.float64]