tutorials.examples.train_box_legacy¶
The goal of this script is to reproduce some of the published results on the Box environment. Run one of the following commands to reproduce some of the results in [A theory of continuous generative flow networks](https://arxiv.org/abs/2301.12594)
python train_box.py –delta {0.1, 0.25} –tied {–uniform_pb} –loss {TB, DB}
Attributes¶
Functions¶
|
Estimate Jensen-Shannon divergence between two distributions defined by KDEs |
|
Create a list of states from [0, 1]^2 by discretizing it into n x n grid. |
|
|
|
Samples states from the true reward distribution |
Module Contents¶
- tutorials.examples.train_box_legacy.DEFAULT_SEED: int = 4444¶
- tutorials.examples.train_box_legacy.estimate_jsd(kde1, kde2)¶
Estimate Jensen-Shannon divergence between two distributions defined by KDEs
- Returns:
A float value of the estimated JSD
- Parameters:
kde1 (sklearn.neighbors.KernelDensity)
kde2 (sklearn.neighbors.KernelDensity)
- Return type:
float
- tutorials.examples.train_box_legacy.get_test_states(n=100, maxi=1.0)¶
Create a list of states from [0, 1]^2 by discretizing it into n x n grid.
- Returns:
A numpy array of shape (n^2, 2) containing the test states,
- Parameters:
n (int)
maxi (float)
- Return type:
numpy.typing.NDArray[numpy.float64]
- tutorials.examples.train_box_legacy.main(args)¶
- Parameters:
args (argparse.Namespace)
- Return type:
float
- tutorials.examples.train_box_legacy.parser¶
- tutorials.examples.train_box_legacy.sample_from_reward(env, n_samples)¶
Samples states from the true reward distribution
Implement rejection sampling, with proposal being uniform distribution in [0, 1]^2 :returns: A numpy array of shape (n_samples, 2) containing the sampled states
- Parameters:
env (gfn.gym.BoxPolar)
n_samples (int)
- Return type:
numpy.typing.NDArray[numpy.float64]