tutorials.examples.train_box_legacy

The goal of this script is to reproduce some of the published results on the Box environment. Run one of the following commands to reproduce some of the results in [A theory of continuous generative flow networks](https://arxiv.org/abs/2301.12594)

python train_box.py –delta {0.1, 0.25} –tied {–uniform_pb} –loss {TB, DB}

Attributes

DEFAULT_SEED

parser

Functions

estimate_jsd(kde1, kde2)

Estimate Jensen-Shannon divergence between two distributions defined by KDEs

get_test_states([n, maxi])

Create a list of states from [0, 1]^2 by discretizing it into n x n grid.

main(args)

sample_from_reward(env, n_samples)

Samples states from the true reward distribution

Module Contents

tutorials.examples.train_box_legacy.DEFAULT_SEED: int = 4444
tutorials.examples.train_box_legacy.estimate_jsd(kde1, kde2)

Estimate Jensen-Shannon divergence between two distributions defined by KDEs

Returns:

A float value of the estimated JSD

Parameters:
  • kde1 (sklearn.neighbors.KernelDensity)

  • kde2 (sklearn.neighbors.KernelDensity)

Return type:

float

tutorials.examples.train_box_legacy.get_test_states(n=100, maxi=1.0)

Create a list of states from [0, 1]^2 by discretizing it into n x n grid.

Returns:

A numpy array of shape (n^2, 2) containing the test states,

Parameters:
  • n (int)

  • maxi (float)

Return type:

numpy.typing.NDArray[numpy.float64]

tutorials.examples.train_box_legacy.main(args)
Parameters:

args (argparse.Namespace)

Return type:

float

tutorials.examples.train_box_legacy.parser
tutorials.examples.train_box_legacy.sample_from_reward(env, n_samples)

Samples states from the true reward distribution

Implement rejection sampling, with proposal being uniform distribution in [0, 1]^2 :returns: A numpy array of shape (n_samples, 2) containing the sampled states

Parameters:
Return type:

numpy.typing.NDArray[numpy.float64]