tutorials.examples.train_box_legacy =================================== .. py:module:: tutorials.examples.train_box_legacy .. autoapi-nested-parse:: The goal of this script is to reproduce some of the published results on the Box environment. Run one of the following commands to reproduce some of the results in [A theory of continuous generative flow networks](https://arxiv.org/abs/2301.12594) python train_box.py --delta {0.1, 0.25} --tied {--uniform_pb} --loss {TB, DB} Attributes ---------- .. autoapisummary:: tutorials.examples.train_box_legacy.DEFAULT_SEED tutorials.examples.train_box_legacy.parser Functions --------- .. autoapisummary:: tutorials.examples.train_box_legacy.estimate_jsd tutorials.examples.train_box_legacy.get_test_states tutorials.examples.train_box_legacy.main tutorials.examples.train_box_legacy.sample_from_reward Module Contents --------------- .. py:data:: DEFAULT_SEED :type: int :value: 4444 .. py:function:: estimate_jsd(kde1, kde2) Estimate Jensen-Shannon divergence between two distributions defined by KDEs :returns: A float value of the estimated JSD .. py:function:: get_test_states(n = 100, maxi = 1.0) Create a list of states from [0, 1]^2 by discretizing it into n x n grid. :returns: A numpy array of shape (n^2, 2) containing the test states, .. py:function:: main(args) .. py:data:: parser .. py:function:: sample_from_reward(env, n_samples) Samples states from the true reward distribution Implement rejection sampling, with proposal being uniform distribution in [0, 1]^2 :returns: A numpy array of shape (n_samples, 2) containing the sampled states