tutorials.examples.train_box_legacy
===================================

.. py:module:: tutorials.examples.train_box_legacy

.. autoapi-nested-parse::

   The goal of this script is to reproduce some of the published results on the Box
   environment. Run one of the following commands to reproduce some of the results in
   [A theory of continuous generative flow networks](https://arxiv.org/abs/2301.12594)


   python train_box.py --delta {0.1, 0.25} --tied {--uniform_pb} --loss {TB, DB}


Attributes
----------

.. autoapisummary::

   tutorials.examples.train_box_legacy.DEFAULT_SEED
   tutorials.examples.train_box_legacy.parser


Functions
---------

.. autoapisummary::

   tutorials.examples.train_box_legacy.estimate_jsd
   tutorials.examples.train_box_legacy.get_test_states
   tutorials.examples.train_box_legacy.main
   tutorials.examples.train_box_legacy.sample_from_reward


Module Contents
---------------

.. py:data:: DEFAULT_SEED
   :type:  int
   :value: 4444


.. py:function:: estimate_jsd(kde1, kde2)

   Estimate Jensen-Shannon divergence between two distributions defined by KDEs

   :returns: A float value of the estimated JSD


.. py:function:: get_test_states(n = 100, maxi = 1.0)

   Create a list of states from [0, 1]^2 by discretizing it into n x n grid.

   :returns: A numpy array of shape (n^2, 2) containing the test states,


.. py:function:: main(args)

.. py:data:: parser

.. py:function:: sample_from_reward(env, n_samples)

   Samples states from the true reward distribution

   Implement rejection sampling, with proposal being uniform distribution in [0, 1]^2
   :returns: A numpy array of shape (n_samples, 2) containing the sampled states