gfn.containers.trajectories
===========================

.. py:module:: gfn.containers.trajectories


Classes
-------

.. autoapisummary::

   gfn.containers.trajectories.Trajectories


Functions
---------

.. autoapisummary::

   gfn.containers.trajectories.pad_dim0_if_needed


Module Contents
---------------

.. py:class:: Trajectories(env, states = None, actions = None, terminating_idx = None, is_backward = False, log_rewards = None, log_probs = None, estimator_outputs = None)

   Bases: :py:obj:`gfn.containers.base.Container`


   Container for complete trajectories (starting in $s_0$ and ending in $s_f$).

   Trajectories are represented as a States object with bi-dimensional batch shape.
   Actions are represented as an Actions object with bi-dimensional batch shape.
   The first dimension represents the time step, the second dimension represents
   the trajectory index. Because different trajectories may have different lengths,
   shorter trajectories are padded with the tensor representation of the terminal
   state ($s_f$ or $s_0$ depending on the direction of the trajectory), and
   actions is appended with dummy actions. The `terminating_idx` tensor represents
   the time step at which each trajectory ends.

   .. attribute:: env

      The environment where the states and actions are defined.

   .. attribute:: states

      States with batch_shape (max_length+1, batch_size).

   .. attribute:: actions

      Actions with batch_shape (max_length, batch_size).

   .. attribute:: terminating_idx

      Tensor of shape (batch_size,) indicating the time step
      at which each trajectory ends.

   .. attribute:: is_backward

      Whether the trajectories are backward or forward. When not
      is_backward, the `states` are ordered from initial to terminal states.
      When is_backward, the `states` are ordered from terminal to initial states.

   .. attribute:: _log_rewards

      (Optional) Tensor of shape (batch_size,) containing the
      log rewards of the trajectories.

   .. attribute:: log_probs

      (Optional) Tensor of shape (max_length, batch_size) indicating
      the log probabilities of the trajectories' actions.

   .. attribute:: estimator_outputs

      (Optional) Tensor of shape (max_length, batch_size, ...)
      containing outputs of a function approximator for each step.


   .. py:method:: __getitem__(index)

      Returns a subset of the trajectories along the batch dimension.

      :param index: Indices to select trajectories.

      :returns: A new Trajectories object with the selected trajectories and associated data.


   .. py:method:: __len__()

      Returns the number of trajectories in the container.

      :returns: The number of trajectories.


   .. py:method:: __repr__()

      Returns a string representation of the Trajectories container.

      :returns: A string summary of the trajectories.


   .. py:attribute:: _log_rewards
      :value: None


   .. py:attribute:: actions


   .. py:property:: batch_size
      :type: int


      The number of trajectories in the container.

      :returns: The number of trajectories.


   .. py:property:: device
      :type: torch.device


      The device on which the trajectories are stored.

      :returns: The device object of the `self.states`.


   .. py:attribute:: env


   .. py:attribute:: estimator_outputs
      :value: None


   .. py:method:: extend(other)

      Extends this Trajectories object with another Trajectories object.

      Extends along all attributes in turn (actions, states, terminating_idx, log_probs,
      log_rewards).

      :param other: Another Trajectories to append.


   .. py:method:: from_tensordict(env, td)
      :classmethod:


      Reconstruct Trajectories from a TensorDict.


   .. py:attribute:: is_backward
      :value: False


   .. py:attribute:: log_probs
      :value: None


   .. py:property:: log_rewards
      :type: torch.Tensor | None


      The log rewards for the trajectories.

      :returns: Log rewards tensor of shape (batch_size,).

      .. note::

         If not provided at initialization, log rewards are computed on demand for
         terminating states.


   .. py:property:: max_length
      :type: int


      The maximum length of the trajectories in the container.

      :returns: The maximum trajectory length.


   .. py:property:: n_trajectories
      :type: int


      Deprecated alias for :attr:`batch_size`.


   .. py:method:: reverse_backward_trajectories()

      Returns a reversed version of the backward trajectories.


   .. py:attribute:: states


   .. py:attribute:: terminating_idx


   .. py:property:: terminating_states
      :type: gfn.states.States


      The terminating states of the trajectories.

      :returns: The terminating states.


   .. py:method:: to_states_container()

      Returns a StatesContainer object from the current Trajectories.

      :returns: A StatesContainer object with the same states, actions, and log_rewards as the
                current Trajectories.


   .. py:method:: to_tensordict()

      Serialize trajectories into a TensorDict.


   .. py:method:: to_transitions()

      Returns a Transitions object from the current Trajectories.

      :returns: A Transitions object with the same states, actions, and log_rewards as the
                current Trajectories.


.. py:function:: pad_dim0_if_needed(a, b, value = -float('inf'))

   Pads tensor a or b to match the first dimension of the other.

   :param a: First tensor.
   :param b: Second tensor.
   :param value: Value to use for padding.

   :returns: Tuple of tensors with the same first dimension.