gfn.containers.trajectories =========================== .. py:module:: gfn.containers.trajectories Classes ------- .. autoapisummary:: gfn.containers.trajectories.Trajectories Functions --------- .. autoapisummary:: gfn.containers.trajectories.pad_dim0_if_needed Module Contents --------------- .. py:class:: Trajectories(env, states = None, actions = None, terminating_idx = None, is_backward = False, log_rewards = None, log_probs = None, estimator_outputs = None) Bases: :py:obj:`gfn.containers.base.Container` Container for complete trajectories (starting in $s_0$ and ending in $s_f$). Trajectories are represented as a States object with bi-dimensional batch shape. Actions are represented as an Actions object with bi-dimensional batch shape. The first dimension represents the time step, the second dimension represents the trajectory index. Because different trajectories may have different lengths, shorter trajectories are padded with the tensor representation of the terminal state ($s_f$ or $s_0$ depending on the direction of the trajectory), and actions is appended with dummy actions. The `terminating_idx` tensor represents the time step at which each trajectory ends. .. attribute:: env The environment where the states and actions are defined. .. attribute:: states States with batch_shape (max_length+1, batch_size). .. attribute:: actions Actions with batch_shape (max_length, batch_size). .. attribute:: terminating_idx Tensor of shape (batch_size,) indicating the time step at which each trajectory ends. .. attribute:: is_backward Whether the trajectories are backward or forward. When not is_backward, the `states` are ordered from initial to terminal states. When is_backward, the `states` are ordered from terminal to initial states. .. attribute:: _log_rewards (Optional) Tensor of shape (batch_size,) containing the log rewards of the trajectories. .. attribute:: log_probs (Optional) Tensor of shape (max_length, batch_size) indicating the log probabilities of the trajectories' actions. .. attribute:: estimator_outputs (Optional) Tensor of shape (max_length, batch_size, ...) containing outputs of a function approximator for each step. .. py:method:: __getitem__(index) Returns a subset of the trajectories along the batch dimension. :param index: Indices to select trajectories. :returns: A new Trajectories object with the selected trajectories and associated data. .. py:method:: __len__() Returns the number of trajectories in the container. :returns: The number of trajectories. .. py:method:: __repr__() Returns a string representation of the Trajectories container. :returns: A string summary of the trajectories. .. py:attribute:: _log_rewards :value: None .. py:attribute:: actions .. py:property:: batch_size :type: int The number of trajectories in the container. :returns: The number of trajectories. .. py:property:: device :type: torch.device The device on which the trajectories are stored. :returns: The device object of the `self.states`. .. py:attribute:: env .. py:attribute:: estimator_outputs :value: None .. py:method:: extend(other) Extends this Trajectories object with another Trajectories object. Extends along all attributes in turn (actions, states, terminating_idx, log_probs, log_rewards). :param other: Another Trajectories to append. .. py:method:: from_tensordict(env, td) :classmethod: Reconstruct Trajectories from a TensorDict. .. py:attribute:: is_backward :value: False .. py:attribute:: log_probs :value: None .. py:property:: log_rewards :type: torch.Tensor | None The log rewards for the trajectories. :returns: Log rewards tensor of shape (batch_size,). .. note:: If not provided at initialization, log rewards are computed on demand for terminating states. .. py:property:: max_length :type: int The maximum length of the trajectories in the container. :returns: The maximum trajectory length. .. py:property:: n_trajectories :type: int Deprecated alias for :attr:`batch_size`. .. py:method:: reverse_backward_trajectories() Returns a reversed version of the backward trajectories. .. py:attribute:: states .. py:attribute:: terminating_idx .. py:property:: terminating_states :type: gfn.states.States The terminating states of the trajectories. :returns: The terminating states. .. py:method:: to_states_container() Returns a StatesContainer object from the current Trajectories. :returns: A StatesContainer object with the same states, actions, and log_rewards as the current Trajectories. .. py:method:: to_tensordict() Serialize trajectories into a TensorDict. .. py:method:: to_transitions() Returns a Transitions object from the current Trajectories. :returns: A Transitions object with the same states, actions, and log_rewards as the current Trajectories. .. py:function:: pad_dim0_if_needed(a, b, value = -float('inf')) Pads tensor a or b to match the first dimension of the other. :param a: First tensor. :param b: Second tensor. :param value: Value to use for padding. :returns: Tuple of tensors with the same first dimension.