env¶
Custom mRNA codon design environment using torchgfn to generate mRNA sequences encoding a given protein. It supports a multi-objective optimization over biological properties of mRNA sequences. Implemented using the DiscreteEnv class. Each timestep corresponds to choosing a synonymous codon for the next amino acid in the sequence. Action Space: Number of CODONS + 1 possible actions (all codons + 1 exit action) State Representation: A vector of length = protein length, initialized to -1. Codons are filled in step-by-step. Masking (Action Constraints)**: At each position t, only codons that correspond to the t-th amino acid are allowed to ensure biological correctness. Reward function: A combination of multiple biological properties to evaluate the mRNA sequence. Weights of these objectives can be updated dynamically to reflect different reward configurations. Rewards and constraints are modular and can be extended to incorporate new objectives. The environment is customizable for different organisms by using species-specific codon tables preferences, and could serve as a benchmark environment in computational biology. This enables exploration of codon space, which is a large search space given a protein sequence, to optimize for mRNA design. Applicable to mRNA vaccines, protein therapeutics, and gene expression optimization.
Classes¶
Environment for designing mRNA codon sequences for a given protein. |
Module Contents¶
- class env.CodonDesignEnv(protein_seq, device, sf=None)¶
Bases:
gfn.env.DiscreteEnvEnvironment for designing mRNA codon sequences for a given protein. Action space is the global codon set (size N_CODONS) plus an exit action. Dynamic masks restrict actions at each step: - At step t < seq_length: only synonymous codons for protein_seq[t] are allowed. - At step t == seq_length: only the exit action is allowed.
- Parameters:
protein_seq (str)
device (torch.device)
- States: type[gfn.states.DiscreteStates]¶
- _device¶
- backward_step(states, actions)¶
Backward transition function of the environment.
This method takes a batch of states and actions and returns a batch of previous states. It does not need to check whether the actions are valid or the states are sink states, because the _backward_step method wraps it and checks for validity.
- Parameters:
states (gfn.states.DiscreteStates) – A batch of states.
actions (gfn.actions.Actions) – A batch of actions.
- Returns:
A batch of previous states.
- Return type:
torch.Tensor
- codon_gc_counts¶
- exit_action_index¶
- idx_to_codon¶
- is_terminal(states)¶
- Parameters:
states (gfn.states.DiscreteStates)
- Return type:
torch.BoolTensor
- static make_sink_states_tensor(shape, device=None)¶
- n_actions¶
- protein_seq¶
- reward(final_states)¶
Returns the environment’s rewards for a batch of states.
This or log_reward must be implemented by the environment.
- Parameters:
states – A batch of states with a batch_shape.
final_states (gfn.states.DiscreteStates)
- Returns:
Tensor of shape (*batch_shape) containing the rewards.
- Return type:
torch.Tensor
- seq_length¶
- set_weights(w)¶
Store the current preference weights (w) for conditional reward.
- Parameters:
w (Union[list[float], torch.Tensor])
- step(states, actions)¶
Forward transition function of the environment.
This method takes a batch of states and actions and returns a batch of next states. It does not need to check whether the actions are valid or the states are sink states, because the _step method wraps it and checks for validity.
- Parameters:
states (gfn.states.DiscreteStates) – A batch of states.
actions (gfn.actions.Actions) – A batch of actions.
- Returns:
A batch of next states.
- Return type:
- syn_indices¶
- update_masks(states)¶
- Parameters:
states (gfn.states.DiscreteStates)
- Return type:
None
- weights¶