Back to SimpleMetrics page.

BestMutationsFromProbabilitiesMetric

Autogenerated Tag Syntax Documentation:


A CompositeRealMetric for calculating the mutations with the highest delta_probability to the current residues from a PerResidueProbabilitiesMetric. Outputs in the format Mutation-Position-Current, delta_value (e.g. D10A, 0.8)

References and author information for the BestMutationsFromProbabilitiesMetric simple metric:

BestMutationsFromProbabilitiesMetric SimpleMetric's author(s): Moritz Ertelt, University of Leipzig [moritz.ertelt@gmail.com]

<BestMutationsFromProbabilitiesMetric name="(&string;)" custom_type="(&string;)"
        metric="(&string;)" max_mutations="(10 &positive_integer;)"
        delta_cutoff="(0.0 &real;)" use_cached_data="(false &bool;)"
        cache_prefix="(&string;)" cache_suffix="(&string;)"
        fail_on_missing_cache="(true &bool;)" />
  • custom_type: Allows multiple configured SimpleMetrics of a single type to be called in a single RunSimpleMetrics and SimpleMetricFeatures. The custom_type name will be added to the data tag in the scorefile or features database.
  • metric: (REQUIRED) A PerResidueProbabilitiesMetric to calculate the probability delta between the current amino acids and the most likely ones.
  • max_mutations: The maximum amount of mutations that will be returned (Default=10).
  • delta_cutoff: The cutoff for the delta in probability that will be reported. Default is 0 meaning at least as likely as the current residue
  • use_cached_data: Use any data stored in the datacache that matches the set metrics name (and any prefix/suffix.) Data is stored during a SimpleMetric's apply function, which is called during RunSimpleMetrics
  • cache_prefix: Any prefix used during apply (RunSimpleMetrics), that we will match on if use_cache is true
  • cache_suffix: Any suffix used during apply (RunSimpleMetrics), that we will match on if use_cache is true
  • fail_on_missing_cache: If use_cached_data is True and cache is not found, should we fail?

General description

A metric for calculating mutations with the highest delta probability to the current residues from a PerResidueProbabilitiesMetric. Returns the most likely positions and their delta probability as CurrentAA-Position-MutationAA (e.g. A89T) in pose numbering. This metric alone does not require compilation with extras=tensorflow,torch but the model predictions that are typically input do. See Building Rosetta with TensorFlow and Torch for the compilation setup.

Example

The examples uses the ESM language model to predict amino acid probabilities, and then gets the ten most likely mutations that are at least as likely as the currently present amino acid.

<ROSETTASCRIPTS>
    <RESIDUE_SELECTORS>
        <Chain name="res" chains="A" />
    </RESIDUE_SELECTORS>
    <SIMPLE_METRICS>
        ----------------- Define models to use -----------------------------
        <PerResidueEsmProbabilitiesMetric name="esm" residue_selector="res" model="esm2_t33_650M_UR50D" write_pssm="esm.pssm"/>
        ----------------- Analyze predictions without re-calculation -------
        <BestMutationsFromProbabilitiesMetric name="esm_mutations" metric="esm" use_cached_data="true" max_mutations=10 delta_cutoff=0.0 />
    </SIMPLE_METRICS>
    <FILTERS>
    </FILTERS>
    <MOVERS>
        <RunSimpleMetrics name="inference" metrics="esm"/>
        <RunSimpleMetrics name="analysis" metrics="esm_mutations"/>
    </MOVERS>
    <PROTOCOLS>
        <Add mover_name="inference"/>
        <Add mover_name="analysis"/>
    </PROTOCOLS>
</ROSETTASCRIPTS>

Reference

The implementation in Rosetta is currently unpublished.

See Also