Back to SimpleMetrics page.

AverageProbabilitiesMetric

Autogenerated Tag Syntax Documentation:


A metric for calculating the weighted average of multiple PerResidueProbabilitiesMetrics.

References and author information for the AverageProbabilitiesMetric simple metric:

AverageProbabilitiesMetric SimpleMetric's author(s): Moritz Ertelt, University of Leipzig [moritz.ertelt@gmail.com]

<AverageProbabilitiesMetric name="(&string;)" custom_type="(&string;)"
        metrics="(&string;)" weights="(&string;)"
        use_cached_data="(false &bool;)" cache_prefix="(&string;)"
        cache_suffix="(&string;)" fail_on_missing_cache="(true &bool;)" />
  • custom_type: Allows multiple configured SimpleMetrics of a single type to be called in a single RunSimpleMetrics and SimpleMetricFeatures. The custom_type name will be added to the data tag in the scorefile or features database.
  • metrics: (REQUIRED) A list of comma-seperated PerResidueProbabilitiesMetrics to calculate the average of.
  • weights: A list of comma-seperated values to weight each metric with (Defaults to 1.0 for all). You need to provide as many weights as you provide metrics.
  • use_cached_data: Use any data stored in the datacache that matches the set metrics name (and any prefix/suffix.) Data is stored during a SimpleMetric's apply function, which is called during RunSimpleMetrics
  • cache_prefix: Any prefix used during apply (RunSimpleMetrics), that we will match on if use_cache is true. Requires that all PerResidueProbabilitiesMetrics have the same prefix/suffix.
  • cache_suffix: Any suffix used during apply (RunSimpleMetrics), that we will match on if use_cache is true. Requires that all PerResidueProbabilitiesMetrics have the same prefix/suffix.
  • fail_on_missing_cache: If use_cached_data is True and cache is not found, should we fail?

General description

A metric for averaging multiple PerResidueProbabilitiesMetrics.

Details

Like other PerResidueProbabilitiesMetrics the probabilities can be output as logits in a psi-blast style PSSM using the SaveProbabilitiesMetricMover, and used as input for the FavorSequenceProfileMover. This metric alone does not require compilation with extras=tensorflow,torch but the model predictions that are typically input do. See Building Rosetta with TensorFlow and Torch for the compilation setup.

Example

In this example we predict the amino acid probabilities for chain A of our protein using ProteinMPNN and ESM, average both predictions and use the average probabilities to calculate a single score for our protein.

<ROSETTASCRIPTS>
    <RESIDUE_SELECTORS>
        <Chain name="res" chains="A" />
    </RESIDUE_SELECTORS>
    <SIMPLE_METRICS>
        ----------------- Define models to use -----------------------------
        <ProteinMPNNProbabilitiesMetric name="mpnn" residue_selector="res"/>
        <PerResidueEsmProbabilitiesMetric name="esm" residue_selector="res" model="esm2_t33_650M_UR50D"/>
        ----------------- Average the probabilities ------------------------
        <AverageProbabilitiesMetric name="avg" metrics="mpnn,esm"/>
        ----------------- Analyze predictions without re-calculation -------
        <PseudoPerplexityMetric name="perplex" metric="avg" use_cached_data="true"/>
    </SIMPLE_METRICS>
    <FILTERS>
    </FILTERS>
    <MOVERS>
        <RunSimpleMetrics name="predictions" metrics="avg"/>
        <RunSimpleMetrics name="analysis" metrics="perplex"/>
    </MOVERS>
    <PROTOCOLS>
        <Add mover_name="predictions"/>
        <Add mover_name="analysis"/>
    </PROTOCOLS>
</ROSETTASCRIPTS>

Reference

The implementation in Rosetta is currently unpublished.

See Also