I think your question is, “how does one see the effect of a single flag on a single trajectory”. The answer is, nobody looks at single trajectories as meaningful until after-the-fact.
Typically we’ll run Rosetta for many thousands of trajectories, then scrape off the top handful of models for further analysis. Those models are the outcome of specific trajectories, but the trajectory itself wasn’t significant – only its result. Because Rosetta uses Monte Carlo, the trajectory itself is not physically plausible – physics (statistical mechanics) only holds when looking at the modeling as an ensemble.
So, we would include the new flag and rerun the whole experiment, then compare the top fraction of models from the two cases.
Why do you want to preserve a particular RNG trajectory?