There’s some options which may help you (both in the MPI case and the single-processor case).
-in:file:lazy_silent and -jd2:lazy_silent_file_reader — this should defer reading in structures until you need them, which should cut down on memory usage.
-jd2:delete_old_poses — This this causes an additional check to delete off old poses when they’re no longer needed, reducing memory usage.
If you split MPI runs across multiple nodes, that can possibly help with memory usage — not so much because the MPI applications share memory, but rather because each node will only load the structures it’s using (especially with the flags above), so you’ll only look at portions of the large silent file, instead of trying to read the whole thing.