Viewing 6 reply threads
  • Author
    Posts
    • #562
      Sally Ride
      Participant

        Hi,

        First sorry for repeating this question but I really need to get straight with randomness in rosetta and it would be great to have a good explanation of that somewhere on the web, for example on this forum.

        I want to run multiple rosetta instances for the same protein on several computer clusters. From the manual and previous posts I can see two options:
        1) -constant_seed -jran
        2) -seed_offset

        Which one is better? I think the commonly used is option 1). But how this differs from 2) ? Does they do the same?

        Is there any constraint on selection of the particular integers from 1 000 000 to 4 000 000 in -jran? If I am to submit 100 jobs on cluster 1 and 100 jobs on cluster 2 I can just simply use 1 000 000 to 1 000 099 on cluster 1 and 1 000 100 to 1 000 199 on cluster 2? Or is it more sophisticated?

        Thanks for your help,

        Janek Kosinski

      • #3951
        Ora Furman
        Participant

          If you run jobs in a cluster of parallel cpus,sometimes you will get exact same results.This is a general problem for Rosetta.The reason is that some cpus are starting from exactly the same random seed .One solution for this is to add -seed_offset followed by an int number to your command line ,in this way, you can force each cpu to start from a different seed with that offset number.
          You can also try to run with -constant_seed -jran in multiple clusters,yes, the selection of the integers for the jran should between 1 million and 4 million.The default value is 1111111.You can certainly choose different values if you want to submit jobs to different clusters.

          > Hi,
          >
          > First sorry for repeating this question but I really need to get straight with randomness in rosetta and it would be great to have a good explanation of that somewhere on the web, for example on this forum.
          >
          > I want to run multiple rosetta instances for the same protein on several computer clusters. From the manual and previous posts I can see two options:
          > 1) -constant_seed -jran
          > 2) -seed_offset
          >
          > Which one is better? I think the commonly used is option 1). But how this differs from 2) ? Does they do the same?
          >
          > Is there any constraint on selection of the particular integers from 1 000 000 to 4 000 000 in -jran? If I am to submit 100 jobs on cluster 1 and 100 jobs on cluster 2 I can just simply use 1 000 000 to 1 000 099 on cluster 1 and 1 000 100 to 1 000 199 on cluster 2? Or is it more sophisticated?
          >
          > Thanks for your help,
          >
          > Janek Kosinski

        • #3955
          Sally Ride
          Participant

            So it sounds that -seed_offset and -constant_seed -jran are just two alternative solutions to the same problem. I used -seed_offset and indeed it apparently worked, but I think the community prefers -constant_seed -jran (however it does not matter as I understand). -constant_seed -jran looks better for the greater control of your seed values…

            Thanks for explanations.

            > If you run jobs in a cluster of parallel cpus,sometimes you will get exact same results.This is a general problem for Rosetta.The reason is that some cpus are starting from exactly the same random seed .One solution for this is to add -seed_offset followed by an int number to your command line ,in this way, you can force each cpu to start from a different seed with that offset number.
            > You can also try to run with -constant_seed -jran in multiple clusters,yes, the selection of the integers for the jran should between 1 million and 4 million.The default value is 1111111.You can certainly choose different values if you want to submit jobs to different clusters.

          • #8075
            Anonymous

              I am only 4 years late, but if you don’t use the MPI version and instead, say, submit 100 jobs each producing 500 decoys there should be no need to use constant_seed -jran or am I wrong?
              D.

            • #12407
              Anonymous

                I am only 4 years late, too. I run thounds of jobs on cluster without using either method using Rosetta 3.8. Usually hunderds of jobs start simultaneously. But I have never observed any identical decoys, though I do not use either option. Are these options still needed?

              • #12928
                Anonymous

                  I am only 4 years late, too. I run thounds of jobs on cluster without using either method using Rosetta 3.8. Usually hunderds of jobs start simultaneously. But I have never observed any identical decoys, though I do not use either option. Are these options still needed?

                • #13449
                  Anonymous

                    I am only 4 years late, too. I run thounds of jobs on cluster without using either method using Rosetta 3.8. Usually hunderds of jobs start simultaneously. But I have never observed any identical decoys, though I do not use either option. Are these options still needed?

                  • #12426
                    Anonymous

                      This thread is in the Rosetta++ (Rosetta2) portion of the forums, so keep in mind that Rosetta++ has different behavior from Rosetta3.

                      I haven’t double-checked the Rosetta++ behavior (I don’t have the code handy), but for Rosetta3 if you don’t use the -constant_seed option Rosetta will automatically pull from the system random number source to get the seed for the pseudorandom number generator. It should print a message to the core.init tracer listing details. 

                      If this is working correctly (again, for Rosetta3), this should avoid any issues with restarting jobs and having a large number of jobs start simultaneously. Each time Rosetta restarts it will pull a new seed, so it shouldn’t reproduce the same decoys, as it might with a constant seed. It’s also pulling the number from the system random number source rather than a clock, so even two jobs starting at the identical instant should get different random number seeds.

                    • #12947
                      Anonymous

                        This thread is in the Rosetta++ (Rosetta2) portion of the forums, so keep in mind that Rosetta++ has different behavior from Rosetta3.

                        I haven’t double-checked the Rosetta++ behavior (I don’t have the code handy), but for Rosetta3 if you don’t use the -constant_seed option Rosetta will automatically pull from the system random number source to get the seed for the pseudorandom number generator. It should print a message to the core.init tracer listing details. 

                        If this is working correctly (again, for Rosetta3), this should avoid any issues with restarting jobs and having a large number of jobs start simultaneously. Each time Rosetta restarts it will pull a new seed, so it shouldn’t reproduce the same decoys, as it might with a constant seed. It’s also pulling the number from the system random number source rather than a clock, so even two jobs starting at the identical instant should get different random number seeds.

                      • #13468
                        Anonymous

                          This thread is in the Rosetta++ (Rosetta2) portion of the forums, so keep in mind that Rosetta++ has different behavior from Rosetta3.

                          I haven’t double-checked the Rosetta++ behavior (I don’t have the code handy), but for Rosetta3 if you don’t use the -constant_seed option Rosetta will automatically pull from the system random number source to get the seed for the pseudorandom number generator. It should print a message to the core.init tracer listing details. 

                          If this is working correctly (again, for Rosetta3), this should avoid any issues with restarting jobs and having a large number of jobs start simultaneously. Each time Rosetta restarts it will pull a new seed, so it shouldn’t reproduce the same decoys, as it might with a constant seed. It’s also pulling the number from the system random number source rather than a clock, so even two jobs starting at the identical instant should get different random number seeds.

                      Viewing 6 reply threads
                      • You must be logged in to reply to this topic.