What distinguishes the different inputs in your data set? Are they different protein complexes? If so, I’d guess something is subtly wrong (or at least, subtly Rosetta-inappropriate) with some fraction of the inputs; they don’t happen to crash out at PDB reading but instead at the docking line you list.
Another possibility is that the failing set are more-than-two-body problems, and docking does not know which groups make for pairs of proteins?