We have seen a situation when migrating about 50 VMs at once where some of them fail. I think this is because they are dirtying pages faster than they can be transmitted.
What algorithm controls when migration fails in this way, and is it tunable? I am fully aware one answer to this question is "do not attempt to migrate 50 busy VMs through a single 1GB/s NIC". -- Alex Bligh