The issue when copying a large number of small files is disk IO / seeking.
Check the wait for IO values using top / whatever when doing such a transfer.
Running multiple threads in such a situation will only cause the disk to thrash 
even more.

Multiple threads makes sense on high latency links.

