2 percent? Have you logged into a compute node and run a simple top when the job is running? Are all the processes distributed across the CPU cores? Are the processes being pinned properly to a core? Or are they hopping from core to core?
Also make SURE all nodes havenooted with all cores online and all report the same amount of RAM