Could someone confirm whether this is a bug or misunderstanding the doc (in which case it's not just me, and it needs clarifying!)? I haven't looked at the current code in the hope of a quick authoritative answer.
This is with 1.5.5rc3, originally on Interlagos, but also checked on Magny Cours. It's also seen on two Interlagos with different physical numbering of the logical processors. On a 48-core Magny Cours with mpirun --bysocket --bind-to-core --report-bindings -np 48 what I get is two processes per core, e.g.: [node247:09521] [[58099,0],0] odls:default:fork binding child [[58099,1],14] to socket 2 cpus 4000 ... [node247:09521] [[58099,0],0] odls:default:fork binding child [[58099,1],38] to socket 2 cpus 4000 and hwloc-ps confirms the situation. However, I (and my boss, who did it originally) expect one per core. With --bycore we do see one per core, of course. Is that actually expected?