One key difference is that coroutines won't make your programs run faster. It is a modelling mechanism that can simplify your programs where you otherwise would have to implement a state machine.
This is also my impression when I look at this code (see http://www.99-bottles-of-beer.net/language-d-2547.html) that implements 99 bottles of beer in D with fibers. What seems to be happening is some alternating handover of the CPU.
But when I run the code all 4 cores of my machine are under load and it looks like the runtime were able to make things run in parallel somehow. Now I'm really confused ...