Hi Elad,

On 02.05.2016 12:21, Elad Kanfi wrote:

Since this driver handles one frame at a time, I couldn't find a case that 
requires the paired barrier.

The order of reads is not critical once it is assured that the value is valid.


Please see sections "SMP BARRIER PAIRING" and "EXAMPLES OF MEMORY BARRIER 
SEQUENCES" in
 memory-barriers.txt for a description why smp barriers have to be paired and a 
smp write
barrier on CPU A without a read barrier on CPU B is _not_ sufficient.

Furthermore after having a closer look into the problem you want to fix, it 
seems that
you also have to ensure the correct order of the assignment of "tx_packet_sent" 
and the
corresponding skb.

On CPU A you do:

1. tx_skb = skb;
2. tx_packet_sent = true;

On CPU B (the irq handler) you do:

1. Check/read tx_packet_sent.
2.If set then handle tx_skb.

So there is a logical dependency between tx_skb and tx_packet_sent (tx_skb 
should only
be handled if tx_packet_sent is set). However both compiler and CPU are free to 
reorder
steps 1. and 2. on both CPU A and CPU B and you have to ensure that tx_skb 
actually contains
a valid pointer for CPU B as soon at it sees tx_packet_sent == true. So what 
you need here is:

CPU A:
1. tx_skb = skb
2. smp_wmb()
3. tx_packet_sent = true;

and CPU B:

1. read tx_packet_sent
2. smp_rmb()
3. If set then handle tx_skb

So this is to ensure that tx_skb is in sync with tx_packet_sent on both writer 
and reader side.


Then there is the second problem (the one you tried to address with your patch) 
that you have to make
sure that by the time you inform the hardware about the new frame and the thus 
the tx-done irq may occur
both tx_packet_sent and tx_skb are actually set in the memory. Otherwise the 
irq may occur but CPU B may
not see tx_packet_sent == true and the irq is lost. To achieve this you would 
have to use a (non-smp)
write barrier before you inform the HW. So for CPU A the sequence would extend 
to:

1. tx_skb = skb
2. smp_wmb() /* ensure correct order between both assignments */
3. tx_packet_sent = true;
4. wmb() /* make sure both assignments reach RAM before the HW is informed and 
the IRQ is fired */
5. nps_enet_reg_set(priv, NPS_ENET_REG_TX_CTL, tx_ctrl.value);

The latter barrier does not require pairing and has to be there even on UP 
systems because you dont want to
sync between CPU A and CPU B in this case but rather between system RAM and 
IO-MEMORY.

Please note that this is only how I understood things and it may be totally 
wrong. But I am sure that the initial
solution with only one smp_wmb() is not correct either.

Regards,
Lino

Reply via email to