On Mon, 2013-11-18 at 23:47 +0200, Michael S. Tsirkin wrote: > On Mon, Nov 18, 2013 at 02:33:16PM -0700, Alex Williamson wrote: > > On Mon, 2013-11-18 at 15:57 -0500, Vlad Yasevich wrote: > > > On 11/18/2013 03:33 PM, Alex Williamson wrote: > > > > On Mon, 2013-11-18 at 15:09 -0500, Vlad Yasevich wrote: > > > >> On 11/18/2013 02:58 PM, Alex Williamson wrote: > > > >>> On Mon, 2013-11-18 at 21:47 +0200, Michael S. Tsirkin wrote: > > > >>>> This reverts commit cd5be5829c1ce87aa6b3a7806524fac07ac9a757. > > > >>>> Digging into hardware specs shows this does not > > > >>>> actually make QEMU behave more like hardware. > > > >>>> Let's stick to the tried heuristic for 1.7 and > > > >>>> possibly revisit for 1.8. > > > >>> > > > >>> If this is broken, then so are these: > > > >>> > > > >>> 23c37c37f0280761072c23bf67d3a4f3c0ff25aa > > > >>> 7c36507c2b8776266f50c5e2739bd18279953b93 > > > >> > > > >> These aren't really broken. They just assume that the high order > > > >> writes will happen after the low order writes. > > > >> > > > >> In the case of e1000, this is a little more then an assumption > > > >> (particularly in the case of nic initilization). > > > > > > > > But AIUI there's also a valid bit in that high order byte on e1000, so > > > > reverting cd5be582 means we stuff a new mac into qemu less often, but > > > > it's still only accurate some of the time. > > > > > > Yes, there is a slight issue with validity of mac at the time of > > > processing packets. I have an outstanding question on the Intel > > > list about this behavior with real HW. But, with e1000, the validity > > > bit provides a much higher guarantee that a guest that will be > > > setting the mac address will write the high register second to > > > guarantee that when the valid bit is written, the mac is fully > > > valid. As a result we don't really need the e1000 part of the > > > cd5be5829. > > > > But doesn't that valid bit mean that a mac update will start and end > > with a write to the high order register? So we're assuming: > > > > a) write RA + 1 (invalidate) > > b) write RA (write low) > > c) write RA + 1 (write high + valid) > > > > Without cd5be5829 the only change is that we don't store a new mac into > > the monitor at b). The mac stored in the monitor is still wrong from a) > > until c). So it's ever so slightly less broken without cd5be5829. > > Yes. > > > > > > > > >> In the case of RTL nic, it is just an assumption, but it hasn't > > > >> been shown faulty yet. We do plan to address this a bit more > > > >> thoroughly. > > > > > > > > So how is RTL less broken without cd5be582? AIUI the valid bit is off > > > > in a separate register on RTL, so we have no guarantee about order of > > > > updating the mac. Without cd5be582 the info in the monitor may be > > > > permanently broken if the guest uses a write order other than what we > > > > assume. > > > > > > > > > > This one is actually not as bad either. RTL spec requires that > > > receive register writes happen as 32 bit word writes. This is > > > what linux and bsd drivers do, so from driver perspective, the > > > issue is the same. What our emulation layer does is turn these > > > 32 bit writes into 4 8-bit writes. This is likely due to some > > > very broken and very old drivers, but I am not sure. > > > > > > So, the information in the monitor will be broken if the guest > > > does: write_hi(); write_lo(); A part of me would really like > > > to see a guest that does this :) > > > > So the argument for reverting here is that it seems unlikely that the > > dwords get written in the hi->lo order and we'd rather the monitor get > > stuck with the wrong mac forever than it show the wrong mac address for > > a tiny window for everybody else? I think you say something about > > sub-optimal here... > > The argument is that I don't think it makes sense to maintain two > differently broken implementations in 1.6 and 1.7.
Maintain 1.6 support after 1.7 is released... is this some new policy? > It's better to have them broken in the same way and > then maybe have a non broken one in 1.8. So 1.6 is broken because rtl/e1000 never informs the monitor about mac updates. QEMU 1.7 adds the following (maybe others, these are just the ones I know of): 655d3b63 - Update rtl/e1000 on reset 7c36507c - Update e1000 on high dword mac write 23c37c37 - Update rtl on high byte mac write cd5be582 - Update rtl/e1000 on any mac write And you're suggesting that reverting just that last commit makes a significant enough difference to make it worthwhile to do so? I'm going to go back to Vlad's replacing sub-optimal with sub-optimal comment. > Or just leave well alone - no one complained so far, > this change is of the "I read the spec and it seems > to say we should do X" variety. I think it's great that Vlad did the research and we now know how to fix these correctly. I suggest we leave an incremental improvement alone in 1.7 and think about implementing it correctly in 1.8. Reading the spec and figuring out how it's supposed to work seems like a better idea than guessing which data element gets written last. > > > The current code isn't perfect either. It still has a potential > > > to show the wrong mac address in the monitor. I doesn't make > > > a lot of sense to me to replace one sub-optimal solution > > > with another sub-optimal solution, especially since no-one > > > complained. > > > > Exactly, the code isn't perfect either way and this revert is just > > replacing one sub-optimal solution for another. So why do it? > > Because we had the old buggy behaviour for years and no one > complained. Maybe the new buggy behaviour will also be fine. > But I'm not inclined to take the chance. Except somebody did complain and filed a bug that Amos was trying to fix. This one revert does not bring us back to 1.6 behavior and does not make a significant improvement in the current behavior, so I just don't see the point. Thanks, Alex > > > >> The patch that was applied was controversial and more then 1 person > > > >> expressed reservations. > > > > > > > > Understood, but it also raised and addressed a shortcoming in the > > > > previous patches. If cd5be582 was controversial because the monitor was > > > > getting updated with incorrect mac addresses then this simple revert > > > > doesn't solve that problem. Thanks, > > > > > > > > Alex > > > > > > > >>> > > > >>> None of these change the behavior of hardware, they only change when > > > >>> the > > > >>> monitor gets told about mac address changes. I'd suggest either add > > > >>> the > > > >>> emulation described in each spec or revert all of them. A partial > > > >>> revert is just noise. Thanks, > > > >>> > > > >>> Alex > > > >>> > > > >>>> > > > >>>> Reported-by: Vlad Yasevich <vyase...@redhat.com> > > > >>>> Cc: Amos Kong <ak...@redhat.com> > > > >>>> Cc: Alex Williamson <alex.william...@redhat.com> > > > >>>> --- > > > >>>> hw/net/e1000.c | 2 +- > > > >>>> hw/net/rtl8139.c | 5 ++++- > > > >>>> 2 files changed, 5 insertions(+), 2 deletions(-) > > > >>>> > > > >>>> diff --git a/hw/net/e1000.c b/hw/net/e1000.c > > > >>>> index ae63591..8387443 100644 > > > >>>> --- a/hw/net/e1000.c > > > >>>> +++ b/hw/net/e1000.c > > > >>>> @@ -1106,7 +1106,7 @@ mac_writereg(E1000State *s, int index, > > > >>>> uint32_t val) > > > >>>> > > > >>>> s->mac_reg[index] = val; > > > >>>> > > > >>>> - if (index == RA || index == RA + 1) { > > > >>>> + if (index == RA + 1) { > > > >>>> macaddr[0] = cpu_to_le32(s->mac_reg[RA]); > > > >>>> macaddr[1] = cpu_to_le32(s->mac_reg[RA + 1]); > > > >>>> qemu_format_nic_info_str(qemu_get_queue(s->nic), (uint8_t > > > >>>> *)macaddr); > > > >>>> diff --git a/hw/net/rtl8139.c b/hw/net/rtl8139.c > > > >>>> index 7f2b4db..5329f44 100644 > > > >>>> --- a/hw/net/rtl8139.c > > > >>>> +++ b/hw/net/rtl8139.c > > > >>>> @@ -2741,7 +2741,10 @@ static void rtl8139_io_writeb(void *opaque, > > > >>>> uint8_t addr, uint32_t val) > > > >>>> > > > >>>> switch (addr) > > > >>>> { > > > >>>> - case MAC0 ... MAC0+5: > > > >>>> + case MAC0 ... MAC0+4: > > > >>>> + s->phys[addr - MAC0] = val; > > > >>>> + break; > > > >>>> + case MAC0+5: > > > >>>> s->phys[addr - MAC0] = val; > > > >>>> qemu_format_nic_info_str(qemu_get_queue(s->nic), > > > >>>> s->phys); > > > >>>> break; > > > >>> > > > >>> > > > >>> > > > >> > > > > > > > > > > > > > > > > > > >