[Qemu-devel] [PATCH 0/2] A few race conditions in E1000 device fixed
Following patches fix a few race conditions in E1000 code: 1st patch fixes race condition between driver shutdown and device shutdown (see patch comment) It also work-arounds race condition in e1000 Linux driver between RX enable and RX rings init (Separate patch for the second problem sent to e1000-devel/linux/kernel and accepted by maintainers, see http://sourceforge.net/mailarchive/forum.php?thread_name=1350280341.2152.12.camel%40jtkirshe-mobl&forum_name=e1000-devel) 2nd patch is pretty trivial and adds forgotten field into live migration list thus fixing another race condition. Dmitry Fleytman (2): Fix a race condition in E1000 device implementation: Fix a race condition in E1000 device live migration. One of data-transfer related flags not in migrated fields list. hw/e1000.c | 25 +++-- 1 file changed, 23 insertions(+), 2 deletions(-) -- 1.7.11.4
[Qemu-devel] [PATCH 2/2] Fix a race condition in E1000 device live migration. One of data-transfer related flags not in migrated fields list.
Signed-off-by: Dmitry Fleytman --- hw/e1000.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/hw/e1000.c b/hw/e1000.c index 1e66ecf..efbe0c9 100644 --- a/hw/e1000.c +++ b/hw/e1000.c @@ -92,7 +92,7 @@ typedef struct E1000State_st { uint32_t rxbuf_size; uint32_t rxbuf_min_shift; -int check_rxov; +uint32_t check_rxov; uint32_t rx_init_done; struct e1000_tx { unsigned char header[256]; @@ -1120,6 +1120,7 @@ static const VMStateDescription vmstate_e1000 = { VMSTATE_UNUSED(4), /* Was mmio_base. */ VMSTATE_UINT32(rxbuf_size, E1000State), VMSTATE_UINT32(rxbuf_min_shift, E1000State), +VMSTATE_UINT32(check_rxov, E1000State), VMSTATE_UINT32(rx_init_done, E1000State), VMSTATE_UINT32(eecd_state.val_in, E1000State), VMSTATE_UINT16(eecd_state.bitnum_in, E1000State), -- 1.7.11.4
[Qemu-devel] [PATCH 1/2] Fix a race condition in E1000 device implementation:
Device driver enables RX on shutdown after HW reset in case device is configured for wake on LAN. Although RX is enabled corresponding rings are not initialized and descrptor addresses of RX ring are invalid. If packet arrives with proper timing QEMU crashes due to invalid guest memory access attempt or guest memory gets corrupted. Reported-by: Chris Webb Reported-by: Richard Davies Signed-off-by: Dmitry Fleytman --- hw/e1000.c | 22 +- 1 file changed, 21 insertions(+), 1 deletion(-) diff --git a/hw/e1000.c b/hw/e1000.c index 63fee10..1e66ecf 100644 --- a/hw/e1000.c +++ b/hw/e1000.c @@ -93,6 +93,7 @@ typedef struct E1000State_st { uint32_t rxbuf_size; uint32_t rxbuf_min_shift; int check_rxov; +uint32_t rx_init_done; struct e1000_tx { unsigned char header[256]; unsigned char vlan_header[4]; @@ -267,6 +268,7 @@ static void e1000_reset(void *opaque) { E1000State *d = opaque; +d->check_rxov = 1; qemu_del_timer(d->autoneg_timer); memset(d->phy_reg, 0, sizeof d->phy_reg); memmove(d->phy_reg, phy_reg_init, sizeof phy_reg_init); @@ -291,6 +293,12 @@ static void set_rx_control(E1000State *s, int index, uint32_t val) { s->mac_reg[RCTL] = val; + +if (!(s->mac_reg[RCTL] & E1000_RCTL_EN)) { +s->rx_init_done = 0; +s->check_rxov = 1; +} + s->rxbuf_size = rxbufsize(val); s->rxbuf_min_shift = ((val / E1000_RCTL_RDMTS_QUAT) & 3) + 1; DBGOUT(RX, "RCTL: %d, mac_reg[RCTL] = 0x%x\n", s->mac_reg[RDT], @@ -925,8 +933,18 @@ mac_writereg(E1000State *s, int index, uint32_t val) static void set_rdt(E1000State *s, int index, uint32_t val) { -s->check_rxov = 0; s->mac_reg[index] = val & 0x; + +if (s->mac_reg[index] || s->rx_init_done) { +s->check_rxov = 0; +/* This is a fix for RX initialization race +* present in e1000 driver on some kernels. +* We consider RX enabled only after we've seen +* at least one RX descriptor filled by guest. +*/ +s->rx_init_done = 1; +} + if (e1000_has_rxbufs(s, 1)) { qemu_flush_queued_packets(&s->nic->nc); } @@ -1102,6 +1120,7 @@ static const VMStateDescription vmstate_e1000 = { VMSTATE_UNUSED(4), /* Was mmio_base. */ VMSTATE_UINT32(rxbuf_size, E1000State), VMSTATE_UINT32(rxbuf_min_shift, E1000State), +VMSTATE_UINT32(rx_init_done, E1000State), VMSTATE_UINT32(eecd_state.val_in, E1000State), VMSTATE_UINT16(eecd_state.bitnum_in, E1000State), VMSTATE_UINT16(eecd_state.bitnum_out, E1000State), @@ -1269,6 +1288,7 @@ static int pci_e1000_init(PCIDevice *pci_dev) add_boot_device_path(d->conf.bootindex, &pci_dev->qdev, "/ethernet-phy@0"); d->autoneg_timer = qemu_new_timer_ms(vm_clock, e1000_autoneg_timer, d); +d->check_rxov = 1; return 0; } -- 1.7.11.4
Re: [Qemu-devel] [PATCH 0/2] A few race conditions in E1000 device fixed
Hello, Please, ignore 1st patch for now. Although it fixes the problem observed it looks like there is a better and easier solution (many thanks to Intel Guys that explained e1000 operation in details: http://sourceforge.net/mailarchive/forum.php?thread_name=CAGHCxhcad%3Dzx7ihX5zoDB%3DZOLGGuZty%3DBck6zSoMQ-9S3ZJo7w%40mail.gmail.com&forum_name=e1000-devel). We'll submit the final patch soon. Dmitry. On Mon, Oct 15, 2012 at 6:48 PM, Dmitry Fleytman wrote: > Following patches fix a few race conditions in E1000 code: > > 1st patch fixes race condition between driver shutdown and device shutdown > (see patch comment) > It also work-arounds race condition in e1000 Linux driver between RX enable > and RX rings init > (Separate patch for the second problem sent to e1000-devel/linux/kernel and > accepted by maintainers, see > > http://sourceforge.net/mailarchive/forum.php?thread_name=1350280341.2152.12.camel%40jtkirshe-mobl&forum_name=e1000-devel) > > 2nd patch is pretty trivial and adds forgotten field into live migration > list thus fixing another race condition. > > Dmitry Fleytman (2): > Fix a race condition in E1000 device implementation: > Fix a race condition in E1000 device live migration. One of > data-transfer related flags not in migrated fields list. > > hw/e1000.c | 25 +++-- > 1 file changed, 23 insertions(+), 2 deletions(-) > > -- > 1.7.11.4 > -- Dmitry Fleytman Technology Expert and Consultant, Daynix Computing Ltd. Cell: +972-54-2819481 Skype: dmitry.fleytman
[Qemu-devel] [PATCH 0/2] E1000 RX/Live migration bugs fixed
Following patches fix 2 problems in E1000 code: 1st patch fixes race condition between RX enable and RX buffers allocation See commit description for detailed explanation 2nd patch adds forgotten field into live migration list I'm not really sure this is a proper solution with no regressions, please, take a look. Dmitry Fleytman (2): Ignore RX tail kicks when RX disabled. Add check_rxov into VMState. hw/e1000.c | 39 +-- 1 file changed, 33 insertions(+), 6 deletions(-) -- 1.7.11.4
[Qemu-devel] [PATCH 1/2] Ignore RX tail kicks when RX disabled.
Device RX initization from driver's side consists of following steps: 1. Initialize head and tail of RX ring to 0 2. Enable Rx (set bit in RCTL register) 3. Allocate buffers, fill descriptors 4. Write ring tail Forth operation signals hardware that RX buffers available and it may start packets indication. Current implementation treats first operation (write 0 to ring tail) as signal of buffers availability and starts data transfers as soon as RX enable indicaton arrives. This is not correct because there is a chance that ring is still empty (third action not performed yet) and then memory corruption occures. Device has to ignore RX tail kicks unless RX enabled. Reported-by: Chris Webb Reported-by: Richard Davies Signed-off-by: Dmitry Fleytman --- hw/e1000.c | 29 + 1 file changed, 25 insertions(+), 4 deletions(-) diff --git a/hw/e1000.c b/hw/e1000.c index 63fee10..606bf3a 100644 --- a/hw/e1000.c +++ b/hw/e1000.c @@ -267,6 +267,7 @@ static void e1000_reset(void *opaque) { E1000State *d = opaque; +d->check_rxov = 1; qemu_del_timer(d->autoneg_timer); memset(d->phy_reg, 0, sizeof d->phy_reg); memmove(d->phy_reg, phy_reg_init, sizeof phy_reg_init); @@ -285,6 +286,10 @@ set_ctrl(E1000State *s, int index, uint32_t val) { /* RST is self clearing */ s->mac_reg[CTRL] = val & ~E1000_CTRL_RST; + +if (val & E1000_CTRL_RST) { +s->check_rxov = 1; +} } static void @@ -754,12 +759,18 @@ static bool e1000_has_rxbufs(E1000State *s, size_t total_size) return total_size <= bufs * s->rxbuf_size; } +static inline bool +is_receive_enabled(E1000State *s) +{ +return s->mac_reg[RCTL] & E1000_RCTL_EN; +} + static int e1000_can_receive(NetClientState *nc) { E1000State *s = DO_UPCAST(NICState, nc, nc)->opaque; -return (s->mac_reg[RCTL] & E1000_RCTL_EN) && e1000_has_rxbufs(s, 1); +return is_receive_enabled(s) && e1000_has_rxbufs(s, 1); } static uint64_t rx_desc_base(E1000State *s) @@ -785,8 +796,9 @@ e1000_receive(NetClientState *nc, const uint8_t *buf, size_t size) size_t desc_size; size_t total_size; -if (!(s->mac_reg[RCTL] & E1000_RCTL_EN)) +if (!is_receive_enabled(s)) { return -1; +} /* Pad to minimum Ethernet frame length */ if (size < sizeof(min_buf)) { @@ -925,8 +937,12 @@ mac_writereg(E1000State *s, int index, uint32_t val) static void set_rdt(E1000State *s, int index, uint32_t val) { -s->check_rxov = 0; s->mac_reg[index] = val & 0x; + +if (is_receive_enabled(s)) { +s->check_rxov = 0; +} + if (e1000_has_rxbufs(s, 1)) { qemu_flush_queued_packets(&s->nic->nc); } @@ -1065,7 +1081,12 @@ static void e1000_io_write(void *opaque, target_phys_addr_t addr, { E1000State *s = opaque; -(void)s; +switch (addr) { +case E1000_CTRL_DUP: +if (val & E1000_CTRL_RST) { +s->check_rxov = 1; +} +} } static const MemoryRegionOps e1000_io_ops = { -- 1.7.11.4
[Qemu-devel] [PATCH 2/2] Add check_rxov into VMState.
E1000State::check_rxov field must be preserved on live migration. Signed-off-by: Dmitry Fleytman --- hw/e1000.c | 10 -- 1 file changed, 8 insertions(+), 2 deletions(-) diff --git a/hw/e1000.c b/hw/e1000.c index 606bf3a..26ad03c 100644 --- a/hw/e1000.c +++ b/hw/e1000.c @@ -92,7 +92,7 @@ typedef struct E1000State_st { uint32_t rxbuf_size; uint32_t rxbuf_min_shift; -int check_rxov; +uint32_t check_rxov; struct e1000_tx { unsigned char header[256]; unsigned char vlan_header[4]; @@ -1100,6 +1100,11 @@ static bool is_version_1(void *opaque, int version_id) return version_id == 1; } +static bool is_version_3(void *opaque, int version_id) +{ +return version_id == 1; +} + static int e1000_post_load(void *opaque, int version_id) { E1000State *s = opaque; @@ -1113,7 +1118,7 @@ static int e1000_post_load(void *opaque, int version_id) static const VMStateDescription vmstate_e1000 = { .name = "e1000", -.version_id = 2, +.version_id = 3, .minimum_version_id = 1, .minimum_version_id_old = 1, .post_load = e1000_post_load, @@ -1123,6 +1128,7 @@ static const VMStateDescription vmstate_e1000 = { VMSTATE_UNUSED(4), /* Was mmio_base. */ VMSTATE_UINT32(rxbuf_size, E1000State), VMSTATE_UINT32(rxbuf_min_shift, E1000State), +VMSTATE_UINT32_TEST(check_rxov, E1000State, is_version_3), VMSTATE_UINT32(eecd_state.val_in, E1000State), VMSTATE_UINT16(eecd_state.bitnum_in, E1000State), VMSTATE_UINT16(eecd_state.bitnum_out, E1000State), -- 1.7.11.4
Re: [Qemu-devel] [PATCH 2/2] Add check_rxov into VMState.
Oops, you are right :) On Thu, Oct 18, 2012 at 9:24 AM, Stefan Hajnoczi wrote: > On Wed, Oct 17, 2012 at 08:31:47PM +0200, Dmitry Fleytman wrote: >> @@ -1100,6 +1100,11 @@ static bool is_version_1(void *opaque, int version_id) >> return version_id == 1; >> } >> >> +static bool is_version_3(void *opaque, int version_id) >> +{ >> +return version_id == 1; >> +} > > version_id == 3? -- Dmitry Fleytman Technology Expert and Consultant, Daynix Computing Ltd. Cell: +972-54-2819481 Skype: dmitry.fleytman
Re: [Qemu-devel] [PATCH 1/2] Ignore RX tail kicks when RX disabled.
Stefan, The real purpose of check_rxov it a bit confusing indeed, mainly because of unclear name (rename?), however it works as following: There are 2 possible when RDT == RDH for RX ring: 1. Device used all the buffers from ring, no empty buffers available 2. Driver fully refilled the ring and all buffers are empty and ready to use check_rxov is used to distinguish these 2 cases: 1. It must be 1 initially (init, reset, etc.) 2. It must be set to one when device uses buffer 3. It must be set to 0 when driver adds buffer to the ring check_rxov == 1 - ring is empty check_rxov == 0 - ring is full Indeed, RX init sequence doesn't look logical, however this is the way all Intel driver behave from e1000 and up to ixgbe. Also see some explanation here: http://permalink.gmane.org/gmane.linux.kernel/1375917 If we drop check_rxov and always treat RDH == RDT as empty ring we'll probably get correct behavior for current Linux driver's code (needs testing of course), however we have no idea how Windows drivers work. Also drivers tend to change... Dmitry. On Thu, Oct 18, 2012 at 10:09 AM, Stefan Hajnoczi wrote: > On Wed, Oct 17, 2012 at 08:31:46PM +0200, Dmitry Fleytman wrote: >> Device RX initization from driver's side consists of following steps: >> 1. Initialize head and tail of RX ring to 0 >> 2. Enable Rx (set bit in RCTL register) >> 3. Allocate buffers, fill descriptors >> 4. Write ring tail >> >> Forth operation signals hardware that RX buffers available >> and it may start packets indication. >> >> Current implementation treats first operation (write 0 to ring tail) >> as signal of buffers availability and starts data transfers as soon >> as RX enable indicaton arrives. >> >> This is not correct because there is a chance that ring is still >> empty (third action not performed yet) and then memory corruption >> occures. > > Any idea what the point of hw/e1000.c check_rxov is? I see nothing in > the datasheet that requires these semantics. > > The Linux e1000 driver never enables the RXO (rx fifo overflow) > interrupt, only RXDMT0 (receive descriptor minimum threshold). This > means hw/e1000.c will not upset the Linux e1000 driver when > e1000_receive() gets called with check_rxov == 1 and RDH == RDT == 0. > > BTW the Linux e1000 driver does not follow the sequence recommended in > the datasheet 14.4 Receive Initialization, which would avoid the weird > window of time where RDH == RDT == 0. > > If we get rid of check_rxov and always check rxbuf space then we have > the correct behavior. I'm a little nervous of simply dropping it > because its purpose is unclear to me :(. > > Stefan -- Dmitry Fleytman Technology Expert and Consultant, Daynix Computing Ltd. Cell: +972-54-2819481 Skype: dmitry.fleytman
Re: [Qemu-devel] [PATCH 1/2] Ignore RX tail kicks when RX disabled.
Hello, Stefan The problem occurs between steps 2 and 3. Let's say packet arrives after step 2 is done by driver. Head and tail are 0 because of step 1 Check_rxov is 0 because of two reasons: 1. On startup it is 0 by default 2. It is zeroed by setting ring tail to 0 on first step Then first check ( __if (!(s->mac_reg[RCTL] & E1000_RCTL_EN))__ ) passes because RX enabled on step 2. e1000_has_rxbufs() returs true because it treats equal head and tail as fully filled ring when check_rxov is 0: static bool e1000_has_rxbufs(E1000State *s, size_t total_size) { [...] if (total_size <= s->rxbuf_size) { return s->mac_reg[RDH] != s->mac_reg[RDT] || !s->check_rxov; [...] } else if (s->mac_reg[RDH] > s->mac_reg[RDT] || !s->check_rxov) { [...] } So QEMU reads uninitialized descriptor and tries to perform "DMA" to arbitrary address from descriptor. Depending on address value it corrupts guest memory or abort()'s here: void *qemu_get_ram_ptr(ram_addr_t addr) { [...] fprintf(stderr, "Bad ram offset %" PRIx64 "\n", (uint64_t)addr); abort(); [...] } Thanks for review, Dmitry. On Thu, Oct 18, 2012 at 9:31 AM, Stefan Hajnoczi wrote: > > On Wed, Oct 17, 2012 at 08:31:46PM +0200, Dmitry Fleytman wrote: > > Device RX initization from driver's side consists of following steps: > > 1. Initialize head and tail of RX ring to 0 > > 2. Enable Rx (set bit in RCTL register) > > 3. Allocate buffers, fill descriptors > > 4. Write ring tail > > > > Forth operation signals hardware that RX buffers available > > and it may start packets indication. > > > > Current implementation treats first operation (write 0 to ring tail) > > as signal of buffers availability and starts data transfers as soon > > as RX enable indicaton arrives. > > > > This is not correct because there is a chance that ring is still > > empty (third action not performed yet) and then memory corruption > > occures. > > The existing code tries to prevent this: > > e1000_receive(NetClientState *nc, const uint8_t *buf, size_t size) > { > [...] > > if (!(s->mac_reg[RCTL] & E1000_RCTL_EN)) > return -1; > > [...] > total_size = size + fcs_len(s); > if (!e1000_has_rxbufs(s, total_size)) { > set_ics(s, 0, E1000_ICS_RXO); > return -1; > } > > Why are these checks not enough? > > Which memory gets corrupted? > > Stefan -- Dmitry Fleytman Technology Expert and Consultant, Daynix Computing Ltd. Cell: +972-54-2819481 Skype: dmitry.fleytman
Re: [Qemu-devel] [PATCH 1/2] Ignore RX tail kicks when RX disabled.
Great! Thanks, Alex. I'll prepare a new changeset that drops check_rxov completely. Also migration-related patch becomes unneeded with this solution. On Thu, Oct 18, 2012 at 6:06 PM, Alexander Duyck wrote: > On 10/18/2012 07:31 AM, Stefan Hajnoczi wrote: >> On Thu, Oct 18, 2012 at 10:34 AM, Dmitry Fleytman wrote: >>> The real purpose of check_rxov it a bit confusing indeed, mainly >>> because of unclear name (rename?), >>> however it works as following: >>> >>> There are 2 possible when RDT == RDH for RX ring: >>> 1. Device used all the buffers from ring, no empty buffers available >>> 2. Driver fully refilled the ring and all buffers are empty and ready >>> to use > > The 2nd case is not true. We should only have RDT == RDH when the ring > is empty. If RDT == RDH and the ring is full then we have a bug in the > driver. The driver should only ever allow RDT to be one less than head, > or ring size - 1 if head is 0. > >>> check_rxov is used to distinguish these 2 cases: >>> 1. It must be 1 initially (init, reset, etc.) >>> 2. It must be set to one when device uses buffer >>> 3. It must be set to 0 when driver adds buffer to the ring >>> check_rxov == 1 - ring is empty >>> check_rxov == 0 - ring is full >>> >>> Indeed, RX init sequence doesn't look logical, however this is the way >>> all Intel driver behave from e1000 and up to ixgbe. >>> Also see some explanation here: >>> http://permalink.gmane.org/gmane.linux.kernel/1375917 >>> >>> If we drop check_rxov and always treat RDH == RDT as empty ring we'll >>> probably get correct behavior for current Linux driver's code (needs >>> testing of course), >>> however we have no idea how Windows drivers work. > > The windows driver should work the same way. If RDH == RDT the hardware > will treat that as a empty ring and will hang. If there is a driver > that is setting RDH == RDT to indicate the ring is full please let us > know as that is likely a buggy driver. > >> Thanks, for the great explanation, Dmitry. >> >> Alexander: I CCed you because I hope you might be able to explain what >> the 82540EM card does when a driver sets RDT to the value of RDH. The >> QEMU NIC emulation code treats this as a full ring (i.e. the >> descriptors are valid and will be filled in by the hardware). Does >> the real hardware act like this or will it treat this condition as >> ring empty (i.e. if the driver sets RDT to the value of RDH then the >> hardware stops receive because there are no descriptors available)? >> >> I can't find a statement in the Intel datasheet about what happens >> when the driver sets RDT = RDH. The QEMU check_rxov variable is >> trying to distinguish between ring empty (RDH has moved to RDT) and >> ring full (driver has set RDH = RDT because the full descriptor ring >> is available). > > If RDT == RDH then we should stop receiving traffic. As far as I know > all of our e1000 hardware treat RDT == RDH as an empty ring state. All > of the drivers should have code in place to stop it. For example the > E1000_DESC_UNUSED macro should be returning ring size - 1 in the case of > RDT == RDH which will result in the head being 0 and the tail being ring > size - 2. > >> Dmitry: At this point we'd need to test what happens on real hardware >> when RDH = RDT in order to be able to remove check_rxov. As you >> mentioned, with the Linux e1000 driver we don't see ring full RDH = >> RDT: >> >> /* call E1000_DESC_UNUSED which always leaves >> * at least 1 descriptor unused to make sure >> * next_to_use != next_to_clean */ >> for (i = 0; i < adapter->num_rx_queues; i++) { >> struct e1000_rx_ring *ring = &adapter->rx_ring[i]; >> adapter->alloc_rx_buf(adapter, ring, >> E1000_DESC_UNUSED(ring)); >> } >> >> Here some sample output from a QEMU printf, notice how RDH is never >> the same as RDT once rx begins: >> >> set_rdt rdh=0 rdt_old=0 rdt_new=0 >> set_rdt rdh=0 rdt_old=0 rdt_new=254 >> set_rdt rdh=1 rdt_old=254 rdt_new=255 >> set_rdt rdh=2 rdt_old=255 rdt_new=0 >> set_rdt rdh=3 rdt_old=0 rdt_new=1 >> set_rdt rdh=4 rdt_old=1 rdt_new=2 >> set_rdt rdh=5 rdt_old=2 rdt_new=3 >> set_rdt rdh=6 rdt_old=3 rdt_new=4 >> set_rdt rdh=7 rdt_old=4 rdt_new=5 >> set_rdt rdh=9 rdt_old=5 rdt_new=7 >> set_rdt rdh=10 rdt_old=7 rdt_new=8 >> set_rdt rdh=11 rdt_
[Qemu-devel] [PATCH V2] E1000 RX ring management fix
Following patch fixes improper RX ring management E1000 code Changes from version 1: 1st patch changed so it drops check_rxov field because it is redundant and leads to race conditions See commit description for details 2nd patch (live migration) dropped because corresponding field got deleted Also I've made short experiment with an Intel adapter controlled by e1000e driver. Indeed I saw no RX indication attempt when RX ring's RDH and RDT are equal. Dmitry Fleytman (1): Drop check_rxov, always treat RX ring with RHD == RDT as empty hw/e1000.c | 7 ++- 1 file changed, 2 insertions(+), 5 deletions(-) -- 1.7.11.4
[Qemu-devel] [PATCH V2] Drop check_rxov, always treat RX ring with RHD == RDT as empty
Real HW always treats RX ring with RDH == RDT as empty. Emulation is supposed to behave the same. Reported-by: Chris Webb Reported-by: Richard Davies Signed-off-by: Dmitry Fleytman --- hw/e1000.c | 7 ++- 1 file changed, 2 insertions(+), 5 deletions(-) diff --git a/hw/e1000.c b/hw/e1000.c index 63fee10..ab39d47 100644 --- a/hw/e1000.c +++ b/hw/e1000.c @@ -92,7 +92,6 @@ typedef struct E1000State_st { uint32_t rxbuf_size; uint32_t rxbuf_min_shift; -int check_rxov; struct e1000_tx { unsigned char header[256]; unsigned char vlan_header[4]; @@ -741,11 +740,11 @@ static bool e1000_has_rxbufs(E1000State *s, size_t total_size) int bufs; /* Fast-path short packets */ if (total_size <= s->rxbuf_size) { -return s->mac_reg[RDH] != s->mac_reg[RDT] || !s->check_rxov; +return s->mac_reg[RDH] != s->mac_reg[RDT]; } if (s->mac_reg[RDH] < s->mac_reg[RDT]) { bufs = s->mac_reg[RDT] - s->mac_reg[RDH]; -} else if (s->mac_reg[RDH] > s->mac_reg[RDT] || !s->check_rxov) { +} else if (s->mac_reg[RDH] > s->mac_reg[RDT]) { bufs = s->mac_reg[RDLEN] / sizeof(struct e1000_rx_desc) + s->mac_reg[RDT] - s->mac_reg[RDH]; } else { @@ -848,7 +847,6 @@ e1000_receive(NetClientState *nc, const uint8_t *buf, size_t size) if (++s->mac_reg[RDH] * sizeof(desc) >= s->mac_reg[RDLEN]) s->mac_reg[RDH] = 0; -s->check_rxov = 1; /* see comment in start_xmit; same here */ if (s->mac_reg[RDH] == rdh_start) { DBGOUT(RXERR, "RDH wraparound @%x, RDT %x, RDLEN %x\n", @@ -925,7 +923,6 @@ mac_writereg(E1000State *s, int index, uint32_t val) static void set_rdt(E1000State *s, int index, uint32_t val) { -s->check_rxov = 0; s->mac_reg[index] = val & 0x; if (e1000_has_rxbufs(s, 1)) { qemu_flush_queued_packets(&s->nic->nc); -- 1.7.11.4
Re: [Qemu-devel] [PATCH V2] Drop check_rxov, always treat RX ring with RHD == RDT as empty
Thanks, Peter I'll resend the patch. Sent from my iPad On Oct 18, 2012, at 9:20 PM, Peter Maydell wrote: > On 18 October 2012 19:59, Dmitry Fleytman wrote: >> Real HW always treats RX ring with RDH == RDT as empty. >> Emulation is supposed to behave the same. > > If you need to do a v3 of this patch for some reason, it would > be nice to amend the summary line so it started "e1000: " so > people scanning git logs know which device is affected. > > Also your commit message body says "RDH" but the subject > says "RHD"... > > thanks > -- PMM
[Qemu-devel] [PATCH V3] E1000 RX ring management fix
Following patch fixes improper RX ring management E1000 code Changes from version 2: Commit message beautification Changes from version 1: 1st patch changed so it drops check_rxov field because it is redundant and leads to race conditions See commit description for details 2nd patch (live migration) dropped because corresponding field got deleted Also I've made short experiment with an Intel adapter controlled by e1000e driver. Indeed I saw no RX indication attempt when RX ring's RDH and RDT are equal. Dmitry Fleytman (1): e1000: drop check_rxov, always treat RX ring with RDH == RDT as empty hw/e1000.c | 7 ++- 1 file changed, 2 insertions(+), 5 deletions(-) -- 1.7.11.4
[Qemu-devel] [PATCH V3] e1000: drop check_rxov, always treat RX ring with RDH == RDT as empty
Real HW always treats RX ring with RDH == RDT as empty. Emulation is supposed to behave the same. Reported-by: Chris Webb Reported-by: Richard Davies Signed-off-by: Dmitry Fleytman --- hw/e1000.c | 7 ++- 1 file changed, 2 insertions(+), 5 deletions(-) diff --git a/hw/e1000.c b/hw/e1000.c index 63fee10..ab39d47 100644 --- a/hw/e1000.c +++ b/hw/e1000.c @@ -92,7 +92,6 @@ typedef struct E1000State_st { uint32_t rxbuf_size; uint32_t rxbuf_min_shift; -int check_rxov; struct e1000_tx { unsigned char header[256]; unsigned char vlan_header[4]; @@ -741,11 +740,11 @@ static bool e1000_has_rxbufs(E1000State *s, size_t total_size) int bufs; /* Fast-path short packets */ if (total_size <= s->rxbuf_size) { -return s->mac_reg[RDH] != s->mac_reg[RDT] || !s->check_rxov; +return s->mac_reg[RDH] != s->mac_reg[RDT]; } if (s->mac_reg[RDH] < s->mac_reg[RDT]) { bufs = s->mac_reg[RDT] - s->mac_reg[RDH]; -} else if (s->mac_reg[RDH] > s->mac_reg[RDT] || !s->check_rxov) { +} else if (s->mac_reg[RDH] > s->mac_reg[RDT]) { bufs = s->mac_reg[RDLEN] / sizeof(struct e1000_rx_desc) + s->mac_reg[RDT] - s->mac_reg[RDH]; } else { @@ -848,7 +847,6 @@ e1000_receive(NetClientState *nc, const uint8_t *buf, size_t size) if (++s->mac_reg[RDH] * sizeof(desc) >= s->mac_reg[RDLEN]) s->mac_reg[RDH] = 0; -s->check_rxov = 1; /* see comment in start_xmit; same here */ if (s->mac_reg[RDH] == rdh_start) { DBGOUT(RXERR, "RDH wraparound @%x, RDT %x, RDLEN %x\n", @@ -925,7 +923,6 @@ mac_writereg(E1000State *s, int index, uint32_t val) static void set_rdt(E1000State *s, int index, uint32_t val) { -s->check_rxov = 0; s->mac_reg[index] = val & 0x; if (e1000_has_rxbufs(s, 1)) { qemu_flush_queued_packets(&s->nic->nc); -- 1.7.11.4
Re: [Qemu-devel] [PATCH V3] e1000: drop check_rxov, always treat RX ring with RDH == RDT as empty
Thanks Stefan, It was my very first idea to drop check_rxov and solve the problem, however for some reason I was sure that it required to emulate real HW behavior. I'm glad we clarified this. Regards, Dmitry Fleytman On Fri, Oct 19, 2012 at 9:52 AM, Stefan Hajnoczi wrote: > On Fri, Oct 19, 2012 at 07:56:55AM +0200, Dmitry Fleytman wrote: >> Real HW always treats RX ring with RDH == RDT as empty. >> Emulation is supposed to behave the same. >> >> Reported-by: Chris Webb >> Reported-by: Richard Davies >> Signed-off-by: Dmitry Fleytman >> --- >> hw/e1000.c | 7 ++- >> 1 file changed, 2 insertions(+), 5 deletions(-) > > Applied to the net tree: > http://github.com/stefanha/qemu/commits/net > > Thanks for your efforts in squashing this bug! Glad it was possible to > drop check_rxov. > > Stefan -- Dmitry Fleytman Technology Expert and Consultant, Daynix Computing Ltd. Cell: +972-54-2819481 Skype: dmitry.fleytman
[Qemu-devel] [PATCH V8 1/5] Adding utility function net_checksum_add_cont() that allows checksum calculation of scattered data with odd chunk sizes
Adding utility function net_raw_checksum() that calculates checksum of buffer given Signed-off-by: Dmitry Fleytman Signed-off-by: Yan Vugenfirer --- net/checksum.c | 13 +++-- net/checksum.h | 14 +- 2 files changed, 20 insertions(+), 7 deletions(-) diff --git a/net/checksum.c b/net/checksum.c index 9919b2e..4fa5563 100644 --- a/net/checksum.c +++ b/net/checksum.c @@ -20,16 +20,17 @@ #define PROTO_TCP 6 #define PROTO_UDP 17 -uint32_t net_checksum_add(int len, uint8_t *buf) +uint32_t net_checksum_add_cont(int len, uint8_t *buf, int seq) { uint32_t sum = 0; int i; -for (i = 0; i < len; i++) { - if (i & 1) - sum += (uint32_t)buf[i]; - else - sum += (uint32_t)buf[i] << 8; +for (i = seq; i < seq + len; i++) { +if (i & 1) { +sum += (uint32_t)buf[i - seq]; +} else { +sum += (uint32_t)buf[i - seq] << 8; +} } return sum; } diff --git a/net/checksum.h b/net/checksum.h index 1f05298..171924c 100644 --- a/net/checksum.h +++ b/net/checksum.h @@ -20,10 +20,22 @@ #include -uint32_t net_checksum_add(int len, uint8_t *buf); +uint32_t net_checksum_add_cont(int len, uint8_t *buf, int seq); uint16_t net_checksum_finish(uint32_t sum); uint16_t net_checksum_tcpudp(uint16_t length, uint16_t proto, uint8_t *addrs, uint8_t *buf); void net_checksum_calculate(uint8_t *data, int length); +static inline uint32_t +net_checksum_add(int len, uint8_t *buf) +{ +return net_checksum_add_cont(len, buf, 0); +} + +static inline uint16_t +net_raw_checksum(uint8_t *data, int length) +{ + return net_checksum_finish(net_checksum_add(length, data)); +} + #endif /* QEMU_NET_CHECKSUM_H */ -- 1.7.11.7
[Qemu-devel] [PATCH V8 0/5] VMXNET3 paravirtual NIC device implementation
uest to ensure [DF] * it gets consistent memory state [DF] */ ... > + */ > +smp_wmb(); > +} Don't use wrappers like this. They just hide bugs. For example it's not helpful before an interrupt in the function below. [DF] I guess you are talking about vmxnet3_complete_packet() [DF] Strictly speaking barrier is a must because we change shared memory in [DF] vmxnet3_complete_packet() [DF] And the wrapper is a good thing because its name explains its effect [DF] in a formal way as opposed to comments ... > +switch (status) { > +case VMXNET3_PKT_STATUS_OK: { don't put {} around cases: they align incorrectly if it's too big move to a function. [DF] Fixed ... > +static bool > +vmxnet3_send_packet(VMXNET3State *s, uint32_t qidx) > +{ > +size_t bytes_sent = 0; > +bool res = true; why = true? don't initialize just because. [DF] Fixed ... > +/* > + * VMWARE headers we got from Linux kernel do not fully comply QEMU coding > + * standards in sense of types and defines used. > + * Since we didn't want to change VMWARE code, following set of typedefs > + * and defines needed to compile these headers with QEMU introduced. > + */ No need for this now. You can export headers and put them under linux-headers. [DF] Not sure it is possible because the header as-is is not stand-alone and won't compile [DF] without changes. We extracted definitions we use from their header and dropped unused [DF] and kernel-specific stuff. [DF} Please, advise. ... > +if (VMXNET3_OM_TSO == s->offload_mode) { Don't do Yoda style like this [DF] "Yoda" style removed everywhere Changes in V6: Fixed most of problems pointed out by Michael S. Tsirkin The only issue still open is creation of shared place with generic network structures and functions. Currently all generic network code introduced by VMXNET3 resides in vmxnet_utils.c/h files. It could be moved to some shared location however we believe it is a matter of separate refactoring as there are a lot of copy-pasted definitions in almost every device and code cleanup efforts requred in order to create truly shared codebase. Reported-by: Michael S. Tsirkin Implemented suggestions by Anthony Liguori Reported-by: Anthony Liguori Fixed incorrect checksum caclulation for some packets in SW offloads mode Reported-by: Gerhard Wiesinger Changes in V5: MSI-X save/load implemented in the device instead of pci bus as suggested by Michael S. Tsirkin Reported-by: Michael S. Tsirkin Patches regrouped as suggested by Paolo Bonzini Reported-by: Paolo Bonzini Changes in V4: Fixed a few problems uncovered by NETIO test suit Assertion on failure to initialize MSI/MSI-X replaced with warning message and fallback to Legacy/MSI respectively Reported-by: Gerhard Wiesinger Various coding style adjustments and patch split-up as suggested by Anthony Liguori Reported-by: Anthony Liguori Live migration support added Changes in V3: Fixed crash when net device that is used as network fronted has no virtio HDR support. Task offloads emulation for cases when net device that is used as network fronted has no virtio HDR support. Reported-by: Gerhard Wiesinger Changes in V2: License text changed accoring to community suggestions Standard license header from GPLv2+ - licensed QEMU files used Dmitry Fleytman (5): Adding utility function net_checksum_add_cont() that allows checksum calculation of scattered data with odd chunk sizes Adding utility function net_checksum_add_iov() for iovec checksum calculation Adding common definitions for VMWARE devices Adding packet abstraction for VMWARE network devices Adding VMXNET3 device implementation default-configs/pci.mak |1 + hw/Makefile.objs|1 + hw/pci.h|1 + hw/vmware_utils.h | 143 +++ hw/vmxnet3.c| 2437 +++ hw/vmxnet3.h| 762 +++ hw/vmxnet_debug.h | 121 +++ hw/vmxnet_pkt.c | 758 +++ hw/vmxnet_pkt.h | 309 ++ hw/vmxnet_utils.c | 219 + hw/vmxnet_utils.h | 340 +++ iov.h |5 + net/checksum.c | 41 +- net/checksum.h | 22 +- 14 files changed, 5153 insertions(+), 7 deletions(-) create mode 100644 hw/vmware_utils.h create mode 100644 hw/vmxnet3.c create mode 100644 hw/vmxnet3.h create mode 100644 hw/vmxnet_debug.h create mode 100644 hw/vmxnet_pkt.c create mode 100644 hw/vmxnet_pkt.h create mode 100644 hw/vmxnet_utils.c create mode 100644 hw/vmxnet_utils.h -- 1.7.11.7
[Qemu-devel] [PATCH V8 4/5] Adding packet abstraction for VMWARE network devices
Signed-off-by: Dmitry Fleytman --- hw/vmxnet_pkt.c | 758 hw/vmxnet_pkt.h | 309 +++ 2 files changed, 1067 insertions(+) create mode 100644 hw/vmxnet_pkt.c create mode 100644 hw/vmxnet_pkt.h diff --git a/hw/vmxnet_pkt.c b/hw/vmxnet_pkt.c new file mode 100644 index 000..9b501e3 --- /dev/null +++ b/hw/vmxnet_pkt.c @@ -0,0 +1,758 @@ +/* + * QEMU VMWARE VMXNET* paravirtual NICs - packets abstractions + * + * Copyright (c) 2012 Ravello Systems LTD (http://ravellosystems.com) + * + * Developed by Daynix Computing LTD (http://www.daynix.com) + * + * Authors: + * Dmitry Fleytman + * Tamir Shomer + * Yan Vugenfirer + * + * This work is licensed under the terms of the GNU GPL, version 2 or later. + * See the COPYING file in the top-level directory. + * + */ + +#include "vmxnet_pkt.h" +#include "vmxnet_utils.h" +#include "iov.h" + +#include "net/checksum.h" + +/*= + *= + * + *TX CODE + * + *= + *===*/ + +enum { +VMXNET_TX_PKT_VHDR_FRAG = 0, +VMXNET_TX_PKT_L2HDR_FRAG, +VMXNET_TX_PKT_L3HDR_FRAG, +VMXNET_TX_PKT_PL_START_FRAG +}; + +/* TX packet private context */ +struct VmxnetTxPkt{ +struct virtio_net_hdr virt_hdr; +bool has_virt_hdr; + +struct iovec *raw; +uint32_t raw_frags; +uint32_t max_raw_frags; + +struct iovec *vec; + +uint8_t l2_hdr[ETH_MAX_L2_HDR_LEN]; + +uint32_t payload_len; + +uint32_t payload_frags; +uint32_t max_payload_frags; + +uint16_t hdr_len; +eth_pkt_types_e packet_type; +uint8_t l4proto; +}; + +void vmxnet_tx_pkt_init(struct VmxnetTxPkt **pkt, uint32_t max_frags, +bool has_virt_hdr) +{ +struct VmxnetTxPkt *p = g_malloc0(sizeof *p); + +p->vec = g_malloc((sizeof *p->vec) * +(max_frags + VMXNET_TX_PKT_PL_START_FRAG)); + +p->raw = g_malloc((sizeof *p->raw) * max_frags); + +p->max_payload_frags = max_frags; +p->max_raw_frags = max_frags; +p->has_virt_hdr = has_virt_hdr; +p->vec[VMXNET_TX_PKT_VHDR_FRAG].iov_base = &p->virt_hdr; +p->vec[VMXNET_TX_PKT_VHDR_FRAG].iov_len = +p->has_virt_hdr ? sizeof p->virt_hdr : 0; +p->vec[VMXNET_TX_PKT_L2HDR_FRAG].iov_base = &p->l2_hdr; +p->vec[VMXNET_TX_PKT_L3HDR_FRAG].iov_base = NULL; +p->vec[VMXNET_TX_PKT_L3HDR_FRAG].iov_len = 0; + +*pkt = p; +} + +void vmxnet_tx_pkt_uninit(struct VmxnetTxPkt *pkt) +{ +if (pkt) { +if (pkt->vec) { +g_free(pkt->vec); +} + +if (pkt->raw) { +g_free(pkt->raw); +} + +g_free(pkt); +} +} + +void vmxnet_tx_pkt_update_ip_checksums(struct VmxnetTxPkt *pkt) +{ +uint16_t csum; +uint32_t ph_raw_csum; +assert(pkt); +uint8_t gso_type = pkt->virt_hdr.gso_type & ~VIRTIO_NET_HDR_GSO_ECN; +struct ip_header *ip_hdr; + +if (VIRTIO_NET_HDR_GSO_TCPV4 != gso_type && +VIRTIO_NET_HDR_GSO_UDP != gso_type) { +return; +} + +ip_hdr = pkt->vec[VMXNET_TX_PKT_L3HDR_FRAG].iov_base; + +if (pkt->payload_len + pkt->vec[VMXNET_TX_PKT_L3HDR_FRAG].iov_len > +ETH_MAX_IP_DGRAM_LEN) { +return; +} + +ip_hdr->ip_len = cpu_to_be16(pkt->payload_len + +pkt->vec[VMXNET_TX_PKT_L3HDR_FRAG].iov_len); + +/* Calculate IP header checksum*/ +ip_hdr->ip_sum = 0; +csum = net_raw_checksum((uint8_t *)ip_hdr, +pkt->vec[VMXNET_TX_PKT_L3HDR_FRAG].iov_len); +ip_hdr->ip_sum = cpu_to_be16(csum); + +/* Calculate IP pseudo header checksum */ +ph_raw_csum = eth_calc_pseudo_hdr_csum(ip_hdr, pkt->payload_len); +csum = cpu_to_be16(net_checksum_finish(ph_raw_csum)); +iov_from_buf(&pkt->vec[VMXNET_TX_PKT_PL_START_FRAG], pkt->payload_frags, + pkt->virt_hdr.csum_offset, &csum, sizeof(csum)); +} + +static void vmxnet_tx_pkt_calculate_hdr_len(struct VmxnetTxPkt *pkt) +{ +pkt->hdr_len = pkt->vec[VMXNET_TX_PKT_L2HDR_FRAG].iov_len + +pkt->vec[VMXNET_TX_PKT_L3HDR_FRAG].iov_len; +} + +static bool vmxnet_tx_pkt_parse_headers(struct VmxnetTxPkt *pkt) +{ +struct iovec *l2_hdr, *l3_hdr; +size_t bytes_read; +size_t full_ip6hdr_len; +uint16_t l3_proto; + +assert(pkt); + +l2_hdr = &pkt->vec[VMXNET_TX_PKT_L2HDR_FRAG]; +l3_hdr = &pkt->vec[VMXNET_TX_PKT_L3HDR_FRAG]; + +bytes_read = iov_to_buf(pkt->raw, pkt->raw_frags, 0, l2_hdr->iov_base, +ETH_MAX_L2_HDR_LE
[Qemu-devel] [PATCH V8 2/5] Adding utility function net_checksum_add_iov() for iovec checksum calculation
Signed-off-by: Dmitry Fleytman Signed-off-by: Yan Vugenfirer --- iov.h | 5 + net/checksum.c | 28 net/checksum.h | 8 3 files changed, 41 insertions(+) diff --git a/iov.h b/iov.h index 34c8ec9..c184a80 100644 --- a/iov.h +++ b/iov.h @@ -11,6 +11,9 @@ * the COPYING file in the top-level directory. */ +#ifndef QEMU_IOV_H +#define QEMU_IOV_H + #include "qemu-common.h" /** @@ -95,3 +98,5 @@ void iov_hexdump(const struct iovec *iov, const unsigned int iov_cnt, unsigned iov_copy(struct iovec *dst_iov, unsigned int dst_iov_cnt, const struct iovec *iov, unsigned int iov_cnt, size_t offset, size_t bytes); + +#endif /* QEMU_IOV_H */ diff --git a/net/checksum.c b/net/checksum.c index 4fa5563..9c813ff 100644 --- a/net/checksum.c +++ b/net/checksum.c @@ -84,3 +84,31 @@ void net_checksum_calculate(uint8_t *data, int length) data[14+hlen+csum_offset] = csum >> 8; data[14+hlen+csum_offset+1] = csum & 0xff; } + +uint32_t +net_checksum_add_iov(const struct iovec *iov, const unsigned int iov_cnt, + uint32_t iov_off, uint32_t size) +{ +size_t iovec_off, buf_off; +unsigned int i; +uint32_t res = 0; +uint32_t seq = 0; + +iovec_off = 0; +buf_off = 0; +for (i = 0; i < iov_cnt && size; i++) { +if (iov_off < (iovec_off + iov[i].iov_len)) { +size_t len = MIN((iovec_off + iov[i].iov_len) - iov_off , size); +void *chunk_buf = iov[i].iov_base + (iov_off - iovec_off); + +res += net_checksum_add_cont(len, chunk_buf, seq); +seq += len; + +buf_off += len; +iov_off += len; +size -= len; +} +iovec_off += iov[i].iov_len; +} +return res; +} diff --git a/net/checksum.h b/net/checksum.h index 171924c..e63c482 100644 --- a/net/checksum.h +++ b/net/checksum.h @@ -19,6 +19,7 @@ #define QEMU_NET_CHECKSUM_H #include +#include "iov.h" uint32_t net_checksum_add_cont(int len, uint8_t *buf, int seq); uint16_t net_checksum_finish(uint32_t sum); @@ -38,4 +39,11 @@ net_raw_checksum(uint8_t *data, int length) return net_checksum_finish(net_checksum_add(length, data)); } +/** + * Checksum calculation for scatter-gather vector + */ +uint32_t net_checksum_add_iov(const struct iovec *iov, + const unsigned int iov_cnt, + uint32_t iov_off, uint32_t size); + #endif /* QEMU_NET_CHECKSUM_H */ -- 1.7.11.7
[Qemu-devel] [PATCH V8 3/5] Adding common definitions for VMWARE devices
Signed-off-by: Dmitry Fleytman Signed-off-by: Yan Vugenfirer --- hw/vmware_utils.h | 143 +++ hw/vmxnet_debug.h | 121 +++ hw/vmxnet_utils.c | 219 +++ hw/vmxnet_utils.h | 340 ++ 4 files changed, 823 insertions(+) create mode 100644 hw/vmware_utils.h create mode 100644 hw/vmxnet_debug.h create mode 100644 hw/vmxnet_utils.c create mode 100644 hw/vmxnet_utils.h diff --git a/hw/vmware_utils.h b/hw/vmware_utils.h new file mode 100644 index 000..7f449eb --- /dev/null +++ b/hw/vmware_utils.h @@ -0,0 +1,143 @@ +/* + * QEMU VMWARE paravirtual devices - auxiliary code + * + * Copyright (c) 2012 Ravello Systems LTD (http://ravellosystems.com) + * + * Developed by Daynix Computing LTD (http://www.daynix.com) + * + * Authors: + * Dmitry Fleytman + * Yan Vugenfirer + * + * This work is licensed under the terms of the GNU GPL, version 2 or later. + * See the COPYING file in the top-level directory. + * + */ + +#ifndef VMWARE_UTILS_H +#define VMWARE_UTILS_H + +#include "range.h" + +#ifndef VMW_SHPRN +#define VMW_SHPRN(fmt, ...) do {} while (0) +#endif + +/* + * Shared memory access functions with byte swap support + * Each function contains printout for reverse-engineering needs + * + */ +static inline void +vmw_shmem_read(hwaddr addr, void *buf, int len) +{ +VMW_SHPRN("SHMEM r: %" PRIx64 ", len: %d to %p", addr, len, buf); +cpu_physical_memory_read(addr, buf, len); +} + +static inline void +vmw_shmem_write(hwaddr addr, void *buf, int len) +{ +VMW_SHPRN("SHMEM w: %" PRIx64 ", len: %d to %p", addr, len, buf); +cpu_physical_memory_write(addr, buf, len); +} + +static inline void +vmw_shmem_rw(hwaddr addr, void *buf, int len, int is_write) +{ +VMW_SHPRN("SHMEM r/w: %" PRIx64 ", len: %d (to %p), is write: %d", + addr, len, buf, is_write); + +cpu_physical_memory_rw(addr, buf, len, is_write); +} + +static inline void +vmw_shmem_set(hwaddr addr, uint8 val, int len) +{ +int i; +VMW_SHPRN("SHMEM set: %" PRIx64 ", len: %d (value 0x%X)", addr, len, val); + +for (i = 0; i < len; i++) { +cpu_physical_memory_write(addr + i, &val, 1); +} +} + +static inline uint32_t +vmw_shmem_ld8(hwaddr addr) +{ +uint8_t res = ldub_phys(addr); +VMW_SHPRN("SHMEM load8: %" PRIx64 " (value 0x%X)", addr, res); +return res; +} + +static inline void +vmw_shmem_st8(hwaddr addr, uint8_t value) +{ +VMW_SHPRN("SHMEM store8: %" PRIx64 " (value 0x%X)", addr, value); +stb_phys(addr, value); +} + +static inline uint32_t +vmw_shmem_ld16(hwaddr addr) +{ +uint16_t res = lduw_le_phys(addr); +VMW_SHPRN("SHMEM load16: %" PRIx64 " (value 0x%X)", addr, res); +return res; +} + +static inline void +vmw_shmem_st16(hwaddr addr, uint16_t value) +{ +VMW_SHPRN("SHMEM store16: %" PRIx64 " (value 0x%X)", addr, value); +stw_le_phys(addr, value); +} + +static inline uint32_t +vmw_shmem_ld32(hwaddr addr) +{ +uint32_t res = ldl_le_phys(addr); +VMW_SHPRN("SHMEM load32: %" PRIx64 " (value 0x%X)", addr, res); +return res; +} + +static inline void +vmw_shmem_st32(hwaddr addr, uint32_t value) +{ +VMW_SHPRN("SHMEM store32: %" PRIx64 " (value 0x%X)", addr, value); +stl_le_phys(addr, value); +} + +static inline uint64_t +vmw_shmem_ld64(hwaddr addr) +{ +uint64_t res = ldq_le_phys(addr); +VMW_SHPRN("SHMEM load64: %" PRIx64 " (value %" PRIx64 ")", addr, res); +return res; +} + +static inline void +vmw_shmem_st64(hwaddr addr, uint64_t value) +{ +VMW_SHPRN("SHMEM store64: %" PRIx64 " (value %" PRIx64 ")", addr, value); +stq_le_phys(addr, value); +} + +/* Macros for simplification of operations on array-style registers */ + +/* + * Whether lies inside of array-style register defined by , + * number of elements () and element size () + * +*/ +#define VMW_IS_MULTIREG_ADDR(addr, base, cnt, regsize) \ +range_covers_byte(base, cnt * regsize, addr) + +/* + * Returns index of given register () in array-style register defined by + * and element size () + * +*/ +#define VMW_MULTIREG_IDX_BY_ADDR(addr, base, regsize) \ +(((addr) - (base)) / (regsize)) + +#endif diff --git a/hw/vmxnet_debug.h b/hw/vmxnet_debug.h new file mode 100644 index 000..faa1431 --- /dev/null +++ b/hw/vmxnet_debug.h @@ -0,0 +1,121 @@ +/* + * QEMU VMWARE VMXNET* paravirtual NICs - debugging facilities + * + * Copyright (c) 2012 Ravello Systems LTD (http://ravellosystems.com) + * + * Developed by Daynix Computing LTD (http://www.daynix.com) + * + * Authors: + * Dmitry Fleytman + * Tamir Shomer + * Yan Vugenfirer + * + * This work is license
Re: [Qemu-devel] [PATCH V8 0/5] VMXNET3 paravirtual NIC device implementation
Hi Everyone Is there any progress with these patches? Is there a chance they will be committed any time soon? Thanks, Dmitry On Fri, Dec 7, 2012 at 1:15 PM, Dmitry Fleytman wrote: > This set of patches implements VMWare VMXNET3 paravirtual NIC device. > The device supports of all the device features including offload > capabilties, > VLANs and etc. > The device is tested on different OSes: > Fedora 15 > Ubuntu 10.4 > Centos 6.2 > Windows 2008R2 > Windows 2008 64bit > Windows 2008 32bit > Windows 2003 64bit > Windows 2003 32bit > > Changes in V8: >Reported-by: Stefan Hajnoczi >Issues reported by Stefan Hajnoczi reviewed and mostly fixed: > > > +} > > +curr_src_off += src[i].iov_len; > > +} > > +return j; > > +} > > The existing iov_copy() function provides equivalent functionality. I > don't think iov_rebuild() is needed. > > [DF] Done. Thanks, missed it. > > > > +size -= len; > > +} > > +iovec_off += iov[i].iov_len; > > +} > > +return res; > > +} > > Rename this net_checksum_add_iov() and place it in net/checksum.c, > then the new dependency on net from block can be dropped. > > [DF] Done. > > > +vmw_shmem_read(hwaddr addr, void *buf, int len) > > { > > VMW_SHPRN("SHMEM r: %" PRIx64 ", len: %d to %p", addr, len, buf); > > cpu_physical_memory_read(addr, buf, len); > > } > > All changes to this file should be squashed with the previous patch. > > [DF] Done > > > +#ifdef VMXNET_DEBUG_SHMEM_ACCESS > > +#define VMW_SHPRN(fmt, ...) > > \ > > +do { > > \ > > +printf("[%s][SH][%s]: " fmt "\n", VMXNET_DEVICE_NAME, __func__, > > \ > > +## __VA_ARGS__); > > \ > > +} while (0) > > +#else > > +#define VMW_SHPRN(fmt, ...) do {} while (0) > > +#endif > > Please use QEMU tracing. It eliminates all this boilerplate and > conditional compilation. Tracing can be enabled/disabled at runtime > and works with SystemTap/DTrace. See docs/tracing.txt. > > [DF] We'd like to stick with compile time logic in this case becase of 2 > reasons: > [DF] 1. These printouts are intended for reverse engineering/development > only and there is > [DF]no need to enable them at run time > [DF] 2. There is a big number of printouts, all driver-device > communication is traced, > [DF]they hit performance even on strongest x86 in case of run-time > logic > > > > +struct eth_header { > > +uint8_t h_dest[ETH_ALEN]; /* destination eth addr */ > > +uint8_t h_source[ETH_ALEN]; /* source ether addr*/ > > +uint16_t h_proto;/* packet type ID field */ > > +}; > > Looks like it's copy-pasted stuff from /usr/include/linux/if_*.h, > /usr/include/netinet/*.h, and friends. If the system-wide headers are > included names will collide for some of the macros at least. > > Did you check if the slirp/ definitions can be reused? > > [DF] Yes, you are right. This is copy-pasted from different places. > [DF] Slips definishing do not fully cover our needs. > > > I'd rather we import network header definitions once in a generic > place into the source tree. That way vmxnet and other components > don't need to redefine these structs. > > [DF] Exaclty! Our intention is to create generic header with network > definitions and make everyone use it. > [DF] We can move our header to some shared place if you want, however I'd > do it in parallel with cleanup > [DF] of similar definitions in existing code and this is a big change that > os out of scope of these patches. > > > + > > > *===*/ > > Is this huge comment box a sign that the code should be split into a > foo_tx.c and an foo_rx.c file? > > [DF] As for me this file is not that big to be splitted (<800 lines), > however I'll do this if you insist :) > > > +size_t vmxnet_tx_pkt_send(VmxnetTxPktH pkt, NetClientState *vc) > > 'vc' is an old name that was used for VLANClientState. The struct has > since been renamed to NetClientState and the rest of QEMU uses 'nc' > instead of 'vc'. > > [DF] Fixed. Thanks. > > > +/* tx module context handle */ > > +typedef void *VmxnetTxPktH; > > Forward-declaring the struct is nicer: > > typedef struct VmxnetTxPkt VmxnetTxPkt; > > The definition of VmxnetTxPkt is still hidden from the caller but yo
Re: [Qemu-devel] [PATCH V10 0/5] VMXNET3 paravirtual NIC device implementation
Stefan, Is there any else needs to be done in order to get these patches committed? Thanks, Dmitry. On Wed, Jan 23, 2013 at 7:30 AM, Dmitry Fleytman wrote: > This set of patches implements VMWare VMXNET3 paravirtual NIC device. > The device supports of all the device features including offload > capabilties, > VLANs and etc. > The device is tested on different OSes: > Fedora 15 > Ubuntu 10.4 > Centos 6.2 > Windows 2008R2 > Windows 2008 64bit > Windows 2008 32bit > Windows 2003 64bit > Windows 2003 32bit > > Changes in V10: >Reported-by: Stefan Hajnoczi >Issues reported by Stefan Hajnoczi fixed. > > Changes in V9: >Reported-by: Stefan Hajnoczi >Issues reported by Stefan Hajnoczi fixed. > > Changes in V8: >Reported-by: Stefan Hajnoczi >Issues reported by Stefan Hajnoczi reviewed and mostly fixed: > > > +} > > +curr_src_off += src[i].iov_len; > > +} > > +return j; > > +} > > The existing iov_copy() function provides equivalent functionality. I > don't think iov_rebuild() is needed. > > [DF] Done. Thanks, missed it. > > > > +size -= len; > > +} > > +iovec_off += iov[i].iov_len; > > +} > > +return res; > > +} > > Rename this net_checksum_add_iov() and place it in net/checksum.c, > then the new dependency on net from block can be dropped. > > [DF] Done. > > > +vmw_shmem_read(hwaddr addr, void *buf, int len) > > { > > VMW_SHPRN("SHMEM r: %" PRIx64 ", len: %d to %p", addr, len, buf); > > cpu_physical_memory_read(addr, buf, len); > > } > > All changes to this file should be squashed with the previous patch. > > [DF] Done > > > +#ifdef VMXNET_DEBUG_SHMEM_ACCESS > > +#define VMW_SHPRN(fmt, ...) > > \ > > +do { > > \ > > +printf("[%s][SH][%s]: " fmt "\n", VMXNET_DEVICE_NAME, __func__, > > \ > > +## __VA_ARGS__); > > \ > > +} while (0) > > +#else > > +#define VMW_SHPRN(fmt, ...) do {} while (0) > > +#endif > > Please use QEMU tracing. It eliminates all this boilerplate and > conditional compilation. Tracing can be enabled/disabled at runtime > and works with SystemTap/DTrace. See docs/tracing.txt. > > [DF] We'd like to stick with compile time logic in this case becase of 2 > reasons: > [DF] 1. These printouts are intended for reverse engineering/development > only and there is > [DF]no need to enable them at run time > [DF] 2. There is a big number of printouts, all driver-device > communication is traced, > [DF]they hit performance even on strongest x86 in case of run-time > logic > > > > +struct eth_header { > > +uint8_t h_dest[ETH_ALEN]; /* destination eth addr */ > > +uint8_t h_source[ETH_ALEN]; /* source ether addr*/ > > +uint16_t h_proto;/* packet type ID field */ > > +}; > > Looks like it's copy-pasted stuff from /usr/include/linux/if_*.h, > /usr/include/netinet/*.h, and friends. If the system-wide headers are > included names will collide for some of the macros at least. > > Did you check if the slirp/ definitions can be reused? > > [DF] Yes, you are right. This is copy-pasted from different places. > [DF] Slips definishing do not fully cover our needs. > > > I'd rather we import network header definitions once in a generic > place into the source tree. That way vmxnet and other components > don't need to redefine these structs. > > [DF] Exaclty! Our intention is to create generic header with network > definitions and make everyone use it. > [DF] We can move our header to some shared place if you want, however I'd > do it in parallel with cleanup > [DF] of similar definitions in existing code and this is a big change that > os out of scope of these patches. > > > + > > > *===*/ > > Is this huge comment box a sign that the code should be split into a > foo_tx.c and an foo_rx.c file? > > [DF] As for me this file is not that big to be splitted (<800 lines), > however I'll do this if you insist :) > > > +size_t vmxnet_tx_pkt_send(VmxnetTxPktH pkt, NetClientState *vc) > > 'vc' is an old name that was used for VLANClientState. The struct has > since been renamed to NetClientState and the rest of QEMU uses 'nc' > instead of 'vc'. > > [DF] Fixed. Thanks. > > > +/* tx module context handle */ > > +typedef vo
[Qemu-devel] [PATCH v11 3/5] Common definitions for VMWARE devices
Signed-off-by: Dmitry Fleytman Signed-off-by: Yan Vugenfirer --- hw/vmware_utils.h | 143 ++ hw/vmxnet_debug.h | 115 ++ include/net/eth.h | 347 ++ net/Makefile.objs | 1 + net/eth.c | 217 ++ 5 files changed, 823 insertions(+) create mode 100644 hw/vmware_utils.h create mode 100644 hw/vmxnet_debug.h create mode 100644 include/net/eth.h create mode 100644 net/eth.c diff --git a/hw/vmware_utils.h b/hw/vmware_utils.h new file mode 100644 index 000..5307e2c --- /dev/null +++ b/hw/vmware_utils.h @@ -0,0 +1,143 @@ +/* + * QEMU VMWARE paravirtual devices - auxiliary code + * + * Copyright (c) 2012 Ravello Systems LTD (http://ravellosystems.com) + * + * Developed by Daynix Computing LTD (http://www.daynix.com) + * + * Authors: + * Dmitry Fleytman + * Yan Vugenfirer + * + * This work is licensed under the terms of the GNU GPL, version 2 or later. + * See the COPYING file in the top-level directory. + * + */ + +#ifndef VMWARE_UTILS_H +#define VMWARE_UTILS_H + +#include "qemu/range.h" + +#ifndef VMW_SHPRN +#define VMW_SHPRN(fmt, ...) do {} while (0) +#endif + +/* + * Shared memory access functions with byte swap support + * Each function contains printout for reverse-engineering needs + * + */ +static inline void +vmw_shmem_read(hwaddr addr, void *buf, int len) +{ +VMW_SHPRN("SHMEM r: %" PRIx64 ", len: %d to %p", addr, len, buf); +cpu_physical_memory_read(addr, buf, len); +} + +static inline void +vmw_shmem_write(hwaddr addr, void *buf, int len) +{ +VMW_SHPRN("SHMEM w: %" PRIx64 ", len: %d to %p", addr, len, buf); +cpu_physical_memory_write(addr, buf, len); +} + +static inline void +vmw_shmem_rw(hwaddr addr, void *buf, int len, int is_write) +{ +VMW_SHPRN("SHMEM r/w: %" PRIx64 ", len: %d (to %p), is write: %d", + addr, len, buf, is_write); + +cpu_physical_memory_rw(addr, buf, len, is_write); +} + +static inline void +vmw_shmem_set(hwaddr addr, uint8 val, int len) +{ +int i; +VMW_SHPRN("SHMEM set: %" PRIx64 ", len: %d (value 0x%X)", addr, len, val); + +for (i = 0; i < len; i++) { +cpu_physical_memory_write(addr + i, &val, 1); +} +} + +static inline uint32_t +vmw_shmem_ld8(hwaddr addr) +{ +uint8_t res = ldub_phys(addr); +VMW_SHPRN("SHMEM load8: %" PRIx64 " (value 0x%X)", addr, res); +return res; +} + +static inline void +vmw_shmem_st8(hwaddr addr, uint8_t value) +{ +VMW_SHPRN("SHMEM store8: %" PRIx64 " (value 0x%X)", addr, value); +stb_phys(addr, value); +} + +static inline uint32_t +vmw_shmem_ld16(hwaddr addr) +{ +uint16_t res = lduw_le_phys(addr); +VMW_SHPRN("SHMEM load16: %" PRIx64 " (value 0x%X)", addr, res); +return res; +} + +static inline void +vmw_shmem_st16(hwaddr addr, uint16_t value) +{ +VMW_SHPRN("SHMEM store16: %" PRIx64 " (value 0x%X)", addr, value); +stw_le_phys(addr, value); +} + +static inline uint32_t +vmw_shmem_ld32(hwaddr addr) +{ +uint32_t res = ldl_le_phys(addr); +VMW_SHPRN("SHMEM load32: %" PRIx64 " (value 0x%X)", addr, res); +return res; +} + +static inline void +vmw_shmem_st32(hwaddr addr, uint32_t value) +{ +VMW_SHPRN("SHMEM store32: %" PRIx64 " (value 0x%X)", addr, value); +stl_le_phys(addr, value); +} + +static inline uint64_t +vmw_shmem_ld64(hwaddr addr) +{ +uint64_t res = ldq_le_phys(addr); +VMW_SHPRN("SHMEM load64: %" PRIx64 " (value %" PRIx64 ")", addr, res); +return res; +} + +static inline void +vmw_shmem_st64(hwaddr addr, uint64_t value) +{ +VMW_SHPRN("SHMEM store64: %" PRIx64 " (value %" PRIx64 ")", addr, value); +stq_le_phys(addr, value); +} + +/* Macros for simplification of operations on array-style registers */ + +/* + * Whether lies inside of array-style register defined by , + * number of elements () and element size () + * +*/ +#define VMW_IS_MULTIREG_ADDR(addr, base, cnt, regsize) \ +range_covers_byte(base, cnt * regsize, addr) + +/* + * Returns index of given register () in array-style register defined by + * and element size () + * +*/ +#define VMW_MULTIREG_IDX_BY_ADDR(addr, base, regsize) \ +(((addr) - (base)) / (regsize)) + +#endif diff --git a/hw/vmxnet_debug.h b/hw/vmxnet_debug.h new file mode 100644 index 000..96dae0f --- /dev/null +++ b/hw/vmxnet_debug.h @@ -0,0 +1,115 @@ +/* + * QEMU VMWARE VMXNET* paravirtual NICs - debugging facilities + * + * Copyright (c) 2012 Ravello Systems LTD (http://ravellosystems.com) + * + * Developed by Daynix Computing LTD (http://www.daynix.com) + * + * Authors: + * Dmitry Fleytman + * Tamir Shomer + * Yan Vugenfirer + * + * Th
[Qemu-devel] [PATCH v11 1/5] Checksum-related utility functions
net_checksum_add_cont() checksum calculation for scattered data with odd chunk sizes net_raw_checksum() checksum calculation for a buffer Signed-off-by: Dmitry Fleytman Signed-off-by: Yan Vugenfirer --- include/net/checksum.h | 14 +- net/checksum.c | 13 +++-- 2 files changed, 20 insertions(+), 7 deletions(-) diff --git a/include/net/checksum.h b/include/net/checksum.h index 1f05298..3e7b93d 100644 --- a/include/net/checksum.h +++ b/include/net/checksum.h @@ -20,10 +20,22 @@ #include -uint32_t net_checksum_add(int len, uint8_t *buf); +uint32_t net_checksum_add_cont(int len, uint8_t *buf, int seq); uint16_t net_checksum_finish(uint32_t sum); uint16_t net_checksum_tcpudp(uint16_t length, uint16_t proto, uint8_t *addrs, uint8_t *buf); void net_checksum_calculate(uint8_t *data, int length); +static inline uint32_t +net_checksum_add(int len, uint8_t *buf) +{ +return net_checksum_add_cont(len, buf, 0); +} + +static inline uint16_t +net_raw_checksum(uint8_t *data, int length) +{ +return net_checksum_finish(net_checksum_add(length, data)); +} + #endif /* QEMU_NET_CHECKSUM_H */ diff --git a/net/checksum.c b/net/checksum.c index 9919b2e..4fa5563 100644 --- a/net/checksum.c +++ b/net/checksum.c @@ -20,16 +20,17 @@ #define PROTO_TCP 6 #define PROTO_UDP 17 -uint32_t net_checksum_add(int len, uint8_t *buf) +uint32_t net_checksum_add_cont(int len, uint8_t *buf, int seq) { uint32_t sum = 0; int i; -for (i = 0; i < len; i++) { - if (i & 1) - sum += (uint32_t)buf[i]; - else - sum += (uint32_t)buf[i] << 8; +for (i = seq; i < seq + len; i++) { +if (i & 1) { +sum += (uint32_t)buf[i - seq]; +} else { +sum += (uint32_t)buf[i - seq] << 8; +} } return sum; } -- 1.8.1.2
[Qemu-devel] [PATCH v11 4/5] Packet abstraction for VMWARE network devices
Signed-off-by: Dmitry Fleytman Signed-off-by: Yan Vugenfirer --- hw/Makefile.objs | 1 + hw/vmxnet_rx_pkt.c | 187 ++ hw/vmxnet_rx_pkt.h | 174 hw/vmxnet_tx_pkt.c | 567 + hw/vmxnet_tx_pkt.h | 148 ++ 5 files changed, 1077 insertions(+) create mode 100644 hw/vmxnet_rx_pkt.c create mode 100644 hw/vmxnet_rx_pkt.h create mode 100644 hw/vmxnet_tx_pkt.c create mode 100644 hw/vmxnet_tx_pkt.h diff --git a/hw/Makefile.objs b/hw/Makefile.objs index a1f3a80..4128f45 100644 --- a/hw/Makefile.objs +++ b/hw/Makefile.objs @@ -118,6 +118,7 @@ common-obj-$(CONFIG_PCNET_PCI) += pcnet-pci.o common-obj-$(CONFIG_PCNET_COMMON) += pcnet.o common-obj-$(CONFIG_E1000_PCI) += e1000.o common-obj-$(CONFIG_RTL8139_PCI) += rtl8139.o +common-obj-$(CONFIG_VMXNET3_PCI) += vmxnet_tx_pkt.o vmxnet_rx_pkt.o common-obj-$(CONFIG_SMC91C111) += smc91c111.o common-obj-$(CONFIG_LAN9118) += lan9118.o diff --git a/hw/vmxnet_rx_pkt.c b/hw/vmxnet_rx_pkt.c new file mode 100644 index 000..a40e346 --- /dev/null +++ b/hw/vmxnet_rx_pkt.c @@ -0,0 +1,187 @@ +/* + * QEMU VMWARE VMXNET* paravirtual NICs - RX packets abstractions + * + * Copyright (c) 2012 Ravello Systems LTD (http://ravellosystems.com) + * + * Developed by Daynix Computing LTD (http://www.daynix.com) + * + * Authors: + * Dmitry Fleytman + * Tamir Shomer + * Yan Vugenfirer + * + * This work is licensed under the terms of the GNU GPL, version 2 or later. + * See the COPYING file in the top-level directory. + * + */ + +#include "vmxnet_rx_pkt.h" +#include "net/eth.h" +#include "qemu-common.h" +#include "qemu/iov.h" +#include "net/checksum.h" +#include "net/tap.h" + +/* + * RX packet may contain up to 2 fragments - rebuilt eth header + * in case of VLAN tag stripping + * and payload received from QEMU - in any case + */ +#define VMXNET_MAX_RX_PACKET_FRAGMENTS (2) + +struct VmxnetRxPkt { +struct virtio_net_hdr virt_hdr; +uint8_t ehdr_buf[ETH_MAX_L2_HDR_LEN]; +struct iovec vec[VMXNET_MAX_RX_PACKET_FRAGMENTS]; +uint16_t vec_len; +uint32_t tot_len; +uint16_t tci; +bool vlan_stripped; +bool has_virt_hdr; +eth_pkt_types_e packet_type; + +/* Analysis results */ +bool isip4; +bool isip6; +bool isudp; +bool istcp; +}; + +void vmxnet_rx_pkt_init(struct VmxnetRxPkt **pkt, bool has_virt_hdr) +{ +struct VmxnetRxPkt *p = g_malloc0(sizeof *p); +p->has_virt_hdr = has_virt_hdr; +*pkt = p; +} + +void vmxnet_rx_pkt_uninit(struct VmxnetRxPkt *pkt) +{ +g_free(pkt); +} + +struct virtio_net_hdr *vmxnet_rx_pkt_get_vhdr(struct VmxnetRxPkt *pkt) +{ +assert(pkt); +return &pkt->virt_hdr; +} + +void vmxnet_rx_pkt_attach_data(struct VmxnetRxPkt *pkt, const void *data, + size_t len, bool strip_vlan) +{ +uint16_t tci = 0; +uint16_t ploff; +assert(pkt); +pkt->vlan_stripped = false; + +if (strip_vlan) { +pkt->vlan_stripped = eth_strip_vlan(data, pkt->ehdr_buf, &ploff, &tci); +} + +if (pkt->vlan_stripped) { +pkt->vec[0].iov_base = pkt->ehdr_buf; +pkt->vec[0].iov_len = ploff - sizeof(struct vlan_header); +pkt->vec[1].iov_base = (uint8_t *) data + ploff; +pkt->vec[1].iov_len = len - ploff; +pkt->vec_len = 2; +pkt->tot_len = len - ploff + sizeof(struct eth_header); +} else { +pkt->vec[0].iov_base = (void *)data; +pkt->vec[0].iov_len = len; +pkt->vec_len = 1; +pkt->tot_len = len; +} + +pkt->tci = tci; + +eth_get_protocols(data, len, &pkt->isip4, &pkt->isip6, +&pkt->isudp, &pkt->istcp); +} + +void vmxnet_rx_pkt_dump(struct VmxnetRxPkt *pkt) +{ +#ifdef VMXNET_RX_PKT_DEBUG +VmxnetRxPkt *pkt = (VmxnetRxPkt *)pkt; +assert(pkt); + +printf("RX PKT: tot_len: %d, vlan_stripped: %d, vlan_tag: %d\n", + pkt->tot_len, pkt->vlan_stripped, pkt->tci); +#endif +} + +void vmxnet_rx_pkt_set_packet_type(struct VmxnetRxPkt *pkt, +eth_pkt_types_e packet_type) +{ +assert(pkt); + +pkt->packet_type = packet_type; + +} + +eth_pkt_types_e vmxnet_rx_pkt_get_packet_type(struct VmxnetRxPkt *pkt) +{ +assert(pkt); + +return pkt->packet_type; +} + +size_t vmxnet_rx_pkt_get_total_len(struct VmxnetRxPkt *pkt) +{ +assert(pkt); + +return pkt->tot_len; +} + +void vmxnet_rx_pkt_get_protocols(struct VmxnetRxPkt *pkt, + bool *isip4, bool *isip6, + bool *isudp, bool *istcp) +{ +assert(pkt); + +*isip4 = pkt->isip4; +*isip6 = pkt->isip6; +*isudp = pkt->isudp; +*istcp = pkt->istcp; +} + +struct iovec *vmxnet_rx_pkt_get_iovec(struct VmxnetRxPkt *pkt) +{ +assert(pkt); + +return
[Qemu-devel] [PATCH V11 0/5] VMXNET3 paravirtual NIC device implementation
> + * Flush shared memory changes > + * Needed before transferring control to guest what does 'transferring control to guest' mean? [DF] Changed to [DF]/* [DF] * Flush shared memory changes [DF] * Needed before sending interrupt to guest to ensure [DF] * it gets consistent memory state [DF] */ ... > + */ > +smp_wmb(); > +} Don't use wrappers like this. They just hide bugs. For example it's not helpful before an interrupt in the function below. [DF] I guess you are talking about vmxnet3_complete_packet() [DF] Strictly speaking barrier is a must because we change shared memory in [DF] vmxnet3_complete_packet() [DF] And the wrapper is a good thing because its name explains its effect [DF] in a formal way as opposed to comments ... > +switch (status) { > +case VMXNET3_PKT_STATUS_OK: { don't put {} around cases: they align incorrectly if it's too big move to a function. [DF] Fixed ... > +static bool > +vmxnet3_send_packet(VMXNET3State *s, uint32_t qidx) > +{ > +size_t bytes_sent = 0; > +bool res = true; why = true? don't initialize just because. [DF] Fixed ... > +/* > + * VMWARE headers we got from Linux kernel do not fully comply QEMU coding > + * standards in sense of types and defines used. > + * Since we didn't want to change VMWARE code, following set of typedefs > + * and defines needed to compile these headers with QEMU introduced. > + */ No need for this now. You can export headers and put them under linux-headers. [DF] Not sure it is possible because the header as-is is not stand-alone and won't compile [DF] without changes. We extracted definitions we use from their header and dropped unused [DF] and kernel-specific stuff. [DF} Please, advise. ... > +if (VMXNET3_OM_TSO == s->offload_mode) { Don't do Yoda style like this [DF] "Yoda" style removed everywhere Changes in V6: Fixed most of problems pointed out by Michael S. Tsirkin The only issue still open is creation of shared place with generic network structures and functions. Currently all generic network code introduced by VMXNET3 resides in vmxnet_utils.c/h files. It could be moved to some shared location however we believe it is a matter of separate refactoring as there are a lot of copy-pasted definitions in almost every device and code cleanup efforts requred in order to create truly shared codebase. Reported-by: Michael S. Tsirkin Implemented suggestions by Anthony Liguori Reported-by: Anthony Liguori Fixed incorrect checksum caclulation for some packets in SW offloads mode Reported-by: Gerhard Wiesinger Changes in V5: MSI-X save/load implemented in the device instead of pci bus as suggested by Michael S. Tsirkin Reported-by: Michael S. Tsirkin Patches regrouped as suggested by Paolo Bonzini Reported-by: Paolo Bonzini Changes in V4: Fixed a few problems uncovered by NETIO test suit Assertion on failure to initialize MSI/MSI-X replaced with warning message and fallback to Legacy/MSI respectively Reported-by: Gerhard Wiesinger Various coding style adjustments and patch split-up as suggested by Anthony Liguori Reported-by: Anthony Liguori Live migration support added Changes in V3: Fixed crash when net device that is used as network fronted has no virtio HDR support. Task offloads emulation for cases when net device that is used as network fronted has no virtio HDR support. Reported-by: Gerhard Wiesinger Changes in V2: License text changed accoring to community suggestions Standard license header from GPLv2+ - licensed QEMU files used Dmitry Fleytman (5): Adding utility function net_checksum_add_cont() that allows checksum calculation of scattered data with odd chunk sizes Adding utility function net_checksum_add_iov() for iovec checksum calculation Adding common definitions for VMWARE devices Adding packet abstraction for VMWARE network devices Adding VMXNET3 device implementation default-configs/pci.mak |1 + hw/Makefile.objs|1 + hw/pci.h|1 + hw/vmware_utils.h | 143 +++ hw/vmxnet3.c| 2437 +++ hw/vmxnet3.h| 762 +++ hw/vmxnet_debug.h | 121 +++ hw/vmxnet_pkt.c | 758 +++ hw/vmxnet_pkt.h | 309 ++ hw/vmxnet_utils.c | 219 + hw/vmxnet_utils.h | 340 +++ iov.h |5 + net/checksum.c | 41 +- net/checksum.h | 22 +- 14 files changed, 5153 insertions(+), 7 deletions(-) create mode 100644 hw/vmware_utils.h create mode 100644 hw/vmxnet3.c create mode 100644 hw/vmxnet3.h create mode 100644 hw/vmxnet_debug.h create mode 100644 hw/vmxnet_pkt.c create mode 100644 hw/vmxnet_pkt.h create mode 100644 hw/vmxnet_utils.c create mode 100644 hw/vmxnet_utils.h -- 1.7.11.7
[Qemu-devel] [PATCH v11 2/5] iovec checksum calculation fuction
Signed-off-by: Dmitry Fleytman Signed-off-by: Yan Vugenfirer --- include/net/checksum.h | 8 net/checksum.c | 28 2 files changed, 36 insertions(+) diff --git a/include/net/checksum.h b/include/net/checksum.h index 3e7b93d..b1cf18a 100644 --- a/include/net/checksum.h +++ b/include/net/checksum.h @@ -19,6 +19,7 @@ #define QEMU_NET_CHECKSUM_H #include +#include "qemu-common.h" uint32_t net_checksum_add_cont(int len, uint8_t *buf, int seq); uint16_t net_checksum_finish(uint32_t sum); @@ -38,4 +39,11 @@ net_raw_checksum(uint8_t *data, int length) return net_checksum_finish(net_checksum_add(length, data)); } +/** + * Checksum calculation for scatter-gather vector + */ +uint32_t net_checksum_add_iov(const struct iovec *iov, + const unsigned int iov_cnt, + uint32_t iov_off, uint32_t size); + #endif /* QEMU_NET_CHECKSUM_H */ diff --git a/net/checksum.c b/net/checksum.c index 4fa5563..9c813ff 100644 --- a/net/checksum.c +++ b/net/checksum.c @@ -84,3 +84,31 @@ void net_checksum_calculate(uint8_t *data, int length) data[14+hlen+csum_offset] = csum >> 8; data[14+hlen+csum_offset+1] = csum & 0xff; } + +uint32_t +net_checksum_add_iov(const struct iovec *iov, const unsigned int iov_cnt, + uint32_t iov_off, uint32_t size) +{ +size_t iovec_off, buf_off; +unsigned int i; +uint32_t res = 0; +uint32_t seq = 0; + +iovec_off = 0; +buf_off = 0; +for (i = 0; i < iov_cnt && size; i++) { +if (iov_off < (iovec_off + iov[i].iov_len)) { +size_t len = MIN((iovec_off + iov[i].iov_len) - iov_off , size); +void *chunk_buf = iov[i].iov_base + (iov_off - iovec_off); + +res += net_checksum_add_cont(len, chunk_buf, seq); +seq += len; + +buf_off += len; +iov_off += len; +size -= len; +} +iovec_off += iov[i].iov_len; +} +return res; +} -- 1.8.1.2
Re: [Qemu-devel] [PATCH V10 0/5] VMXNET3 paravirtual NIC device implementation
Done. On Thu, Feb 21, 2013 at 12:47 PM, Stefan Hajnoczi wrote: > Hi Dmitry, > The net multiqueue feature went into QEMU 1.4 and conflicts with vmxnet3.c. > > Please post a rebased version onto qemu.git/master so vmxnet3 can be > merged. I'm currently getting the following compiler errors with > these patches: > > hw/vmxnet3.c: In function ‘vmxnet3_set_variable_mac’: > hw/vmxnet3.c:438:37: error: ‘NICState’ has no member named ‘nc’ > hw/vmxnet3.c: In function ‘vmxnet3_send_packet’: > hw/vmxnet3.c:680:47: error: ‘NICState’ has no member named ‘nc’ > hw/vmxnet3.c: In function ‘vmxnet3_update_features’: > hw/vmxnet3.c:1285:31: error: ‘NICState’ has no member named ‘nc’ > hw/vmxnet3.c: In function ‘vmxnet3_can_receive’: > hw/vmxnet3.c:1715:94: error: ‘NICState’ has no member named ‘nc’ > hw/vmxnet3.c:1715:162: error: ‘NICState’ has no member named ‘nc’ > hw/vmxnet3.c:1715:178: error: initialization from incompatible pointer > type [-Werror] > hw/vmxnet3.c:1715:178: error: (near initialization for ‘s’) [-Werror] > hw/vmxnet3.c:1715:216: error: ‘NICState’ has no member named ‘nc’ > hw/vmxnet3.c:1715:72: error: unused variable ‘offset_must_be_zero’ > [-Werror=unused-variable] > hw/vmxnet3.c: In function ‘vmxnet3_receive’: > hw/vmxnet3.c:1795:94: error: ‘NICState’ has no member named ‘nc’ > hw/vmxnet3.c:1795:162: error: ‘NICState’ has no member named ‘nc’ > hw/vmxnet3.c:1795:178: error: initialization from incompatible pointer > type [-Werror] > hw/vmxnet3.c:1795:178: error: (near initialization for ‘s’) [-Werror] > hw/vmxnet3.c:1795:216: error: ‘NICState’ has no member named ‘nc’ > hw/vmxnet3.c:1795:72: error: unused variable ‘offset_must_be_zero’ > [-Werror=unused-variable] > hw/vmxnet3.c:1798:37: error: ‘NICState’ has no member named ‘nc’ > hw/vmxnet3.c: In function ‘vmxnet3_cleanup’: > hw/vmxnet3.c:1830:94: error: ‘NICState’ has no member named ‘nc’ > hw/vmxnet3.c:1830:162: error: ‘NICState’ has no member named ‘nc’ > hw/vmxnet3.c:1830:178: error: initialization from incompatible pointer > type [-Werror] > hw/vmxnet3.c:1830:178: error: (near initialization for ‘s’) [-Werror] > hw/vmxnet3.c:1830:216: error: ‘NICState’ has no member named ‘nc’ > hw/vmxnet3.c:1830:72: error: unused variable ‘offset_must_be_zero’ > [-Werror=unused-variable] > hw/vmxnet3.c: In function ‘vmxnet3_set_link_status’: > hw/vmxnet3.c:1836:94: error: ‘NICState’ has no member named ‘nc’ > hw/vmxnet3.c:1836:162: error: ‘NICState’ has no member named ‘nc’ > hw/vmxnet3.c:1836:178: error: initialization from incompatible pointer > type [-Werror] > hw/vmxnet3.c:1836:178: error: (near initialization for ‘s’) [-Werror] > hw/vmxnet3.c:1836:216: error: ‘NICState’ has no member named ‘nc’ > hw/vmxnet3.c:1836:72: error: unused variable ‘offset_must_be_zero’ > [-Werror=unused-variable] > hw/vmxnet3.c: In function ‘vmxnet3_peer_has_vnet_hdr’: > hw/vmxnet3.c:1859:34: error: ‘NICState’ has no member named ‘nc’ > hw/vmxnet3.c: In function ‘vmxnet3_net_uninit’: > hw/vmxnet3.c:1877:32: error: ‘NICState’ has no member named ‘nc’ > hw/vmxnet3.c: In function ‘vmxnet3_net_init’: > hw/vmxnet3.c:1909:36: error: ‘NICState’ has no member named ‘nc’ > hw/vmxnet3.c:1910:34: error: ‘NICState’ has no member named ‘nc’ > hw/vmxnet3.c:1913:37: error: ‘NICState’ has no member named ‘nc’ >
Re: [Qemu-devel] [PATCH v11 2/5] iovec checksum calculation fuction
Eduardo, Andreas, Thanks for pointing out, we didn't know about this limitation of qemu-common.h usage. Include directive will be moved to corresponding .c file. Dmitry. On Tue, Feb 26, 2013 at 3:38 PM, Eduardo Habkost wrote: > On Mon, Feb 25, 2013 at 09:37:52PM +0100, Andreas Färber wrote: > > Am 25.02.2013 21:11, schrieb Dmitry Fleytman: > [...] > > > diff --git a/include/net/checksum.h b/include/net/checksum.h > > > index 3e7b93d..b1cf18a 100644 > > > --- a/include/net/checksum.h > > > +++ b/include/net/checksum.h > > > @@ -19,6 +19,7 @@ > > > #define QEMU_NET_CHECKSUM_H > > > > > > #include > > > +#include "qemu-common.h" > > > > Eduardo has worked hard to resolve circular qemu-common.h dependencies! > > Are you sure you are not reintroducing one here? > > Even if there's no circular dependency yet, this makes it very easy to > introduce circular dependencies silently if one day a header included by > qemu-common.h ends up including checksum.h. That's why qemu-common.h > shouldn't be included by any header file. > > > > What do you actually > > need out of it? You already have stdint.h for uint32_t, and struct iovec > > is used as pointer so you shouldn't need its internals from > > qemu-common.h here and can include it from checksum.c instead. > > > [...] > > -- > Eduardo >
Re: [Qemu-devel] [PATCH v11 2/5] iovec checksum calculation fuction
Thanks Andreas, fixed. On Mon, Feb 25, 2013 at 10:37 PM, Andreas Färber wrote: > Am 25.02.2013 21:11, schrieb Dmitry Fleytman: > > Signed-off-by: Dmitry Fleytman > > Signed-off-by: Yan Vugenfirer > > --- > > include/net/checksum.h | 8 > > net/checksum.c | 28 > > 2 files changed, 36 insertions(+) > > > > diff --git a/include/net/checksum.h b/include/net/checksum.h > > index 3e7b93d..b1cf18a 100644 > > --- a/include/net/checksum.h > > +++ b/include/net/checksum.h > > @@ -19,6 +19,7 @@ > > #define QEMU_NET_CHECKSUM_H > > > > #include > > +#include "qemu-common.h" > > Eduardo has worked hard to resolve circular qemu-common.h dependencies! > Are you sure you are not reintroducing one here? What do you actually > need out of it? You already have stdint.h for uint32_t, and struct iovec > is used as pointer so you shouldn't need its internals from > qemu-common.h here and can include it from checksum.c instead. > > > > > uint32_t net_checksum_add_cont(int len, uint8_t *buf, int seq); > > uint16_t net_checksum_finish(uint32_t sum); > > @@ -38,4 +39,11 @@ net_raw_checksum(uint8_t *data, int length) > > return net_checksum_finish(net_checksum_add(length, data)); > > } > > > > +/** > * net_checksum_add_iov: > * @iov: ... > * @iov_cnt: ... > * @iov_off: ... > * @size: ... > * > > + * Checksum calculation for scatter-gather vector > > + */ > > +uint32_t net_checksum_add_iov(const struct iovec *iov, > > + const unsigned int iov_cnt, > > + uint32_t iov_off, uint32_t size); > > + > > #endif /* QEMU_NET_CHECKSUM_H */ > [snip] > > The subject is also improvable: "net: ", an appropriate verb and a typo. > > Regards, > Andreas > > -- > SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg, Germany > GF: Jeff Hawn, Jennifer Guild, Felix Imendörffer; HRB 16746 AG Nürnberg >
Re: [Qemu-devel] [PATCH v11 5/5] VMXNET3 device implementation
Andreas, thanks for your review. The issue is we were developing this device using other devices code (mainly e1000 and virtio-net) as a reference, and unfortunately they don't follow current QOM rules. Also these patches are pretty long-living (around a year already), I'm not sure what was the proper convention that days. Anyway, I've fixed almost everything you've pointed out. See below my comments on specific issues. On Tue, Feb 26, 2013 at 12:05 AM, Andreas Färber wrote: > Am 25.02.2013 21:11, schrieb Dmitry Fleytman: > [snip] > > +static int > > +vmxnet3_msix_load(QEMUFile *f, void *opaque, int version_id) > > +{ > > +msix_load(&((VMXNET3State *)opaque)->dev, f); > > Apart from doing too much in one line, you should not access the parent > field ->dev directly but use PCI_DEVICE(opaque) cast macro, which makes > the code much simpler. > > http://wiki.qemu.org/QOMConventions > > Please check the above rest of the file for more occurrences. > > Suggest to rename dev field to parent_obj as recommended above, then you > will easily see where it is (mis)used outside of VMState code. > > [DF] Done. > > +return 0; > > +} > > + > > +static int vmxnet3_pci_init(PCIDevice *dev) > > +{ > > +static const MemoryRegionOps b0_ops = { > > +.read = vmxnet3_io_bar0_read, > > +.write = vmxnet3_io_bar0_write, > > +.endianness = DEVICE_LITTLE_ENDIAN, > > +.impl = { > > +.min_access_size = 4, > > +.max_access_size = 4, > > +}, > > +}; > > + > > +static const MemoryRegionOps b1_ops = { > > +.read = vmxnet3_io_bar1_read, > > +.write = vmxnet3_io_bar1_write, > > +.endianness = DEVICE_LITTLE_ENDIAN, > > +.impl = { > > +.min_access_size = 4, > > +.max_access_size = 4, > > +}, > > +}; > > Any reason to place these inside the function? > Makes it harder to see what the function is actually doing and may need > to be relocated to instance_init. [DF] Fixed. > > + > > +VMXNET3State *s = DO_UPCAST(VMXNET3State, dev, dev); > > Please don't use DO_UPCAST() for QOM types, introduce your own > VMXNET3(obj) cast macro and use that instead. > > [DF] Done. > > + > > +VMW_CBPRN("Starting init..."); > > + > > +memory_region_init_io(&s->bar0, &b0_ops, s, > > + "vmxnet3-b0", VMXNET3_PT_REG_SIZE); > > +pci_register_bar(&s->dev, VMXNET3_BAR0_IDX, > > Please don't access the parent field ->dev directly. Here you already > have a PCIDevice *dev, which you would be advised to rename to PCIDevice > *d or so, see below. [DF] Fixed everywhere. > > + PCI_BASE_ADDRESS_SPACE_MEMORY, &s->bar0); > > + > > +memory_region_init_io(&s->bar1, &b1_ops, s, > > + "vmxnet3-b1", VMXNET3_VD_REG_SIZE); > > +pci_register_bar(&s->dev, VMXNET3_BAR1_IDX, > > + PCI_BASE_ADDRESS_SPACE_MEMORY, &s->bar1); > > + > > +memory_region_init(&s->msix_bar, "vmxnet3-msix-bar", > > + VMXNET3_MSIX_BAR_SIZE); > > +pci_register_bar(&s->dev, VMXNET3_MSIX_BAR_IDX, > > + PCI_BASE_ADDRESS_SPACE_MEMORY, &s->msix_bar); > > + > > +vmxnet3_reset_interrupt_states(s); > > + > > +/* Interrupt pin A */ > > +s->dev.config[PCI_INTERRUPT_PIN] = 0x01; > > dev->config[... > > > + > > +if (!vmxnet3_init_msix(s)) { > > +VMW_WRPRN("Failed to initialize MSI-X, configuration is > inconsistent."); > > +} > > + > > +if (!vmxnet3_init_msi(s)) { > > +VMW_WRPRN("Failed to initialize MSI, configuration is > inconsistent."); > > +} > > When might these functions fail and why do they not return non-0 then? [DF] These functions may fail in case corresponding QEMU function fails. [DF] They are bool, not int, return value follows boolean convention. > > + > > +vmxnet3_net_init(s); > > + > > +register_savevm(&dev->qdev, "vmxnet3-msix", -1, 1, > > +vmxnet3_msix_save, vmxnet3_msix_load, s); > > Why is this needed and not in the main VMStateDescription as a subsection? [DF] The only two places in QEMU doing the same use this technique, [DF] we wrote this by example. > > + > > +add_boot_device_pa
[Qemu-devel] [PATCH V12 1/5] Checksum-related utility functions
net_checksum_add_cont() checksum calculation for scattered data with odd chunk sizes net_raw_checksum() checksum calculation for a buffer Signed-off-by: Dmitry Fleytman Signed-off-by: Yan Vugenfirer --- include/net/checksum.h | 14 +- net/checksum.c | 13 +++-- 2 files changed, 20 insertions(+), 7 deletions(-) diff --git a/include/net/checksum.h b/include/net/checksum.h index 1f05298..3e7b93d 100644 --- a/include/net/checksum.h +++ b/include/net/checksum.h @@ -20,10 +20,22 @@ #include -uint32_t net_checksum_add(int len, uint8_t *buf); +uint32_t net_checksum_add_cont(int len, uint8_t *buf, int seq); uint16_t net_checksum_finish(uint32_t sum); uint16_t net_checksum_tcpudp(uint16_t length, uint16_t proto, uint8_t *addrs, uint8_t *buf); void net_checksum_calculate(uint8_t *data, int length); +static inline uint32_t +net_checksum_add(int len, uint8_t *buf) +{ +return net_checksum_add_cont(len, buf, 0); +} + +static inline uint16_t +net_raw_checksum(uint8_t *data, int length) +{ +return net_checksum_finish(net_checksum_add(length, data)); +} + #endif /* QEMU_NET_CHECKSUM_H */ diff --git a/net/checksum.c b/net/checksum.c index 9919b2e..4fa5563 100644 --- a/net/checksum.c +++ b/net/checksum.c @@ -20,16 +20,17 @@ #define PROTO_TCP 6 #define PROTO_UDP 17 -uint32_t net_checksum_add(int len, uint8_t *buf) +uint32_t net_checksum_add_cont(int len, uint8_t *buf, int seq) { uint32_t sum = 0; int i; -for (i = 0; i < len; i++) { - if (i & 1) - sum += (uint32_t)buf[i]; - else - sum += (uint32_t)buf[i] << 8; +for (i = seq; i < seq + len; i++) { +if (i & 1) { +sum += (uint32_t)buf[i - seq]; +} else { +sum += (uint32_t)buf[i - seq] << 8; +} } return sum; } -- 1.8.1.2
[Qemu-devel] [PATCH V12 2/5] net: iovec checksum calculator
Signed-off-by: Dmitry Fleytman Signed-off-by: Yan Vugenfirer --- include/net/checksum.h | 12 net/checksum.c | 29 + 2 files changed, 41 insertions(+) diff --git a/include/net/checksum.h b/include/net/checksum.h index 3e7b93d..80203fb 100644 --- a/include/net/checksum.h +++ b/include/net/checksum.h @@ -38,4 +38,16 @@ net_raw_checksum(uint8_t *data, int length) return net_checksum_finish(net_checksum_add(length, data)); } +/** + * net_checksum_add_iov: scatter-gather vector checksumming + * + * @iov: input scatter-gather array + * @iov_cnt: number of array elements + * @iov_off: starting iov offset for checksumming + * @size: length of data to be checksummed + */ +uint32_t net_checksum_add_iov(const struct iovec *iov, + const unsigned int iov_cnt, + uint32_t iov_off, uint32_t size); + #endif /* QEMU_NET_CHECKSUM_H */ diff --git a/net/checksum.c b/net/checksum.c index 4fa5563..14c0855 100644 --- a/net/checksum.c +++ b/net/checksum.c @@ -15,6 +15,7 @@ * along with this program; if not, see <http://www.gnu.org/licenses/>. */ +#include "qemu-common.h" #include "net/checksum.h" #define PROTO_TCP 6 @@ -84,3 +85,31 @@ void net_checksum_calculate(uint8_t *data, int length) data[14+hlen+csum_offset] = csum >> 8; data[14+hlen+csum_offset+1] = csum & 0xff; } + +uint32_t +net_checksum_add_iov(const struct iovec *iov, const unsigned int iov_cnt, + uint32_t iov_off, uint32_t size) +{ +size_t iovec_off, buf_off; +unsigned int i; +uint32_t res = 0; +uint32_t seq = 0; + +iovec_off = 0; +buf_off = 0; +for (i = 0; i < iov_cnt && size; i++) { +if (iov_off < (iovec_off + iov[i].iov_len)) { +size_t len = MIN((iovec_off + iov[i].iov_len) - iov_off , size); +void *chunk_buf = iov[i].iov_base + (iov_off - iovec_off); + +res += net_checksum_add_cont(len, chunk_buf, seq); +seq += len; + +buf_off += len; +iov_off += len; +size -= len; +} +iovec_off += iov[i].iov_len; +} +return res; +} -- 1.8.1.2
[Qemu-devel] [PATCH V12 0/5] VMXNET3 paravirtual NIC device implementation
inline void vmxnet3_flush_shmem_changes(void) > +{ > +/* > + * Flush shared memory changes > + * Needed before transferring control to guest what does 'transferring control to guest' mean? [DF] Changed to [DF]/* [DF] * Flush shared memory changes [DF] * Needed before sending interrupt to guest to ensure [DF] * it gets consistent memory state [DF] */ ... > + */ > +smp_wmb(); > +} Don't use wrappers like this. They just hide bugs. For example it's not helpful before an interrupt in the function below. [DF] I guess you are talking about vmxnet3_complete_packet() [DF] Strictly speaking barrier is a must because we change shared memory in [DF] vmxnet3_complete_packet() [DF] And the wrapper is a good thing because its name explains its effect [DF] in a formal way as opposed to comments ... > +switch (status) { > +case VMXNET3_PKT_STATUS_OK: { don't put {} around cases: they align incorrectly if it's too big move to a function. [DF] Fixed ... > +static bool > +vmxnet3_send_packet(VMXNET3State *s, uint32_t qidx) > +{ > +size_t bytes_sent = 0; > +bool res = true; why = true? don't initialize just because. [DF] Fixed ... > +/* > + * VMWARE headers we got from Linux kernel do not fully comply QEMU coding > + * standards in sense of types and defines used. > + * Since we didn't want to change VMWARE code, following set of typedefs > + * and defines needed to compile these headers with QEMU introduced. > + */ No need for this now. You can export headers and put them under linux-headers. [DF] Not sure it is possible because the header as-is is not stand-alone and won't compile [DF] without changes. We extracted definitions we use from their header and dropped unused [DF] and kernel-specific stuff. [DF} Please, advise. ... > +if (VMXNET3_OM_TSO == s->offload_mode) { Don't do Yoda style like this [DF] "Yoda" style removed everywhere Changes in V6: Fixed most of problems pointed out by Michael S. Tsirkin The only issue still open is creation of shared place with generic network structures and functions. Currently all generic network code introduced by VMXNET3 resides in vmxnet_utils.c/h files. It could be moved to some shared location however we believe it is a matter of separate refactoring as there are a lot of copy-pasted definitions in almost every device and code cleanup efforts requred in order to create truly shared codebase. Reported-by: Michael S. Tsirkin Implemented suggestions by Anthony Liguori Reported-by: Anthony Liguori Fixed incorrect checksum caclulation for some packets in SW offloads mode Reported-by: Gerhard Wiesinger Changes in V5: MSI-X save/load implemented in the device instead of pci bus as suggested by Michael S. Tsirkin Reported-by: Michael S. Tsirkin Patches regrouped as suggested by Paolo Bonzini Reported-by: Paolo Bonzini Changes in V4: Fixed a few problems uncovered by NETIO test suit Assertion on failure to initialize MSI/MSI-X replaced with warning message and fallback to Legacy/MSI respectively Reported-by: Gerhard Wiesinger Various coding style adjustments and patch split-up as suggested by Anthony Liguori Reported-by: Anthony Liguori Live migration support added Changes in V3: Fixed crash when net device that is used as network fronted has no virtio HDR support. Task offloads emulation for cases when net device that is used as network fronted has no virtio HDR support. Reported-by: Gerhard Wiesinger Changes in V2: License text changed accoring to community suggestions Standard license header from GPLv2+ - licensed QEMU files used Dmitry Fleytman (5): Checksum-related utility functions net: iovec checksum calculator Common definitions for VMWARE devices Packet abstraction for VMWARE network devices VMXNET3 device implementation default-configs/pci.mak |1 + hw/Makefile.objs|2 + hw/pci/pci.h|1 + hw/vmware_utils.h | 143 +++ hw/vmxnet3.c| 2460 +++ hw/vmxnet3.h| 760 +++ hw/vmxnet_debug.h | 115 +++ hw/vmxnet_rx_pkt.c | 187 hw/vmxnet_rx_pkt.h | 174 hw/vmxnet_tx_pkt.c | 567 +++ hw/vmxnet_tx_pkt.h | 148 +++ include/net/checksum.h | 26 +- include/net/eth.h | 347 +++ net/Makefile.objs |1 + net/checksum.c | 42 +- net/eth.c | 217 + 16 files changed, 5184 insertions(+), 7 deletions(-) create mode 100644 hw/vmware_utils.h create mode 100644 hw/vmxnet3.c create mode 100644 hw/vmxnet3.h create mode 100644 hw/vmxnet_debug.h create mode 100644 hw/vmxnet_rx_pkt.c create mode 100644 hw/vmxnet_rx_pkt.h create mode 100644 hw/vmxnet_tx_pkt.c create mode 100644 hw/vmxnet_tx_pkt.h create mode 100644 include/net/eth.h create mode 100644 net/eth.c -- 1.7.11.7
[Qemu-devel] [PATCH V12 4/5] Packet abstraction for VMWARE network devices
Signed-off-by: Dmitry Fleytman Signed-off-by: Yan Vugenfirer --- hw/Makefile.objs | 1 + hw/vmxnet_rx_pkt.c | 187 ++ hw/vmxnet_rx_pkt.h | 174 hw/vmxnet_tx_pkt.c | 567 + hw/vmxnet_tx_pkt.h | 148 ++ 5 files changed, 1077 insertions(+) create mode 100644 hw/vmxnet_rx_pkt.c create mode 100644 hw/vmxnet_rx_pkt.h create mode 100644 hw/vmxnet_tx_pkt.c create mode 100644 hw/vmxnet_tx_pkt.h diff --git a/hw/Makefile.objs b/hw/Makefile.objs index 40ebe46..14922cb 100644 --- a/hw/Makefile.objs +++ b/hw/Makefile.objs @@ -119,6 +119,7 @@ common-obj-$(CONFIG_PCNET_PCI) += pcnet-pci.o common-obj-$(CONFIG_PCNET_COMMON) += pcnet.o common-obj-$(CONFIG_E1000_PCI) += e1000.o common-obj-$(CONFIG_RTL8139_PCI) += rtl8139.o +common-obj-$(CONFIG_VMXNET3_PCI) += vmxnet_tx_pkt.o vmxnet_rx_pkt.o common-obj-$(CONFIG_SMC91C111) += smc91c111.o common-obj-$(CONFIG_LAN9118) += lan9118.o diff --git a/hw/vmxnet_rx_pkt.c b/hw/vmxnet_rx_pkt.c new file mode 100644 index 000..a40e346 --- /dev/null +++ b/hw/vmxnet_rx_pkt.c @@ -0,0 +1,187 @@ +/* + * QEMU VMWARE VMXNET* paravirtual NICs - RX packets abstractions + * + * Copyright (c) 2012 Ravello Systems LTD (http://ravellosystems.com) + * + * Developed by Daynix Computing LTD (http://www.daynix.com) + * + * Authors: + * Dmitry Fleytman + * Tamir Shomer + * Yan Vugenfirer + * + * This work is licensed under the terms of the GNU GPL, version 2 or later. + * See the COPYING file in the top-level directory. + * + */ + +#include "vmxnet_rx_pkt.h" +#include "net/eth.h" +#include "qemu-common.h" +#include "qemu/iov.h" +#include "net/checksum.h" +#include "net/tap.h" + +/* + * RX packet may contain up to 2 fragments - rebuilt eth header + * in case of VLAN tag stripping + * and payload received from QEMU - in any case + */ +#define VMXNET_MAX_RX_PACKET_FRAGMENTS (2) + +struct VmxnetRxPkt { +struct virtio_net_hdr virt_hdr; +uint8_t ehdr_buf[ETH_MAX_L2_HDR_LEN]; +struct iovec vec[VMXNET_MAX_RX_PACKET_FRAGMENTS]; +uint16_t vec_len; +uint32_t tot_len; +uint16_t tci; +bool vlan_stripped; +bool has_virt_hdr; +eth_pkt_types_e packet_type; + +/* Analysis results */ +bool isip4; +bool isip6; +bool isudp; +bool istcp; +}; + +void vmxnet_rx_pkt_init(struct VmxnetRxPkt **pkt, bool has_virt_hdr) +{ +struct VmxnetRxPkt *p = g_malloc0(sizeof *p); +p->has_virt_hdr = has_virt_hdr; +*pkt = p; +} + +void vmxnet_rx_pkt_uninit(struct VmxnetRxPkt *pkt) +{ +g_free(pkt); +} + +struct virtio_net_hdr *vmxnet_rx_pkt_get_vhdr(struct VmxnetRxPkt *pkt) +{ +assert(pkt); +return &pkt->virt_hdr; +} + +void vmxnet_rx_pkt_attach_data(struct VmxnetRxPkt *pkt, const void *data, + size_t len, bool strip_vlan) +{ +uint16_t tci = 0; +uint16_t ploff; +assert(pkt); +pkt->vlan_stripped = false; + +if (strip_vlan) { +pkt->vlan_stripped = eth_strip_vlan(data, pkt->ehdr_buf, &ploff, &tci); +} + +if (pkt->vlan_stripped) { +pkt->vec[0].iov_base = pkt->ehdr_buf; +pkt->vec[0].iov_len = ploff - sizeof(struct vlan_header); +pkt->vec[1].iov_base = (uint8_t *) data + ploff; +pkt->vec[1].iov_len = len - ploff; +pkt->vec_len = 2; +pkt->tot_len = len - ploff + sizeof(struct eth_header); +} else { +pkt->vec[0].iov_base = (void *)data; +pkt->vec[0].iov_len = len; +pkt->vec_len = 1; +pkt->tot_len = len; +} + +pkt->tci = tci; + +eth_get_protocols(data, len, &pkt->isip4, &pkt->isip6, +&pkt->isudp, &pkt->istcp); +} + +void vmxnet_rx_pkt_dump(struct VmxnetRxPkt *pkt) +{ +#ifdef VMXNET_RX_PKT_DEBUG +VmxnetRxPkt *pkt = (VmxnetRxPkt *)pkt; +assert(pkt); + +printf("RX PKT: tot_len: %d, vlan_stripped: %d, vlan_tag: %d\n", + pkt->tot_len, pkt->vlan_stripped, pkt->tci); +#endif +} + +void vmxnet_rx_pkt_set_packet_type(struct VmxnetRxPkt *pkt, +eth_pkt_types_e packet_type) +{ +assert(pkt); + +pkt->packet_type = packet_type; + +} + +eth_pkt_types_e vmxnet_rx_pkt_get_packet_type(struct VmxnetRxPkt *pkt) +{ +assert(pkt); + +return pkt->packet_type; +} + +size_t vmxnet_rx_pkt_get_total_len(struct VmxnetRxPkt *pkt) +{ +assert(pkt); + +return pkt->tot_len; +} + +void vmxnet_rx_pkt_get_protocols(struct VmxnetRxPkt *pkt, + bool *isip4, bool *isip6, + bool *isudp, bool *istcp) +{ +assert(pkt); + +*isip4 = pkt->isip4; +*isip6 = pkt->isip6; +*isudp = pkt->isudp; +*istcp = pkt->istcp; +} + +struct iovec *vmxnet_rx_pkt_get_iovec(struct VmxnetRxPkt *pkt) +{ +assert(pkt); + +return
[Qemu-devel] [PATCH V12 3/5] Common definitions for VMWARE devices
Signed-off-by: Dmitry Fleytman Signed-off-by: Yan Vugenfirer --- hw/vmware_utils.h | 143 ++ hw/vmxnet_debug.h | 115 ++ include/net/eth.h | 347 ++ net/Makefile.objs | 1 + net/eth.c | 217 ++ 5 files changed, 823 insertions(+) create mode 100644 hw/vmware_utils.h create mode 100644 hw/vmxnet_debug.h create mode 100644 include/net/eth.h create mode 100644 net/eth.c diff --git a/hw/vmware_utils.h b/hw/vmware_utils.h new file mode 100644 index 000..5307e2c --- /dev/null +++ b/hw/vmware_utils.h @@ -0,0 +1,143 @@ +/* + * QEMU VMWARE paravirtual devices - auxiliary code + * + * Copyright (c) 2012 Ravello Systems LTD (http://ravellosystems.com) + * + * Developed by Daynix Computing LTD (http://www.daynix.com) + * + * Authors: + * Dmitry Fleytman + * Yan Vugenfirer + * + * This work is licensed under the terms of the GNU GPL, version 2 or later. + * See the COPYING file in the top-level directory. + * + */ + +#ifndef VMWARE_UTILS_H +#define VMWARE_UTILS_H + +#include "qemu/range.h" + +#ifndef VMW_SHPRN +#define VMW_SHPRN(fmt, ...) do {} while (0) +#endif + +/* + * Shared memory access functions with byte swap support + * Each function contains printout for reverse-engineering needs + * + */ +static inline void +vmw_shmem_read(hwaddr addr, void *buf, int len) +{ +VMW_SHPRN("SHMEM r: %" PRIx64 ", len: %d to %p", addr, len, buf); +cpu_physical_memory_read(addr, buf, len); +} + +static inline void +vmw_shmem_write(hwaddr addr, void *buf, int len) +{ +VMW_SHPRN("SHMEM w: %" PRIx64 ", len: %d to %p", addr, len, buf); +cpu_physical_memory_write(addr, buf, len); +} + +static inline void +vmw_shmem_rw(hwaddr addr, void *buf, int len, int is_write) +{ +VMW_SHPRN("SHMEM r/w: %" PRIx64 ", len: %d (to %p), is write: %d", + addr, len, buf, is_write); + +cpu_physical_memory_rw(addr, buf, len, is_write); +} + +static inline void +vmw_shmem_set(hwaddr addr, uint8 val, int len) +{ +int i; +VMW_SHPRN("SHMEM set: %" PRIx64 ", len: %d (value 0x%X)", addr, len, val); + +for (i = 0; i < len; i++) { +cpu_physical_memory_write(addr + i, &val, 1); +} +} + +static inline uint32_t +vmw_shmem_ld8(hwaddr addr) +{ +uint8_t res = ldub_phys(addr); +VMW_SHPRN("SHMEM load8: %" PRIx64 " (value 0x%X)", addr, res); +return res; +} + +static inline void +vmw_shmem_st8(hwaddr addr, uint8_t value) +{ +VMW_SHPRN("SHMEM store8: %" PRIx64 " (value 0x%X)", addr, value); +stb_phys(addr, value); +} + +static inline uint32_t +vmw_shmem_ld16(hwaddr addr) +{ +uint16_t res = lduw_le_phys(addr); +VMW_SHPRN("SHMEM load16: %" PRIx64 " (value 0x%X)", addr, res); +return res; +} + +static inline void +vmw_shmem_st16(hwaddr addr, uint16_t value) +{ +VMW_SHPRN("SHMEM store16: %" PRIx64 " (value 0x%X)", addr, value); +stw_le_phys(addr, value); +} + +static inline uint32_t +vmw_shmem_ld32(hwaddr addr) +{ +uint32_t res = ldl_le_phys(addr); +VMW_SHPRN("SHMEM load32: %" PRIx64 " (value 0x%X)", addr, res); +return res; +} + +static inline void +vmw_shmem_st32(hwaddr addr, uint32_t value) +{ +VMW_SHPRN("SHMEM store32: %" PRIx64 " (value 0x%X)", addr, value); +stl_le_phys(addr, value); +} + +static inline uint64_t +vmw_shmem_ld64(hwaddr addr) +{ +uint64_t res = ldq_le_phys(addr); +VMW_SHPRN("SHMEM load64: %" PRIx64 " (value %" PRIx64 ")", addr, res); +return res; +} + +static inline void +vmw_shmem_st64(hwaddr addr, uint64_t value) +{ +VMW_SHPRN("SHMEM store64: %" PRIx64 " (value %" PRIx64 ")", addr, value); +stq_le_phys(addr, value); +} + +/* Macros for simplification of operations on array-style registers */ + +/* + * Whether lies inside of array-style register defined by , + * number of elements () and element size () + * +*/ +#define VMW_IS_MULTIREG_ADDR(addr, base, cnt, regsize) \ +range_covers_byte(base, cnt * regsize, addr) + +/* + * Returns index of given register () in array-style register defined by + * and element size () + * +*/ +#define VMW_MULTIREG_IDX_BY_ADDR(addr, base, regsize) \ +(((addr) - (base)) / (regsize)) + +#endif diff --git a/hw/vmxnet_debug.h b/hw/vmxnet_debug.h new file mode 100644 index 000..96dae0f --- /dev/null +++ b/hw/vmxnet_debug.h @@ -0,0 +1,115 @@ +/* + * QEMU VMWARE VMXNET* paravirtual NICs - debugging facilities + * + * Copyright (c) 2012 Ravello Systems LTD (http://ravellosystems.com) + * + * Developed by Daynix Computing LTD (http://www.daynix.com) + * + * Authors: + * Dmitry Fleytman + * Tamir Shomer + * Yan Vugenfirer + * + * Th
Re: [Qemu-devel] [PATCH V12 5/5] VMXNET3 device implementation
Oops, forgot to address this part... Is it enough to make following change: - * This work is licensed under the terms of the GNU GPL, version 2 or later. + * This work is licensed under the terms of the GNU GPL, version 2. Do you want me to resend the series? Dmitry. On Mon, Mar 4, 2013 at 4:28 PM, Stefan Hajnoczi wrote: > On Sat, Mar 02, 2013 at 02:44:39PM +0200, Dmitry Fleytman wrote: > > diff --git a/hw/vmxnet3.h b/hw/vmxnet3.h > > new file mode 100644 > > index 000..22787b1 > > --- /dev/null > > +++ b/hw/vmxnet3.h > > @@ -0,0 +1,760 @@ > > +/* > > + * QEMU VMWARE VMXNET3 paravirtual NIC > > + * > > + * Copyright (c) 2012 Ravello Systems LTD (http://ravellosystems.com) > > + * > > + * Developed by Daynix Computing LTD (http://www.daynix.com) > > + * > > + * Authors: > > + * Dmitry Fleytman > > + * Tamir Shomer > > + * Yan Vugenfirer > > + * > > + * This work is licensed under the terms of the GNU GPL, version 2 or > later. > > + * See the COPYING file in the top-level directory. > > + * > > + */ > [...] > > +/* > > + * Linux driver for VMware's vmxnet3 ethernet NIC. > > + * > > + * Copyright (C) 2008-2009, VMware, Inc. All Rights Reserved. > > + * > > + * This program is free software; you can redistribute it and/or modify > it > > + * under the terms of the GNU General Public License as published by the > > + * Free Software Foundation; version 2 of the License and no later > version. > > I think Andreas pointed out that you chose GPLv2+ but VMware chose > GPLv2-only. The file should be GPLv2-only. > > Stefan >
[Qemu-devel] [PATCH v13 1/5] Checksum-related utility functions
net_checksum_add_cont() checksum calculation for scattered data with odd chunk sizes net_raw_checksum() checksum calculation for a buffer Signed-off-by: Dmitry Fleytman Signed-off-by: Yan Vugenfirer --- include/net/checksum.h | 14 +- net/checksum.c | 13 +++-- 2 files changed, 20 insertions(+), 7 deletions(-) diff --git a/include/net/checksum.h b/include/net/checksum.h index 1f05298..3e7b93d 100644 --- a/include/net/checksum.h +++ b/include/net/checksum.h @@ -20,10 +20,22 @@ #include -uint32_t net_checksum_add(int len, uint8_t *buf); +uint32_t net_checksum_add_cont(int len, uint8_t *buf, int seq); uint16_t net_checksum_finish(uint32_t sum); uint16_t net_checksum_tcpudp(uint16_t length, uint16_t proto, uint8_t *addrs, uint8_t *buf); void net_checksum_calculate(uint8_t *data, int length); +static inline uint32_t +net_checksum_add(int len, uint8_t *buf) +{ +return net_checksum_add_cont(len, buf, 0); +} + +static inline uint16_t +net_raw_checksum(uint8_t *data, int length) +{ +return net_checksum_finish(net_checksum_add(length, data)); +} + #endif /* QEMU_NET_CHECKSUM_H */ diff --git a/net/checksum.c b/net/checksum.c index 9919b2e..4fa5563 100644 --- a/net/checksum.c +++ b/net/checksum.c @@ -20,16 +20,17 @@ #define PROTO_TCP 6 #define PROTO_UDP 17 -uint32_t net_checksum_add(int len, uint8_t *buf) +uint32_t net_checksum_add_cont(int len, uint8_t *buf, int seq) { uint32_t sum = 0; int i; -for (i = 0; i < len; i++) { - if (i & 1) - sum += (uint32_t)buf[i]; - else - sum += (uint32_t)buf[i] << 8; +for (i = seq; i < seq + len; i++) { +if (i & 1) { +sum += (uint32_t)buf[i - seq]; +} else { +sum += (uint32_t)buf[i - seq] << 8; +} } return sum; } -- 1.8.1.2
[Qemu-devel] [PATCH V13 0/5] VMXNET3 paravirtual NIC device implementation
->size, (r)->cell_size, (r)->gen, (r)->next) make macros upper case [DF] Fixed ... > +static inline void vmxnet3_flush_shmem_changes(void) > +{ > +/* > + * Flush shared memory changes > + * Needed before transferring control to guest what does 'transferring control to guest' mean? [DF] Changed to [DF]/* [DF] * Flush shared memory changes [DF] * Needed before sending interrupt to guest to ensure [DF] * it gets consistent memory state [DF] */ ... > + */ > +smp_wmb(); > +} Don't use wrappers like this. They just hide bugs. For example it's not helpful before an interrupt in the function below. [DF] I guess you are talking about vmxnet3_complete_packet() [DF] Strictly speaking barrier is a must because we change shared memory in [DF] vmxnet3_complete_packet() [DF] And the wrapper is a good thing because its name explains its effect [DF] in a formal way as opposed to comments ... > +switch (status) { > +case VMXNET3_PKT_STATUS_OK: { don't put {} around cases: they align incorrectly if it's too big move to a function. [DF] Fixed ... > +static bool > +vmxnet3_send_packet(VMXNET3State *s, uint32_t qidx) > +{ > +size_t bytes_sent = 0; > +bool res = true; why = true? don't initialize just because. [DF] Fixed ... > +/* > + * VMWARE headers we got from Linux kernel do not fully comply QEMU coding > + * standards in sense of types and defines used. > + * Since we didn't want to change VMWARE code, following set of typedefs > + * and defines needed to compile these headers with QEMU introduced. > + */ No need for this now. You can export headers and put them under linux-headers. [DF] Not sure it is possible because the header as-is is not stand-alone and won't compile [DF] without changes. We extracted definitions we use from their header and dropped unused [DF] and kernel-specific stuff. [DF} Please, advise. ... > +if (VMXNET3_OM_TSO == s->offload_mode) { Don't do Yoda style like this [DF] "Yoda" style removed everywhere Changes in V6: Fixed most of problems pointed out by Michael S. Tsirkin The only issue still open is creation of shared place with generic network structures and functions. Currently all generic network code introduced by VMXNET3 resides in vmxnet_utils.c/h files. It could be moved to some shared location however we believe it is a matter of separate refactoring as there are a lot of copy-pasted definitions in almost every device and code cleanup efforts requred in order to create truly shared codebase. Reported-by: Michael S. Tsirkin Implemented suggestions by Anthony Liguori Reported-by: Anthony Liguori Fixed incorrect checksum caclulation for some packets in SW offloads mode Reported-by: Gerhard Wiesinger Changes in V5: MSI-X save/load implemented in the device instead of pci bus as suggested by Michael S. Tsirkin Reported-by: Michael S. Tsirkin Patches regrouped as suggested by Paolo Bonzini Reported-by: Paolo Bonzini Changes in V4: Fixed a few problems uncovered by NETIO test suit Assertion on failure to initialize MSI/MSI-X replaced with warning message and fallback to Legacy/MSI respectively Reported-by: Gerhard Wiesinger Various coding style adjustments and patch split-up as suggested by Anthony Liguori Reported-by: Anthony Liguori Live migration support added Changes in V3: Fixed crash when net device that is used as network fronted has no virtio HDR support. Task offloads emulation for cases when net device that is used as network fronted has no virtio HDR support. Reported-by: Gerhard Wiesinger Changes in V2: License text changed accoring to community suggestions Standard license header from GPLv2+ - licensed QEMU files used Dmitry Fleytman (5): Checksum-related utility functions net: iovec checksum calculator Common definitions for VMWARE devices Packet abstraction for VMWARE network devices VMXNET3 device implementation default-configs/pci.mak |1 + hw/Makefile.objs|2 + hw/pci/pci.h|1 + hw/vmware_utils.h | 143 +++ hw/vmxnet3.c| 2460 +++ hw/vmxnet3.h| 760 +++ hw/vmxnet_debug.h | 115 +++ hw/vmxnet_rx_pkt.c | 187 hw/vmxnet_rx_pkt.h | 174 hw/vmxnet_tx_pkt.c | 567 +++ hw/vmxnet_tx_pkt.h | 148 +++ include/net/checksum.h | 26 +- include/net/eth.h | 347 +++ net/Makefile.objs |1 + net/checksum.c | 42 +- net/eth.c | 217 + 16 files changed, 5184 insertions(+), 7 deletions(-) create mode 100644 hw/vmware_utils.h create mode 100644 hw/vmxnet3.c create mode 100644 hw/vmxnet3.h create mode 100644 hw/vmxnet_debug.h create mode 100644 hw/vmxnet_rx_pkt.c create mode 100644 hw/vmxnet_rx_pkt.h create mode 100644 hw/vmxnet_tx_pkt.c create mode 100644 hw/vmxnet_tx_pkt.h create mode 100644 include/net/eth.h create mode 100644 net/eth.c -- 1.7.11.7
[Qemu-devel] [PATCH v13 3/5] Common definitions for VMWARE devices
Signed-off-by: Dmitry Fleytman Signed-off-by: Yan Vugenfirer --- hw/vmware_utils.h | 143 ++ hw/vmxnet_debug.h | 115 ++ include/net/eth.h | 347 ++ net/Makefile.objs | 1 + net/eth.c | 217 ++ 5 files changed, 823 insertions(+) create mode 100644 hw/vmware_utils.h create mode 100644 hw/vmxnet_debug.h create mode 100644 include/net/eth.h create mode 100644 net/eth.c diff --git a/hw/vmware_utils.h b/hw/vmware_utils.h new file mode 100644 index 000..5307e2c --- /dev/null +++ b/hw/vmware_utils.h @@ -0,0 +1,143 @@ +/* + * QEMU VMWARE paravirtual devices - auxiliary code + * + * Copyright (c) 2012 Ravello Systems LTD (http://ravellosystems.com) + * + * Developed by Daynix Computing LTD (http://www.daynix.com) + * + * Authors: + * Dmitry Fleytman + * Yan Vugenfirer + * + * This work is licensed under the terms of the GNU GPL, version 2 or later. + * See the COPYING file in the top-level directory. + * + */ + +#ifndef VMWARE_UTILS_H +#define VMWARE_UTILS_H + +#include "qemu/range.h" + +#ifndef VMW_SHPRN +#define VMW_SHPRN(fmt, ...) do {} while (0) +#endif + +/* + * Shared memory access functions with byte swap support + * Each function contains printout for reverse-engineering needs + * + */ +static inline void +vmw_shmem_read(hwaddr addr, void *buf, int len) +{ +VMW_SHPRN("SHMEM r: %" PRIx64 ", len: %d to %p", addr, len, buf); +cpu_physical_memory_read(addr, buf, len); +} + +static inline void +vmw_shmem_write(hwaddr addr, void *buf, int len) +{ +VMW_SHPRN("SHMEM w: %" PRIx64 ", len: %d to %p", addr, len, buf); +cpu_physical_memory_write(addr, buf, len); +} + +static inline void +vmw_shmem_rw(hwaddr addr, void *buf, int len, int is_write) +{ +VMW_SHPRN("SHMEM r/w: %" PRIx64 ", len: %d (to %p), is write: %d", + addr, len, buf, is_write); + +cpu_physical_memory_rw(addr, buf, len, is_write); +} + +static inline void +vmw_shmem_set(hwaddr addr, uint8 val, int len) +{ +int i; +VMW_SHPRN("SHMEM set: %" PRIx64 ", len: %d (value 0x%X)", addr, len, val); + +for (i = 0; i < len; i++) { +cpu_physical_memory_write(addr + i, &val, 1); +} +} + +static inline uint32_t +vmw_shmem_ld8(hwaddr addr) +{ +uint8_t res = ldub_phys(addr); +VMW_SHPRN("SHMEM load8: %" PRIx64 " (value 0x%X)", addr, res); +return res; +} + +static inline void +vmw_shmem_st8(hwaddr addr, uint8_t value) +{ +VMW_SHPRN("SHMEM store8: %" PRIx64 " (value 0x%X)", addr, value); +stb_phys(addr, value); +} + +static inline uint32_t +vmw_shmem_ld16(hwaddr addr) +{ +uint16_t res = lduw_le_phys(addr); +VMW_SHPRN("SHMEM load16: %" PRIx64 " (value 0x%X)", addr, res); +return res; +} + +static inline void +vmw_shmem_st16(hwaddr addr, uint16_t value) +{ +VMW_SHPRN("SHMEM store16: %" PRIx64 " (value 0x%X)", addr, value); +stw_le_phys(addr, value); +} + +static inline uint32_t +vmw_shmem_ld32(hwaddr addr) +{ +uint32_t res = ldl_le_phys(addr); +VMW_SHPRN("SHMEM load32: %" PRIx64 " (value 0x%X)", addr, res); +return res; +} + +static inline void +vmw_shmem_st32(hwaddr addr, uint32_t value) +{ +VMW_SHPRN("SHMEM store32: %" PRIx64 " (value 0x%X)", addr, value); +stl_le_phys(addr, value); +} + +static inline uint64_t +vmw_shmem_ld64(hwaddr addr) +{ +uint64_t res = ldq_le_phys(addr); +VMW_SHPRN("SHMEM load64: %" PRIx64 " (value %" PRIx64 ")", addr, res); +return res; +} + +static inline void +vmw_shmem_st64(hwaddr addr, uint64_t value) +{ +VMW_SHPRN("SHMEM store64: %" PRIx64 " (value %" PRIx64 ")", addr, value); +stq_le_phys(addr, value); +} + +/* Macros for simplification of operations on array-style registers */ + +/* + * Whether lies inside of array-style register defined by , + * number of elements () and element size () + * +*/ +#define VMW_IS_MULTIREG_ADDR(addr, base, cnt, regsize) \ +range_covers_byte(base, cnt * regsize, addr) + +/* + * Returns index of given register () in array-style register defined by + * and element size () + * +*/ +#define VMW_MULTIREG_IDX_BY_ADDR(addr, base, regsize) \ +(((addr) - (base)) / (regsize)) + +#endif diff --git a/hw/vmxnet_debug.h b/hw/vmxnet_debug.h new file mode 100644 index 000..96dae0f --- /dev/null +++ b/hw/vmxnet_debug.h @@ -0,0 +1,115 @@ +/* + * QEMU VMWARE VMXNET* paravirtual NICs - debugging facilities + * + * Copyright (c) 2012 Ravello Systems LTD (http://ravellosystems.com) + * + * Developed by Daynix Computing LTD (http://www.daynix.com) + * + * Authors: + * Dmitry Fleytman + * Tamir Shomer + * Yan Vugenfirer + * + * Th
[Qemu-devel] [PATCH v13 4/5] Packet abstraction for VMWARE network devices
Signed-off-by: Dmitry Fleytman Signed-off-by: Yan Vugenfirer --- hw/Makefile.objs | 1 + hw/vmxnet_rx_pkt.c | 187 ++ hw/vmxnet_rx_pkt.h | 174 hw/vmxnet_tx_pkt.c | 567 + hw/vmxnet_tx_pkt.h | 148 ++ 5 files changed, 1077 insertions(+) create mode 100644 hw/vmxnet_rx_pkt.c create mode 100644 hw/vmxnet_rx_pkt.h create mode 100644 hw/vmxnet_tx_pkt.c create mode 100644 hw/vmxnet_tx_pkt.h diff --git a/hw/Makefile.objs b/hw/Makefile.objs index 40ebe46..14922cb 100644 --- a/hw/Makefile.objs +++ b/hw/Makefile.objs @@ -119,6 +119,7 @@ common-obj-$(CONFIG_PCNET_PCI) += pcnet-pci.o common-obj-$(CONFIG_PCNET_COMMON) += pcnet.o common-obj-$(CONFIG_E1000_PCI) += e1000.o common-obj-$(CONFIG_RTL8139_PCI) += rtl8139.o +common-obj-$(CONFIG_VMXNET3_PCI) += vmxnet_tx_pkt.o vmxnet_rx_pkt.o common-obj-$(CONFIG_SMC91C111) += smc91c111.o common-obj-$(CONFIG_LAN9118) += lan9118.o diff --git a/hw/vmxnet_rx_pkt.c b/hw/vmxnet_rx_pkt.c new file mode 100644 index 000..a40e346 --- /dev/null +++ b/hw/vmxnet_rx_pkt.c @@ -0,0 +1,187 @@ +/* + * QEMU VMWARE VMXNET* paravirtual NICs - RX packets abstractions + * + * Copyright (c) 2012 Ravello Systems LTD (http://ravellosystems.com) + * + * Developed by Daynix Computing LTD (http://www.daynix.com) + * + * Authors: + * Dmitry Fleytman + * Tamir Shomer + * Yan Vugenfirer + * + * This work is licensed under the terms of the GNU GPL, version 2 or later. + * See the COPYING file in the top-level directory. + * + */ + +#include "vmxnet_rx_pkt.h" +#include "net/eth.h" +#include "qemu-common.h" +#include "qemu/iov.h" +#include "net/checksum.h" +#include "net/tap.h" + +/* + * RX packet may contain up to 2 fragments - rebuilt eth header + * in case of VLAN tag stripping + * and payload received from QEMU - in any case + */ +#define VMXNET_MAX_RX_PACKET_FRAGMENTS (2) + +struct VmxnetRxPkt { +struct virtio_net_hdr virt_hdr; +uint8_t ehdr_buf[ETH_MAX_L2_HDR_LEN]; +struct iovec vec[VMXNET_MAX_RX_PACKET_FRAGMENTS]; +uint16_t vec_len; +uint32_t tot_len; +uint16_t tci; +bool vlan_stripped; +bool has_virt_hdr; +eth_pkt_types_e packet_type; + +/* Analysis results */ +bool isip4; +bool isip6; +bool isudp; +bool istcp; +}; + +void vmxnet_rx_pkt_init(struct VmxnetRxPkt **pkt, bool has_virt_hdr) +{ +struct VmxnetRxPkt *p = g_malloc0(sizeof *p); +p->has_virt_hdr = has_virt_hdr; +*pkt = p; +} + +void vmxnet_rx_pkt_uninit(struct VmxnetRxPkt *pkt) +{ +g_free(pkt); +} + +struct virtio_net_hdr *vmxnet_rx_pkt_get_vhdr(struct VmxnetRxPkt *pkt) +{ +assert(pkt); +return &pkt->virt_hdr; +} + +void vmxnet_rx_pkt_attach_data(struct VmxnetRxPkt *pkt, const void *data, + size_t len, bool strip_vlan) +{ +uint16_t tci = 0; +uint16_t ploff; +assert(pkt); +pkt->vlan_stripped = false; + +if (strip_vlan) { +pkt->vlan_stripped = eth_strip_vlan(data, pkt->ehdr_buf, &ploff, &tci); +} + +if (pkt->vlan_stripped) { +pkt->vec[0].iov_base = pkt->ehdr_buf; +pkt->vec[0].iov_len = ploff - sizeof(struct vlan_header); +pkt->vec[1].iov_base = (uint8_t *) data + ploff; +pkt->vec[1].iov_len = len - ploff; +pkt->vec_len = 2; +pkt->tot_len = len - ploff + sizeof(struct eth_header); +} else { +pkt->vec[0].iov_base = (void *)data; +pkt->vec[0].iov_len = len; +pkt->vec_len = 1; +pkt->tot_len = len; +} + +pkt->tci = tci; + +eth_get_protocols(data, len, &pkt->isip4, &pkt->isip6, +&pkt->isudp, &pkt->istcp); +} + +void vmxnet_rx_pkt_dump(struct VmxnetRxPkt *pkt) +{ +#ifdef VMXNET_RX_PKT_DEBUG +VmxnetRxPkt *pkt = (VmxnetRxPkt *)pkt; +assert(pkt); + +printf("RX PKT: tot_len: %d, vlan_stripped: %d, vlan_tag: %d\n", + pkt->tot_len, pkt->vlan_stripped, pkt->tci); +#endif +} + +void vmxnet_rx_pkt_set_packet_type(struct VmxnetRxPkt *pkt, +eth_pkt_types_e packet_type) +{ +assert(pkt); + +pkt->packet_type = packet_type; + +} + +eth_pkt_types_e vmxnet_rx_pkt_get_packet_type(struct VmxnetRxPkt *pkt) +{ +assert(pkt); + +return pkt->packet_type; +} + +size_t vmxnet_rx_pkt_get_total_len(struct VmxnetRxPkt *pkt) +{ +assert(pkt); + +return pkt->tot_len; +} + +void vmxnet_rx_pkt_get_protocols(struct VmxnetRxPkt *pkt, + bool *isip4, bool *isip6, + bool *isudp, bool *istcp) +{ +assert(pkt); + +*isip4 = pkt->isip4; +*isip6 = pkt->isip6; +*isudp = pkt->isudp; +*istcp = pkt->istcp; +} + +struct iovec *vmxnet_rx_pkt_get_iovec(struct VmxnetRxPkt *pkt) +{ +assert(pkt); + +return
[Qemu-devel] [PATCH v13 2/5] net: iovec checksum calculator
Signed-off-by: Dmitry Fleytman Signed-off-by: Yan Vugenfirer --- include/net/checksum.h | 12 net/checksum.c | 29 + 2 files changed, 41 insertions(+) diff --git a/include/net/checksum.h b/include/net/checksum.h index 3e7b93d..80203fb 100644 --- a/include/net/checksum.h +++ b/include/net/checksum.h @@ -38,4 +38,16 @@ net_raw_checksum(uint8_t *data, int length) return net_checksum_finish(net_checksum_add(length, data)); } +/** + * net_checksum_add_iov: scatter-gather vector checksumming + * + * @iov: input scatter-gather array + * @iov_cnt: number of array elements + * @iov_off: starting iov offset for checksumming + * @size: length of data to be checksummed + */ +uint32_t net_checksum_add_iov(const struct iovec *iov, + const unsigned int iov_cnt, + uint32_t iov_off, uint32_t size); + #endif /* QEMU_NET_CHECKSUM_H */ diff --git a/net/checksum.c b/net/checksum.c index 4fa5563..14c0855 100644 --- a/net/checksum.c +++ b/net/checksum.c @@ -15,6 +15,7 @@ * along with this program; if not, see <http://www.gnu.org/licenses/>. */ +#include "qemu-common.h" #include "net/checksum.h" #define PROTO_TCP 6 @@ -84,3 +85,31 @@ void net_checksum_calculate(uint8_t *data, int length) data[14+hlen+csum_offset] = csum >> 8; data[14+hlen+csum_offset+1] = csum & 0xff; } + +uint32_t +net_checksum_add_iov(const struct iovec *iov, const unsigned int iov_cnt, + uint32_t iov_off, uint32_t size) +{ +size_t iovec_off, buf_off; +unsigned int i; +uint32_t res = 0; +uint32_t seq = 0; + +iovec_off = 0; +buf_off = 0; +for (i = 0; i < iov_cnt && size; i++) { +if (iov_off < (iovec_off + iov[i].iov_len)) { +size_t len = MIN((iovec_off + iov[i].iov_len) - iov_off , size); +void *chunk_buf = iov[i].iov_base + (iov_off - iovec_off); + +res += net_checksum_add_cont(len, chunk_buf, seq); +seq += len; + +buf_off += len; +iov_off += len; +size -= len; +} +iovec_off += iov[i].iov_len; +} +return res; +} -- 1.8.1.2
Re: [Qemu-devel] [PATCH V12 5/5] VMXNET3 device implementation
Stefan, I've sent V13 of patches with license changed. Dmitry. On Mon, Mar 4, 2013 at 4:52 PM, Dmitry Fleytman wrote: > Oops, forgot to address this part... > > Is it enough to make following change: > > - * This work is licensed under the terms of the GNU GPL, version 2 or > later. > + * This work is licensed under the terms of the GNU GPL, version 2. > > > Do you want me to resend the series? > Dmitry. > > On Mon, Mar 4, 2013 at 4:28 PM, Stefan Hajnoczi wrote: > >> On Sat, Mar 02, 2013 at 02:44:39PM +0200, Dmitry Fleytman wrote: >> > diff --git a/hw/vmxnet3.h b/hw/vmxnet3.h >> > new file mode 100644 >> > index 000..22787b1 >> > --- /dev/null >> > +++ b/hw/vmxnet3.h >> > @@ -0,0 +1,760 @@ >> > +/* >> > + * QEMU VMWARE VMXNET3 paravirtual NIC >> > + * >> > + * Copyright (c) 2012 Ravello Systems LTD (http://ravellosystems.com) >> > + * >> > + * Developed by Daynix Computing LTD (http://www.daynix.com) >> > + * >> > + * Authors: >> > + * Dmitry Fleytman >> > + * Tamir Shomer >> > + * Yan Vugenfirer >> > + * >> > + * This work is licensed under the terms of the GNU GPL, version 2 or >> later. >> > + * See the COPYING file in the top-level directory. >> > + * >> > + */ >> [...] >> > +/* >> > + * Linux driver for VMware's vmxnet3 ethernet NIC. >> > + * >> > + * Copyright (C) 2008-2009, VMware, Inc. All Rights Reserved. >> > + * >> > + * This program is free software; you can redistribute it and/or >> modify it >> > + * under the terms of the GNU General Public License as published by >> the >> > + * Free Software Foundation; version 2 of the License and no later >> version. >> >> I think Andreas pointed out that you chose GPLv2+ but VMware chose >> GPLv2-only. The file should be GPLv2-only. >> >> Stefan >> > >
Re: [Qemu-devel] [PATCH V12 5/5] VMXNET3 device implementation
No problem. On Wed, Mar 6, 2013 at 11:26 AM, Stefan Hajnoczi wrote: > On Wed, Mar 06, 2013 at 09:24:35AM +0200, Dmitry Fleytman wrote: > > I've sent V13 of patches with license changed. > > Thanks for resending. I'm happy to do trivial fixups while merging but > I don't do that in the case of license changes. The reason is that > there must be a clear record of you choosing the license. > > Stefan >
Re: [Qemu-devel] [PATCH v13 5/5] VMXNET3 device implementation
Thanks, Andreas Now I see. We'll resubmit our patches soon. Dmitry. On Thu, Mar 7, 2013 at 2:59 PM, Andreas Färber wrote: > Am 06.03.2013 08:21, schrieb Dmitry Fleytman: > > Signed-off-by: Dmitry Fleytman > > Signed-off-by: Yan Vugenfirer > > --- > > default-configs/pci.mak |1 + > > hw/Makefile.objs|1 + > > hw/pci/pci.h|1 + > > hw/vmxnet3.c| 2460 > +++ > > hw/vmxnet3.h| 760 +++ > > 5 files changed, 3223 insertions(+) > > create mode 100644 hw/vmxnet3.c > > create mode 100644 hw/vmxnet3.h > > > > diff --git a/default-configs/pci.mak b/default-configs/pci.mak > > index ee2d18d..ce56d58 100644 > > --- a/default-configs/pci.mak > > +++ b/default-configs/pci.mak > > @@ -13,6 +13,7 @@ CONFIG_LSI_SCSI_PCI=y > > CONFIG_MEGASAS_SCSI_PCI=y > > CONFIG_RTL8139_PCI=y > > CONFIG_E1000_PCI=y > > +CONFIG_VMXNET3_PCI=y > > CONFIG_IDE_CORE=y > > CONFIG_IDE_QDEV=y > > CONFIG_IDE_PCI=y > > diff --git a/hw/Makefile.objs b/hw/Makefile.objs > > index 14922cb..026aff6 100644 > > --- a/hw/Makefile.objs > > +++ b/hw/Makefile.objs > > @@ -120,6 +120,7 @@ common-obj-$(CONFIG_PCNET_COMMON) += pcnet.o > > common-obj-$(CONFIG_E1000_PCI) += e1000.o > > common-obj-$(CONFIG_RTL8139_PCI) += rtl8139.o > > common-obj-$(CONFIG_VMXNET3_PCI) += vmxnet_tx_pkt.o vmxnet_rx_pkt.o > > +common-obj-$(CONFIG_VMXNET3_PCI) += vmxnet3.o > > > > common-obj-$(CONFIG_SMC91C111) += smc91c111.o > > common-obj-$(CONFIG_LAN9118) += lan9118.o > > diff --git a/hw/pci/pci.h b/hw/pci/pci.h > > index f340fe5..3beb70b 100644 > > --- a/hw/pci/pci.h > > +++ b/hw/pci/pci.h > > @@ -60,6 +60,7 @@ > > #define PCI_DEVICE_ID_VMWARE_NET 0x0720 > > #define PCI_DEVICE_ID_VMWARE_SCSI0x0730 > > #define PCI_DEVICE_ID_VMWARE_IDE 0x1729 > > +#define PCI_DEVICE_ID_VMWARE_VMXNET3 0x07B0 > > > > /* Intel (0x8086) */ > > #define PCI_DEVICE_ID_INTEL_82551IT 0x1209 > > diff --git a/hw/vmxnet3.c b/hw/vmxnet3.c > > new file mode 100644 > > index 000..75b7181 > > --- /dev/null > > +++ b/hw/vmxnet3.c > > @@ -0,0 +1,2460 @@ > > +/* > > + * QEMU VMWARE VMXNET3 paravirtual NIC > > + * > > + * Copyright (c) 2012 Ravello Systems LTD (http://ravellosystems.com) > > + * > > + * Developed by Daynix Computing LTD (http://www.daynix.com) > > + * > > + * Authors: > > + * Dmitry Fleytman > > + * Tamir Shomer > > + * Yan Vugenfirer > > + * > > + * This work is licensed under the terms of the GNU GPL, version 2 or > later. > > + * See the COPYING file in the top-level directory. > > + * > > + */ > > + > > +#include "hw.h" > > +#include "pci/pci.h" > > +#include "net/net.h" > > +#include "virtio-net.h" > > +#include "net/tap.h" > > +#include "net/checksum.h" > > +#include "sysemu/sysemu.h" > > +#include "qemu-common.h" > > +#include "qemu/bswap.h" > > +#include "pci/msix.h" > > +#include "pci/msi.h" > > + > > +#include "vmxnet3.h" > > +#include "vmxnet_debug.h" > > +#include "vmware_utils.h" > > +#include "vmxnet_tx_pkt.h" > > +#include "vmxnet_rx_pkt.h" > > + > > +#define PCI_DEVICE_ID_VMWARE_VMXNET3_REVISION 0x1 > > +#define VMXNET3_MSIX_BAR_SIZE 0x2000 > > + > > +#define VMXNET3_BAR0_IDX (0) > > +#define VMXNET3_BAR1_IDX (1) > > +#define VMXNET3_MSIX_BAR_IDX (2) > > + > > +#define VMXNET3_OFF_MSIX_TABLE (0x000) > > +#define VMXNET3_OFF_MSIX_PBA (0x800) > > + > > +/* Link speed in Mbps should be shifted by 16 */ > > +#define VMXNET3_LINK_SPEED (1000 << 16) > > + > > +/* Link status: 1 - up, 0 - down. */ > > +#define VMXNET3_LINK_STATUS_UP 0x1 > > + > > +/* Least significant bit should be set for revision and version */ > > +#define VMXNET3_DEVICE_VERSION0x1 > > +#define VMXNET3_DEVICE_REVISION 0x1 > > + > > +/* Macros for rings descriptors access */ > > +#define VMXNET3_READ_TX_QUEUE_DESCR8(dpa, field) \ > > +(vmw_shmem_ld8(dpa + offsetof(struct Vmxnet3_TxQueueDesc, field))) > > + > > +#define VMXNET3_WRITE_TX_QUEUE_DESCR8(dpa, field, value) \ > > +(vmw_shmem_st8(dpa + offsetof(struct Vmxnet3_TxQueueDesc, field, > value)
[Qemu-devel] [PATCH V14 0/5] VMXNET3 paravirtual NIC device implementation
> \ > + (r)->pa, (r)->size, (r)->cell_size, (r)->gen, (r)->next) make macros upper case [DF] Fixed ... > +static inline void vmxnet3_flush_shmem_changes(void) > +{ > +/* > + * Flush shared memory changes > + * Needed before transferring control to guest what does 'transferring control to guest' mean? [DF] Changed to [DF]/* [DF] * Flush shared memory changes [DF] * Needed before sending interrupt to guest to ensure [DF] * it gets consistent memory state [DF] */ ... > + */ > +smp_wmb(); > +} Don't use wrappers like this. They just hide bugs. For example it's not helpful before an interrupt in the function below. [DF] I guess you are talking about vmxnet3_complete_packet() [DF] Strictly speaking barrier is a must because we change shared memory in [DF] vmxnet3_complete_packet() [DF] And the wrapper is a good thing because its name explains its effect [DF] in a formal way as opposed to comments ... > +switch (status) { > +case VMXNET3_PKT_STATUS_OK: { don't put {} around cases: they align incorrectly if it's too big move to a function. [DF] Fixed ... > +static bool > +vmxnet3_send_packet(VMXNET3State *s, uint32_t qidx) > +{ > +size_t bytes_sent = 0; > +bool res = true; why = true? don't initialize just because. [DF] Fixed ... > +/* > + * VMWARE headers we got from Linux kernel do not fully comply QEMU coding > + * standards in sense of types and defines used. > + * Since we didn't want to change VMWARE code, following set of typedefs > + * and defines needed to compile these headers with QEMU introduced. > + */ No need for this now. You can export headers and put them under linux-headers. [DF] Not sure it is possible because the header as-is is not stand-alone and won't compile [DF] without changes. We extracted definitions we use from their header and dropped unused [DF] and kernel-specific stuff. [DF} Please, advise. ... > +if (VMXNET3_OM_TSO == s->offload_mode) { Don't do Yoda style like this [DF] "Yoda" style removed everywhere Changes in V6: Fixed most of problems pointed out by Michael S. Tsirkin The only issue still open is creation of shared place with generic network structures and functions. Currently all generic network code introduced by VMXNET3 resides in vmxnet_utils.c/h files. It could be moved to some shared location however we believe it is a matter of separate refactoring as there are a lot of copy-pasted definitions in almost every device and code cleanup efforts requred in order to create truly shared codebase. Reported-by: Michael S. Tsirkin Implemented suggestions by Anthony Liguori Reported-by: Anthony Liguori Fixed incorrect checksum caclulation for some packets in SW offloads mode Reported-by: Gerhard Wiesinger Changes in V5: MSI-X save/load implemented in the device instead of pci bus as suggested by Michael S. Tsirkin Reported-by: Michael S. Tsirkin Patches regrouped as suggested by Paolo Bonzini Reported-by: Paolo Bonzini Changes in V4: Fixed a few problems uncovered by NETIO test suit Assertion on failure to initialize MSI/MSI-X replaced with warning message and fallback to Legacy/MSI respectively Reported-by: Gerhard Wiesinger Various coding style adjustments and patch split-up as suggested by Anthony Liguori Reported-by: Anthony Liguori Live migration support added Changes in V3: Fixed crash when net device that is used as network fronted has no virtio HDR support. Task offloads emulation for cases when net device that is used as network fronted has no virtio HDR support. Reported-by: Gerhard Wiesinger Changes in V2: License text changed accoring to community suggestions Standard license header from GPLv2+ - licensed QEMU files used Dmitry Fleytman (5): Checksum-related utility functions net: iovec checksum calculator Common definitions for VMWARE devices Packet abstraction for VMWARE network devices VMXNET3 device implementation default-configs/pci.mak |1 + hw/Makefile.objs|2 + hw/pci/pci.h|1 + hw/vmware_utils.h | 143 +++ hw/vmxnet3.c| 2461 +++ hw/vmxnet3.h| 760 +++ hw/vmxnet_debug.h | 115 +++ hw/vmxnet_rx_pkt.c | 187 hw/vmxnet_rx_pkt.h | 174 hw/vmxnet_tx_pkt.c | 567 +++ hw/vmxnet_tx_pkt.h | 148 +++ include/net/checksum.h | 26 +- include/net/eth.h | 347 +++ net/Makefile.objs |1 + net/checksum.c | 42 +- net/eth.c | 217 + 16 files changed, 5185 insertions(+), 7 deletions(-) create mode 100644 hw/vmware_utils.h create mode 100644 hw/vmxnet3.c create mode 100644 hw/vmxnet3.h create mode 100644 hw/vmxnet_debug.h create mode 100644 hw/vmxnet_rx_pkt.c create mode 100644 hw/vmxnet_rx_pkt.h create mode 100644 hw/vmxnet_tx_pkt.c create mode 100644 hw/vmxnet_tx_pkt.h create mode 100644 include/net/eth.h create mode 100644 net/eth.c -- 1.8.1.2
[Qemu-devel] [PATCH V14 1/5] Checksum-related utility functions
net_checksum_add_cont() checksum calculation for scattered data with odd chunk sizes net_raw_checksum() checksum calculation for a buffer Signed-off-by: Dmitry Fleytman Signed-off-by: Yan Vugenfirer --- include/net/checksum.h | 14 +- net/checksum.c | 13 +++-- 2 files changed, 20 insertions(+), 7 deletions(-) diff --git a/include/net/checksum.h b/include/net/checksum.h index 1f05298..3e7b93d 100644 --- a/include/net/checksum.h +++ b/include/net/checksum.h @@ -20,10 +20,22 @@ #include -uint32_t net_checksum_add(int len, uint8_t *buf); +uint32_t net_checksum_add_cont(int len, uint8_t *buf, int seq); uint16_t net_checksum_finish(uint32_t sum); uint16_t net_checksum_tcpudp(uint16_t length, uint16_t proto, uint8_t *addrs, uint8_t *buf); void net_checksum_calculate(uint8_t *data, int length); +static inline uint32_t +net_checksum_add(int len, uint8_t *buf) +{ +return net_checksum_add_cont(len, buf, 0); +} + +static inline uint16_t +net_raw_checksum(uint8_t *data, int length) +{ +return net_checksum_finish(net_checksum_add(length, data)); +} + #endif /* QEMU_NET_CHECKSUM_H */ diff --git a/net/checksum.c b/net/checksum.c index 9919b2e..4fa5563 100644 --- a/net/checksum.c +++ b/net/checksum.c @@ -20,16 +20,17 @@ #define PROTO_TCP 6 #define PROTO_UDP 17 -uint32_t net_checksum_add(int len, uint8_t *buf) +uint32_t net_checksum_add_cont(int len, uint8_t *buf, int seq) { uint32_t sum = 0; int i; -for (i = 0; i < len; i++) { - if (i & 1) - sum += (uint32_t)buf[i]; - else - sum += (uint32_t)buf[i] << 8; +for (i = seq; i < seq + len; i++) { +if (i & 1) { +sum += (uint32_t)buf[i - seq]; +} else { +sum += (uint32_t)buf[i - seq] << 8; +} } return sum; } -- 1.8.1.2
[Qemu-devel] [PATCH V14 3/5] Common definitions for VMWARE devices
Signed-off-by: Dmitry Fleytman Signed-off-by: Yan Vugenfirer --- hw/vmware_utils.h | 143 ++ hw/vmxnet_debug.h | 115 ++ include/net/eth.h | 347 ++ net/Makefile.objs | 1 + net/eth.c | 217 ++ 5 files changed, 823 insertions(+) create mode 100644 hw/vmware_utils.h create mode 100644 hw/vmxnet_debug.h create mode 100644 include/net/eth.h create mode 100644 net/eth.c diff --git a/hw/vmware_utils.h b/hw/vmware_utils.h new file mode 100644 index 000..5307e2c --- /dev/null +++ b/hw/vmware_utils.h @@ -0,0 +1,143 @@ +/* + * QEMU VMWARE paravirtual devices - auxiliary code + * + * Copyright (c) 2012 Ravello Systems LTD (http://ravellosystems.com) + * + * Developed by Daynix Computing LTD (http://www.daynix.com) + * + * Authors: + * Dmitry Fleytman + * Yan Vugenfirer + * + * This work is licensed under the terms of the GNU GPL, version 2 or later. + * See the COPYING file in the top-level directory. + * + */ + +#ifndef VMWARE_UTILS_H +#define VMWARE_UTILS_H + +#include "qemu/range.h" + +#ifndef VMW_SHPRN +#define VMW_SHPRN(fmt, ...) do {} while (0) +#endif + +/* + * Shared memory access functions with byte swap support + * Each function contains printout for reverse-engineering needs + * + */ +static inline void +vmw_shmem_read(hwaddr addr, void *buf, int len) +{ +VMW_SHPRN("SHMEM r: %" PRIx64 ", len: %d to %p", addr, len, buf); +cpu_physical_memory_read(addr, buf, len); +} + +static inline void +vmw_shmem_write(hwaddr addr, void *buf, int len) +{ +VMW_SHPRN("SHMEM w: %" PRIx64 ", len: %d to %p", addr, len, buf); +cpu_physical_memory_write(addr, buf, len); +} + +static inline void +vmw_shmem_rw(hwaddr addr, void *buf, int len, int is_write) +{ +VMW_SHPRN("SHMEM r/w: %" PRIx64 ", len: %d (to %p), is write: %d", + addr, len, buf, is_write); + +cpu_physical_memory_rw(addr, buf, len, is_write); +} + +static inline void +vmw_shmem_set(hwaddr addr, uint8 val, int len) +{ +int i; +VMW_SHPRN("SHMEM set: %" PRIx64 ", len: %d (value 0x%X)", addr, len, val); + +for (i = 0; i < len; i++) { +cpu_physical_memory_write(addr + i, &val, 1); +} +} + +static inline uint32_t +vmw_shmem_ld8(hwaddr addr) +{ +uint8_t res = ldub_phys(addr); +VMW_SHPRN("SHMEM load8: %" PRIx64 " (value 0x%X)", addr, res); +return res; +} + +static inline void +vmw_shmem_st8(hwaddr addr, uint8_t value) +{ +VMW_SHPRN("SHMEM store8: %" PRIx64 " (value 0x%X)", addr, value); +stb_phys(addr, value); +} + +static inline uint32_t +vmw_shmem_ld16(hwaddr addr) +{ +uint16_t res = lduw_le_phys(addr); +VMW_SHPRN("SHMEM load16: %" PRIx64 " (value 0x%X)", addr, res); +return res; +} + +static inline void +vmw_shmem_st16(hwaddr addr, uint16_t value) +{ +VMW_SHPRN("SHMEM store16: %" PRIx64 " (value 0x%X)", addr, value); +stw_le_phys(addr, value); +} + +static inline uint32_t +vmw_shmem_ld32(hwaddr addr) +{ +uint32_t res = ldl_le_phys(addr); +VMW_SHPRN("SHMEM load32: %" PRIx64 " (value 0x%X)", addr, res); +return res; +} + +static inline void +vmw_shmem_st32(hwaddr addr, uint32_t value) +{ +VMW_SHPRN("SHMEM store32: %" PRIx64 " (value 0x%X)", addr, value); +stl_le_phys(addr, value); +} + +static inline uint64_t +vmw_shmem_ld64(hwaddr addr) +{ +uint64_t res = ldq_le_phys(addr); +VMW_SHPRN("SHMEM load64: %" PRIx64 " (value %" PRIx64 ")", addr, res); +return res; +} + +static inline void +vmw_shmem_st64(hwaddr addr, uint64_t value) +{ +VMW_SHPRN("SHMEM store64: %" PRIx64 " (value %" PRIx64 ")", addr, value); +stq_le_phys(addr, value); +} + +/* Macros for simplification of operations on array-style registers */ + +/* + * Whether lies inside of array-style register defined by , + * number of elements () and element size () + * +*/ +#define VMW_IS_MULTIREG_ADDR(addr, base, cnt, regsize) \ +range_covers_byte(base, cnt * regsize, addr) + +/* + * Returns index of given register () in array-style register defined by + * and element size () + * +*/ +#define VMW_MULTIREG_IDX_BY_ADDR(addr, base, regsize) \ +(((addr) - (base)) / (regsize)) + +#endif diff --git a/hw/vmxnet_debug.h b/hw/vmxnet_debug.h new file mode 100644 index 000..96dae0f --- /dev/null +++ b/hw/vmxnet_debug.h @@ -0,0 +1,115 @@ +/* + * QEMU VMWARE VMXNET* paravirtual NICs - debugging facilities + * + * Copyright (c) 2012 Ravello Systems LTD (http://ravellosystems.com) + * + * Developed by Daynix Computing LTD (http://www.daynix.com) + * + * Authors: + * Dmitry Fleytman + * Tamir Shomer + * Yan Vugenfirer + * + * Th
[Qemu-devel] [PATCH V14 4/5] Packet abstraction for VMWARE network devices
Signed-off-by: Dmitry Fleytman Signed-off-by: Yan Vugenfirer --- hw/Makefile.objs | 1 + hw/vmxnet_rx_pkt.c | 187 ++ hw/vmxnet_rx_pkt.h | 174 hw/vmxnet_tx_pkt.c | 567 + hw/vmxnet_tx_pkt.h | 148 ++ 5 files changed, 1077 insertions(+) create mode 100644 hw/vmxnet_rx_pkt.c create mode 100644 hw/vmxnet_rx_pkt.h create mode 100644 hw/vmxnet_tx_pkt.c create mode 100644 hw/vmxnet_tx_pkt.h diff --git a/hw/Makefile.objs b/hw/Makefile.objs index 40ebe46..14922cb 100644 --- a/hw/Makefile.objs +++ b/hw/Makefile.objs @@ -119,6 +119,7 @@ common-obj-$(CONFIG_PCNET_PCI) += pcnet-pci.o common-obj-$(CONFIG_PCNET_COMMON) += pcnet.o common-obj-$(CONFIG_E1000_PCI) += e1000.o common-obj-$(CONFIG_RTL8139_PCI) += rtl8139.o +common-obj-$(CONFIG_VMXNET3_PCI) += vmxnet_tx_pkt.o vmxnet_rx_pkt.o common-obj-$(CONFIG_SMC91C111) += smc91c111.o common-obj-$(CONFIG_LAN9118) += lan9118.o diff --git a/hw/vmxnet_rx_pkt.c b/hw/vmxnet_rx_pkt.c new file mode 100644 index 000..a40e346 --- /dev/null +++ b/hw/vmxnet_rx_pkt.c @@ -0,0 +1,187 @@ +/* + * QEMU VMWARE VMXNET* paravirtual NICs - RX packets abstractions + * + * Copyright (c) 2012 Ravello Systems LTD (http://ravellosystems.com) + * + * Developed by Daynix Computing LTD (http://www.daynix.com) + * + * Authors: + * Dmitry Fleytman + * Tamir Shomer + * Yan Vugenfirer + * + * This work is licensed under the terms of the GNU GPL, version 2 or later. + * See the COPYING file in the top-level directory. + * + */ + +#include "vmxnet_rx_pkt.h" +#include "net/eth.h" +#include "qemu-common.h" +#include "qemu/iov.h" +#include "net/checksum.h" +#include "net/tap.h" + +/* + * RX packet may contain up to 2 fragments - rebuilt eth header + * in case of VLAN tag stripping + * and payload received from QEMU - in any case + */ +#define VMXNET_MAX_RX_PACKET_FRAGMENTS (2) + +struct VmxnetRxPkt { +struct virtio_net_hdr virt_hdr; +uint8_t ehdr_buf[ETH_MAX_L2_HDR_LEN]; +struct iovec vec[VMXNET_MAX_RX_PACKET_FRAGMENTS]; +uint16_t vec_len; +uint32_t tot_len; +uint16_t tci; +bool vlan_stripped; +bool has_virt_hdr; +eth_pkt_types_e packet_type; + +/* Analysis results */ +bool isip4; +bool isip6; +bool isudp; +bool istcp; +}; + +void vmxnet_rx_pkt_init(struct VmxnetRxPkt **pkt, bool has_virt_hdr) +{ +struct VmxnetRxPkt *p = g_malloc0(sizeof *p); +p->has_virt_hdr = has_virt_hdr; +*pkt = p; +} + +void vmxnet_rx_pkt_uninit(struct VmxnetRxPkt *pkt) +{ +g_free(pkt); +} + +struct virtio_net_hdr *vmxnet_rx_pkt_get_vhdr(struct VmxnetRxPkt *pkt) +{ +assert(pkt); +return &pkt->virt_hdr; +} + +void vmxnet_rx_pkt_attach_data(struct VmxnetRxPkt *pkt, const void *data, + size_t len, bool strip_vlan) +{ +uint16_t tci = 0; +uint16_t ploff; +assert(pkt); +pkt->vlan_stripped = false; + +if (strip_vlan) { +pkt->vlan_stripped = eth_strip_vlan(data, pkt->ehdr_buf, &ploff, &tci); +} + +if (pkt->vlan_stripped) { +pkt->vec[0].iov_base = pkt->ehdr_buf; +pkt->vec[0].iov_len = ploff - sizeof(struct vlan_header); +pkt->vec[1].iov_base = (uint8_t *) data + ploff; +pkt->vec[1].iov_len = len - ploff; +pkt->vec_len = 2; +pkt->tot_len = len - ploff + sizeof(struct eth_header); +} else { +pkt->vec[0].iov_base = (void *)data; +pkt->vec[0].iov_len = len; +pkt->vec_len = 1; +pkt->tot_len = len; +} + +pkt->tci = tci; + +eth_get_protocols(data, len, &pkt->isip4, &pkt->isip6, +&pkt->isudp, &pkt->istcp); +} + +void vmxnet_rx_pkt_dump(struct VmxnetRxPkt *pkt) +{ +#ifdef VMXNET_RX_PKT_DEBUG +VmxnetRxPkt *pkt = (VmxnetRxPkt *)pkt; +assert(pkt); + +printf("RX PKT: tot_len: %d, vlan_stripped: %d, vlan_tag: %d\n", + pkt->tot_len, pkt->vlan_stripped, pkt->tci); +#endif +} + +void vmxnet_rx_pkt_set_packet_type(struct VmxnetRxPkt *pkt, +eth_pkt_types_e packet_type) +{ +assert(pkt); + +pkt->packet_type = packet_type; + +} + +eth_pkt_types_e vmxnet_rx_pkt_get_packet_type(struct VmxnetRxPkt *pkt) +{ +assert(pkt); + +return pkt->packet_type; +} + +size_t vmxnet_rx_pkt_get_total_len(struct VmxnetRxPkt *pkt) +{ +assert(pkt); + +return pkt->tot_len; +} + +void vmxnet_rx_pkt_get_protocols(struct VmxnetRxPkt *pkt, + bool *isip4, bool *isip6, + bool *isudp, bool *istcp) +{ +assert(pkt); + +*isip4 = pkt->isip4; +*isip6 = pkt->isip6; +*isudp = pkt->isudp; +*istcp = pkt->istcp; +} + +struct iovec *vmxnet_rx_pkt_get_iovec(struct VmxnetRxPkt *pkt) +{ +assert(pkt); + +return
[Qemu-devel] [PATCH V14 2/5] net: iovec checksum calculator
Signed-off-by: Dmitry Fleytman Signed-off-by: Yan Vugenfirer --- include/net/checksum.h | 12 net/checksum.c | 29 + 2 files changed, 41 insertions(+) diff --git a/include/net/checksum.h b/include/net/checksum.h index 3e7b93d..80203fb 100644 --- a/include/net/checksum.h +++ b/include/net/checksum.h @@ -38,4 +38,16 @@ net_raw_checksum(uint8_t *data, int length) return net_checksum_finish(net_checksum_add(length, data)); } +/** + * net_checksum_add_iov: scatter-gather vector checksumming + * + * @iov: input scatter-gather array + * @iov_cnt: number of array elements + * @iov_off: starting iov offset for checksumming + * @size: length of data to be checksummed + */ +uint32_t net_checksum_add_iov(const struct iovec *iov, + const unsigned int iov_cnt, + uint32_t iov_off, uint32_t size); + #endif /* QEMU_NET_CHECKSUM_H */ diff --git a/net/checksum.c b/net/checksum.c index 4fa5563..14c0855 100644 --- a/net/checksum.c +++ b/net/checksum.c @@ -15,6 +15,7 @@ * along with this program; if not, see <http://www.gnu.org/licenses/>. */ +#include "qemu-common.h" #include "net/checksum.h" #define PROTO_TCP 6 @@ -84,3 +85,31 @@ void net_checksum_calculate(uint8_t *data, int length) data[14+hlen+csum_offset] = csum >> 8; data[14+hlen+csum_offset+1] = csum & 0xff; } + +uint32_t +net_checksum_add_iov(const struct iovec *iov, const unsigned int iov_cnt, + uint32_t iov_off, uint32_t size) +{ +size_t iovec_off, buf_off; +unsigned int i; +uint32_t res = 0; +uint32_t seq = 0; + +iovec_off = 0; +buf_off = 0; +for (i = 0; i < iov_cnt && size; i++) { +if (iov_off < (iovec_off + iov[i].iov_len)) { +size_t len = MIN((iovec_off + iov[i].iov_len) - iov_off , size); +void *chunk_buf = iov[i].iov_base + (iov_off - iovec_off); + +res += net_checksum_add_cont(len, chunk_buf, seq); +seq += len; + +buf_off += len; +iov_off += len; +size -= len; +} +iovec_off += iov[i].iov_len; +} +return res; +} -- 1.8.1.2
Re: [Qemu-devel] [PATCH 1/1 V6] VMWare PVSCSI paravirtual device implementation
On Thu, Apr 18, 2013 at 1:54 PM, Paolo Bonzini wrote: > Il 18/04/2013 11:38, Dmitry Fleytman ha scritto: > > > > > +static void > > > +pvscsi_free_queue(PVSCSIRequestList *req_list) > > > > This shouldn't be needed. > > > > > > > > Doesn't one need to clear completion queue on reset command arrival from > > driver? > > > > It should happen in qbus_reset_all. The scsi-disk device will cancel > pending requests, and these will be moved from the pending_queue to the > completion_queue. I think you can call pvscsi_process_completion_queue > instead of pvscsi_free_queue. > > Paolo > > Ok, sounds reasonable. I'll send the updated patch soon. > > > > > +{ > > > +PVSCSIRequest *pvscsi_req; > > > + > > > +while (!QTAILQ_EMPTY(req_list)) { > > > +pvscsi_req = QTAILQ_FIRST(req_list); > > > +QTAILQ_REMOVE(req_list, pvscsi_req, next); > > > +g_free(pvscsi_req); > > > +} > > > +} > > > + > >
[Qemu-devel] [PATCH 0/1 V7] VMWare PVSCSI paravirtual device implementation
Below is the implementation of VMWare PVSCSI device PVSCSI implementation is based on Paolo Bonzini code sumbitted some time ago but never applied. See commit messages and file headers for details. This patch contains changes made by Deep Debroy, see here: http://lists.gnu.org/archive/html/qemu-devel/2012-07/msg03585.html Cc: Deep Debroy Implementation supports of all the device features. Code was tested on different OSes: Fedora 15 Ubuntu 10.4 Centos 6.2 Windows 2008R2 Windows 2008 64bit Windows 2008 32bit Windows 2003 64bit Windows 2003 32bit Changes since V6: 1. Fixes as suggested by Paolo Bonzini Reported-by: Paolo Bonzini Changes since V5: 1. SCSI hotplug support added 2. Code rebase for mainline Changes since V4: Array access checks and minor beautification as suggested by Blue Swirl. Reported-by: Blue Swirl Changes since V3: 1. Utility function strpadcpy() and structure changes in SCSI devices removed from v4 since they are already applied to scsi-next from v3 by Paolo. 2. Logging ported to use tracepoints. All ifdef based custom macros for logging removed. 3. The vmware_utils.h is no longer present with necessary macros inlined. 4. pvscsi.h replaced by vmw_pvscsi.h from linux kernel with some minor modifications to build in qemu. 5. Various fixes and beautification as suggested by Blue Swirl. Reported-by: Blue Swirl Changes since V1: Various fixes and beautification as suggested by Paolo Bonzini Reported-by: Paolo Bonzini Dmitry Fleytman (1): VMWare PVSCSI paravirtual device implementation default-configs/pci.mak|1 + docs/specs/pvscsi-spec.txt | 92 hw/Makefile.objs |1 + hw/pci/pci.h |1 + hw/pvscsi.c| 1194 hw/vmw_pvscsi.h| 434 trace-events | 36 ++ 7 files changed, 1759 insertions(+) create mode 100644 docs/specs/pvscsi-spec.txt create mode 100644 hw/pvscsi.c create mode 100644 hw/vmw_pvscsi.h -- 1.8.1.4
Re: [Qemu-devel] [PATCH 0/1 V4] virtio-net: dynamic network offloads configuration
Hello All, Any news regarding this patch? Thanks, Dmitry On Sun, Apr 7, 2013 at 9:34 AM, Dmitry Fleytman wrote: > From: Dmitry Fleytman > > This patch implements recently accepted by virtio-spec > dynamic offloads configuration feature. > See commit message for details. > > V4 changes: > 1. Feature definitions re-used for command bitmask > 2. Command data made uint64 > 3. Commit messsages fixed > > Reported-by: Rusty Russell ru...@rustcorp.com.au > > V3 changes: > 1. Compat macro added > 2. Feature name beautification > > V2 changes: > 1. _GUEST_ added to command and feature names > 2. Live migration logic fixed > > Reported-by: Michael S. Tsirkin > > One of recently introduced Windows features (RSC) > requires network driver to be able to enable and disable > HW LRO offload on the fly without device reinitialization. > > Current Virtio specification doesn't support this requirement. > The solution proposed by following spec patch is to add > a new control command for this purpose. > > The same solution may be used in Linux driver for ethtool interface > implementation. > > -- > 1.8.1.4 > >
Re: [Qemu-devel] [PATCH 1/1 V7] VMWare PVSCSI paravirtual device implementation
Paolo, thanks for review. Regarding the change - it's ok with me, but why do one needs this? I think we always set proper status before request cancellation. May QEMU call cancel callback on its own? Dmitry. On Fri, Apr 19, 2013 at 10:37 AM, Paolo Bonzini wrote: > Il 19/04/2013 09:05, Dmitry Fleytman ha scritto: > > +if (pvscsi_req->dev->resetting) { > > +pvscsi_req->cmp.hostStatus = BTSTAT_BUSRESET; > > +} > > I'm changing this to > > if (pvscsi_req->cmp.hostStatus == BTSTAT_SUCCESS) { > if (pvscsi_req->dev->resetting) { > pvscsi_req->cmp.hostStatus = BTSTAT_BUSRESET; > } else { > pvscsi_req->cmp.hostStatus = BTSTAT_ABORTQUEUE; > } > } > > Otherwise it's okay. > > Thanks! > > Paolo >
Re: [Qemu-devel] [PATCH 1/1 V7] VMWare PVSCSI paravirtual device implementation
I see. Thanks. On Fri, Apr 19, 2013 at 10:58 AM, Paolo Bonzini wrote: > Il 19/04/2013 09:53, Dmitry Fleytman ha scritto: > > Paolo, thanks for review. > > > > Regarding the change - it's ok with me, but why do one needs this? I > > think we always set proper status before request cancellation. > > May QEMU call cancel callback on its own? > > The cancel callback should not be run if the command is completed (what > happens is that scsi_req_cancel will call _either_ the complete callback > or the cancel callback). However, we had bugs in the past on this and > I'm not sure all of them have been stomped. > > The outer "if" statement is the equivalent of this in virtio-scsi.c > > if (!req) { > return; > } > > if (req->dev->resetting) { > ... > } > > So it may even be better if I follow the scheme in virtio-scsi.c and do > this: > > if (r->completed) { > return; > } > >if (pvscsi_req->dev->resetting) { >pvscsi_req->cmp.hostStatus = BTSTAT_BUSRESET; > } else { >pvscsi_req->cmp.hostStatus = BTSTAT_ABORTQUEUE; > } > > pvscsi_complete_request(s, pvscsi_req); > > Paolo >
Re: [Qemu-devel] [PATCH 0/1 V4] virtio-net: dynamic network offloads configuration
Spec patch already inside. Sent from my iPad On Apr 20, 2013, at 8:04 PM, "Michael S. Tsirkin" wrote: > On Fri, Apr 19, 2013 at 10:10:01AM +0300, Dmitry Fleytman wrote: >> Hello All, >> >> Any news regarding this patch? >> >> Thanks, >> Dmitry > > Rusty could you comment on the spec change soon please? > If you pick it up I think we can include the feature in QEMU 1.5. > >> On Sun, Apr 7, 2013 at 9:34 AM, Dmitry Fleytman wrote: >> >>From: Dmitry Fleytman >> >>This patch implements recently accepted by virtio-spec >>dynamic offloads configuration feature. >>See commit message for details. >> >>V4 changes: >> 1. Feature definitions re-used for command bitmask >> 2. Command data made uint64 >> 3. Commit messsages fixed >> >>Reported-by: Rusty Russell ru...@rustcorp.com.au >> >>V3 changes: >> 1. Compat macro added >> 2. Feature name beautification >> >>V2 changes: >> 1. _GUEST_ added to command and feature names >> 2. Live migration logic fixed >> >>Reported-by: Michael S. Tsirkin >> >>One of recently introduced Windows features (RSC) >>requires network driver to be able to enable and disable >>HW LRO offload on the fly without device reinitialization. >> >>Current Virtio specification doesn't support this requirement. >>The solution proposed by following spec patch is to add >>a new control command for this purpose. >> >>The same solution may be used in Linux driver for ethtool interface >>implementation. >> >>-- >>1.8.1.4 >> >> >>
Re: [Qemu-devel] [PATCH] vmxnet3: Pad short frames to minimum size (60 bytes)
On Aug 24, 2014, at 15:06 PM, Michael Tokarev wrote: > 20.08.2014 16:27, Ben Draper wrote: >> When running VMware ESXi under qemu-kvm the guest discards frames >> that are too short. Short ARP Requests will be dropped, this prevents >> guests on the same bridge as VMware ESXi from communicating. This patch >> simply adds the padding on the network device itself. > > I'm not sure it is "trivial enough", so to say. Do we have a maintainer > for vmxnet? It's been written and updated several times by vmware (Daynix) > people, maybe they can comment on this somehow? I mean, if we don't have > a maintainer for this device, it is okay to go to -trivial, but maybe it's > a good idea to try to reach the author(s) first? (Adding Cc). > > Especially since this change is only required in certain cases, not > generally. Hi Michael, I’m the maintainer of vmxnet3/pvscsi devices in QEMU. Thanks for CC’ing me. I think this patch is correct and needed. As we saw a few times already on different operating systems, vmware drivers expect short packets to be padded as required by corresponding RFC. Therefore this patch fixes a real bug. Reviewed-by: Dmitry Fleytman > > Thanks, > > /mjt > >> Signed-off-by: Ben Draper >> --- >> hw/net/vmxnet3.c | 10 ++ >> 1 file changed, 10 insertions(+) >> >> diff --git a/hw/net/vmxnet3.c b/hw/net/vmxnet3.c >> index 791321f..f246fa1 100644 >> --- a/hw/net/vmxnet3.c >> +++ b/hw/net/vmxnet3.c >> @@ -34,6 +34,7 @@ >> >> #define PCI_DEVICE_ID_VMWARE_VMXNET3_REVISION 0x1 >> #define VMXNET3_MSIX_BAR_SIZE 0x2000 >> +#define MIN_BUF_SIZE 60 >> >> #define VMXNET3_BAR0_IDX (0) >> #define VMXNET3_BAR1_IDX (1) >> @@ -1871,12 +1872,21 @@ vmxnet3_receive(NetClientState *nc, const uint8_t >> *buf, size_t size) >> { >> VMXNET3State *s = qemu_get_nic_opaque(nc); >> size_t bytes_indicated; >> +uint8_t min_buf[MIN_BUF_SIZE]; >> >> if (!vmxnet3_can_receive(nc)) { >> VMW_PKPRN("Cannot receive now"); >> return -1; >> } >> >> +/* Pad to minimum Ethernet frame length */ >> +if (size < sizeof(min_buf)) { >> +memcpy(min_buf, buf, size); >> +memset(&min_buf[size], 0, sizeof(min_buf) - size); >> +buf = min_buf; >> +size = sizeof(min_buf); >> +} >> + >> if (s->peer_has_vhdr) { >> vmxnet_rx_pkt_set_vhdr(s->rx_pkt, (struct virtio_net_hdr *)buf); >> buf += sizeof(struct virtio_net_hdr); >> >
Re: [Qemu-devel] [PATCH] vmxnet3: Pad short frames to minimum size (60 bytes)
On Aug 24, 2014, at 16:10 PM, Michael Tokarev wrote: > 24.08.2014 16:28, Dmitry Fleytman wrote: > >> Hi Michael, >> >> I’m the maintainer of vmxnet3/pvscsi devices in QEMU. Thanks for CC’ing me. > > Maybe you can add yourself to MAINTAINERS file as well? :) Yes, this should be done. How we do this? Should I send a patch for MAINTAINETRS? > I dunno if that's actually needed, but at least this should > stop "strain" patches like this to be sent to -trivial alone... ;) > >> I think this patch is correct and needed. >> >> As we saw a few times already on different operating systems, >> vmware drivers expect short packets to be padded as required >> by corresponding RFC. Therefore this patch fixes a real bug. > > Okay, since there's no entry for vmxnet in MAINTAINERS, and with > your blessing, and since this is a rather specific device which > is not in common use, I'll apply it to -trivial, for now, unless > you want to pick it up and send a pull request for it. -trivial is good enough for this patch. > > Given your description, I think it should be Cc: qemu-stable@. > >> Reviewed-by: Dmitry Fleytman > > Thanks, > > /mjt
[Qemu-devel] [PATCH] MAINTAINERS: Add VMWare devices maintainer
Signed-off-by: Dmitry Fleytman --- MAINTAINERS | 6 ++ 1 file changed, 6 insertions(+) diff --git a/MAINTAINERS b/MAINTAINERS index 59940f9..1b3e2be 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -678,6 +678,12 @@ S: Maintained F: hw/*/xilinx_* F: include/hw/xilinx.h +Vmware +M: Dmitry Fleytman +S: Maintained +F: hw/net/vmxnet* +F: hw/scsi/vmw_pvscsi* + Subsystems -- Audio -- 1.9.3
Re: [Qemu-devel] [patch qemu] vmxnet3: fix msix vectors unuse
ACK Thanks for fixing. On May 19, 2014, at 16:47 PM, Jiri Pirko wrote: > In vmxnet3_cleanup_msix(), there is called msix_vector_unuse() with > VMXNET3_MAX_INTRS. That is not correct since vector of > value VMXNET3_MAX_INTRS was never used. Also all the used vectors > are not un-used. So call vmxnet3_unuse_msix_vectors() instead which > does the correct job. > > Signed-off-by: Jiri Pirko > --- > hw/net/vmxnet3.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/hw/net/vmxnet3.c b/hw/net/vmxnet3.c > index 1bb9259..f3be494 100644 > --- a/hw/net/vmxnet3.c > +++ b/hw/net/vmxnet3.c > @@ -2050,7 +2050,7 @@ vmxnet3_cleanup_msix(VMXNET3State *s) > PCIDevice *d = PCI_DEVICE(s); > > if (s->msix_used) { > -msix_vector_unuse(d, VMXNET3_MAX_INTRS); > +vmxnet3_unuse_msix_vectors(s, VMXNET3_MAX_INTRS); > msix_uninit(d, &s->msix_bar, &s->msix_bar); > } > } > -- > 1.9.0 >
Re: [Qemu-devel] [Qemu-trivial] [PATCH] MAINTAINERS: Add VMWare devices maintainer
> On 29 באוג 2014, at 19:24, Michael Tokarev wrote: > > Applied to -trivial, dunno if it'll be picked up by anyone else ;) This patch is pretty trivial indeed ;) > > Thank you! > > /mjt
[Qemu-devel] [PATCH 0/5] VMWare PVSCSI paravirtual device implementation
Below is the implementation of VMWare PVSCSI device and command line parameters to configure vendor name and product name for SCSI storage are implemented. Latter is needed to make PVSCSI storage devices look exactly as on VMWare hypervisors. With this and VMWARE3 patches V2V migration problem for VMWare images should be solved relatively easy. PVSCSI implementation is based on Paolo Bonzini code sumbitted some time ago but never applied. See commit messages and file headers for details. Implementation supports of all the device features. Code was tested on different OSes: Fedora 15 Ubuntu 10.4 Centos 6.2 Windows 2008R2 Windows 2008 64bit Windows 2008 32bit Windows 2003 64bit Windows 2003 32bit Dmitry Fleytman (5): Utility function strpadcpy() added Vendor name and product name parameters for SCSI devices Options "vendor_name" and "product_name" added for SCSI disks. Header with various utility functions shared by VMWARE SCSI and network devices PVCSI paravirtualized device implementation PVSCSI paravirtualized device integration Bus type "pvscsi" added. Makefile.objs |1 + blockdev.c | 12 +- blockdev.h | 16 +- cutils.c | 13 + default-configs/pci.mak|1 + docs/specs/pvscsi-spec.txt | 92 hw/pc.c|5 + hw/pci-hotplug.c |7 +- hw/pci.h |1 + hw/pvscsi.c| 1242 hw/pvscsi.h| 442 hw/scsi-bus.c | 14 +- hw/scsi-disk.c | 51 ++- hw/scsi.h |1 + hw/vmware_utils.h | 122 + qemu-common.h |1 + 16 files changed, 1997 insertions(+), 24 deletions(-) create mode 100644 docs/specs/pvscsi-spec.txt create mode 100644 hw/pvscsi.c create mode 100644 hw/pvscsi.h create mode 100644 hw/vmware_utils.h -- 1.7.7.6
[Qemu-devel] [PATCH 1/5] Utility function strpadcpy() added
Signed-off-by: Dmitry Fleytman Signed-off-by: Yan Vugenfirer --- cutils.c | 13 + qemu-common.h |1 + 2 files changed, 14 insertions(+), 0 deletions(-) diff --git a/cutils.c b/cutils.c index af308cd..0df7fdf 100644 --- a/cutils.c +++ b/cutils.c @@ -27,6 +27,19 @@ #include "qemu_socket.h" +void strpadcpy(char *buf, int buf_size, const char *str, char pad) +{ +int i; +int has_src_data = TRUE; + +for (i = 0; i < buf_size; i++) { +if ((has_src_data) && (0 == str[i])) { +has_src_data = FALSE; +} + buf[i] = has_src_data ? str[i] : pad; +} +} + void pstrcpy(char *buf, int buf_size, const char *str) { int c; diff --git a/qemu-common.h b/qemu-common.h index b0fdf5c..fdd3d17 100644 --- a/qemu-common.h +++ b/qemu-common.h @@ -134,6 +134,7 @@ int qemu_timedate_diff(struct tm *tm); /* cutils.c */ void pstrcpy(char *buf, int buf_size, const char *str); +void strpadcpy(char *buf, int buf_size, const char *str, char pad); char *pstrcat(char *buf, int buf_size, const char *s); int strstart(const char *str, const char *val, const char **ptr); int stristart(const char *str, const char *val, const char **ptr); -- 1.7.7.6
[Qemu-devel] [PATCH 5/5] PVSCSI paravirtualized device integration Bus type "pvscsi" added.
Sample command line for pvscsi-based disk is: -drive file=image.raw,if=none,cache=off,id=pvscsi1 \ -device pvscsi,id=pvscsi -device scsi-disk,drive=pvscsi1,bus=pvscsi.0 \ Signed-off-by: Dmitry Fleytman Signed-off-by: Yan Vugenfirer --- Makefile.objs |1 + blockdev.c | 12 blockdev.h | 10 +- default-configs/pci.mak |1 + hw/pc.c |5 + hw/pci-hotplug.c|7 ++- hw/pci.h|1 + hw/scsi-bus.c | 14 -- hw/scsi.h |1 + 9 files changed, 44 insertions(+), 8 deletions(-) diff --git a/Makefile.objs b/Makefile.objs index 226b01d..bf0a351 100644 --- a/Makefile.objs +++ b/Makefile.objs @@ -304,6 +304,7 @@ hw-obj-$(CONFIG_AHCI) += ide/ich.o # SCSI layer hw-obj-$(CONFIG_LSI_SCSI_PCI) += lsi53c895a.o +hw-obj-$(CONFIG_PVSCSI_SCSI_PCI) += pvscsi.o hw-obj-$(CONFIG_ESP) += esp.o hw-obj-y += dma-helpers.o sysbus.o isa-bus.o diff --git a/blockdev.c b/blockdev.c index 1a500b8..41e8efd 100644 --- a/blockdev.c +++ b/blockdev.c @@ -32,6 +32,7 @@ static const char *const if_name[IF_COUNT] = { [IF_SD] = "sd", [IF_VIRTIO] = "virtio", [IF_XEN] = "xen", +[IF_PVSCSI] = "pvscsi", }; static const int if_max_devs[IF_COUNT] = { @@ -433,7 +434,8 @@ DriveInfo *drive_init(QemuOpts *opts, int default_to_scsi) on_write_error = BLOCK_ERR_STOP_ENOSPC; if ((buf = qemu_opt_get(opts, "werror")) != NULL) { -if (type != IF_IDE && type != IF_SCSI && type != IF_VIRTIO && type != IF_NONE) { +if (type != IF_IDE && type != IF_SCSI && type != IF_VIRTIO && +type != IF_PVSCSI && type != IF_NONE) { error_report("werror is not supported by this bus type"); return NULL; } @@ -446,7 +448,8 @@ DriveInfo *drive_init(QemuOpts *opts, int default_to_scsi) on_read_error = BLOCK_ERR_REPORT; if ((buf = qemu_opt_get(opts, "rerror")) != NULL) { -if (type != IF_IDE && type != IF_VIRTIO && type != IF_SCSI && type != IF_NONE) { +if (type != IF_IDE && type != IF_VIRTIO && type != IF_SCSI && +type != IF_PVSCSI && type != IF_NONE) { error_report("rerror is not supported by this bus type"); return NULL; } @@ -516,7 +519,7 @@ DriveInfo *drive_init(QemuOpts *opts, int default_to_scsi) } else { /* no id supplied -> create one */ dinfo->id = g_malloc0(32); -if (type == IF_IDE || type == IF_SCSI) +if (type == IF_IDE || type == IF_SCSI || type == IF_PVSCSI) mediastr = (media == MEDIA_CDROM) ? "-cd" : "-hd"; if (max_devs) snprintf(dinfo->id, 32, "%s%i%s%i", @@ -545,6 +548,7 @@ DriveInfo *drive_init(QemuOpts *opts, int default_to_scsi) case IF_IDE: case IF_SCSI: case IF_XEN: +case IF_PVSCSI: case IF_NONE: switch(media) { case MEDIA_DISK: @@ -596,7 +600,7 @@ DriveInfo *drive_init(QemuOpts *opts, int default_to_scsi) ro = 1; } else if (ro == 1) { if (type != IF_SCSI && type != IF_VIRTIO && type != IF_FLOPPY && -type != IF_NONE && type != IF_PFLASH) { +type != IF_PVSCSI && type != IF_NONE && type != IF_PFLASH) { error_report("readonly not supported by this bus type"); goto err; } diff --git a/blockdev.h b/blockdev.h index 1813c53..7c531aa 100644 --- a/blockdev.h +++ b/blockdev.h @@ -24,7 +24,15 @@ void blockdev_auto_del(BlockDriverState *bs); typedef enum { IF_DEFAULT = -1,/* for use with drive_add() only */ IF_NONE, -IF_IDE, IF_SCSI, IF_FLOPPY, IF_PFLASH, IF_MTD, IF_SD, IF_VIRTIO, IF_XEN, +IF_IDE, +IF_SCSI, +IF_FLOPPY, +IF_PFLASH, +IF_MTD, +IF_SD, +IF_VIRTIO, +IF_XEN, +IF_PVSCSI, IF_COUNT } BlockInterfaceType; diff --git a/default-configs/pci.mak b/default-configs/pci.mak index 21e4ccf..c203bf8 100644 --- a/default-configs/pci.mak +++ b/default-configs/pci.mak @@ -11,6 +11,7 @@ CONFIG_EEPRO100_PCI=y CONFIG_PCNET_PCI=y CONFIG_PCNET_COMMON=y CONFIG_LSI_SCSI_PCI=y +CONFIG_PVSCSI_SCSI_PCI=y CONFIG_RTL8139_PCI=y CONFIG_E1000_PCI=y CONFIG_IDE_CORE=y diff --git a/hw/pc.c b/hw/pc.c index 83a1b5b..2140a25 100644 --- a/hw/pc.c +++ b/hw/pc.c @@ -1175,4 +1175,9 @@ void pc_pci_device_init(PCIBus *pci_bus) for (bus = 0; bus <= max_bus; bus++) { pci_create_simple(pci_bus, -1, "lsi53c895a"); } + +max_bus = drive_get_max_bus(IF_PVSCSI); +for (bus = 0; bus <= max_bus; bus++) { +pci_create_simple(pci_bus, -1, "pvscsi"); +} } diff --g
[Qemu-devel] [PATCH 3/5] Header with various utility functions shared by VMWARE SCSI and network devices
Signed-off-by: Dmitry Fleytman Signed-off-by: Yan Vugenfirer --- hw/vmware_utils.h | 122 + 1 files changed, 122 insertions(+), 0 deletions(-) create mode 100644 hw/vmware_utils.h diff --git a/hw/vmware_utils.h b/hw/vmware_utils.h new file mode 100644 index 000..a86e691 --- /dev/null +++ b/hw/vmware_utils.h @@ -0,0 +1,122 @@ +/* + * QEMU VMWARE paravirtual devices - auxiliary code + * + * Copyright (c) 2012 Ravello Systems LTD (http://ravellosystems.com) + * + * Developed by Daynix Computing LTD (http://www.daynix.com) + * + * Authors: + * Dmitry Fleytman + * Yan Vugenfirer + * + * This work is licensed under the terms of the GNU GPL, version 2 or later. + * See the COPYING file in the top-level directory. + * + */ + +#ifndef VMWARE_UTILS_H +#define VMWARE_UTILS_H + +/* Shared memory access functions with byte swap support */ +static inline void +vmw_shmem_read(target_phys_addr_t addr, void *buf, int len) +{ +DSHPRINTF("SHMEM r: %" PRIx64 ", len: %d to %p", addr, len, buf); +cpu_physical_memory_read(addr, buf, len); +} + +static inline void +vmw_shmem_write(target_phys_addr_t addr, void *buf, int len) +{ +DSHPRINTF("SHMEM w: %" PRIx64 ", len: %d to %p", addr, len, buf); +cpu_physical_memory_write(addr, buf, len); +} + +static inline void +vmw_shmem_rw(target_phys_addr_t addr, void *buf, int len, int is_write) +{ +DSHPRINTF("SHMEM r/w: %" PRIx64 ", len: %d (to %p), is write: %d", + addr, len, buf, is_write); + +cpu_physical_memory_rw(addr, buf, len, is_write); +} + +static inline void +vmw_shmem_set(target_phys_addr_t addr, uint8 val, int len) +{ +int i; +DSHPRINTF("SHMEM set: %" PRIx64 ", len: %d (value 0x%X)", addr, len, val); + +for (i = 0; i < len; i++) { +cpu_physical_memory_write(addr + i, &val, 1); +} +} + +static inline uint32_t +vmw_shmem_ld8(target_phys_addr_t addr) +{ +uint8_t res = ldub_phys(addr); +DSHPRINTF("SHMEM load8: %" PRIx64 " (value 0x%X)", addr, res); +return res; +} + +static inline void +vmw_shmem_st8(target_phys_addr_t addr, uint8_t value) +{ +DSHPRINTF("SHMEM store8: %" PRIx64 " (value 0x%X)", addr, value); +stb_phys(addr, value); +} + +static inline uint32_t +vmw_shmem_ld16(target_phys_addr_t addr) +{ +uint16_t res = lduw_le_phys(addr); +DSHPRINTF("SHMEM load16: %" PRIx64 " (value 0x%X)", addr, res); +return res; +} + +static inline void +vmw_shmem_st16(target_phys_addr_t addr, uint16_t value) +{ +DSHPRINTF("SHMEM store16: %" PRIx64 " (value 0x%X)", addr, value); +stw_le_phys(addr, value); +} + +static inline uint32_t +vmw_shmem_ld32(target_phys_addr_t addr) +{ +uint32_t res = ldl_le_phys(addr); +DSHPRINTF("SHMEM load32: %" PRIx64 " (value 0x%X)", addr, res); +return res; +} + +static inline void +vmw_shmem_st32(target_phys_addr_t addr, uint32_t value) +{ +DSHPRINTF("SHMEM store32: %" PRIx64 " (value 0x%X)", addr, value); +stl_le_phys(addr, value); +} + +static inline uint64_t +vmw_shmem_ld64(target_phys_addr_t addr) +{ +uint64_t res = ldq_le_phys(addr); +DSHPRINTF("SHMEM load64: %" PRIx64 " (value %" PRIx64 ")", addr, res); +return res; +} + +static inline void +vmw_shmem_st64(target_phys_addr_t addr, uint64_t value) +{ +DSHPRINTF("SHMEM store64: %" PRIx64 " (value %" PRIx64 ")", addr, value); +stq_le_phys(addr, value); +} + +/* MACROS for simplification of operations on array-style registers */ +#define IS_MULTIREG_ADDR(addr, base, cnt, regsize) \ +(((addr) >= (base)) && ((addr) < (base) + (cnt) * (regsize))) + +#define MULTIREG_IDX_BY_ADDR(addr, base, regsize) \ +(((addr) - (base)) / (regsize)) + +#endif -- 1.7.7.6
[Qemu-devel] [PATCH 2/5] Vendor name and product name parameters for SCSI devices Options "vendor_name" and "product_name" added for SCSI disks.
Sample command line is: -drive file=image.raw,if=none,cache=off,id=scsi1 \ -device lsi,id=scsi -device scsi-disk,drive=scsi1,bus=scsi.0,product_name="VENDOR SCSI DISK",vendor_name="[VENDOR]" \ Signed-off-by: Dmitry Fleytman Signed-off-by: Yan Vugenfirer --- blockdev.h |6 +- hw/scsi-disk.c | 51 --- 2 files changed, 41 insertions(+), 16 deletions(-) diff --git a/blockdev.h b/blockdev.h index 260e16b..1813c53 100644 --- a/blockdev.h +++ b/blockdev.h @@ -17,7 +17,9 @@ void blockdev_mark_auto_del(BlockDriverState *bs); void blockdev_auto_del(BlockDriverState *bs); -#define BLOCK_SERIAL_STRLEN 20 +#define BLOCK_SERIAL_STRLEN 20 +#define BLOCK_VENDOR_STRLEN 8 +#define BLOCK_PRODUCT_STRLEN 16 typedef enum { IF_DEFAULT = -1,/* for use with drive_add() only */ @@ -37,6 +39,8 @@ struct DriveInfo { int media_cd; QemuOpts *opts; char serial[BLOCK_SERIAL_STRLEN + 1]; +char vname[BLOCK_VENDOR_STRLEN + 1]; +char pname[BLOCK_PRODUCT_STRLEN + 1]; QTAILQ_ENTRY(DriveInfo) next; int refcount; }; diff --git a/hw/scsi-disk.c b/hw/scsi-disk.c index add399e..0a12ea2 100644 --- a/hw/scsi-disk.c +++ b/hw/scsi-disk.c @@ -70,6 +70,8 @@ struct SCSIDiskState QEMUBH *bh; char *version; char *serial; +char *vname; +char *pname; bool tray_open; bool tray_locked; }; @@ -566,12 +568,23 @@ static int scsi_disk_emulate_inquiry(SCSIRequest *req, uint8_t *outbuf) outbuf[0] = s->qdev.type & 0x1f; outbuf[1] = s->removable ? 0x80 : 0; -if (s->qdev.type == TYPE_ROM) { -memcpy(&outbuf[16], "QEMU CD-ROM ", 16); + +if (NULL != s->pname) { +strpadcpy((char *) &outbuf[16], 16, s->pname, ' '); } else { -memcpy(&outbuf[16], "QEMU HARDDISK ", 16); +if (s->qdev.type == TYPE_ROM) { +memcpy(&outbuf[16], "QEMU CD-ROM ", 16); +} else { +memcpy(&outbuf[16], "QEMU HARDDISK ", 16); +} } -memcpy(&outbuf[8], "QEMU", 8); + +if (NULL != s->vname) { +strpadcpy((char *) &outbuf[8], 8, s->vname, ' '); +} else { +memcpy(&outbuf[8], "QEMU", 8); +} + memset(&outbuf[32], 0, 4); memcpy(&outbuf[32], s->version, MIN(4, strlen(s->version))); /* @@ -1585,14 +1598,19 @@ static int scsi_initfn(SCSIDevice *dev) return -1; } -if (!s->serial) { -/* try to fall back to value set with legacy -drive serial=... */ -dinfo = drive_get_by_blockdev(s->qdev.conf.bs); -if (*dinfo->serial) { -s->serial = g_strdup(dinfo->serial); -} -} +dinfo = drive_get_by_blockdev(s->qdev.conf.bs); +/* when no value given try to fall back to */ +/* value set with legacy -drive serial=... */ +if ((!s->serial) && (*dinfo->serial)) { +s->serial = g_strdup(dinfo->serial); +} +if ((!s->vname) && (*dinfo->vname)) { +s->vname = g_strdup(dinfo->vname); +} +if ((!s->pname) && (*dinfo->pname)) { +s->pname = g_strdup(dinfo->pname); +} if (!s->version) { s->version = g_strdup(QEMU_VERSION); } @@ -1788,10 +1806,13 @@ static SCSIRequest *scsi_block_new_request(SCSIDevice *d, uint32_t tag, } #endif -#define DEFINE_SCSI_DISK_PROPERTIES() \ -DEFINE_BLOCK_PROPERTIES(SCSIDiskState, qdev.conf), \ -DEFINE_PROP_STRING("ver", SCSIDiskState, version), \ -DEFINE_PROP_STRING("serial", SCSIDiskState, serial) +#define DEFINE_SCSI_DISK_PROPERTIES() \ +DEFINE_BLOCK_PROPERTIES(SCSIDiskState, qdev.conf),\ +DEFINE_PROP_STRING("ver", SCSIDiskState, version), \ +DEFINE_PROP_STRING("serial", SCSIDiskState, serial), \ +DEFINE_PROP_STRING("vendor_name", SCSIDiskState, vname), \ +DEFINE_PROP_STRING("product_name", SCSIDiskState, pname) + static Property scsi_hd_properties[] = { DEFINE_SCSI_DISK_PROPERTIES(), -- 1.7.7.6
[Qemu-devel] [PATCH v4 0/9] VMXNET3 paravirtual NIC device implementation
This set of patches implements VMWare VMXNET3 paravirtual NIC device. The device supports of all the device features including offload capabilties, VLANs and etc. The device is tested on different OSes: Fedora 15 Ubuntu 10.4 Centos 6.2 Windows 2008R2 Windows 2008 64bit Windows 2008 32bit Windows 2003 64bit Windows 2003 32bit Changes in V4: Fixed a few problems uncovered by NETIO test suit Assertion on failure to initialize MSI/MSI-X replaced with warning message and fallback to Legacy/MSI respectively Reported-by: Gerhard Wiesinger Various coding style adjustments and patch split-up as suggested by Anthony Liguori Reported-by: Anthony Liguori Live migration support added Changes in V3: Fixed crash when net device that is used as network fronted has no virtio HDR support. Task offloads emulation for cases when net device that is used as network fronted has no virtio HDR support. Reported-by: Gerhard Wiesinger Changes in V2: License text changed accoring to community suggestions Standard license header from GPLv2+ - licensed QEMU files used Dmitry Fleytman (9): Adding missing flag VIRTIO_NET_HDR_F_DATA_VALID from Linux kernel source tre Reformatting comments according to checkpatch.pl requirements Adding utility function net_checksum_add_cont() that allows checksum calculation of scattered data with odd chunk sizes Adding utility function iov_net_csum_add() for iovec checksum calculation MSI-X state save/load invocations moved to PCI Device save/load callbacks to avoid code duplication in MSI-X-enabled devices that support live migration Header with various utility functions shared by VMWARE SCSI and network devi Various utility functions used by VMWARE network devices Packet abstraction used by VMWARE network devices VMXNET3 paravirtual device implementation VMXNET3 paravirtualized device integration. Interface type "vmxnet3" added. Makefile.objs |1 + default-configs/pci.mak |1 + hw/pci.c|7 + hw/pci.h|1 + hw/virtio-net.h | 13 +- hw/virtio-pci.c |2 - hw/vmware_utils.h | 122 +++ hw/vmxnet3.c| 2435 +++ hw/vmxnet3.h| 757 +++ hw/vmxnet_debug.h | 121 +++ hw/vmxnet_pkt.c | 1243 hw/vmxnet_pkt.h | 479 ++ hw/vmxnet_utils.c | 165 hw/vmxnet_utils.h | 320 +++ iov.c | 29 + iov.h |3 + net.c |2 +- net/checksum.c | 13 +- net/checksum.h | 14 +- 19 files changed, 5712 insertions(+), 16 deletions(-) create mode 100644 hw/vmware_utils.h create mode 100644 hw/vmxnet3.c create mode 100644 hw/vmxnet3.h create mode 100644 hw/vmxnet_debug.h create mode 100644 hw/vmxnet_pkt.c create mode 100644 hw/vmxnet_pkt.h create mode 100644 hw/vmxnet_utils.c create mode 100644 hw/vmxnet_utils.h -- 1.7.7.6
[Qemu-devel] [PATCH v4 3/9] Adding utility function iov_net_csum_add() for iovec checksum calculation
Signed-off-by: Dmitry Fleytman Signed-off-by: Yan Vugenfirer --- iov.c | 29 + iov.h |3 +++ 2 files changed, 32 insertions(+), 0 deletions(-) diff --git a/iov.c b/iov.c index 0f96493..5d4f94c 100644 --- a/iov.c +++ b/iov.c @@ -16,6 +16,7 @@ */ #include "iov.h" +#include "net/checksum.h" size_t iov_from_buf(struct iovec *iov, unsigned int iov_cnt, const void *buf, size_t iov_off, size_t size) @@ -130,3 +131,31 @@ void iov_hexdump(const struct iovec *iov, const unsigned int iov_cnt, fprintf(fp, "\n"); } } + +uint32_t +iov_net_csum_add(const struct iovec *iov, const unsigned int iov_cnt, + size_t iov_off, size_t size) +{ +size_t iovec_off, buf_off; +unsigned int i; +uint32_t res = 0; +uint32_t seq = 0; + +iovec_off = 0; +buf_off = 0; +for (i = 0; i < iov_cnt && size; i++) { +if (iov_off < (iovec_off + iov[i].iov_len)) { +size_t len = MIN((iovec_off + iov[i].iov_len) - iov_off , size); +void *chunk_buf = iov[i].iov_base + (iov_off - iovec_off); + +res += net_checksum_add_cont(len, chunk_buf, seq); +seq += len; + +buf_off += len; +iov_off += len; +size -= len; +} +iovec_off += iov[i].iov_len; +} +return res; +} diff --git a/iov.h b/iov.h index 94d2f78..ba385f5 100644 --- a/iov.h +++ b/iov.h @@ -21,3 +21,6 @@ size_t iov_clear(const struct iovec *iov, const unsigned int iov_cnt, size_t iov_off, size_t size); void iov_hexdump(const struct iovec *iov, const unsigned int iov_cnt, FILE *fp, const char *prefix, size_t limit); +uint32_t +iov_net_csum_add(const struct iovec *iov, const unsigned int iov_cnt, + size_t iov_off, size_t size); -- 1.7.7.6
[Qemu-devel] [PATCH v4 1/9] Adding missing flag VIRTIO_NET_HDR_F_DATA_VALID from Linux kernel source tre Reformatting comments according to checkpatch.pl requirements
Signed-off-by: Dmitry Fleytman Signed-off-by: Yan Vugenfirer --- hw/virtio-net.h | 13 +++-- 1 files changed, 7 insertions(+), 6 deletions(-) diff --git a/hw/virtio-net.h b/hw/virtio-net.h index 4468741..fa3c17b 100644 --- a/hw/virtio-net.h +++ b/hw/virtio-net.h @@ -78,13 +78,14 @@ struct virtio_net_config * specify GSO or CSUM features, you can simply ignore the header. */ struct virtio_net_hdr { -#define VIRTIO_NET_HDR_F_NEEDS_CSUM 1 // Use csum_start, csum_offset +#define VIRTIO_NET_HDR_F_NEEDS_CSUM 1 /* Use csum_start, csum_offset */ +#define VIRTIO_NET_HDR_F_DATA_VALID 2 /* Csum is valid */ uint8_t flags; -#define VIRTIO_NET_HDR_GSO_NONE 0 // Not a GSO frame -#define VIRTIO_NET_HDR_GSO_TCPV41 // GSO frame, IPv4 TCP (TSO) -#define VIRTIO_NET_HDR_GSO_UDP 3 // GSO frame, IPv4 UDP (UFO) -#define VIRTIO_NET_HDR_GSO_TCPV64 // GSO frame, IPv6 TCP -#define VIRTIO_NET_HDR_GSO_ECN 0x80// TCP has ECN set +#define VIRTIO_NET_HDR_GSO_NONE 0 /* Not a GSO frame */ +#define VIRTIO_NET_HDR_GSO_TCPV41 /* GSO frame, IPv4 TCP (TSO) */ +#define VIRTIO_NET_HDR_GSO_UDP 3 /* GSO frame, IPv4 UDP (UFO) */ +#define VIRTIO_NET_HDR_GSO_TCPV64 /* GSO frame, IPv6 TCP */ +#define VIRTIO_NET_HDR_GSO_ECN 0x80 /* TCP has ECN set */ uint8_t gso_type; uint16_t hdr_len; uint16_t gso_size; -- 1.7.7.6
[Qemu-devel] [PATCH v4 9/9] VMXNET3 paravirtualized device integration. Interface type "vmxnet3" added.
Signed-off-by: Dmitry Fleytman Signed-off-by: Yan Vugenfirer --- Makefile.objs |1 + default-configs/pci.mak |1 + hw/pci.c|2 ++ hw/pci.h|1 + net.c |2 +- 5 files changed, 6 insertions(+), 1 deletions(-) diff --git a/Makefile.objs b/Makefile.objs index 226b01d..1366e86 100644 --- a/Makefile.objs +++ b/Makefile.objs @@ -284,6 +284,7 @@ hw-obj-$(CONFIG_PCNET_PCI) += pcnet-pci.o hw-obj-$(CONFIG_PCNET_COMMON) += pcnet.o hw-obj-$(CONFIG_E1000_PCI) += e1000.o hw-obj-$(CONFIG_RTL8139_PCI) += rtl8139.o +hw-obj-$(CONFIG_VMXNET3_PCI) += vmxnet3.o vmxnet_utils.o vmxnet_pkt.o hw-obj-$(CONFIG_SMC91C111) += smc91c111.o hw-obj-$(CONFIG_LAN9118) += lan9118.o diff --git a/default-configs/pci.mak b/default-configs/pci.mak index 21e4ccf..f8e6ee1 100644 --- a/default-configs/pci.mak +++ b/default-configs/pci.mak @@ -13,6 +13,7 @@ CONFIG_PCNET_COMMON=y CONFIG_LSI_SCSI_PCI=y CONFIG_RTL8139_PCI=y CONFIG_E1000_PCI=y +CONFIG_VMXNET3_PCI=y CONFIG_IDE_CORE=y CONFIG_IDE_QDEV=y CONFIG_IDE_PCI=y diff --git a/hw/pci.c b/hw/pci.c index 9146d3f..e2b0045 100644 --- a/hw/pci.c +++ b/hw/pci.c @@ -1355,6 +1355,7 @@ static const char * const pci_nic_models[] = { "e1000", "pcnet", "virtio", +"vmxnet3", NULL }; @@ -1367,6 +1368,7 @@ static const char * const pci_nic_names[] = { "e1000", "pcnet", "virtio-net-pci", +"vmxnet3", NULL }; diff --git a/hw/pci.h b/hw/pci.h index 4f19fdb..fee8250 100644 --- a/hw/pci.h +++ b/hw/pci.h @@ -60,6 +60,7 @@ #define PCI_DEVICE_ID_VMWARE_NET 0x0720 #define PCI_DEVICE_ID_VMWARE_SCSI0x0730 #define PCI_DEVICE_ID_VMWARE_IDE 0x1729 +#define PCI_DEVICE_ID_VMWARE_VMXNET3 0x07B0 /* Intel (0x8086) */ #define PCI_DEVICE_ID_INTEL_82551IT 0x1209 diff --git a/net.c b/net.c index c34474f..e2f586c 100644 --- a/net.c +++ b/net.c @@ -857,7 +857,7 @@ static const struct { }, { .name = "model", .type = QEMU_OPT_STRING, -.help = "device model (e1000, rtl8139, virtio etc.)", +.help = "device model (e1000, rtl8139, virtio, vmxnet3 etc.)", }, { .name = "addr", .type = QEMU_OPT_STRING, -- 1.7.7.6
[Qemu-devel] [PATCH v4 6/9] Various utility functions used by VMWARE network devices
Signed-off-by: Dmitry Fleytman Signed-off-by: Yan Vugenfirer --- hw/vmxnet_debug.h | 121 hw/vmxnet_utils.c | 165 +++ hw/vmxnet_utils.h | 320 + 3 files changed, 606 insertions(+), 0 deletions(-) create mode 100644 hw/vmxnet_debug.h create mode 100644 hw/vmxnet_utils.c create mode 100644 hw/vmxnet_utils.h diff --git a/hw/vmxnet_debug.h b/hw/vmxnet_debug.h new file mode 100644 index 000..cc3471f --- /dev/null +++ b/hw/vmxnet_debug.h @@ -0,0 +1,121 @@ +/* + * QEMU VMWARE VMXNET* paravirtual NICs - debugging facilities + * + * Copyright (c) 2012 Ravello Systems LTD (http://ravellosystems.com) + * + * Developed by Daynix Computing LTD (http://www.daynix.com) + * + * Authors: + * Dmitry Fleytman + * Tamir Shomer + * Yan Vugenfirer + * + * This work is licensed under the terms of the GNU GPL, version 2 or later. + * See the COPYING file in the top-level directory. + * + */ + +#ifndef _QEMU_VMXNET_DEBUG_H +#define _QEMU_VMXNET_DEBUG_H + +#ifdef VMXNET_VERSION_2 +#define VMXNET_DEVICE_NAME "vmxnet" +#elif defined VMXNET_VERSION_3 +#define VMXNET_DEVICE_NAME "vmxnet3" +#else +#error "VMXNET version is not defined" +#endif + +/* #define DEBUG_VMXNET_CB */ +#define DEBUG_VMXNET_WARNINGS +#define DEBUG_VMXNET_ERRORS +/* #define DEBUG_VMXNET_INTERRUPTS */ +/* #define DEBUG_VMXNET_CONFIG */ +/* #define DEBUG_VMXNET_RINGS */ +/* #define DEBUG_VMXNET_PACKETS */ +/* #define DEBUG_VMXNET_SHMEM_ACCESS */ + +#ifdef DEBUG_VMXNET_SHMEM_ACCESS +#define DSHPRINTF(fmt, ...) \ +do { \ +printf("[%s][SH][%s]: " fmt "\n", VMXNET_DEVICE_NAME, __func__, \ +## __VA_ARGS__); \ +} while (0) +#else +#define DSHPRINTF(fmt, ...) do {} while (0) +#endif + +#ifdef DEBUG_VMXNET_CB +#define DCBPRINTF(fmt, ...) \ +do { \ +printf("[%s][CB][%s]: " fmt "\n", VMXNET_DEVICE_NAME, __func__, \ +## __VA_ARGS__); \ +} while (0) +#else +#define DCBPRINTF(fmt, ...) do {} while (0) +#endif + +#ifdef DEBUG_VMXNET_PACKETS +#define DPKPRINTF(fmt, ...) \ +do { \ +printf("[%s][PK][%s]: " fmt "\n", VMXNET_DEVICE_NAME, __func__, \ +## __VA_ARGS__); \ +} while (0) +#else +#define DPKPRINTF(fmt, ...) do {} while (0) +#endif + +#ifdef DEBUG_VMXNET_WARNINGS +#define DWRPRINTF(fmt, ...) \ +do { \ +printf("[%s][WR][%s]: " fmt "\n", VMXNET_DEVICE_NAME, __func__, \ +## __VA_ARGS__); \ +} while (0) +#else +#define DWRPRINTF(fmt, ...) do {} while (0) +#endif + +#ifdef DEBUG_VMXNET_ERRORS +#define DERPRINTF(fmt, ...) \ +do { \ +printf("[%s][ER][%s]: " fmt "\n", VMXNET_DEVICE_NAME, __func__, \ +## __VA_ARGS__); \ +} while (0) +#else +#define DERPRINTF(fmt, ...) do {} while (0) +#endif + +#ifdef DEBUG_VMXNET_INTERRUPTS +#define DIRPRINTF(fmt, ...) \ +do { \ +printf("[%s][IR][%s]: " fmt "\n", VMXNET_DEVICE_NAME, __func__, \ +## __VA_ARGS__); \ +} while (0) +#else +#define DIRPRINTF(fmt, ...) do {} while (0) +#endif + +#ifdef DEBUG_VMXNET_CONFIG +#define DCFPRINTF(fmt, ...) \ +do { \ +printf("[%s][CF][%s]: " fmt "\n", VMXNET_DEVICE_NAME, __func__, \ +## __VA_ARGS__); \ +} while (0) +#else +#define DCFPRINTF(fmt, ...) do {} while (0) +#endif + +#ifdef DEBUG_VMXNET_RINGS +#define DRIPRINTF(fmt, ...) \ +do { \ +printf("[%s][RI][%s]: " fmt "\n", VMXNE
Re: [Qemu-devel] [PATCH 0/5] VMWare PVSCSI paravirtual device implementation
Avi, We are considering this option as well... Dmitry Fleytman.
[Qemu-devel] [PATCH v4 7/9] Packet abstraction used by VMWARE network devices
Signed-off-by: Dmitry Fleytman Signed-off-by: Yan Vugenfirer --- hw/vmxnet_pkt.c | 1243 +++ hw/vmxnet_pkt.h | 479 + 2 files changed, 1722 insertions(+), 0 deletions(-) create mode 100644 hw/vmxnet_pkt.c create mode 100644 hw/vmxnet_pkt.h diff --git a/hw/vmxnet_pkt.c b/hw/vmxnet_pkt.c new file mode 100644 index 000..5fe2672 --- /dev/null +++ b/hw/vmxnet_pkt.c @@ -0,0 +1,1243 @@ +/* + * QEMU VMWARE VMXNET* paravirtual NICs - packets abstractions + * + * Copyright (c) 2012 Ravello Systems LTD (http://ravellosystems.com) + * + * Developed by Daynix Computing LTD (http://www.daynix.com) + * + * Authors: + * Dmitry Fleytman + * Tamir Shomer + * Yan Vugenfirer + * + * This work is licensed under the terms of the GNU GPL, version 2 or later. + * See the COPYING file in the top-level directory. + * + */ + +#include "vmxnet_pkt.h" +#include "vmxnet_utils.h" +#include "iov.h" + +#include "net/checksum.h" + +/*= + *= + * + *TX CODE + * + *= + *===*/ + +enum { +VMXNET_TX_PKT_VHDR_FRAG = 0, +VMXNET_TX_PKT_L2HDR_FRAG, +VMXNET_TX_PKT_L3HDR_FRAG, +VMXNET_TX_PKT_PL_START_FRAG +}; + +/* TX packet private context */ +typedef struct _Vmxnet_TxPkt { +struct virtio_net_hdr virt_hdr; +bool has_virt_hdr; + +struct iovec *vec; + +uint8_t l2_hdr[ETH_MAX_L2_HDR_LEN]; +uint8_t l3_hdr[ETH_MAX_L3_HDR_LEN]; + +uint32_t payload_len; + +uint32_t payload_frags; +uint32_t max_payload_frags; + +uint16_t hdr_len; +eth_pkt_types_e packet_type; +uint16_t l3_proto; +} Vmxnet_TxPkt; + +/** + * + * Function: vmxnet_tx_pkt_init + * + * Desc: Init function for tx packet functionality. + * + * Params: (OUT) pkt - private handle. + * (IN) max_frags - max tx ip fragments. + * (IN) has_virt_hdr - device uses virtio header. + * + * Return: 0 on success, -1 on error + * + * Scope: Global + * + */ +int vmxnet_tx_pkt_init(Vmxnet_TxPkt_h *pkt, uint32_t max_frags, +bool has_virt_hdr) +{ +int rc = 0; + +Vmxnet_TxPkt *p = g_malloc(sizeof *p); +if (!p) { +rc = -1; +goto Exit; +} + +memset(p, 0, sizeof *p); + +p->vec = g_malloc((sizeof *p->vec) * +(max_frags + VMXNET_TX_PKT_PL_START_FRAG)); +if (!p->vec) { +rc = -1; +goto Exit; +} + +p->max_payload_frags = max_frags; +p->has_virt_hdr = has_virt_hdr; +p->vec[VMXNET_TX_PKT_VHDR_FRAG].iov_base = &p->virt_hdr; +p->vec[VMXNET_TX_PKT_VHDR_FRAG].iov_len = +p->has_virt_hdr ? sizeof p->virt_hdr : 0; +p->vec[VMXNET_TX_PKT_L2HDR_FRAG].iov_base = &p->l2_hdr; +p->vec[VMXNET_TX_PKT_L3HDR_FRAG].iov_base = &p->l3_hdr; + +*pkt = p; + +Exit: +if (rc) { +vmxnet_tx_pkt_uninit(p); +} +return rc; +} + +/** + * + * Function: vmxnet_tx_pkt_uninit + * + * Desc: Clean all tx packet resources. + * + * Params: (IN) pkt - private handle. + * + * Return: nothing + * + * Scope: Global + * + */ +void vmxnet_tx_pkt_uninit(Vmxnet_TxPkt_h pkt) +{ +Vmxnet_TxPkt *p = (Vmxnet_TxPkt *)pkt; + +if (p) { +if (p->vec) { +g_free(p->vec); +} + +g_free(p); +} +} + +/** + * + * Function: vmxnet_tx_pkt_update_ip_checksums + * + * Desc: fix ip header fields and calculate checksums needed. + * + * Params: (IN) pkt - private handle. + * + * Return: Nothing. + * + * Scope: Global + * + */ +void vmxnet_tx_pkt_update_ip_checksums(Vmxnet_TxPkt_h pkt) +{ +uint16_t csum; +Vmxnet_TxPkt *p = (Vmxnet_TxPkt *)pkt; +assert(p); +uint8_t gso_type = p->virt_hdr.gso_type & ~VIRTIO_NET_HDR_GSO_ECN; +struct ip_header *ip_hdr; +target_phys_addr_t payload = (target_phys_addr_t) +(uint64_t) p->vec[VMXNET_TX_PKT_PL_START_FRAG].iov_base; + +if (VIRTIO_NET_HDR_GSO_TCPV4 != gso_type && +VIRTIO_NET_HDR_GSO_UDP != gso_type) { +return; +} + +ip_hdr = p->vec[VMXNET_TX_PKT_L3HDR_FRAG].iov_base; + +if (p->payload_len + p->vec[VMXNET_TX_PKT_L3HDR_FRAG].iov_len > +ETH_MAX_IP_
[Qemu-devel] [PATCH v4 4/9] MSI-X state save/load invocations moved to PCI Device save/load callbacks to avoid code duplication in MSI-X-enabled devices that support live migration
Signed-off-by: Dmitry Fleytman Signed-off-by: Yan Vugenfirer --- hw/pci.c|5 + hw/virtio-pci.c |2 -- 2 files changed, 5 insertions(+), 2 deletions(-) diff --git a/hw/pci.c b/hw/pci.c index bf046bf..9146d3f 100644 --- a/hw/pci.c +++ b/hw/pci.c @@ -31,6 +31,7 @@ #include "loader.h" #include "range.h" #include "qmp-commands.h" +#include "msix.h" //#define DEBUG_PCI #ifdef DEBUG_PCI @@ -387,6 +388,8 @@ static int get_pci_irq_state(QEMUFile *f, void *pv, size_t size) pci_set_irq_state(s, i, irq_state[i]); } +msix_load(s, f); + return 0; } @@ -398,6 +401,8 @@ static void put_pci_irq_state(QEMUFile *f, void *pv, size_t size) for (i = 0; i < PCI_NUM_PINS; ++i) { qemu_put_be32(f, pci_irq_state(s, i)); } + +msix_save(s, f); } static VMStateInfo vmstate_info_pci_irq_state = { diff --git a/hw/virtio-pci.c b/hw/virtio-pci.c index a0fb7c1..2f3cb1f 100644 --- a/hw/virtio-pci.c +++ b/hw/virtio-pci.c @@ -110,7 +110,6 @@ static void virtio_pci_save_config(void * opaque, QEMUFile *f) { VirtIOPCIProxy *proxy = opaque; pci_device_save(&proxy->pci_dev, f); -msix_save(&proxy->pci_dev, f); if (msix_present(&proxy->pci_dev)) qemu_put_be16(f, proxy->vdev->config_vector); } @@ -130,7 +129,6 @@ static int virtio_pci_load_config(void * opaque, QEMUFile *f) if (ret) { return ret; } -msix_load(&proxy->pci_dev, f); if (msix_present(&proxy->pci_dev)) { qemu_get_be16s(f, &proxy->vdev->config_vector); } else { -- 1.7.7.6
[Qemu-devel] [PATCH v4 2/9] Adding utility function net_checksum_add_cont() that allows checksum calculation of scattered data with odd chunk sizes
Adding utility function net_raw_checksum() that calculates checksum of buffer given Signed-off-by: Dmitry Fleytman Signed-off-by: Yan Vugenfirer --- net/checksum.c | 13 +++-- net/checksum.h | 14 +- 2 files changed, 20 insertions(+), 7 deletions(-) diff --git a/net/checksum.c b/net/checksum.c index 9919b2e..4fa5563 100644 --- a/net/checksum.c +++ b/net/checksum.c @@ -20,16 +20,17 @@ #define PROTO_TCP 6 #define PROTO_UDP 17 -uint32_t net_checksum_add(int len, uint8_t *buf) +uint32_t net_checksum_add_cont(int len, uint8_t *buf, int seq) { uint32_t sum = 0; int i; -for (i = 0; i < len; i++) { - if (i & 1) - sum += (uint32_t)buf[i]; - else - sum += (uint32_t)buf[i] << 8; +for (i = seq; i < seq + len; i++) { +if (i & 1) { +sum += (uint32_t)buf[i - seq]; +} else { +sum += (uint32_t)buf[i - seq] << 8; +} } return sum; } diff --git a/net/checksum.h b/net/checksum.h index 1f05298..171924c 100644 --- a/net/checksum.h +++ b/net/checksum.h @@ -20,10 +20,22 @@ #include -uint32_t net_checksum_add(int len, uint8_t *buf); +uint32_t net_checksum_add_cont(int len, uint8_t *buf, int seq); uint16_t net_checksum_finish(uint32_t sum); uint16_t net_checksum_tcpudp(uint16_t length, uint16_t proto, uint8_t *addrs, uint8_t *buf); void net_checksum_calculate(uint8_t *data, int length); +static inline uint32_t +net_checksum_add(int len, uint8_t *buf) +{ +return net_checksum_add_cont(len, buf, 0); +} + +static inline uint16_t +net_raw_checksum(uint8_t *data, int length) +{ + return net_checksum_finish(net_checksum_add(length, data)); +} + #endif /* QEMU_NET_CHECKSUM_H */ -- 1.7.7.6
[Qemu-devel] [PATCH v4 5/9] Header with various utility functions shared by VMWARE SCSI and network devi
Signed-off-by: Dmitry Fleytman Signed-off-by: Yan Vugenfirer --- hw/vmware_utils.h | 122 + 1 files changed, 122 insertions(+), 0 deletions(-) create mode 100644 hw/vmware_utils.h diff --git a/hw/vmware_utils.h b/hw/vmware_utils.h new file mode 100644 index 000..a86e691 --- /dev/null +++ b/hw/vmware_utils.h @@ -0,0 +1,122 @@ +/* + * QEMU VMWARE paravirtual devices - auxiliary code + * + * Copyright (c) 2012 Ravello Systems LTD (http://ravellosystems.com) + * + * Developed by Daynix Computing LTD (http://www.daynix.com) + * + * Authors: + * Dmitry Fleytman + * Yan Vugenfirer + * + * This work is licensed under the terms of the GNU GPL, version 2 or later. + * See the COPYING file in the top-level directory. + * + */ + +#ifndef VMWARE_UTILS_H +#define VMWARE_UTILS_H + +/* Shared memory access functions with byte swap support */ +static inline void +vmw_shmem_read(target_phys_addr_t addr, void *buf, int len) +{ +DSHPRINTF("SHMEM r: %" PRIx64 ", len: %d to %p", addr, len, buf); +cpu_physical_memory_read(addr, buf, len); +} + +static inline void +vmw_shmem_write(target_phys_addr_t addr, void *buf, int len) +{ +DSHPRINTF("SHMEM w: %" PRIx64 ", len: %d to %p", addr, len, buf); +cpu_physical_memory_write(addr, buf, len); +} + +static inline void +vmw_shmem_rw(target_phys_addr_t addr, void *buf, int len, int is_write) +{ +DSHPRINTF("SHMEM r/w: %" PRIx64 ", len: %d (to %p), is write: %d", + addr, len, buf, is_write); + +cpu_physical_memory_rw(addr, buf, len, is_write); +} + +static inline void +vmw_shmem_set(target_phys_addr_t addr, uint8 val, int len) +{ +int i; +DSHPRINTF("SHMEM set: %" PRIx64 ", len: %d (value 0x%X)", addr, len, val); + +for (i = 0; i < len; i++) { +cpu_physical_memory_write(addr + i, &val, 1); +} +} + +static inline uint32_t +vmw_shmem_ld8(target_phys_addr_t addr) +{ +uint8_t res = ldub_phys(addr); +DSHPRINTF("SHMEM load8: %" PRIx64 " (value 0x%X)", addr, res); +return res; +} + +static inline void +vmw_shmem_st8(target_phys_addr_t addr, uint8_t value) +{ +DSHPRINTF("SHMEM store8: %" PRIx64 " (value 0x%X)", addr, value); +stb_phys(addr, value); +} + +static inline uint32_t +vmw_shmem_ld16(target_phys_addr_t addr) +{ +uint16_t res = lduw_le_phys(addr); +DSHPRINTF("SHMEM load16: %" PRIx64 " (value 0x%X)", addr, res); +return res; +} + +static inline void +vmw_shmem_st16(target_phys_addr_t addr, uint16_t value) +{ +DSHPRINTF("SHMEM store16: %" PRIx64 " (value 0x%X)", addr, value); +stw_le_phys(addr, value); +} + +static inline uint32_t +vmw_shmem_ld32(target_phys_addr_t addr) +{ +uint32_t res = ldl_le_phys(addr); +DSHPRINTF("SHMEM load32: %" PRIx64 " (value 0x%X)", addr, res); +return res; +} + +static inline void +vmw_shmem_st32(target_phys_addr_t addr, uint32_t value) +{ +DSHPRINTF("SHMEM store32: %" PRIx64 " (value 0x%X)", addr, value); +stl_le_phys(addr, value); +} + +static inline uint64_t +vmw_shmem_ld64(target_phys_addr_t addr) +{ +uint64_t res = ldq_le_phys(addr); +DSHPRINTF("SHMEM load64: %" PRIx64 " (value %" PRIx64 ")", addr, res); +return res; +} + +static inline void +vmw_shmem_st64(target_phys_addr_t addr, uint64_t value) +{ +DSHPRINTF("SHMEM store64: %" PRIx64 " (value %" PRIx64 ")", addr, value); +stq_le_phys(addr, value); +} + +/* MACROS for simplification of operations on array-style registers */ +#define IS_MULTIREG_ADDR(addr, base, cnt, regsize) \ +(((addr) >= (base)) && ((addr) < (base) + (cnt) * (regsize))) + +#define MULTIREG_IDX_BY_ADDR(addr, base, regsize) \ +(((addr) - (base)) / (regsize)) + +#endif -- 1.7.7.6
Re: [Qemu-devel] [PATCH v4 4/9] MSI-X state save/load invocations moved to PCI Device save/load callbacks to avoid code duplication in MSI-X-enabled devices that support live migration
Michael, Great. I believe higher level API if what really needed here. I'll revert this patch and move msix_load/store invocations into the device code. Thanks. On Fri, Mar 16, 2012 at 1:00 AM, Michael S. Tsirkin wrote: > On Thu, Mar 15, 2012 at 11:09:03PM +0200, Dmitry Fleytman wrote: >> Signed-off-by: Dmitry Fleytman >> Signed-off-by: Yan Vugenfirer > > I'm working on a higher level API that will > handle all capabilities. For now, pls just put > these calls in your device. > > >> --- >> hw/pci.c | 5 + >> hw/virtio-pci.c | 2 -- >> 2 files changed, 5 insertions(+), 2 deletions(-) >> >> diff --git a/hw/pci.c b/hw/pci.c >> index bf046bf..9146d3f 100644 >> --- a/hw/pci.c >> +++ b/hw/pci.c >> @@ -31,6 +31,7 @@ >> #include "loader.h" >> #include "range.h" >> #include "qmp-commands.h" >> +#include "msix.h" >> >> //#define DEBUG_PCI >> #ifdef DEBUG_PCI >> @@ -387,6 +388,8 @@ static int get_pci_irq_state(QEMUFile *f, void *pv, >> size_t size) >> pci_set_irq_state(s, i, irq_state[i]); >> } >> >> + msix_load(s, f); >> + >> return 0; >> } >> >> @@ -398,6 +401,8 @@ static void put_pci_irq_state(QEMUFile *f, void *pv, >> size_t size) >> for (i = 0; i < PCI_NUM_PINS; ++i) { >> qemu_put_be32(f, pci_irq_state(s, i)); >> } >> + >> + msix_save(s, f); >> } >> >> static VMStateInfo vmstate_info_pci_irq_state = { >> diff --git a/hw/virtio-pci.c b/hw/virtio-pci.c >> index a0fb7c1..2f3cb1f 100644 >> --- a/hw/virtio-pci.c >> +++ b/hw/virtio-pci.c >> @@ -110,7 +110,6 @@ static void virtio_pci_save_config(void * opaque, >> QEMUFile *f) >> { >> VirtIOPCIProxy *proxy = opaque; >> pci_device_save(&proxy->pci_dev, f); >> - msix_save(&proxy->pci_dev, f); >> if (msix_present(&proxy->pci_dev)) >> qemu_put_be16(f, proxy->vdev->config_vector); >> } >> @@ -130,7 +129,6 @@ static int virtio_pci_load_config(void * opaque, >> QEMUFile *f) >> if (ret) { >> return ret; >> } >> - msix_load(&proxy->pci_dev, f); >> if (msix_present(&proxy->pci_dev)) { >> qemu_get_be16s(f, &proxy->vdev->config_vector); >> } else { >> -- >> 1.7.7.6
Re: [Qemu-devel] [PATCH 5/5] PVSCSI paravirtualized device integration Bus type "pvscsi" added.
Good point. Fixed. Thanks. On Thu, Mar 15, 2012 at 11:46 AM, Paolo Bonzini wrote: > Il 15/03/2012 10:02, Dmitry Fleytman ha scritto: >> Sample command line for pvscsi-based disk is: >> -drive file=image.raw,if=none,cache=off,id=pvscsi1 \ >> -device pvscsi,id=pvscsi -device scsi-disk,drive=pvscsi1,bus=pvscsi.0 \ >> >> Signed-off-by: Dmitry Fleytman >> Signed-off-by: Yan Vugenfirer >> --- >> Makefile.objs | 1 + >> blockdev.c | 12 >> blockdev.h | 10 +- >> default-configs/pci.mak | 1 + >> hw/pc.c | 5 + >> hw/pci-hotplug.c | 7 ++- >> hw/pci.h | 1 + >> hw/scsi-bus.c | 14 -- >> hw/scsi.h | 1 + >> 9 files changed, 44 insertions(+), 8 deletions(-) >> >> diff --git a/Makefile.objs b/Makefile.objs >> index 226b01d..bf0a351 100644 >> --- a/Makefile.objs >> +++ b/Makefile.objs >> @@ -304,6 +304,7 @@ hw-obj-$(CONFIG_AHCI) += ide/ich.o >> >> # SCSI layer >> hw-obj-$(CONFIG_LSI_SCSI_PCI) += lsi53c895a.o >> +hw-obj-$(CONFIG_PVSCSI_SCSI_PCI) += pvscsi.o >> hw-obj-$(CONFIG_ESP) += esp.o >> >> hw-obj-y += dma-helpers.o sysbus.o isa-bus.o >> diff --git a/blockdev.c b/blockdev.c >> index 1a500b8..41e8efd 100644 >> --- a/blockdev.c >> +++ b/blockdev.c >> @@ -32,6 +32,7 @@ static const char *const if_name[IF_COUNT] = { >> [IF_SD] = "sd", >> [IF_VIRTIO] = "virtio", >> [IF_XEN] = "xen", >> + [IF_PVSCSI] = "pvscsi", >> }; >> >> static const int if_max_devs[IF_COUNT] = { >> @@ -433,7 +434,8 @@ DriveInfo *drive_init(QemuOpts *opts, int >> default_to_scsi) >> >> on_write_error = BLOCK_ERR_STOP_ENOSPC; >> if ((buf = qemu_opt_get(opts, "werror")) != NULL) { >> - if (type != IF_IDE && type != IF_SCSI && type != IF_VIRTIO && type >> != IF_NONE) { >> + if (type != IF_IDE && type != IF_SCSI && type != IF_VIRTIO && >> + type != IF_PVSCSI && type != IF_NONE) { >> error_report("werror is not supported by this bus type"); >> return NULL; >> } >> @@ -446,7 +448,8 @@ DriveInfo *drive_init(QemuOpts *opts, int >> default_to_scsi) >> >> on_read_error = BLOCK_ERR_REPORT; >> if ((buf = qemu_opt_get(opts, "rerror")) != NULL) { >> - if (type != IF_IDE && type != IF_VIRTIO && type != IF_SCSI && type >> != IF_NONE) { >> + if (type != IF_IDE && type != IF_VIRTIO && type != IF_SCSI && >> + type != IF_PVSCSI && type != IF_NONE) { >> error_report("rerror is not supported by this bus type"); >> return NULL; >> } >> @@ -516,7 +519,7 @@ DriveInfo *drive_init(QemuOpts *opts, int >> default_to_scsi) >> } else { >> /* no id supplied -> create one */ >> dinfo->id = g_malloc0(32); >> - if (type == IF_IDE || type == IF_SCSI) >> + if (type == IF_IDE || type == IF_SCSI || type == IF_PVSCSI) >> mediastr = (media == MEDIA_CDROM) ? "-cd" : "-hd"; >> if (max_devs) >> snprintf(dinfo->id, 32, "%s%i%s%i", >> @@ -545,6 +548,7 @@ DriveInfo *drive_init(QemuOpts *opts, int >> default_to_scsi) >> case IF_IDE: >> case IF_SCSI: >> case IF_XEN: >> + case IF_PVSCSI: >> case IF_NONE: >> switch(media) { >> case MEDIA_DISK: >> @@ -596,7 +600,7 @@ DriveInfo *drive_init(QemuOpts *opts, int >> default_to_scsi) >> ro = 1; >> } else if (ro == 1) { >> if (type != IF_SCSI && type != IF_VIRTIO && type != IF_FLOPPY && >> - type != IF_NONE && type != IF_PFLASH) { >> + type != IF_PVSCSI && type != IF_NONE && type != IF_PFLASH) { >> error_report("readonly not supported by this bus type"); >> goto err; >> } >> diff --git a/blockdev.h b/blockdev.h >> index 1813c53..7c531aa 100644 >> --- a/blockdev.h >> +++ b/blockdev.h >> @@ -24,7 +24,15 @@ void blockdev_auto_del(BlockDriverState *bs); >> typedef enum { >> IF_DEFAULT = -1, /* for use with drive_
Re: [Qemu-devel] [PATCH 1/5] Utility function strpadcpy() added
Wow! Someone still remembers Pascal. It was a long time I didn't hear about it. I think I still have some code I wrote for old DOS TurboPascal with TurboVision, maybe I'll publish it somewhere :) Anyway, I believe that difference is rather minor, but let it be... My implementation replaced with your one. On Thu, Mar 15, 2012 at 11:53 AM, Paolo Bonzini wrote: > Il 15/03/2012 10:02, Dmitry Fleytman ha scritto: >> Signed-off-by: Dmitry Fleytman >> Signed-off-by: Yan Vugenfirer >> --- >> cutils.c | 13 + >> qemu-common.h | 1 + >> 2 files changed, 14 insertions(+), 0 deletions(-) >> >> diff --git a/cutils.c b/cutils.c >> index af308cd..0df7fdf 100644 >> --- a/cutils.c >> +++ b/cutils.c >> @@ -27,6 +27,19 @@ >> >> #include "qemu_socket.h" >> >> +void strpadcpy(char *buf, int buf_size, const char *str, char pad) >> +{ >> + int i; >> + int has_src_data = TRUE; >> + >> + for (i = 0; i < buf_size; i++) { >> + if ((has_src_data) && (0 == str[i])) { >> + has_src_data = FALSE; >> + } >> + buf[i] = has_src_data ? str[i] : pad; >> + } > > No parentheses around simple if conditions, this is not Pascal. :) But > since you're at it, why not the simpler: > > int len = qemu_strnlen(str, buf_size); > memcpy(buf, str, len); > memset(buf + len, pad, buf_size - len); > >> +} >> + >> void pstrcpy(char *buf, int buf_size, const char *str) >> { >> int c; >> diff --git a/qemu-common.h b/qemu-common.h >> index b0fdf5c..fdd3d17 100644 >> --- a/qemu-common.h >> +++ b/qemu-common.h >> @@ -134,6 +134,7 @@ int qemu_timedate_diff(struct tm *tm); >> >> /* cutils.c */ >> void pstrcpy(char *buf, int buf_size, const char *str); >> +void strpadcpy(char *buf, int buf_size, const char *str, char pad); >> char *pstrcat(char *buf, int buf_size, const char *s); >> int strstart(const char *str, const char *val, const char **ptr); >> int stristart(const char *str, const char *val, const char **ptr); >
Re: [Qemu-devel] [PATCH 3/5] Header with various utility functions shared by VMWARE SCSI and network devices
On Thu, Mar 15, 2012 at 11:56 AM, Paolo Bonzini wrote: > Il 15/03/2012 10:02, Dmitry Fleytman ha scritto: >> Signed-off-by: Dmitry Fleytman >> Signed-off-by: Yan Vugenfirer >> --- >> hw/vmware_utils.h | 122 >> + >> 1 files changed, 122 insertions(+), 0 deletions(-) >> create mode 100644 hw/vmware_utils.h >> >> diff --git a/hw/vmware_utils.h b/hw/vmware_utils.h >> new file mode 100644 >> index 000..a86e691 >> --- /dev/null >> +++ b/hw/vmware_utils.h >> @@ -0,0 +1,122 @@ >> +/* >> + * QEMU VMWARE paravirtual devices - auxiliary code >> + * >> + * Copyright (c) 2012 Ravello Systems LTD (http://ravellosystems.com) >> + * >> + * Developed by Daynix Computing LTD (http://www.daynix.com) >> + * >> + * Authors: >> + * Dmitry Fleytman >> + * Yan Vugenfirer >> + * >> + * This work is licensed under the terms of the GNU GPL, version 2 or later. >> + * See the COPYING file in the top-level directory. >> + * >> + */ >> + >> +#ifndef VMWARE_UTILS_H >> +#define VMWARE_UTILS_H >> + >> +/* Shared memory access functions with byte swap support */ >> +static inline void >> +vmw_shmem_read(target_phys_addr_t addr, void *buf, int len) >> +{ >> + DSHPRINTF("SHMEM r: %" PRIx64 ", len: %d to %p", addr, len, buf); > > Please add an #ifndef DSHPRINTF that defines it to nothing. Done. > >> + cpu_physical_memory_read(addr, buf, len); >> +} >> + >> +static inline void >> +vmw_shmem_write(target_phys_addr_t addr, void *buf, int len) >> +{ >> + DSHPRINTF("SHMEM w: %" PRIx64 ", len: %d to %p", addr, len, buf); >> + cpu_physical_memory_write(addr, buf, len); >> +} >> + >> +static inline void >> +vmw_shmem_rw(target_phys_addr_t addr, void *buf, int len, int is_write) >> +{ >> + DSHPRINTF("SHMEM r/w: %" PRIx64 ", len: %d (to %p), is write: %d", >> + addr, len, buf, is_write); >> + >> + cpu_physical_memory_rw(addr, buf, len, is_write); >> +} >> + >> +static inline void >> +vmw_shmem_set(target_phys_addr_t addr, uint8 val, int len) >> +{ >> + int i; >> + DSHPRINTF("SHMEM set: %" PRIx64 ", len: %d (value 0x%X)", addr, len, >> val); >> + >> + for (i = 0; i < len; i++) { >> + cpu_physical_memory_write(addr + i, &val, 1); >> + } >> +} >> + >> +static inline uint32_t >> +vmw_shmem_ld8(target_phys_addr_t addr) >> +{ >> + uint8_t res = ldub_phys(addr); >> + DSHPRINTF("SHMEM load8: %" PRIx64 " (value 0x%X)", addr, res); >> + return res; >> +} >> + >> +static inline void >> +vmw_shmem_st8(target_phys_addr_t addr, uint8_t value) >> +{ >> + DSHPRINTF("SHMEM store8: %" PRIx64 " (value 0x%X)", addr, value); >> + stb_phys(addr, value); >> +} >> + >> +static inline uint32_t >> +vmw_shmem_ld16(target_phys_addr_t addr) >> +{ >> + uint16_t res = lduw_le_phys(addr); >> + DSHPRINTF("SHMEM load16: %" PRIx64 " (value 0x%X)", addr, res); >> + return res; >> +} >> + >> +static inline void >> +vmw_shmem_st16(target_phys_addr_t addr, uint16_t value) >> +{ >> + DSHPRINTF("SHMEM store16: %" PRIx64 " (value 0x%X)", addr, value); >> + stw_le_phys(addr, value); >> +} >> + >> +static inline uint32_t >> +vmw_shmem_ld32(target_phys_addr_t addr) >> +{ >> + uint32_t res = ldl_le_phys(addr); >> + DSHPRINTF("SHMEM load32: %" PRIx64 " (value 0x%X)", addr, res); >> + return res; >> +} >> + >> +static inline void >> +vmw_shmem_st32(target_phys_addr_t addr, uint32_t value) >> +{ >> + DSHPRINTF("SHMEM store32: %" PRIx64 " (value 0x%X)", addr, value); >> + stl_le_phys(addr, value); >> +} >> + >> +static inline uint64_t >> +vmw_shmem_ld64(target_phys_addr_t addr) >> +{ >> + uint64_t res = ldq_le_phys(addr); >> + DSHPRINTF("SHMEM load64: %" PRIx64 " (value %" PRIx64 ")", addr, res); >> + return res; >> +} >> + >> +static inline void >> +vmw_shmem_st64(target_phys_addr_t addr, uint64_t value) >> +{ >> + DSHPRINTF("SHMEM store64: %" PRIx64 " (value %" PRIx64 ")", addr, >> value); >> + stq_le_phys(addr, value); >> +} >> + >> +/* MACROS for simplification of operations on array-style registers */ >> +#define IS_MULTIREG_ADDR(addr, base, cnt, regsize) \ >> + (((addr) >= (base)) && ((addr) < (base) + (cnt) * (regsize))) >> + >> +#define MULTIREG_IDX_BY_ADDR(addr, base, regsize) \ >> + (((addr) - (base)) / (regsize)) >> + >> +#endif > > Otherwise looks good. > > Paolo
Re: [Qemu-devel] [PATCH 2/5] Vendor name and product name parameters for SCSI devices Options "vendor_name" and "product_name" added for SCSI disks.
Unused stuff cleaned out. On Thu, Mar 15, 2012 at 11:55 AM, Paolo Bonzini wrote: > Il 15/03/2012 10:02, Dmitry Fleytman ha scritto: >> Sample command line is: >> >> -drive file=image.raw,if=none,cache=off,id=scsi1 \ >> -device lsi,id=scsi -device >> scsi-disk,drive=scsi1,bus=scsi.0,product_name="VENDOR SCSI >> DISK",vendor_name="[VENDOR]" \ >> >> Signed-off-by: Dmitry Fleytman >> Signed-off-by: Yan Vugenfirer >> --- >> blockdev.h | 6 +- >> hw/scsi-disk.c | 51 --- >> 2 files changed, 41 insertions(+), 16 deletions(-) >> >> diff --git a/blockdev.h b/blockdev.h >> index 260e16b..1813c53 100644 >> --- a/blockdev.h >> +++ b/blockdev.h >> @@ -17,7 +17,9 @@ >> void blockdev_mark_auto_del(BlockDriverState *bs); >> void blockdev_auto_del(BlockDriverState *bs); >> >> -#define BLOCK_SERIAL_STRLEN 20 >> +#define BLOCK_SERIAL_STRLEN 20 >> +#define BLOCK_VENDOR_STRLEN 8 >> +#define BLOCK_PRODUCT_STRLEN 16 >> >> typedef enum { >> IF_DEFAULT = -1, /* for use with drive_add() only */ >> @@ -37,6 +39,8 @@ struct DriveInfo { >> int media_cd; >> QemuOpts *opts; >> char serial[BLOCK_SERIAL_STRLEN + 1]; >> + char vname[BLOCK_VENDOR_STRLEN + 1]; >> + char pname[BLOCK_PRODUCT_STRLEN + 1]; >> QTAILQ_ENTRY(DriveInfo) next; >> int refcount; >> }; > > Unused. > >> diff --git a/hw/scsi-disk.c b/hw/scsi-disk.c >> index add399e..0a12ea2 100644 >> --- a/hw/scsi-disk.c >> +++ b/hw/scsi-disk.c >> @@ -70,6 +70,8 @@ struct SCSIDiskState >> QEMUBH *bh; >> char *version; >> char *serial; >> + char *vname; >> + char *pname; >> bool tray_open; >> bool tray_locked; >> }; >> @@ -566,12 +568,23 @@ static int scsi_disk_emulate_inquiry(SCSIRequest *req, >> uint8_t *outbuf) >> >> outbuf[0] = s->qdev.type & 0x1f; >> outbuf[1] = s->removable ? 0x80 : 0; >> - if (s->qdev.type == TYPE_ROM) { >> - memcpy(&outbuf[16], "QEMU CD-ROM ", 16); >> + >> + if (NULL != s->pname) { >> + strpadcpy((char *) &outbuf[16], 16, s->pname, ' '); >> } else { >> - memcpy(&outbuf[16], "QEMU HARDDISK ", 16); >> + if (s->qdev.type == TYPE_ROM) { >> + memcpy(&outbuf[16], "QEMU CD-ROM ", 16); >> + } else { >> + memcpy(&outbuf[16], "QEMU HARDDISK ", 16); >> + } >> } >> - memcpy(&outbuf[8], "QEMU ", 8); >> + >> + if (NULL != s->vname) { >> + strpadcpy((char *) &outbuf[8], 8, s->vname, ' '); >> + } else { >> + memcpy(&outbuf[8], "QEMU ", 8); >> + } >> + >> memset(&outbuf[32], 0, 4); >> memcpy(&outbuf[32], s->version, MIN(4, strlen(s->version))); >> /* >> @@ -1585,14 +1598,19 @@ static int scsi_initfn(SCSIDevice *dev) >> return -1; >> } >> >> - if (!s->serial) { >> - /* try to fall back to value set with legacy -drive serial=... */ >> - dinfo = drive_get_by_blockdev(s->qdev.conf.bs); >> - if (*dinfo->serial) { >> - s->serial = g_strdup(dinfo->serial); >> - } >> - } >> + dinfo = drive_get_by_blockdev(s->qdev.conf.bs); >> >> + /* when no value given try to fall back to */ >> + /* value set with legacy -drive serial=... */ >> + if ((!s->serial) && (*dinfo->serial)) { >> + s->serial = g_strdup(dinfo->serial); >> + } > > No need to change the way the serial is handled, because you don't need > dinfo for vname/pname. > >> + if ((!s->vname) && (*dinfo->vname)) { >> + s->vname = g_strdup(dinfo->vname); >> + } >> + if ((!s->pname) && (*dinfo->pname)) { >> + s->pname = g_strdup(dinfo->pname); >> + } > > (Also, no parentheses around simple conditions). > >> if (!s->version) { >> s->version = g_strdup(QEMU_VERSION); >> } >> @@ -1788,10 +1806,13 @@ static SCSIRequest >> *scsi_block_new_request(SCSIDevice *d, uint32_t tag, >> } >> #endif >> >> -#define DEFINE_SCSI_DISK_PROPERTIES() \ >> - DEFINE_BLOCK_PROPERTIES(SCSIDiskState, qdev.conf), \ >> - DEFINE_PROP_STRING("ver", SCSIDiskState, version), \ >> - DEFINE_PROP_STRING("serial", SCSIDiskState, serial) >> +#define DEFINE_SCSI_DISK_PROPERTIES() \ >> + DEFINE_BLOCK_PROPERTIES(SCSIDiskState, qdev.conf), \ >> + DEFINE_PROP_STRING("ver", SCSIDiskState, version), \ >> + DEFINE_PROP_STRING("serial", SCSIDiskState, serial), \ >> + DEFINE_PROP_STRING("vendor_name", SCSIDiskState, vname), \ >> + DEFINE_PROP_STRING("product_name", SCSIDiskState, pname) >> + >> >> static Property scsi_hd_properties[] = { >> DEFINE_SCSI_DISK_PROPERTIES(), > > Paolo
Re: [Qemu-devel] [PATCH v4 9/9] VMXNET3 paravirtualized device integration. Interface type "vmxnet3" added.
Fixed. On Fri, Mar 16, 2012 at 1:35 PM, Paolo Bonzini wrote: > Il 15/03/2012 22:09, Dmitry Fleytman ha scritto: >> Signed-off-by: Dmitry Fleytman >> Signed-off-by: Yan Vugenfirer >> --- >> Makefile.objs | 1 + >> default-configs/pci.mak | 1 + >> hw/pci.c | 2 ++ >> hw/pci.h | 1 + >> net.c | 2 +- >> 5 files changed, 6 insertions(+), 1 deletions(-) >> >> diff --git a/Makefile.objs b/Makefile.objs >> index 226b01d..1366e86 100644 >> --- a/Makefile.objs >> +++ b/Makefile.objs >> @@ -284,6 +284,7 @@ hw-obj-$(CONFIG_PCNET_PCI) += pcnet-pci.o >> hw-obj-$(CONFIG_PCNET_COMMON) += pcnet.o >> hw-obj-$(CONFIG_E1000_PCI) += e1000.o >> hw-obj-$(CONFIG_RTL8139_PCI) += rtl8139.o >> +hw-obj-$(CONFIG_VMXNET3_PCI) += vmxnet3.o vmxnet_utils.o vmxnet_pkt.o >> >> hw-obj-$(CONFIG_SMC91C111) += smc91c111.o >> hw-obj-$(CONFIG_LAN9118) += lan9118.o >> diff --git a/default-configs/pci.mak b/default-configs/pci.mak >> index 21e4ccf..f8e6ee1 100644 >> --- a/default-configs/pci.mak >> +++ b/default-configs/pci.mak >> @@ -13,6 +13,7 @@ CONFIG_PCNET_COMMON=y >> CONFIG_LSI_SCSI_PCI=y >> CONFIG_RTL8139_PCI=y >> CONFIG_E1000_PCI=y >> +CONFIG_VMXNET3_PCI=y >> CONFIG_IDE_CORE=y >> CONFIG_IDE_QDEV=y >> CONFIG_IDE_PCI=y > > These parts should be included in part 8. > > Paolo > >> diff --git a/hw/pci.c b/hw/pci.c >> index 9146d3f..e2b0045 100644 >> --- a/hw/pci.c >> +++ b/hw/pci.c >> @@ -1355,6 +1355,7 @@ static const char * const pci_nic_models[] = { >> "e1000", >> "pcnet", >> "virtio", >> + "vmxnet3", >> NULL >> }; >> >> @@ -1367,6 +1368,7 @@ static const char * const pci_nic_names[] = { >> "e1000", >> "pcnet", >> "virtio-net-pci", >> + "vmxnet3", >> NULL >> }; >> >> diff --git a/hw/pci.h b/hw/pci.h >> index 4f19fdb..fee8250 100644 >> --- a/hw/pci.h >> +++ b/hw/pci.h >> @@ -60,6 +60,7 @@ >> #define PCI_DEVICE_ID_VMWARE_NET 0x0720 >> #define PCI_DEVICE_ID_VMWARE_SCSI 0x0730 >> #define PCI_DEVICE_ID_VMWARE_IDE 0x1729 >> +#define PCI_DEVICE_ID_VMWARE_VMXNET3 0x07B0 >> >> /* Intel (0x8086) */ >> #define PCI_DEVICE_ID_INTEL_82551IT 0x1209 >> diff --git a/net.c b/net.c >> index c34474f..e2f586c 100644 >> --- a/net.c >> +++ b/net.c >> @@ -857,7 +857,7 @@ static const struct { >> }, { >> .name = "model", >> .type = QEMU_OPT_STRING, >> - .help = "device model (e1000, rtl8139, virtio etc.)", >> + .help = "device model (e1000, rtl8139, virtio, vmxnet3 >> etc.)", >> }, { >> .name = "addr", >> .type = QEMU_OPT_STRING, >
[Qemu-devel] [PATCH 0/5 V2] VMWare PVSCSI paravirtual device implementation
Below is the implementation of VMWare PVSCSI device and command line parameters to configure vendor name and product name for SCSI storage are implemented. Latter is needed to make PVSCSI storage devices look exactly as on VMWare hypervisors. With this and VMWARE3 patches V2V migration problem for VMWare images should be solved relatively easy. PVSCSI implementation is based on Paolo Bonzini code sumbitted some time ago but never applied. See commit messages and file headers for details. Implementation supports of all the device features. Code was tested on different OSes: Fedora 15 Ubuntu 10.4 Centos 6.2 Windows 2008R2 Windows 2008 64bit Windows 2008 32bit Windows 2003 64bit Windows 2003 32bit Changes in V2: Various fixes and beautification as suggested by Paolo Bonzini Reported-by: Paolo Bonzini Dmitry Fleytman (4): Utility function strpadcpy() added Vendor name and product name parameters for SCSI devices Options "vendor_name" and "product_name" added for SCSI disks. Header with various utility functions shared by VMWARE SCSI and network devices PVCSI paravirtualized device implementation Bus type "pvscsi" added. Makefile.objs |1 + cutils.c |7 + default-configs/pci.mak|1 + docs/specs/pvscsi-spec.txt | 92 hw/pci.h |1 + hw/pvscsi.c| 1239 hw/pvscsi.h| 442 hw/scsi-disk.c | 32 +- hw/vmware_utils.h | 126 + qemu-common.h |1 + 10 files changed, 1934 insertions(+), 8 deletions(-) create mode 100644 docs/specs/pvscsi-spec.txt create mode 100644 hw/pvscsi.c create mode 100644 hw/pvscsi.h create mode 100644 hw/vmware_utils.h -- 1.7.7.6
[Qemu-devel] [PATCH 2/4 V2] Vendor name and product name parameters for SCSI devices Options "vendor_name" and "product_name" added for SCSI disks.
Sample command line is: -drive file=image.raw,if=none,cache=off,id=scsi1 \ -device lsi,id=scsi -device scsi-disk,drive=scsi1,bus=scsi.0,product_name="VENDOR SCSI DISK",vendor_name="[VENDOR]" \ Signed-off-by: Dmitry Fleytman Signed-off-by: Yan Vugenfirer --- hw/scsi-disk.c | 32 1 files changed, 24 insertions(+), 8 deletions(-) diff --git a/hw/scsi-disk.c b/hw/scsi-disk.c index add399e..1a2997f 100644 --- a/hw/scsi-disk.c +++ b/hw/scsi-disk.c @@ -70,6 +70,8 @@ struct SCSIDiskState QEMUBH *bh; char *version; char *serial; +char *vname; +char *pname; bool tray_open; bool tray_locked; }; @@ -566,12 +568,23 @@ static int scsi_disk_emulate_inquiry(SCSIRequest *req, uint8_t *outbuf) outbuf[0] = s->qdev.type & 0x1f; outbuf[1] = s->removable ? 0x80 : 0; -if (s->qdev.type == TYPE_ROM) { -memcpy(&outbuf[16], "QEMU CD-ROM ", 16); + +if (NULL != s->pname) { +strpadcpy((char *) &outbuf[16], 16, s->pname, ' '); +} else { +if (s->qdev.type == TYPE_ROM) { +memcpy(&outbuf[16], "QEMU CD-ROM ", 16); +} else { +memcpy(&outbuf[16], "QEMU HARDDISK ", 16); +} +} + +if (NULL != s->vname) { +strpadcpy((char *) &outbuf[8], 8, s->vname, ' '); } else { -memcpy(&outbuf[16], "QEMU HARDDISK ", 16); +memcpy(&outbuf[8], "QEMU", 8); } -memcpy(&outbuf[8], "QEMU", 8); + memset(&outbuf[32], 0, 4); memcpy(&outbuf[32], s->version, MIN(4, strlen(s->version))); /* @@ -1788,10 +1801,13 @@ static SCSIRequest *scsi_block_new_request(SCSIDevice *d, uint32_t tag, } #endif -#define DEFINE_SCSI_DISK_PROPERTIES() \ -DEFINE_BLOCK_PROPERTIES(SCSIDiskState, qdev.conf), \ -DEFINE_PROP_STRING("ver", SCSIDiskState, version), \ -DEFINE_PROP_STRING("serial", SCSIDiskState, serial) +#define DEFINE_SCSI_DISK_PROPERTIES() \ +DEFINE_BLOCK_PROPERTIES(SCSIDiskState, qdev.conf),\ +DEFINE_PROP_STRING("ver", SCSIDiskState, version), \ +DEFINE_PROP_STRING("serial", SCSIDiskState, serial), \ +DEFINE_PROP_STRING("vendor_name", SCSIDiskState, vname), \ +DEFINE_PROP_STRING("product_name", SCSIDiskState, pname) + static Property scsi_hd_properties[] = { DEFINE_SCSI_DISK_PROPERTIES(), -- 1.7.7.6
[Qemu-devel] [PATCH 3/4 V2] Header with various utility functions shared by VMWARE SCSI and network devices
Signed-off-by: Dmitry Fleytman Signed-off-by: Yan Vugenfirer --- hw/vmware_utils.h | 126 + 1 files changed, 126 insertions(+), 0 deletions(-) create mode 100644 hw/vmware_utils.h diff --git a/hw/vmware_utils.h b/hw/vmware_utils.h new file mode 100644 index 000..cc845d7 --- /dev/null +++ b/hw/vmware_utils.h @@ -0,0 +1,126 @@ +/* + * QEMU VMWARE paravirtual devices - auxiliary code + * + * Copyright (c) 2012 Ravello Systems LTD (http://ravellosystems.com) + * + * Developed by Daynix Computing LTD (http://www.daynix.com) + * + * Authors: + * Dmitry Fleytman + * Yan Vugenfirer + * + * This work is licensed under the terms of the GNU GPL, version 2 or later. + * See the COPYING file in the top-level directory. + * + */ + +#ifndef VMWARE_UTILS_H +#define VMWARE_UTILS_H + +#ifndef DSHPRINTF +#define DSHPRINTF(fmt, ...) do {} while (0) +#endif + +/* Shared memory access functions with byte swap support */ +static inline void +vmw_shmem_read(target_phys_addr_t addr, void *buf, int len) +{ +DSHPRINTF("SHMEM r: %" PRIx64 ", len: %d to %p", addr, len, buf); +cpu_physical_memory_read(addr, buf, len); +} + +static inline void +vmw_shmem_write(target_phys_addr_t addr, void *buf, int len) +{ +DSHPRINTF("SHMEM w: %" PRIx64 ", len: %d to %p", addr, len, buf); +cpu_physical_memory_write(addr, buf, len); +} + +static inline void +vmw_shmem_rw(target_phys_addr_t addr, void *buf, int len, int is_write) +{ +DSHPRINTF("SHMEM r/w: %" PRIx64 ", len: %d (to %p), is write: %d", + addr, len, buf, is_write); + +cpu_physical_memory_rw(addr, buf, len, is_write); +} + +static inline void +vmw_shmem_set(target_phys_addr_t addr, uint8 val, int len) +{ +int i; +DSHPRINTF("SHMEM set: %" PRIx64 ", len: %d (value 0x%X)", addr, len, val); + +for (i = 0; i < len; i++) { +cpu_physical_memory_write(addr + i, &val, 1); +} +} + +static inline uint32_t +vmw_shmem_ld8(target_phys_addr_t addr) +{ +uint8_t res = ldub_phys(addr); +DSHPRINTF("SHMEM load8: %" PRIx64 " (value 0x%X)", addr, res); +return res; +} + +static inline void +vmw_shmem_st8(target_phys_addr_t addr, uint8_t value) +{ +DSHPRINTF("SHMEM store8: %" PRIx64 " (value 0x%X)", addr, value); +stb_phys(addr, value); +} + +static inline uint32_t +vmw_shmem_ld16(target_phys_addr_t addr) +{ +uint16_t res = lduw_le_phys(addr); +DSHPRINTF("SHMEM load16: %" PRIx64 " (value 0x%X)", addr, res); +return res; +} + +static inline void +vmw_shmem_st16(target_phys_addr_t addr, uint16_t value) +{ +DSHPRINTF("SHMEM store16: %" PRIx64 " (value 0x%X)", addr, value); +stw_le_phys(addr, value); +} + +static inline uint32_t +vmw_shmem_ld32(target_phys_addr_t addr) +{ +uint32_t res = ldl_le_phys(addr); +DSHPRINTF("SHMEM load32: %" PRIx64 " (value 0x%X)", addr, res); +return res; +} + +static inline void +vmw_shmem_st32(target_phys_addr_t addr, uint32_t value) +{ +DSHPRINTF("SHMEM store32: %" PRIx64 " (value 0x%X)", addr, value); +stl_le_phys(addr, value); +} + +static inline uint64_t +vmw_shmem_ld64(target_phys_addr_t addr) +{ +uint64_t res = ldq_le_phys(addr); +DSHPRINTF("SHMEM load64: %" PRIx64 " (value %" PRIx64 ")", addr, res); +return res; +} + +static inline void +vmw_shmem_st64(target_phys_addr_t addr, uint64_t value) +{ +DSHPRINTF("SHMEM store64: %" PRIx64 " (value %" PRIx64 ")", addr, value); +stq_le_phys(addr, value); +} + +/* MACROS for simplification of operations on array-style registers */ +#define IS_MULTIREG_ADDR(addr, base, cnt, regsize) \ +(((addr) >= (base)) && ((addr) < (base) + (cnt) * (regsize))) + +#define MULTIREG_IDX_BY_ADDR(addr, base, regsize) \ +(((addr) - (base)) / (regsize)) + +#endif -- 1.7.7.6
[Qemu-devel] [PATCH 1/4 V2] Utility function strpadcpy() added
Signed-off-by: Dmitry Fleytman Signed-off-by: Yan Vugenfirer --- cutils.c |7 +++ qemu-common.h |1 + 2 files changed, 8 insertions(+), 0 deletions(-) diff --git a/cutils.c b/cutils.c index af308cd..3ccf45c 100644 --- a/cutils.c +++ b/cutils.c @@ -27,6 +27,13 @@ #include "qemu_socket.h" +void strpadcpy(char *buf, int buf_size, const char *str, char pad) +{ +int len = qemu_strnlen(str, buf_size); +memcpy(buf, str, len); +memset(buf + len, pad, buf_size - len); +} + void pstrcpy(char *buf, int buf_size, const char *str) { int c; diff --git a/qemu-common.h b/qemu-common.h index b0fdf5c..fdd3d17 100644 --- a/qemu-common.h +++ b/qemu-common.h @@ -134,6 +134,7 @@ int qemu_timedate_diff(struct tm *tm); /* cutils.c */ void pstrcpy(char *buf, int buf_size, const char *str); +void strpadcpy(char *buf, int buf_size, const char *str, char pad); char *pstrcat(char *buf, int buf_size, const char *s); int strstart(const char *str, const char *val, const char **ptr); int stristart(const char *str, const char *val, const char **ptr); -- 1.7.7.6
[Qemu-devel] [PATCH 0/7 v5] VMXNET3 paravirtual NIC device implementation
This set of patches implements VMWare VMXNET3 paravirtual NIC device. The device supports of all the device features including offload capabilties, VLANs and etc. The device is tested on different OSes: Fedora 15 Ubuntu 10.4 Centos 6.2 Windows 2008R2 Windows 2008 64bit Windows 2008 32bit Windows 2003 64bit Windows 2003 32bit Changes in V5: MSI-X save/load implemented in the device instead of pci bus as suggested by Michael S. Tsirkin Reported-by: Michael S. Tsirkin Patches regrouped as suggested by Paolo Bonzini Reported-by: Paolo Bonzini Changes in V4: Fixed a few problems uncovered by NETIO test suit Assertion on failure to initialize MSI/MSI-X replaced with warning message and fallback to Legacy/MSI respectively Reported-by: Gerhard Wiesinger Various coding style adjustments and patch split-up as suggested by Anthony Liguori Reported-by: Anthony Liguori Live migration support added Changes in V3: Fixed crash when net device that is used as network fronted has no virtio HDR support. Task offloads emulation for cases when net device that is used as network fronted has no virtio HDR support. Reported-by: Gerhard Wiesinger Changes in V2: License text changed accoring to community suggestions Standard license header from GPLv2+ - licensed QEMU files used Dmitry Fleytman (7): Adding missing flag VIRTIO_NET_HDR_F_DATA_VALID from Linux kernel source tre Reformatting comments according to checkpatch.pl requirements Adding utility function net_checksum_add_cont() that allows checksum calculation of scattered data with odd chunk sizes Adding utility function iov_net_csum_add() for iovec checksum calculation Header with various utility functions shared by VMWARE SCSI and network devices Various utility functions used by VMWARE network devices Packet abstraction used by VMWARE network devices VMXNET3 paravirtualized device implementation Interface type "vmxnet3" added. Makefile.objs |1 + default-configs/pci.mak |1 + hw/pci.c|2 + hw/pci.h|1 + hw/virtio-net.h | 13 +- hw/vmware_utils.h | 126 +++ hw/vmxnet3.c| 2454 +++ hw/vmxnet3.h| 757 +++ hw/vmxnet_debug.h | 121 +++ hw/vmxnet_pkt.c | 1243 hw/vmxnet_pkt.h | 479 + hw/vmxnet_utils.c | 165 hw/vmxnet_utils.h | 320 ++ iov.c | 29 + iov.h |3 + net.c |2 +- net/checksum.c | 13 +- net/checksum.h | 14 +- 18 files changed, 5730 insertions(+), 14 deletions(-) create mode 100644 hw/vmware_utils.h create mode 100644 hw/vmxnet3.c create mode 100644 hw/vmxnet3.h create mode 100644 hw/vmxnet_debug.h create mode 100644 hw/vmxnet_pkt.c create mode 100644 hw/vmxnet_pkt.h create mode 100644 hw/vmxnet_utils.c create mode 100644 hw/vmxnet_utils.h -- 1.7.7.6
[Qemu-devel] [PATCH 1/7 v5] Adding missing flag VIRTIO_NET_HDR_F_DATA_VALID from Linux kernel source tre Reformatting comments according to checkpatch.pl requirements
Signed-off-by: Dmitry Fleytman Signed-off-by: Yan Vugenfirer --- hw/virtio-net.h | 13 +++-- 1 files changed, 7 insertions(+), 6 deletions(-) diff --git a/hw/virtio-net.h b/hw/virtio-net.h index 4468741..fa3c17b 100644 --- a/hw/virtio-net.h +++ b/hw/virtio-net.h @@ -78,13 +78,14 @@ struct virtio_net_config * specify GSO or CSUM features, you can simply ignore the header. */ struct virtio_net_hdr { -#define VIRTIO_NET_HDR_F_NEEDS_CSUM 1 // Use csum_start, csum_offset +#define VIRTIO_NET_HDR_F_NEEDS_CSUM 1 /* Use csum_start, csum_offset */ +#define VIRTIO_NET_HDR_F_DATA_VALID 2 /* Csum is valid */ uint8_t flags; -#define VIRTIO_NET_HDR_GSO_NONE 0 // Not a GSO frame -#define VIRTIO_NET_HDR_GSO_TCPV41 // GSO frame, IPv4 TCP (TSO) -#define VIRTIO_NET_HDR_GSO_UDP 3 // GSO frame, IPv4 UDP (UFO) -#define VIRTIO_NET_HDR_GSO_TCPV64 // GSO frame, IPv6 TCP -#define VIRTIO_NET_HDR_GSO_ECN 0x80// TCP has ECN set +#define VIRTIO_NET_HDR_GSO_NONE 0 /* Not a GSO frame */ +#define VIRTIO_NET_HDR_GSO_TCPV41 /* GSO frame, IPv4 TCP (TSO) */ +#define VIRTIO_NET_HDR_GSO_UDP 3 /* GSO frame, IPv4 UDP (UFO) */ +#define VIRTIO_NET_HDR_GSO_TCPV64 /* GSO frame, IPv6 TCP */ +#define VIRTIO_NET_HDR_GSO_ECN 0x80 /* TCP has ECN set */ uint8_t gso_type; uint16_t hdr_len; uint16_t gso_size; -- 1.7.7.6
[Qemu-devel] [PATCH 2/7 v5] Adding utility function net_checksum_add_cont() that allows checksum calculation of scattered data with odd chunk sizes
Adding utility function net_raw_checksum() that calculates checksum of buffer given Signed-off-by: Dmitry Fleytman Signed-off-by: Yan Vugenfirer --- net/checksum.c | 13 +++-- net/checksum.h | 14 +- 2 files changed, 20 insertions(+), 7 deletions(-) diff --git a/net/checksum.c b/net/checksum.c index 9919b2e..4fa5563 100644 --- a/net/checksum.c +++ b/net/checksum.c @@ -20,16 +20,17 @@ #define PROTO_TCP 6 #define PROTO_UDP 17 -uint32_t net_checksum_add(int len, uint8_t *buf) +uint32_t net_checksum_add_cont(int len, uint8_t *buf, int seq) { uint32_t sum = 0; int i; -for (i = 0; i < len; i++) { - if (i & 1) - sum += (uint32_t)buf[i]; - else - sum += (uint32_t)buf[i] << 8; +for (i = seq; i < seq + len; i++) { +if (i & 1) { +sum += (uint32_t)buf[i - seq]; +} else { +sum += (uint32_t)buf[i - seq] << 8; +} } return sum; } diff --git a/net/checksum.h b/net/checksum.h index 1f05298..171924c 100644 --- a/net/checksum.h +++ b/net/checksum.h @@ -20,10 +20,22 @@ #include -uint32_t net_checksum_add(int len, uint8_t *buf); +uint32_t net_checksum_add_cont(int len, uint8_t *buf, int seq); uint16_t net_checksum_finish(uint32_t sum); uint16_t net_checksum_tcpudp(uint16_t length, uint16_t proto, uint8_t *addrs, uint8_t *buf); void net_checksum_calculate(uint8_t *data, int length); +static inline uint32_t +net_checksum_add(int len, uint8_t *buf) +{ +return net_checksum_add_cont(len, buf, 0); +} + +static inline uint16_t +net_raw_checksum(uint8_t *data, int length) +{ + return net_checksum_finish(net_checksum_add(length, data)); +} + #endif /* QEMU_NET_CHECKSUM_H */ -- 1.7.7.6
[Qemu-devel] [PATCH 6/7 v5] Packet abstraction used by VMWARE network devices
Signed-off-by: Dmitry Fleytman Signed-off-by: Yan Vugenfirer --- hw/vmxnet_pkt.c | 1243 +++ hw/vmxnet_pkt.h | 479 + 2 files changed, 1722 insertions(+), 0 deletions(-) create mode 100644 hw/vmxnet_pkt.c create mode 100644 hw/vmxnet_pkt.h diff --git a/hw/vmxnet_pkt.c b/hw/vmxnet_pkt.c new file mode 100644 index 000..5fe2672 --- /dev/null +++ b/hw/vmxnet_pkt.c @@ -0,0 +1,1243 @@ +/* + * QEMU VMWARE VMXNET* paravirtual NICs - packets abstractions + * + * Copyright (c) 2012 Ravello Systems LTD (http://ravellosystems.com) + * + * Developed by Daynix Computing LTD (http://www.daynix.com) + * + * Authors: + * Dmitry Fleytman + * Tamir Shomer + * Yan Vugenfirer + * + * This work is licensed under the terms of the GNU GPL, version 2 or later. + * See the COPYING file in the top-level directory. + * + */ + +#include "vmxnet_pkt.h" +#include "vmxnet_utils.h" +#include "iov.h" + +#include "net/checksum.h" + +/*= + *= + * + *TX CODE + * + *= + *===*/ + +enum { +VMXNET_TX_PKT_VHDR_FRAG = 0, +VMXNET_TX_PKT_L2HDR_FRAG, +VMXNET_TX_PKT_L3HDR_FRAG, +VMXNET_TX_PKT_PL_START_FRAG +}; + +/* TX packet private context */ +typedef struct _Vmxnet_TxPkt { +struct virtio_net_hdr virt_hdr; +bool has_virt_hdr; + +struct iovec *vec; + +uint8_t l2_hdr[ETH_MAX_L2_HDR_LEN]; +uint8_t l3_hdr[ETH_MAX_L3_HDR_LEN]; + +uint32_t payload_len; + +uint32_t payload_frags; +uint32_t max_payload_frags; + +uint16_t hdr_len; +eth_pkt_types_e packet_type; +uint16_t l3_proto; +} Vmxnet_TxPkt; + +/** + * + * Function: vmxnet_tx_pkt_init + * + * Desc: Init function for tx packet functionality. + * + * Params: (OUT) pkt - private handle. + * (IN) max_frags - max tx ip fragments. + * (IN) has_virt_hdr - device uses virtio header. + * + * Return: 0 on success, -1 on error + * + * Scope: Global + * + */ +int vmxnet_tx_pkt_init(Vmxnet_TxPkt_h *pkt, uint32_t max_frags, +bool has_virt_hdr) +{ +int rc = 0; + +Vmxnet_TxPkt *p = g_malloc(sizeof *p); +if (!p) { +rc = -1; +goto Exit; +} + +memset(p, 0, sizeof *p); + +p->vec = g_malloc((sizeof *p->vec) * +(max_frags + VMXNET_TX_PKT_PL_START_FRAG)); +if (!p->vec) { +rc = -1; +goto Exit; +} + +p->max_payload_frags = max_frags; +p->has_virt_hdr = has_virt_hdr; +p->vec[VMXNET_TX_PKT_VHDR_FRAG].iov_base = &p->virt_hdr; +p->vec[VMXNET_TX_PKT_VHDR_FRAG].iov_len = +p->has_virt_hdr ? sizeof p->virt_hdr : 0; +p->vec[VMXNET_TX_PKT_L2HDR_FRAG].iov_base = &p->l2_hdr; +p->vec[VMXNET_TX_PKT_L3HDR_FRAG].iov_base = &p->l3_hdr; + +*pkt = p; + +Exit: +if (rc) { +vmxnet_tx_pkt_uninit(p); +} +return rc; +} + +/** + * + * Function: vmxnet_tx_pkt_uninit + * + * Desc: Clean all tx packet resources. + * + * Params: (IN) pkt - private handle. + * + * Return: nothing + * + * Scope: Global + * + */ +void vmxnet_tx_pkt_uninit(Vmxnet_TxPkt_h pkt) +{ +Vmxnet_TxPkt *p = (Vmxnet_TxPkt *)pkt; + +if (p) { +if (p->vec) { +g_free(p->vec); +} + +g_free(p); +} +} + +/** + * + * Function: vmxnet_tx_pkt_update_ip_checksums + * + * Desc: fix ip header fields and calculate checksums needed. + * + * Params: (IN) pkt - private handle. + * + * Return: Nothing. + * + * Scope: Global + * + */ +void vmxnet_tx_pkt_update_ip_checksums(Vmxnet_TxPkt_h pkt) +{ +uint16_t csum; +Vmxnet_TxPkt *p = (Vmxnet_TxPkt *)pkt; +assert(p); +uint8_t gso_type = p->virt_hdr.gso_type & ~VIRTIO_NET_HDR_GSO_ECN; +struct ip_header *ip_hdr; +target_phys_addr_t payload = (target_phys_addr_t) +(uint64_t) p->vec[VMXNET_TX_PKT_PL_START_FRAG].iov_base; + +if (VIRTIO_NET_HDR_GSO_TCPV4 != gso_type && +VIRTIO_NET_HDR_GSO_UDP != gso_type) { +return; +} + +ip_hdr = p->vec[VMXNET_TX_PKT_L3HDR_FRAG].iov_base; + +if (p->payload_len + p->vec[VMXNET_TX_PKT_L3HDR_FRAG].iov_len > +ETH_MAX_IP_
[Qemu-devel] [PATCH 4/7 v5] Header with various utility functions shared by VMWARE SCSI and network devices
Signed-off-by: Dmitry Fleytman Signed-off-by: Yan Vugenfirer --- hw/vmware_utils.h | 126 + 1 files changed, 126 insertions(+), 0 deletions(-) create mode 100644 hw/vmware_utils.h diff --git a/hw/vmware_utils.h b/hw/vmware_utils.h new file mode 100644 index 000..cc845d7 --- /dev/null +++ b/hw/vmware_utils.h @@ -0,0 +1,126 @@ +/* + * QEMU VMWARE paravirtual devices - auxiliary code + * + * Copyright (c) 2012 Ravello Systems LTD (http://ravellosystems.com) + * + * Developed by Daynix Computing LTD (http://www.daynix.com) + * + * Authors: + * Dmitry Fleytman + * Yan Vugenfirer + * + * This work is licensed under the terms of the GNU GPL, version 2 or later. + * See the COPYING file in the top-level directory. + * + */ + +#ifndef VMWARE_UTILS_H +#define VMWARE_UTILS_H + +#ifndef DSHPRINTF +#define DSHPRINTF(fmt, ...) do {} while (0) +#endif + +/* Shared memory access functions with byte swap support */ +static inline void +vmw_shmem_read(target_phys_addr_t addr, void *buf, int len) +{ +DSHPRINTF("SHMEM r: %" PRIx64 ", len: %d to %p", addr, len, buf); +cpu_physical_memory_read(addr, buf, len); +} + +static inline void +vmw_shmem_write(target_phys_addr_t addr, void *buf, int len) +{ +DSHPRINTF("SHMEM w: %" PRIx64 ", len: %d to %p", addr, len, buf); +cpu_physical_memory_write(addr, buf, len); +} + +static inline void +vmw_shmem_rw(target_phys_addr_t addr, void *buf, int len, int is_write) +{ +DSHPRINTF("SHMEM r/w: %" PRIx64 ", len: %d (to %p), is write: %d", + addr, len, buf, is_write); + +cpu_physical_memory_rw(addr, buf, len, is_write); +} + +static inline void +vmw_shmem_set(target_phys_addr_t addr, uint8 val, int len) +{ +int i; +DSHPRINTF("SHMEM set: %" PRIx64 ", len: %d (value 0x%X)", addr, len, val); + +for (i = 0; i < len; i++) { +cpu_physical_memory_write(addr + i, &val, 1); +} +} + +static inline uint32_t +vmw_shmem_ld8(target_phys_addr_t addr) +{ +uint8_t res = ldub_phys(addr); +DSHPRINTF("SHMEM load8: %" PRIx64 " (value 0x%X)", addr, res); +return res; +} + +static inline void +vmw_shmem_st8(target_phys_addr_t addr, uint8_t value) +{ +DSHPRINTF("SHMEM store8: %" PRIx64 " (value 0x%X)", addr, value); +stb_phys(addr, value); +} + +static inline uint32_t +vmw_shmem_ld16(target_phys_addr_t addr) +{ +uint16_t res = lduw_le_phys(addr); +DSHPRINTF("SHMEM load16: %" PRIx64 " (value 0x%X)", addr, res); +return res; +} + +static inline void +vmw_shmem_st16(target_phys_addr_t addr, uint16_t value) +{ +DSHPRINTF("SHMEM store16: %" PRIx64 " (value 0x%X)", addr, value); +stw_le_phys(addr, value); +} + +static inline uint32_t +vmw_shmem_ld32(target_phys_addr_t addr) +{ +uint32_t res = ldl_le_phys(addr); +DSHPRINTF("SHMEM load32: %" PRIx64 " (value 0x%X)", addr, res); +return res; +} + +static inline void +vmw_shmem_st32(target_phys_addr_t addr, uint32_t value) +{ +DSHPRINTF("SHMEM store32: %" PRIx64 " (value 0x%X)", addr, value); +stl_le_phys(addr, value); +} + +static inline uint64_t +vmw_shmem_ld64(target_phys_addr_t addr) +{ +uint64_t res = ldq_le_phys(addr); +DSHPRINTF("SHMEM load64: %" PRIx64 " (value %" PRIx64 ")", addr, res); +return res; +} + +static inline void +vmw_shmem_st64(target_phys_addr_t addr, uint64_t value) +{ +DSHPRINTF("SHMEM store64: %" PRIx64 " (value %" PRIx64 ")", addr, value); +stq_le_phys(addr, value); +} + +/* MACROS for simplification of operations on array-style registers */ +#define IS_MULTIREG_ADDR(addr, base, cnt, regsize) \ +(((addr) >= (base)) && ((addr) < (base) + (cnt) * (regsize))) + +#define MULTIREG_IDX_BY_ADDR(addr, base, regsize) \ +(((addr) - (base)) / (regsize)) + +#endif -- 1.7.7.6
[Qemu-devel] [PATCH 5/7 v5] Various utility functions used by VMWARE network devices
Signed-off-by: Dmitry Fleytman Signed-off-by: Yan Vugenfirer --- hw/vmxnet_debug.h | 121 hw/vmxnet_utils.c | 165 +++ hw/vmxnet_utils.h | 320 + 3 files changed, 606 insertions(+), 0 deletions(-) create mode 100644 hw/vmxnet_debug.h create mode 100644 hw/vmxnet_utils.c create mode 100644 hw/vmxnet_utils.h diff --git a/hw/vmxnet_debug.h b/hw/vmxnet_debug.h new file mode 100644 index 000..cc3471f --- /dev/null +++ b/hw/vmxnet_debug.h @@ -0,0 +1,121 @@ +/* + * QEMU VMWARE VMXNET* paravirtual NICs - debugging facilities + * + * Copyright (c) 2012 Ravello Systems LTD (http://ravellosystems.com) + * + * Developed by Daynix Computing LTD (http://www.daynix.com) + * + * Authors: + * Dmitry Fleytman + * Tamir Shomer + * Yan Vugenfirer + * + * This work is licensed under the terms of the GNU GPL, version 2 or later. + * See the COPYING file in the top-level directory. + * + */ + +#ifndef _QEMU_VMXNET_DEBUG_H +#define _QEMU_VMXNET_DEBUG_H + +#ifdef VMXNET_VERSION_2 +#define VMXNET_DEVICE_NAME "vmxnet" +#elif defined VMXNET_VERSION_3 +#define VMXNET_DEVICE_NAME "vmxnet3" +#else +#error "VMXNET version is not defined" +#endif + +/* #define DEBUG_VMXNET_CB */ +#define DEBUG_VMXNET_WARNINGS +#define DEBUG_VMXNET_ERRORS +/* #define DEBUG_VMXNET_INTERRUPTS */ +/* #define DEBUG_VMXNET_CONFIG */ +/* #define DEBUG_VMXNET_RINGS */ +/* #define DEBUG_VMXNET_PACKETS */ +/* #define DEBUG_VMXNET_SHMEM_ACCESS */ + +#ifdef DEBUG_VMXNET_SHMEM_ACCESS +#define DSHPRINTF(fmt, ...) \ +do { \ +printf("[%s][SH][%s]: " fmt "\n", VMXNET_DEVICE_NAME, __func__, \ +## __VA_ARGS__); \ +} while (0) +#else +#define DSHPRINTF(fmt, ...) do {} while (0) +#endif + +#ifdef DEBUG_VMXNET_CB +#define DCBPRINTF(fmt, ...) \ +do { \ +printf("[%s][CB][%s]: " fmt "\n", VMXNET_DEVICE_NAME, __func__, \ +## __VA_ARGS__); \ +} while (0) +#else +#define DCBPRINTF(fmt, ...) do {} while (0) +#endif + +#ifdef DEBUG_VMXNET_PACKETS +#define DPKPRINTF(fmt, ...) \ +do { \ +printf("[%s][PK][%s]: " fmt "\n", VMXNET_DEVICE_NAME, __func__, \ +## __VA_ARGS__); \ +} while (0) +#else +#define DPKPRINTF(fmt, ...) do {} while (0) +#endif + +#ifdef DEBUG_VMXNET_WARNINGS +#define DWRPRINTF(fmt, ...) \ +do { \ +printf("[%s][WR][%s]: " fmt "\n", VMXNET_DEVICE_NAME, __func__, \ +## __VA_ARGS__); \ +} while (0) +#else +#define DWRPRINTF(fmt, ...) do {} while (0) +#endif + +#ifdef DEBUG_VMXNET_ERRORS +#define DERPRINTF(fmt, ...) \ +do { \ +printf("[%s][ER][%s]: " fmt "\n", VMXNET_DEVICE_NAME, __func__, \ +## __VA_ARGS__); \ +} while (0) +#else +#define DERPRINTF(fmt, ...) do {} while (0) +#endif + +#ifdef DEBUG_VMXNET_INTERRUPTS +#define DIRPRINTF(fmt, ...) \ +do { \ +printf("[%s][IR][%s]: " fmt "\n", VMXNET_DEVICE_NAME, __func__, \ +## __VA_ARGS__); \ +} while (0) +#else +#define DIRPRINTF(fmt, ...) do {} while (0) +#endif + +#ifdef DEBUG_VMXNET_CONFIG +#define DCFPRINTF(fmt, ...) \ +do { \ +printf("[%s][CF][%s]: " fmt "\n", VMXNET_DEVICE_NAME, __func__, \ +## __VA_ARGS__); \ +} while (0) +#else +#define DCFPRINTF(fmt, ...) do {} while (0) +#endif + +#ifdef DEBUG_VMXNET_RINGS +#define DRIPRINTF(fmt, ...) \ +do { \ +printf("[%s][RI][%s]: " fmt "\n", VMXNE
[Qemu-devel] [PATCH 3/7 v5] Adding utility function iov_net_csum_add() for iovec checksum calculation
Signed-off-by: Dmitry Fleytman Signed-off-by: Yan Vugenfirer --- iov.c | 29 + iov.h |3 +++ 2 files changed, 32 insertions(+), 0 deletions(-) diff --git a/iov.c b/iov.c index 0f96493..5d4f94c 100644 --- a/iov.c +++ b/iov.c @@ -16,6 +16,7 @@ */ #include "iov.h" +#include "net/checksum.h" size_t iov_from_buf(struct iovec *iov, unsigned int iov_cnt, const void *buf, size_t iov_off, size_t size) @@ -130,3 +131,31 @@ void iov_hexdump(const struct iovec *iov, const unsigned int iov_cnt, fprintf(fp, "\n"); } } + +uint32_t +iov_net_csum_add(const struct iovec *iov, const unsigned int iov_cnt, + size_t iov_off, size_t size) +{ +size_t iovec_off, buf_off; +unsigned int i; +uint32_t res = 0; +uint32_t seq = 0; + +iovec_off = 0; +buf_off = 0; +for (i = 0; i < iov_cnt && size; i++) { +if (iov_off < (iovec_off + iov[i].iov_len)) { +size_t len = MIN((iovec_off + iov[i].iov_len) - iov_off , size); +void *chunk_buf = iov[i].iov_base + (iov_off - iovec_off); + +res += net_checksum_add_cont(len, chunk_buf, seq); +seq += len; + +buf_off += len; +iov_off += len; +size -= len; +} +iovec_off += iov[i].iov_len; +} +return res; +} diff --git a/iov.h b/iov.h index 94d2f78..ba385f5 100644 --- a/iov.h +++ b/iov.h @@ -21,3 +21,6 @@ size_t iov_clear(const struct iovec *iov, const unsigned int iov_cnt, size_t iov_off, size_t size); void iov_hexdump(const struct iovec *iov, const unsigned int iov_cnt, FILE *fp, const char *prefix, size_t limit); +uint32_t +iov_net_csum_add(const struct iovec *iov, const unsigned int iov_cnt, + size_t iov_off, size_t size); -- 1.7.7.6
Re: [Qemu-devel] [PATCH v4 0/9] VMXNET3 paravirtual NIC device implementation
Hello, Gerhard I've rechecked SSH connection both incoming and outgoing with patch v5. Everything works fine. If you still see problems, please, provide your exact configuration. Thanking you for your support, Dmitry Fleytman. On Sun, Mar 18, 2012 at 10:29 AM, Gerhard Wiesinger wrote: > Hello, > > I'm still having problems with v4 patch: ping works well, even with large > packet sizes but ssh doesn't work at all. > Tested with Knoppix 6.7 and Fedora 16. > > Thnx. > > Ciao, > Gerhard > > > On 15.03.2012 22:08, Dmitry Fleytman wrote: >> >> This set of patches implements VMWare VMXNET3 paravirtual NIC device. >> The device supports of all the device features including offload >> capabilties, >> VLANs and etc. >> The device is tested on different OSes: >> Fedora 15 >> Ubuntu 10.4 >> Centos 6.2 >> Windows 2008R2 >> Windows 2008 64bit >> Windows 2008 32bit >> Windows 2003 64bit >> Windows 2003 32bit >> >> Changes in V4: >> Fixed a few problems uncovered by NETIO test suit >> Assertion on failure to initialize MSI/MSI-X replaced with warning >> message and fallback to Legacy/MSI respectively >> >> Reported-by: Gerhard Wiesinger >> >> Various coding style adjustments and patch split-up as suggested by >> Anthony Liguori >> >> Reported-by: Anthony Liguori >> >> Live migration support added >> >> Changes in V3: >> Fixed crash when net device that is used as network fronted has no >> virtio HDR support. >> Task offloads emulation for cases when net device that is used as >> network fronted has no virtio HDR support. >> >> Reported-by: Gerhard Wiesinger >> >> Changes in V2: >> License text changed accoring to community suggestions >> Standard license header from GPLv2+ - licensed QEMU files used >> >> Dmitry Fleytman (9): >> Adding missing flag VIRTIO_NET_HDR_F_DATA_VALID from Linux kernel >> source tre Reformatting comments according to checkpatch.pl >> requirements >> Adding utility function net_checksum_add_cont() that allows checksum >> calculation of scattered data with odd chunk sizes >> Adding utility function iov_net_csum_add() for iovec checksum >> calculation >> MSI-X state save/load invocations moved to PCI Device save/load >> callbacks to avoid code duplication in MSI-X-enabled devices >> that support live migration >> Header with various utility functions shared by VMWARE SCSI and >> network devi >> Various utility functions used by VMWARE network devices >> Packet abstraction used by VMWARE network devices >> VMXNET3 paravirtual device implementation >> VMXNET3 paravirtualized device integration. Interface type >> "vmxnet3" added. >> >> Makefile.objs | 1 + >> default-configs/pci.mak | 1 + >> hw/pci.c | 7 + >> hw/pci.h | 1 + >> hw/virtio-net.h | 13 +- >> hw/virtio-pci.c | 2 - >> hw/vmware_utils.h | 122 +++ >> hw/vmxnet3.c | 2435 >> +++ >> hw/vmxnet3.h | 757 +++ >> hw/vmxnet_debug.h | 121 +++ >> hw/vmxnet_pkt.c | 1243 >> hw/vmxnet_pkt.h | 479 ++ >> hw/vmxnet_utils.c | 165 >> hw/vmxnet_utils.h | 320 +++ >> iov.c | 29 + >> iov.h | 3 + >> net.c | 2 +- >> net/checksum.c | 13 +- >> net/checksum.h | 14 +- >> 19 files changed, 5712 insertions(+), 16 deletions(-) >> create mode 100644 hw/vmware_utils.h >> create mode 100644 hw/vmxnet3.c >> create mode 100644 hw/vmxnet3.h >> create mode 100644 hw/vmxnet_debug.h >> create mode 100644 hw/vmxnet_pkt.c >> create mode 100644 hw/vmxnet_pkt.h >> create mode 100644 hw/vmxnet_utils.c >> create mode 100644 hw/vmxnet_utils.h >> >
Re: [Qemu-devel] [PATCH v4 0/9] VMXNET3 paravirtual NIC device implementation
Hello, Gerhard I've tested telnet connections on Knoppix running on QEMU-KVM with patch V5. Everything works fine on my setup. What is your network setup? How do you connect tap1 interface to the outer world? Also, since you have ping failure to init MSI-X is not related to the problem - device just falls back to MSI interrupts, but anyway, why does it fail? Could it be some QEMU/KVM versions incompartibility? Best regards, Dmitry Fleytman. On Mon, Mar 19, 2012 at 9:24 PM, Gerhard Wiesinger wrote: > Hello Dmitry, > > Tried also v5 patch without success: > /root/download/qemu/git/qemu-kvm/x86_64-softmmu/qemu-system-x86_64 > -drive > if=ide,index=3,media=cdrom,file=ISO/KNOPPIX_V6.7.1CD-2011-09-14-DE.iso > -boot order=cad,menu=on > -m 2048 -k de -vga vmware -vnc :0 > -bios /root/download/seabios/git/seabios/out/bios.bin > -chardev stdio,id=seabios -device isa-debugcon,iobase=0x402,chardev=seabios > -device vmxnet3,mac=1a:46:0b:ca:bc:7e,vlan=1,romfile= > -net tap,ifname=tap1,script=no,downscript=no,vlan=1 > > ping ok, but outside tcp communication fails: > # timeout Knoppix => outside > telnet 192.168.0.2 22 > # timeout outside => Knoppix failes > telnet 192.168.0.30 22 > > RTL8139 with same command line is ok. > > Maybe that helps directly at startup: > kvm_msix_vector_add: kvm_add_msix failed: No space left on device > [vmxnet3][WR][vmxnet3_use_msix_vectors]: Failed to use MSI-X vector 9, error > -28 > [vmxnet3][WR][vmxnet3_init_msix]: Failed to use MSI-X vectors, error 0 > [vmxnet3][WR][vmxnet3_pci_init]: Failed to initialize MSI-X, configuration > is inconsistent. > [vmxnet3][WR][vmxnet3_peer_has_vnet_hdr]: Peer has no virtio extension. Task > offloads will be emulated. > > I'm using git qemu-kvm and not git qemu. > > Thnx. > > Ciao, > Gerhard > > > On 18.03.2012 16:30, Dmitry Fleytman wrote: >> >> Hello, Gerhard >> >> I've rechecked SSH connection both incoming and outgoing with patch v5. >> Everything works fine. >> If you still see problems, please, provide your exact configuration. >> >> Thanking you for your support, >> Dmitry Fleytman. >> >> >> On Sun, Mar 18, 2012 at 10:29 AM, Gerhard Wiesinger >> wrote: >>> >>> Hello, >>> >>> I'm still having problems with v4 patch: ping works well, even with large >>> packet sizes but ssh doesn't work at all. >>> Tested with Knoppix 6.7 and Fedora 16. >>> >>> Thnx. >>> >>> Ciao, >>> Gerhard >>> >>> >>> On 15.03.2012 22:08, Dmitry Fleytman wrote: >>>> >>>> This set of patches implements VMWare VMXNET3 paravirtual NIC device. >>>> The device supports of all the device features including offload >>>> capabilties, >>>> VLANs and etc. >>>> The device is tested on different OSes: >>>> Fedora 15 >>>> Ubuntu 10.4 >>>> Centos 6.2 >>>> Windows 2008R2 >>>> Windows 2008 64bit >>>> Windows 2008 32bit >>>> Windows 2003 64bit >>>> Windows 2003 32bit >>>> >>>> Changes in V4: >>>> Fixed a few problems uncovered by NETIO test suit >>>> Assertion on failure to initialize MSI/MSI-X replaced with warning >>>> message and fallback to Legacy/MSI respectively >>>> >>>> Reported-by: Gerhard Wiesinger >>>> >>>> Various coding style adjustments and patch split-up as suggested by >>>> Anthony Liguori >>>> >>>> Reported-by: Anthony Liguori >>>> >>>> Live migration support added >>>> >>>> Changes in V3: >>>> Fixed crash when net device that is used as network fronted has no >>>> virtio HDR support. >>>> Task offloads emulation for cases when net device that is used as >>>> network fronted has no virtio HDR support. >>>> >>>> Reported-by: Gerhard Wiesinger >>>> >>>> Changes in V2: >>>> License text changed accoring to community suggestions >>>> Standard license header from GPLv2+ - licensed QEMU files used >>>> >>>> Dmitry Fleytman (9): >>>> Adding missing flag VIRTIO_NET_HDR_F_DATA_VALID from Linux kernel >>>> source tre Reformatting comments according to checkpatch.pl >>>> requirements >>>> Adding utility function net_checksum_add_cont() that allows checksum >>>> calculation
Re: [Qemu-devel] [PATCH 6/7 v5] Packet abstraction used by VMWARE network devices
Hello, Anthony Thanks for you comments, see inline. On Mon, Apr 16, 2012 at 11:06 PM, Anthony Liguori wrote: > On 03/18/2012 04:27 AM, Dmitry Fleytman wrote: >> >> Signed-off-by: Dmitry Fleytman >> Signed-off-by: Yan Vugenfirer >> --- >> hw/vmxnet_pkt.c | 1243 >> +++ >> hw/vmxnet_pkt.h | 479 + >> 2 files changed, 1722 insertions(+), 0 deletions(-) >> create mode 100644 hw/vmxnet_pkt.c >> create mode 100644 hw/vmxnet_pkt.h >> >> diff --git a/hw/vmxnet_pkt.c b/hw/vmxnet_pkt.c >> new file mode 100644 >> index 000..5fe2672 >> --- /dev/null >> +++ b/hw/vmxnet_pkt.c >> @@ -0,0 +1,1243 @@ >> +/* >> + * QEMU VMWARE VMXNET* paravirtual NICs - packets abstractions >> + * >> + * Copyright (c) 2012 Ravello Systems LTD (http://ravellosystems.com) >> + * >> + * Developed by Daynix Computing LTD (http://www.daynix.com) >> + * >> + * Authors: >> + * Dmitry Fleytman >> + * Tamir Shomer >> + * Yan Vugenfirer >> + * >> + * This work is licensed under the terms of the GNU GPL, version 2 or >> later. >> + * See the COPYING file in the top-level directory. >> + * >> + */ >> + >> +#include "vmxnet_pkt.h" >> +#include "vmxnet_utils.h" >> +#include "iov.h" >> + >> +#include "net/checksum.h" >> + >> >> +/*= >> + >> *= >> + * >> + * TX CODE >> + * >> + >> *= >> + >> *===*/ >> + >> +enum { >> + VMXNET_TX_PKT_VHDR_FRAG = 0, >> + VMXNET_TX_PKT_L2HDR_FRAG, >> + VMXNET_TX_PKT_L3HDR_FRAG, >> + VMXNET_TX_PKT_PL_START_FRAG >> +}; >> + >> +/* TX packet private context */ >> +typedef struct _Vmxnet_TxPkt { >> + struct virtio_net_hdr virt_hdr; >> + bool has_virt_hdr; >> + >> + struct iovec *vec; >> + >> + uint8_t l2_hdr[ETH_MAX_L2_HDR_LEN]; >> + uint8_t l3_hdr[ETH_MAX_L3_HDR_LEN]; >> + >> + uint32_t payload_len; >> + >> + uint32_t payload_frags; >> + uint32_t max_payload_frags; >> + >> + uint16_t hdr_len; >> + eth_pkt_types_e packet_type; >> + uint16_t l3_proto; >> +} Vmxnet_TxPkt; >> + >> >> +/** >> + * >> + * Function: vmxnet_tx_pkt_init >> + * >> + * Desc: Init function for tx packet functionality. >> + * >> + * Params: (OUT) pkt - private handle. >> + * (IN) max_frags - max tx ip fragments. >> + * (IN) has_virt_hdr - device uses virtio header. >> + * >> + * Return: 0 on success, -1 on error >> + * >> + * Scope: Global >> + * >> + >> */ > > > I applaud the use of comments but I don't think it's necessary to duplicate > this in the .c and .h file. We also are using GtkDoc as our comment format > these days. Good point. Will be fixed in the next submission. > > >> +int vmxnet_tx_pkt_init(Vmxnet_TxPkt_h *pkt, uint32_t max_frags, >> + bool has_virt_hdr) >> +{ >> + int rc = 0; >> + >> + Vmxnet_TxPkt *p = g_malloc(sizeof *p); >> + if (!p) { >> + rc = -1; >> + goto Exit; >> + } > > > > g_malloc cannot return NULL. Thanks, fixed. > > >> + >> + memset(p, 0, sizeof *p); > > > g_malloc0 will memset for you. Also fixed. > >> + >> + p->vec = g_malloc((sizeof *p->vec) * >> + (max_frags + VMXNET_TX_PKT_PL_START_FRAG)); >> + if (!p->vec) { >> + rc = -1; >> + goto Exit; >> + } >> + >> + p->max_payload_frags = max_frags; >> + p->has_virt_hdr = has_virt_hdr; >> + p->vec[VMXNET_TX_PKT_VHDR_FRAG].iov_base =&p->virt_hdr; >> >> + p->vec[VMXNET_TX_PKT_VHDR_FRAG].iov_len = >> + p->has_virt_hdr ? sizeof p->virt_hdr : 0; >> + p->vec[VMXNET_TX_PKT_L2HDR_FRAG].iov_base =&p->l2_hdr; >> + p->vec[VMXNET_TX_PKT_L3HDR_FRAG].iov_base =&p->l3_hdr; >> >> +
[Qemu-devel] [PATCH 0/7 V6] VMXNET3 paravirtual NIC device implementation
From: Dmitry Fleytman This set of patches implements VMWare VMXNET3 paravirtual NIC device. The device supports of all the device features including offload capabilties, VLANs and etc. The device is tested on different OSes: Fedora 15 Ubuntu 10.4 Centos 6.2 Windows 2008R2 Windows 2008 64bit Windows 2008 32bit Windows 2003 64bit Windows 2003 32bit Changes in V6: Fixed most of problems pointed out by Michael S. Tsirkin The only issue still open is creation of shared place with generic network structures and functions. Currently all generic network code introduced by VMXNET3 resides in vmxnet_utils.c/h files. It could be moved to some shared location however we believe it is a matter of separate refactoring as there are a lot of copy-pasted definitions in almost every device and code cleanup efforts requred in order to create truly shared codebase. Reported-by: Michael S. Tsirkin Implemented suggestions by Anthony Liguori Reported-by: Anthony Liguori Fixed incorrect checksum caclulation for some packets in SW offloads mode Reported-by: Gerhard Wiesinger Changes in V5: MSI-X save/load implemented in the device instead of pci bus as suggested by Michael S. Tsirkin Reported-by: Michael S. Tsirkin Patches regrouped as suggested by Paolo Bonzini Reported-by: Paolo Bonzini Changes in V4: Fixed a few problems uncovered by NETIO test suit Assertion on failure to initialize MSI/MSI-X replaced with warning message and fallback to Legacy/MSI respectively Reported-by: Gerhard Wiesinger Various coding style adjustments and patch split-up as suggested by Anthony Liguori Reported-by: Anthony Liguori Live migration support added Changes in V3: Fixed crash when net device that is used as network fronted has no virtio HDR support. Task offloads emulation for cases when net device that is used as network fronted has no virtio HDR support. Reported-by: Gerhard Wiesinger Changes in V2: License text changed accoring to community suggestions Standard license header from GPLv2+ - licensed QEMU files used Dmitry Fleytman (7): Adding missing flag VIRTIO_NET_HDR_F_DATA_VALID from Linux kernel source tree Reformatting comments according to checkpatch.pl requirements Adding utility function net_checksum_add_cont() that allows checksum calculation of scattered data with odd chunk sizes Adding utility function iov_net_csum_add() for iovec checksum calculation Adding utility function iov_rebuild() for smart iovec copy Header with various utility functions shared by VMWARE SCSI and network devices Various utility functions used by VMWARE network devices Packet abstraction used by VMWARE network devices VMXNET3 paravirtualized device implementation Device "vmxnet3" added. Makefile.objs |1 + default-configs/pci.mak |1 + hw/pci.h|1 + hw/virtio-net.h | 13 +- hw/vmware_utils.h | 126 +++ hw/vmxnet3.c| 2435 +++ hw/vmxnet3.h| 762 +++ hw/vmxnet_debug.h | 121 +++ hw/vmxnet_pkt.c | 776 +++ hw/vmxnet_pkt.h | 311 ++ hw/vmxnet_utils.c | 219 + hw/vmxnet_utils.h | 341 +++ iov.c | 53 + iov.h |6 + net/checksum.c | 13 +- net/checksum.h | 14 +- 16 files changed, 5180 insertions(+), 13 deletions(-) create mode 100644 hw/vmware_utils.h create mode 100644 hw/vmxnet3.c create mode 100644 hw/vmxnet3.h create mode 100644 hw/vmxnet_debug.h create mode 100644 hw/vmxnet_pkt.c create mode 100644 hw/vmxnet_pkt.h create mode 100644 hw/vmxnet_utils.c create mode 100644 hw/vmxnet_utils.h -- 1.7.7.6
[Qemu-devel] [PATCH 1/7 V6] Adding missing flag VIRTIO_NET_HDR_F_DATA_VALID from Linux kernel source tree Reformatting comments according to checkpatch.pl requirements
From: Dmitry Fleytman Signed-off-by: Dmitry Fleytman Signed-off-by: Yan Vugenfirer --- hw/virtio-net.h | 13 +++-- 1 files changed, 7 insertions(+), 6 deletions(-) diff --git a/hw/virtio-net.h b/hw/virtio-net.h index 4468741..fa3c17b 100644 --- a/hw/virtio-net.h +++ b/hw/virtio-net.h @@ -78,13 +78,14 @@ struct virtio_net_config * specify GSO or CSUM features, you can simply ignore the header. */ struct virtio_net_hdr { -#define VIRTIO_NET_HDR_F_NEEDS_CSUM 1 // Use csum_start, csum_offset +#define VIRTIO_NET_HDR_F_NEEDS_CSUM 1 /* Use csum_start, csum_offset */ +#define VIRTIO_NET_HDR_F_DATA_VALID 2 /* Csum is valid */ uint8_t flags; -#define VIRTIO_NET_HDR_GSO_NONE 0 // Not a GSO frame -#define VIRTIO_NET_HDR_GSO_TCPV41 // GSO frame, IPv4 TCP (TSO) -#define VIRTIO_NET_HDR_GSO_UDP 3 // GSO frame, IPv4 UDP (UFO) -#define VIRTIO_NET_HDR_GSO_TCPV64 // GSO frame, IPv6 TCP -#define VIRTIO_NET_HDR_GSO_ECN 0x80// TCP has ECN set +#define VIRTIO_NET_HDR_GSO_NONE 0 /* Not a GSO frame */ +#define VIRTIO_NET_HDR_GSO_TCPV41 /* GSO frame, IPv4 TCP (TSO) */ +#define VIRTIO_NET_HDR_GSO_UDP 3 /* GSO frame, IPv4 UDP (UFO) */ +#define VIRTIO_NET_HDR_GSO_TCPV64 /* GSO frame, IPv6 TCP */ +#define VIRTIO_NET_HDR_GSO_ECN 0x80 /* TCP has ECN set */ uint8_t gso_type; uint16_t hdr_len; uint16_t gso_size; -- 1.7.7.6