Hi, I noticed why the chardev socket backend disconnects, and I would like to make this a RFC to see how I should fix it. Current scenario after boot-up:
1. tcp_chr_read_poll keeps polling the slirp_socket_can_recv, and slirp_socket_can_recv returns 0 since slirp_find_ctl_socket couldn't find the guestfwd socket. 2. The returned 0 in step 1 was assigned to the s->max_size (s is SocketChardev *), and the socket chardev handler won't read since readable size is 0. 3. When the 1st request is sent, the guestfwd socket is added into the slirp's socket list, instead of 0, tcp_chr_read_poll will return the result of sopreprbuf > 0. 4. tcp_chr_read reads the thing. 5. tcp_chr_read_poll still returns things > 0, which is the output of sopreprbuf. 6. tcp_chr_read reads the thing again, but there's nothing in the buffer, so it's unhappy, and closes the connection. 7. any follow-up requests won't be handled. These tcp_chr* functions are in fle [1], and slirp_* are in fle [2]. My questions: 1. Since this thing doesn't work on 2nd and later requests, I want to know how this thing is supposed to work, and to avoid asking people vaguely, I will provide my though following and please correct me if I am wrong: a. The state machine in chardev socket should maintain a connected state (s->state == TCP_CHARDEV_STATE_CONNECTED), this means no change in [1]. b. slirp_socket_can_recv should return 0 once all data is read instead of outcome from sopreprbuf. This means I need to remove the socket or change its state to no file descriptor [3], namely somehow reset it. c. When a new request comes in, it will need to add the socket back to this slirp instance's socket list, populate its file descriptor, and establish the connection. b and c sounds convoluted so I want to check. 2. What is the outcome of sopreprbuf function [3]? Since it's returned to the tcp_chr_read_poll function, I thought it's the readable bytes in the socket, but in my test I noticed following thing: tcp_chr_read_poll_size : s->max_size: 132480 tcp_chr_read : size: 2076 tcp_chr_read_poll_size : s->max_size: 129600 tcp_chr_read : size: 0 Even there's not remaining things in the buffer (read size 0), it's still non-zero, and thus the read function keeps reading it until it becomes unhappy. Also, 132480-129600 = 2880 vs 2076, the read byte doesn't match. Either I need to go with the way in question 1, b.c. steps, or I don't need to delete the socket, but the sopreprbuf wasn't proper to be used there and I need to correct it. Also updated https://gitlab.com/qemu-project/qemu/-/issues/1835. Any feedback will be appreciated, thanks! Felix [1]. https://gitlab.com/qemu-project/qemu/-/blob/master/chardev/char-socket.c#L141 [2]. https://gitlab.freedesktop.org/slirp/libslirp/-/blob/master/src/slirp.c#L1582 [3]. https://gitlab.freedesktop.org/slirp/libslirp/-/blob/master/src/socket.h#L221 On Wed, Aug 23, 2023 at 10:27 AM Felix Wu <f...@google.com> wrote: > Update on debugging this thing (already updated > https://gitlab.com/qemu-project/qemu/-/issues/1835): > I saw that `tcp_chr_free_connection` was called after the first response > being successfully sent: > ``` > > slirp_guestfwd_write guestfwd_write: size 80tcp_chr_write tcp_chr_write: > s->state:2tcp_chr_write tcp_chr_write: len:80qemu_chr_write_parameter len: 80 > // tracking qemu_chr_writeqemu_chr_write_res len: 80 // same > thingtcp_chr_free_connection tcp_chr_free_connection: state: 2, changing it > to disconnecttcp_chr_change_state tcp_chr_change_state: state: 2, next state: > 0 // state 2==connected, 0==disconnected. > > ``` > And after that, the state of `SocketChardev` remained disconnected, and > when the 2nd request came in, the `tcp_chr_write` dropped it directly. > Maybe this state machine should be reset after every connection? Not sure. > > On Thu, Aug 17, 2023 at 11:58 AM Felix Wu <f...@google.com> wrote: > >> Hi Samuel, >> >> Thanks for the clarification! I missed the email so didn't reply in time, >> but was able to figure it out. >> >> Hi everyone, >> IPv6 guestfwd works in my local test but it has a weird bug: if you send >> two requests, the first one gets the correct response, but the second one >> gets stuck. >> I am using a simple http server for this test, and just noticed this bug >> also exists in IPv4 guestfwd. I've documented it in >> https://gitlab.com/qemu-project/qemu/-/issues/1835. >> >> Just want to check if anyone has seen the same issue before. >> >> Thanks! Felix >> >> On Thu, Jul 20, 2023 at 7:54 AM Samuel Thibault <samuel.thiba...@gnu.org> >> wrote: >> >>> Hello, >>> >>> Felix Wu, le mar. 18 juil. 2023 18:12:16 -0700, a ecrit: >>> > 02 == SYN so it looks good. But both tcpdump and wireshark (looking >>> into packet >>> > dump provided by QEMU invocation) >>> >>> Which packet dump? >>> >>> > I added multiple prints inside slirp and confirmed the ipv6 version of >>> [1] was >>> > reached. >>> > in tcp_output function [2], I got following print: >>> > qemu-system-aarch64: info: Slirp: AF_INET6 out dst ip = >>> > fdb5:481:10ce:0:8c41:aaff:fea9:f674, port = 52190 >>> > qemu-system-aarch64: info: Slirp: AF_INET6 out src ip = fec0::105, >>> port = 54322 >>> > It looks like there should be something being sent back to the guest, >>> >>> That's what it is. >>> >>> > unless my understanding of tcp_output is wrong. >>> >>> It looks so. >>> >>> > To understand the datapath of guestfwd better, I have the following >>> questions: >>> > 1. What's the meaning of tcp_input and tcp_output? My guess is the >>> following >>> > graph, but I would like to confirm. >>> >>> No, tcp_input is for packets that come from the guest, and tcp_output is >>> for packets that are send to the guest. So it's like that: >>> >>> > tcp_input write_cb host send() >>> > QEMU --------> slirp -----------> QEMU --------------------> host >>> > <-------- <--------- <----------------- >>> > tcp_output slirp_socket_recv host recv() >>> >>> > 2. I don't see port 6655 in the above process. How does slirp know >>> 6655 is the >>> > port that needs to be visited on the host side? >>> >>> Slirp itself *doesn't* know that port. The guestfwd piece just calls the >>> SlirpWriteCb when it has data coming from the guest. See the >>> documentation: >>> >>> /* Set up port forwarding between a port in the guest network and a >>> * callback that will receive the data coming from the port */ >>> SLIRP_EXPORT >>> int slirp_add_guestfwd(Slirp *slirp, SlirpWriteCb write_cb, void *opaque, >>> struct in_addr *guest_addr, int guest_port); >>> >>> and >>> >>> /* This is called by the application for a guestfwd, to provide the data >>> to be >>> * sent on the forwarded port */ >>> SLIRP_EXPORT >>> void slirp_socket_recv(Slirp *slirp, struct in_addr guest_addr, int >>> guest_port, >>> const uint8_t *buf, int size); >>> >>> Samuel >>> >>