Hi John, Thanks for your analysis. If you still have the strace logs would you attach them to the bug report together with your analysis?
So I think the questions now are.. Why do we have the difference in read/write timing? What are the events that need to happen before the first bytes are written to the /tmp/wireshark file ? Does the initial SHB to the file get generated from within dumpcap itself at startup, or does dumpcap wait for the start of the TCP stream with its own SHB and just pass that data through? I think it's the latter as you've proven the temporary file is exactly the same as the source. Does the delay for TCP connection cause that initial 0 byte read? What other delays are happening in dumpcap startup for which tshark isn't allowing enough time? I see that there is a update_cb parameter to sync_pipe_start() but it is NULL in the tshark case. wireshark uses it for ui update. The callback if it exists is called just before the pipe_input_set_handler(). Perhaps adding a callback to introduce a delay at this point when connecting to a TCP socket may help. Are there any other things we might try to fix or avoid this race condition? James On Thu, Nov 19, 2020 at 6:13 AM John Sullivan <jseth...@kanargh.org.uk> wrote: > On Monday, November 16, 2020, 8:45:48 PM, Alastair Scott wrote: > > I've posted an in-depth description of the issue with logs and pcap's > > attached here: https://gitlab.com/wireshark/wireshark/-/issues/17013 > > Observations after running under strace with "-f -s 4096" with the > same capture file (5992 bytes, SHA1 cefbb7): > > * tshark execs dumpcap to do the actual "capture" and communicates > with it over pipes and a temporary file > > * Given we're reading from a pcapng file, despite the convoluted data > path, dumpcap's output to the temp file is actually identical to the > original pcap file (same size, same sha1sum). This is written out in what > looks like packet-sized chunks as it reads them from the socket. > > * The reading end does three read() calls. The first two between them > read the first 4096 bytes, the third reads the remaining 1896 bytes. > The first happens almost immediately after opening the file, the > second two not until the writing end has finished writing all data > (which it signals by writing the packet count over the pipe). This is > the same between failing and successful runs. > > * By extracting the data from strace output I can prove that the > reading end does in fact read all data, and exactly the same data as > in the original file, even in the failing case. There is no problem > there. > > * One key difference appears to be that in the failing case the first > read happens before the writing end has written anything at all, and > reads 0 bytes. The second read then reads 4096 bytes, and the third > 1896 bytes, before the reading end throws that all away and aborts. In > the successful case the first read happens late enough to return 28 > bytes, then the second reads asks for (and gets) only 4068 bytes to > make up the difference to 4096 bytes. > > So it looks like a race at the beginning, and even though the complete > data transfer is executed in all cases, if that initial read didn't > get the first valid file chunk so the "section header block" can be > validated, then it will ultimately cause the file to be rejected. > > John > -- > Dead stars still burn > > ___________________________________________________________________________ > Sent via: Wireshark-dev mailing list <wireshark-dev@wireshark.org> > Archives: https://www.wireshark.org/lists/wireshark-dev > Unsubscribe: https://www.wireshark.org/mailman/options/wireshark-dev > mailto:wireshark-dev-requ...@wireshark.org > ?subject=unsubscribe
___________________________________________________________________________ Sent via: Wireshark-dev mailing list <wireshark-dev@wireshark.org> Archives: https://www.wireshark.org/lists/wireshark-dev Unsubscribe: https://www.wireshark.org/mailman/options/wireshark-dev mailto:wireshark-dev-requ...@wireshark.org?subject=unsubscribe