On Mon, May 05, 2025 at 06:51:58PM -0300, Fabiano Rosas wrote:
> Peter Xu <pet...@redhat.com> writes:
> 
> > On Wed, Apr 16, 2025 at 10:43:55AM -0300, Fabiano Rosas wrote:
> >> When qatomic_fetch_inc() started being used to count the number of
> >> packets sent, the printing of the number of packets received stopped
> >> matching the number of packets sent.
> >> 
> >> Fix by moving the increment of the number of packets on the recv side
> >> to multifd_recv_unfill_packet().
> >> 
> >> Also change the tracepoint text because "packet num" is ambiguous for
> >> the sync since the packet number of the actual sync packet will be one
> >> less than the total number of packets seen so far.
> >
> > Would this be a hint that the recv side may not really need a global
> > packet_num counter, at all?
> >
> > On source, it needs to be there if we want to have an unified unique ID for
> > each multifd packet, so that when allcating a packet we need them to be
> > assigned properly.
> >
> > On dest, it almost only receives packets, it's still unclear to me how the
> > recv packet_num global var could help us considering we have the per-packet
> > trace in trace_multifd_recv_unfill() dumping the unique ID for each..
> >
> > So.. would it be of any use?  Would it be better if we remove it instead?
> >
> 
> It's good for debugging. The p->packet_num on the recv side will at some
> point contain the total number of packets sent, but it's hard to answer
> how many packets have arrived at any given moment just by looking at
> trace_multifd_recv_unfill(). Packets could arrive out of order.

IMHO it can also be confusing at the same time..

E.g., before this patch the value doesn't mean all the packets smaller than
that have landed.. but only the max of whatever recv side got.

While after this patch, IIUC it will increase each time, but it can be
confusing in another way when e.g. the packets arrives in different order
and this value can imply something weird.  Consider below seq:

  Packets arrival: 1,2,3,5

Then this var will be 4 even after 5 received, but without getting 4..

But yeah maybe it's still helpful when we only want to sanity check the
total amount matches on src.  I'm OK we keep it like the new definition if
that helps in any way.

> 
> I'm inclined to say that p->packet_num is the one that's useless. After
> this patch, it's only there to hold an endianness swap. We can reach the
> BE value via p->packet->packet_num already.

Yep, I think the tracepoint is more important than the value cached,
especially when cached twice..

-- 
Peter Xu


Reply via email to