In my dreamworld, a packet with a timestamp achieved at rx time on ethX, or via local traffic, would be consistent with the right clock throughout the system and reliably still be there when it goes to ethY.
This would save having to timestamp (again) inside the cb block in fq_codel, cake, etc, and more importantly, measure total system delay from entrance to egress, (e.g tc filters, firewall rules, routing table lookups), not just queuing delay at the qdisc, thus detecting when a system was overloaded and reducing queue size and throughput appropriately as a result to cope.