Hi Alex,

IMO, we should be going with Option 2 - with the value selected for this
column to always be the Request ID (whether that value is the
`X-Request-Id`, OTel Trace ID, etc.) This is mainly because I don't see a
use case currently where identifying whether the Request ID is an OTel
Trace ID or some other sort of value provides any value to the user - but I
am open to changing my preferred option if someone can share a use case
where this is not true.

IOW, I don't think OTel information is relevant to Events if it is not used
as the primary way to identify a request, and therefore should not be
persisted. However, the discussion on (
https://lists.apache.org/thread/p9357rcy3d1j94w4yogtdwcf2kxzg3jr) may
change this view in that maybe we need to keep a "Span ID" (not necessary
that this is OTel-generated/specific) to identify corresponding
Before/After events. In which case, maybe we mix in Option 4 to store just
the "Span ID". I believe both email threads need more opinions before we
will see the full set of requirements of what we should make.

Best,
Adnan Hemani

On Fri, Oct 31, 2025 at 5:22 AM Alexandre Dutra <[email protected]> wrote:

> Hi all,
>
> As a follow-up to [1], I'm starting this thread to discuss how we can
> persist client-generated request IDs and OTel context in our database.
>
> Quoting myself [2], the requirements I think we want to fulfil are:
>
> 1. Only one correlation ID is enough
> 2. The correlation ID is an opaque string
> 3. The main use case is to find events with matching correlation IDs
> 4. The only query pattern is by exact match
>
> A few options were suggested:
>
> 1) Two separate columns: request_id and otel_context, both of type
> TEXT (nullable).
>     - Pros: Easy to implement and offers good read performance,
> especially with indexes.
>     - Cons: Could be overkill, as often one context is sufficient for
> correlating records.
>
> 2) A single column: (e.g., correlation_id, final name TBD) of type TEXT.
>     - Pros: Same as option 1.
>     - Cons: If both a request ID and OTel context are available, we
> can only store one. There's no straightforward way to identify the
> type of context stored (unless we use a prefix).
>
> 3) Two columns: correlation_id and correlation_id_type.
>     - Pros: Same as option 1, and it addresses the issue of
> identifying the ID type.
>     - Cons: Might be over-engineered. Is the ID type truly essential?
> Isn't it opaque?
>
> 4) Leverage the existing additional_properties column: (JSONB in
> Postgres, TEXT in H2).
>     - Pros: Simple and flexible due to its schemaless nature, allowing
> us to add anything we need.
>     - Cons: Query performance might not be optimal, though indexes could
> help.
>
> What are your thoughts?
>
> Thanks,
> Alex
>
> [1] https://lists.apache.org/thread/bb1qyxjt827t3tomv2xp0s1kovxjsp94
> [2] https://lists.apache.org/thread/fqjjmxc6v8bbynwd5xfz83ngmp6gqqxj
>

Reply via email to