Re: two proposed spec changes

Jacob Marble Tue, 29 Aug 2023 10:50:19 -0700

Please define "derived field"?

We don't allow empty string as a tag value, so that sentinel value is
available. However, there are some second-order effects that need to be
considered.


Just thinking out loud, I haven't explored using default values for tags in
our Iceberg export code; certainly need to.
https://iceberg.apache.org/spec/#default-values

On Tue, Aug 29, 2023 at 9:52 AM Ryan Blue <[email protected]> wrote:

> Jacob, could you model this with a derived field? Or could you require the
> tags and use a "unknown" value?
>
> On Mon, Aug 28, 2023 at 11:18 AM Jacob Marble <[email protected]>
> wrote:
>
>> On Fri, Aug 25, 2023 at 3:23 PM Ryan Blue <[email protected]> wrote:
>>
>>> I don't think that we should introduce nanosecond precision types
>>> without at least supporting both timestamp and timestamptz. I'm not sure
>>> whether nanosecond time should be supported.
>>>
>>
>> SGTM; this seems to be the most agreeable part of the proposal.
>>
>> For the primary keys, what is the use case you're trying to solve? Do
>>> your tables allow null values in primary keys? If so, what is the purpose
>>> of it?
>>>
>>
>> InfluxDB is a schema-on-write database; tables and columns are created by
>> writing to them. Constraints:
>> - Every table has exactly one timestamp[nanos] column, and is required.
>> - "Field" columns are typed (int, uint, float, string, bool). These are
>> the time series data that vary with time. At least one field value is
>> required, per row.
>> - "Tag" columns are only strings. These are identifying data - used for
>> grouping, filtering. Tag values are not required, whether tag columns are
>> present or not.
>>
>> Primary keys are composed of **non-null tags**, plus timestamp. For
>> example, these rows:
>>
>> timestamp | tag A | tag B | field(int) F
>> 09:25 | null | null | 1
>> 09:25 | "foo" | null | 1
>> 09:25 | "foo" | "bar" | 1
>> 10:25 | "foo" | "bar" | 1
>>
>> have these primary keys:
>>
>> (09:25)
>> (09:25,A="foo")
>> (09:25,A="foo",B="bar")
>> (10:25,A="foo",B="bar")
>>
>> InfluxDB uses these primary keys in two contexts:
>> - deduplication in query pipelines
>> - compaction (mitigate performance impact of query-time deduplication)
>>
>> --
>> Jacob Marble
>> 🇺🇸 🇺🇦
>>
>
>
> --
> Ryan Blue
> Tabular
>


-- 
Jacob Marble
🇺🇸 🇺🇦

Re: two proposed spec changes

Reply via email to