walterddr edited a comment on issue #8045:
URL: https://github.com/apache/pinot/issues/8045#issuecomment-1022709088
Context
===
for more context. let's say user want to configure this during ingest:
```
"dateTimeFieldSpecs": [{
"name": "Date",
"dataType": "STRING",
"format" : "1:SECONDS:SIMPLE_DATE_FORMAT:MM/dd/yyyy HH:mm:ss a",
"granularity": "1:HOURS"
}]
```
they have to set the `"dataType"` to `STRING` because one want the result of
```
Select Date From myTable
```
to be a string that conforms with the SDF specified . (for example the
STRING is directly feed into some downstream program)
Challenge
===
However,
1. in SQL database, setting a column to STRING type means we need to support
>= and <= in the raw data format.
2. in Pinot, we cant support this SDF as time column format because they are
not both lexical and time order consistent (e.g. `02/01/2021` comes after
`01/29/2022` in string-ordering but before in timestamp-ordering), if we use
this field as time field for partitioning real-time and offline table, we will
have wrong results because the underlying ordering is STRING-based
3. one can also configure the `dataType` to `TIMESTAMP` and intrinsically
convert to String in query, but the result has to be the ISO SQL standard
yyyy-mm-ddTHH:MM:SS format, which might not be what the user wanted.
Problem Statement
===
We want to create some kind of ingestion configurable DataType (let's name
it `DateTime`) that (1) returns a String that conforms with the ingestion
configured SDF; and (2) ordered by EPOCH ordering;
So that
```
SELECT myDateTimeType FROM myTable ORDER BY myDateTimeType
```
returns
```
02/01/2021 00:00:00
01/01/2022 00:00:00
```
Proposal
===
We can either store the actual data in STRING or LONG. but
1. if we store it in raw string format and force it to order by converted
EPOCH, this requires us to convert it every time making a compare. very costly.
2. if we were to store it as LONG which is natively sorted in EPOCH, and
only do the conversion when query: we need to store the original SDF configured
by user during ingestion somewhere, so we need to find a way to let Pinot know
during query time
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]