GitHub user mapleFU added a comment to the discussion: TimeSeries Proposal
For (4), you can dynamically compute it. In fact in (1) I think the timestamp in the root metadata can be eliminated if we care about the performance. > We could allow users to specify chunk_size during time series creation to > avoid oversized chunks? A key advantage of fixed time-window chunks is that > given a timestamp, the corresponding chunk can be quickly located through > simple calculations. Could you elaborate further on the "internal merging > rules" and "dynamic chunking"? The problem is that, assuming lots of tags and timeline, and a running server, some time it will have burst write, and the chunk would be extremely huge. Sometimes it's writing slowly, which causes very very small or empty chunks. It's hard for user to tune this in most cases. If the `chunk_size` is bytes, it assures that the system would not have "extremly large" or "extremly small" chunks. If we use time to switch firstly, we might need to merge small chunks to huge chunks, otherwise system can run slower and slower. > The Compressed Chunk Type might be optional, as the performance benefits of > compression are still unclear Generally this requires something like "byte-level" handling, and prevents from existing simd decoding, but I didn't testing that. Personally a compressed chunk might be regarded as "seal" ( meaning no or few writing ), so it should be size and aggregate optimized. Two stride is good for scan ( RANGE ) but not good for pointget( GET ) GitHub link: https://github.com/apache/kvrocks/discussions/3044#discussioncomment-13726774 ---- This is an automatically sent email for [email protected]. To unsubscribe, please send an email to: [email protected]
