kenwenzel commented on issue #3006: URL: https://github.com/apache/parquet-java/issues/3006#issuecomment-2747521352
@gszadovszky My use case is repeated access to the same Parquet files where historical time-series data is retrieved. Our SPARQL-based query engine allows joins between time-series points, e.g., "Find the current state of machine X when event Y occurred.". If the machine states are close to each other (maybe within the same data page) then caching would help much. BTW, we store the same data in LevelDB which is a lot faster than accessing the Parquet files. I think the main point here is that LevelDB caches decompressed blocks in memory. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@parquet.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@parquet.apache.org For additional commands, e-mail: issues-h...@parquet.apache.org