zhuqi-lucas commented on PR #16395:
URL: https://github.com/apache/datafusion/pull/16395#issuecomment-2987414899

   Thank you @alamb, 
   I am excited to update today that i resolve the page index conflicts by 
adding new API in arrow-rs which can write bytes to the buf, and it can make 
the buf-wirtten metrics consistent, and the buf-wirtten will be used by page 
index also, so it's safe now, and i enable the page index now for the example, 
the testing result is good!
   
   I am currently using this arrow-rs branch before the code merge:
   https://github.com/apache/arrow-rs/pull/7714
   
   
   The example print logs, it's good, thanks! 
   
   
   ```rust
   Writing values: [ByteArray { data: "foo" }, ByteArray { data: "bar" }, 
ByteArray { data: "foo" }]
   Writing custom index at offset: 68, length: 7
   Finished writing file to 
/var/folders/q7/zjtv8rvx2hz0_t_rjjq8p9k00000gp/T/.tmp9zCIJt/a.parquet
   Writing values: [ByteArray { data: "baz" }, ByteArray { data: "qux" }]
   Writing custom index at offset: 68, length: 7
   Finished writing file to 
/var/folders/q7/zjtv8rvx2hz0_t_rjjq8p9k00000gp/T/.tmp9zCIJt/b.parquet
   Writing values: [ByteArray { data: "foo" }, ByteArray { data: "quux" }, 
ByteArray { data: "quux" }]
   Writing custom index at offset: 70, length: 8
   Finished writing file to 
/var/folders/q7/zjtv8rvx2hz0_t_rjjq8p9k00000gp/T/.tmp9zCIJt/c.parquet
   Reading index from 
/var/folders/q7/zjtv8rvx2hz0_t_rjjq8p9k00000gp/T/.tmp9zCIJt/a.parquet (size: 
363)
   Reading index at offset: 68, length: 7
   Read distinct index for a.parquet: "a.parquet"
   Reading index from 
/var/folders/q7/zjtv8rvx2hz0_t_rjjq8p9k00000gp/T/.tmp9zCIJt/b.parquet (size: 
363)
   Reading index at offset: 68, length: 7
   Read distinct index for b.parquet: "b.parquet"
   Reading index from 
/var/folders/q7/zjtv8rvx2hz0_t_rjjq8p9k00000gp/T/.tmp9zCIJt/c.parquet (size: 
368)
   Reading index at offset: 70, length: 8
   Read distinct index for c.parquet: "c.parquet"
   Filtering for category: foo
   Pruned files: ["c.parquet", "a.parquet"]
   +----------+
   | category |
   +----------+
   | foo      |
   | foo      |
   | foo      |
   +----------+
   
   ```
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org

Reply via email to