zhengruifeng commented on PR #53150: URL: https://github.com/apache/spark/pull/53150#issuecomment-3569616521
> Arrow isn't intended for long term storage it's intended as a wire protocol -- I don't love using it for persisting models. I'm -0.9 on this change for now. Parquet seems like a better choice most likely. > does the arrow library provide APIs to write to local file? @holdenk @cloud-fan Arrow supports Random Access Files, and it provides [APIs](https://arrow.apache.org/docs/python/ipc.html#writing-and-reading-random-access-files) to write to local file. But our arrow utils mainly works with serialized `ArrowRecordBatches` the `Array[Byte]`, we will need add new helper functions for `ArrowRecordBatches` if we want to use arrow files APIs. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
