Hi,

It just occurs to me that it may be a better idea to move the parquet
module into a separate sub-crate by using cargo workspaces
<https://doc.rust-lang.org/book/ch14-03-cargo-workspaces.html>. The
advantage is that we can make the project more modular (in future, we may
want to add more sub-crates such as arrow/parquet_derive, orc, gandiva,
etc), and allow us to run CI jobs separately on each crate.

Some small caveats:
1. Cargo doesn't allow cyclic dependency. So if the parquet sub-crate
depends on arrow, we can't reference parquet in arrow. This doesn't seem
like an issue though since arrow itself should be physical on-disk format
independent. I also didn't see any reference on parquet in cpp/src/arrow.
2. The path dependency used in workspace has to be changed to a version
number when we do "cargo publish". This should be added to the release
instructions and committer who performs the job should do the extra step.

Thoughts?

Chao

Reply via email to