Hi, all:

In yesterday’s community sync we talked about the location of different 
language clients, and I think we all agree that there should be consistent 
behavior for these clients, but the decision has not been made yet. I want to 
continue the discussion here on the pros and cons of different sides: mono 
repo(all in one big repo) or multi small repos( one for each language client)

To make things clear, currently we have four language libraries under 
development:


  1.  Java: in main repo(https://github.com/apache/iceberg)
  2.  Python: in main repo (https://github.com/apache/iceberg)
  3.  Go: in main repo (https://github.com/apache/iceberg)
  4.  Rust: in standalone repo (https://github.com/apache/iceberg-rust/)

Currently I mainly contribute rust client and I can share the thoughts on why I 
voted for standalone repo:


  1.  Easier project setup. Iceberg is a complex project with several 
components, and mainly written in java. As someone not quite familiar with this 
project structure, I feel easier to start a new one rather fitting into an 
existing one.
  2.  Faster ci workflow. In early days of rust client’s development, we only 
need to touch rust related code. If we all live in one mono repo, it will 
trigger unnecessary ci to run for other components.

I admit that these reasons may not stand for long term maintains, but it’s good 
for fast-paced development in early days.

After reviewing some discussions on the web, I have a summary about the pros 
and cons of two sides:

Mono Repo

Pros

  *   Visibility and transparency. It would be easier to follow progresses of 
all clients, and prs can have more reviews and attractions.
  *   Easier sharing of resources. It would be easier to share resources for 
integration tests.
Cons

  *   Increases complexity of project structure. The project structure would be 
more complex when coupling different languages and toolchain setup.
  *   Longer build/ci time.  Unnecessary ci checks maybe triggered for small 
prs in different languages.

Multi Repo

Pros

  *   Simplifies project structure. Different language may have toolchains and 
project setup, one repo for one language makes project structure easier to 
understand and follow.
  *   Independent versioning and releases. Different language may have 
different versioning and releases process. It’s also possible in monorepo, but 
I guess it would be easier in standalone multi repo.
  *   Improved build/ci time. No unnecessary ci checks will be triggered.
Cons

  *   Difficult to track the overall progress. Multi repos makes it harder to 
track what’s happening in different teams.
  *   Difficult to share common resources. It maybe more difficult to share 
resources and do integration tests cross different languages.


Welcome to share your ideas and thoughts in this discussion!

References


  1.  
https://www.coforge.com/blog/mono-repo-vs.-multi-repo-in-git-unravelling-the-key-differences

Reply via email to