For our current use case, I’d like to minimize the Pax build to reduce dependencies and avoid unnecessary components. Specifically, I’ve disabled the following CMake options: option(USE_MANIFEST_API "Use manifest API" OFF) option(USE_PAX_CATALOG "Use manifest API, by pax impl" OFF) option(BUILD_GTEST "Build with Google Test" OFF) option(BUILD_GBENCH "Build with Google Benchmark" OFF) option(BUILD_TOOLS "Build with Pax tools" OFF)
This configuration allows us to build only the core modules without pulling in submodules or extra third-party libraries. Please let me know if any additional flags should be adjusted for this lightweight setup. Leonid Borchuk <le.borc...@gmail.com> 于2025年5月10日周六 07:25写道: > Hi, all > > I really like the PostgreSQL approach - configure && make && make install. > And usually there are no additional packages or builds required. Postgresql > seems to be compiled everywhere - even on coffee machine. It would be great > to see the same for cloudberry. Since PAX is quite complex feature it would > be better to have a special option --enable-pax. > > I believe we will continue to create RPM and DEB packages with all features > enabled, so users can take precompiled binaries and start using PAX from > scratch. > > But for those who want to contribute, they can easily compile their first > binary and then enable features step by step. > > On Fri, May 9, 2025 at 11:05 AM Dianjin Wang <wangdian...@gmail.com> > wrote: > > > Hi Jiaqi and all, > > > > Thanks for your reply and for sharing your insights. I’d like to > > respond with a few thoughts from a community and release readiness > > perspective. > > > > 1. Lack of transparency in introducing the PAX > > > > While I fully respect the engineering efforts behind PAX, we must also > > reflect on how the feature was introduced. There was no public design > > proposal or open discussion prior to its contribution. The PR was open > > for a few weeks, but it was too large to review effectively. The goals > > behind PAX might be well-understood within certain internal teams, but > > not yet widely discussed in the Cloudberry community. At that time, we > > rushed to merge it into the main branch to catch up with the release - > > I also gave my +1 to the related PRs. > > > > This is a valuable lesson for all of us — especially for large, > > impactful features, we need more transparency and community > > discussion, just like the ongoing one around perfmon[1]. I'm not meant > > to blame anyone, it's one part of the story of incubation. > > > > The key point is that the default enabling behaviour of PAX is also > > against the description in the doc[2]. This is why we have this > > discussion thread. I have created one PR on setting the PAX to be > > disabled by default[3]. Compared to the thousands of file changes in > > PAX, mine is too small, however, we have such a deep conversation > > here, quite interesting and proud. > > > > 2. PAX is enabled by default is beyond the users' expectations for now > > > > The concern is not whether PAX is valuable — it’s about how users > > perceive and experience it. > > > > Having PAX enabled by default breaks the general expectation from most > > users familiar with Greenplum or PostgreSQL. We must admit that most > > of Cloudberry users will be ones who migrate from the open-source > > Greenplum. This difference can lead to: > > > > * Confusing build failures > > * Extra burden on users' Cloudberry installation > > * Frustration during first-time installation, which may cause them to > > give up on Cloudberry entirely > > > > Already saw the failure happening from the PPMC members, I bet it will > > be the same for the general users, we can imagine that. > > > > Our goal is to help PAX mature through community feedback. The best > > way is to make it opt-in — let users choose to enable it, experiment > > with it, and report issues, rather than forcing it upon them. > > > > 3. Let’s give users the power to choose > > > > Instead of enabling PAX by default, we can: > > > > * Provide clear instructions on how to enable it > > * Share blog posts and documentation explaining its advantages and > roadmap > > * Present it at community events (meetups, talks, etc.) > > > > This gives users the opportunity to understand what PAX is, why it > > matters, and decide whether it’s right for them. `--enable-pax` is > > meaningfully different from `--disable-pax`; the former has less > > burden on users' operations. > > > > Eventually, if the community widely adopts PAX and achieves consensus, > > we can consider enabling it by default in a future release through a > > public conversation like this. At least, we should disable PAX by > > default for the coming 2.0.0 release, which will be critical for > > Cloudberry as a newly incubating project. If we enable the PAX by > > default at this stage, we risk losing the opportunity to grow our user > > base. > > > > 4. Being different from Greenplum is not a problem — we are building > > an Apache community > > > > I understand your concern about Cloudberry becoming a “subset of > > Greenplum.” But in fact, we are already quite different: our community > > is open, the governance is collaborative, and we follow the Apache Way > > except for these existing great features, including PAX. That in > > itself sets Cloudberry apart. > > > > To summarize: > > > > I believe PAX is promising, and we should keep promoting it actively. > > But we should also respect users’ installation experience, at least in > > our first Apache release. Love to hear more voices! > > > > [1] https://github.com/apache/cloudberry/discussions/1087 > > [2] > https://github.com/apache/cloudberry/tree/main/contrib/pax_storage/doc > > [3] https://github.com/apache/cloudberry/pull/1081 > > > > Best, > > Dianjin Wang > > > > On Fri, May 9, 2025 at 12:32 PM Zhang Mingli <avamin...@apache.org> > wrote: > > > > > > Hi, all > > > > > > To move forward, > > > How about adding a --disable-pax configuration flag first? > > > Similar to --disable-orca, by default, the build would include PAX, but > > developers should have the option to exclude it if needed. > > > When --disable-pax is specified, the build system should skip compiling > > PAX entirely. > > > > > > On 2025/05/06 03:41:27 Max Yang wrote: > > > > PAX, as a storage plugin, has been contributed to the community. > > > > Should we turn on pax by default, encourage users to use it more, > > provide > > > > more feedback to help polish the product, but also provide an option > to > > > > turn it off for unwanted users? > > > > We can make submodule init automated to reduce the impact on other > > users. > > > > > > > > Best regards, Max Yang > > > > > > > > > > > > On Tue, May 6, 2025 at 11:10 AM jiaqi.zhou <jiaqi...@163.com> wrote: > > > > > > > > > Hi all, Wishing you a happy Labor Day. > > > > > > > > > > > > > > > > > > > > > > > > > I DON'T AGREEthat disabling PAX by default is a good idea. > > > > > > > > > > > > > > > > > > > > 1. About "Build Consistency with Other contrib Modules" > > > > > > > > > > > > > > > You may notice that the AO table does not exist as an extension in > > > > > CBDB/GP, which is of course partly for historical reasons. But in > > fact, > > > > > when we were working on PAX, we had to clean up a lot of > AO-specific > > > > > logical in the CBDB kernel. Currently, the AO tables could be > > modularized > > > > > and move to the contrib/gpcontrib, but we did not do so. PAX was > > designed > > > > > to replace the AO table, and the current version of PAX has more > > feature > > > > > than AO and better scalability. It also performed better than the > AO > > table > > > > > in some tests. > > > > > > > > > > > > > > > If you still think that all extension in contrib should disabled, > > Then > > > > > CBDB will completely serve as a subset of GP. > > > > > > > > > > > > > > > 2. About "Reduce Build Failures" > > > > > > > > > > > > > > > Ed has mentioned compilation improvement suggestions before, and > PAX > > will > > > > > make corresponding compilation changes later. > > > > > > > > > > > > > > > 3. About "Optional Nature of the Feature" > > > > > > > > > > > > > > > Which part do you think is the "core feature"? Although PAX is an > > optional > > > > > AM function, it is superior to AO table in all aspects. Is there > > enough > > > > > evidence to show that user don’t need PAX? If we do disabled PAX by > > > > > default, That will make users less aware of what PAX does. > > > > > > > > > > > > > > > Finally, I always think that we should "solve problems when they > > occur" > > > > > instead of avoiding them. As a new feature, PAX will encounter many > > > > > problems before it matures, and we should not avoid these problems. > > If we > > > > > just avoid problems blindly, CBDB will become a subset of GP with > > many > > > > > useless features. > > > > > > > > > > > > > > > Thanks > > > > > Jiaqi > > > > > > > > > > At 2025-04-25 14:23:37, "Dianjin Wang" <wangdian...@gmail.com> > > wrote: > > > > > >Hi all, > > > > > > > > > > > >I’d like to bring up a point regarding the PAX recently merged > into > > > > > >the main branch. > > > > > > > > > > > >Currently, the PAX is enabled by default in configure, and users > > need > > > > > >to explicitly disable it via `--disable-pax` option. However, this > > > > > >behavior is inconsistent with most of the other extensions under > the > > > > > >contrib/ or gpcontrib/ dir, which are typically disabled by > default > > > > > >unless explicitly enabled. > > > > > > > > > > > >I believe it would be more user-friendly and consistent to change > > the > > > > > >default behavior of PAX to disabled, requiring users to opt-in via > > > > > >`--enable-pax`. > > > > > > > > > > > >Here’s why: > > > > > > > > > > > >1. Build Consistency with Other contrib Modules: Most contrib > > plugins > > > > > >are not enabled by default. Changing PAX to follow this pattern > > aligns > > > > > >with user expectations, especially for long-time Greenplum users. > > > > > > > > > > > >2. Reduce Build Failures: PAX currently requires downloading > several > > > > > >submodules during the build. For users who install Cloudberry from > > > > > >source without prior knowledge of this requirement, this will lead > > to > > > > > >build failures, which can be confusing and frustrating. > > > > > > > > > > > >3. Optional Nature of the Feature: PAX, while valuable, is not a > > core > > > > > >feature that every user will need. Letting users opt in to > building > > it > > > > > >makes the installation process simpler for the general case. > > > > > > > > > > > >We can also update the build instructions and documentation to > > clearly > > > > > >indicate how to enable PAX if needed (--enable-pax) and how to > fetch > > > > > >its required submodules. > > > > > > > > > > > >I’d love to hear your thoughts on this. If there’s a consensus, > that > > > > > >would be better to make this change before our release 2.0. > > > > > > > > > > > >Best, > > > > > >Dianjin Wang > > > > > > > > > > > > > >--------------------------------------------------------------------- > > > > > >To unsubscribe, e-mail: dev-unsubscr...@cloudberry.apache.org > > > > > >For additional commands, e-mail: dev-h...@cloudberry.apache.org > > > > > > > > > > > > > > > --------------------------------------------------------------------- > > > To unsubscribe, e-mail: dev-unsubscr...@cloudberry.apache.org > > > For additional commands, e-mail: dev-h...@cloudberry.apache.org > > > > > > > --------------------------------------------------------------------- > > To unsubscribe, e-mail: dev-unsubscr...@cloudberry.apache.org > > For additional commands, e-mail: dev-h...@cloudberry.apache.org > > > > >