Hi Everyone,

We've previously discussed [1] merging the grails-data-mapping repo into
grails-core.  Other than grails-geb, this is the last remaining repository
to merge into core to have a mono repo. This email attempts to summarize
some of the previously expressed concerns and advocates for merging
grails-data-mapping & grails-geb sooner rather than later.  Let's gather
people's thoughts so we can determine if a vote thread is feasible.

Some of the recent concerns that have been raised on merging data mapping
are:
* slower build times (both locally & in GitHub actions)
* the requirement for mongodb

To answer those concerns:
------------------------------------------------
On slower build times:
------------------------------------------------
The current grails-core should not be viewed as slow.  Across all of the
recent mergers (cache, views, gradle plugins, docs), I've spent a lot of
time optimizing the build.  These optimizations include:
A. Converting a substantial amount of our build files to lazy
initialization instead of eager initialization & updating the grails gradle
plugins to make use of lazy where possible.  This means we only spend a
total of 8.8 seconds in configuration in a project being built from
scratch.
B. Updating parts of the build & parts of the grails gradle plugins to be
cacheable by defining inputs/outputs.  This means there's a higher chance
that if a project dependency doesn't change, it won't rebuild now.
C. Decoupling the gradle plugin dependencies so that there are not circular
references & that its dependencies can be managed separately from
application dependencies (we produce a grails-bom for applications &
grails-gradle-bom for gradle usage now).
D. Eliminating unnecessary steps or processes in the build (i.e. stream
lining the docs workflow)
E. Parallelizing the build where possible (there are known issues that
prevent us from being fully parallel, but we're very close to almost all
projects being able to be run in parallel).
F. Fixing our dependency graphs so that we generate proper platform POMs
and proper gradle modules so dependencies can be calculated correctly (and
quickly).
G. I have added properties to both configure tests that should run on an
opt-in basis and an opt-out basis.  This allows selectively running tests
by setting a system property on your build.  This allows further focused
development when needed.

The build for the grails-core library is now approximately 3 minutes if
building from scratch on the most recent Mac hardware (assuming the
libraries are already present locally).  The build also peeks at 1.5 gig of
memory usage.  It is also highly cacheable - only 30% of the tasks have
remaining cache issues (namely ones related to gsp, gson, and asset
compilation).  I believe long term we can get the typical build time down
even lower by improving these processes to be cacheable by gradle, by
further decoupling our build, and by further parallelization.  After
merging grails-data-mapping, it should be possible to keep these build
times down.

Concerning the build times in github, the main slowness is caused by how we
matrix test now with windows, mac, linux across different versions of the
JVM.  We also get throttled more when we have to do this across every
repository.  Having one repository will mean there's less of a chance of
being throttled.  Moreover, we can pursue self hosted build agents to solve
this in the long term.  In the short term, if it really becomes an issue,
we can enable gradle caching which will result in very little code having
to run to the aforementioned improvements.

------------------------------------------------
On the requirement for mongodb
------------------------------------------------
The requirement for a running mongodb causes several issues:
A. It forces the build to be synchronous for testing mongo related projects
(plugin, mongo core, mongo ext, mongo bson, mongo templates, tck for mongo,
etc)
B. It requires the user to have a running mongo instance.

We know that B. is fixable by running a mongodb container.  More over, if
it could be run for each project, then that also solves the synchronous
execution.  We know that running a docker container for B is trivial and
grails development already requires a container runtime to run it's
functional tests with geb.  The command for this is: `docker run -d  --name
mongo-on-docker  -p 27017:27017 mongo`

A. will then be fixed if we can spin up a container over the lifecycle of a
given project's tests.  For example, we could use test containers prior to
the GrailsApp.run() call in Application.groovy to ensure one exists per
application.  There will need to be some configuration rework, but it
shouldn't be too hard to accomplish longer term.



------------------------------------------------
As for why we should merge these libraries:
------------------------------------------------
1. While working on the merges of the previous builds, I have discovered
numerous validations that gradle performs (circular dependencies, etc) that
were not being performed when these builds are separate.  Combined, we get
the benefit of gradle warning us about circular dependencies and benefiting
from this feedback.

2. Somewhat related, we can institute code standards, code styles, and code
quality scans in a centralized manner.

3. Seperate from Gradle's validations, we can implement our own gradle
plugins local to the grails-core repo that will enforce architecture
separation - this includes ensuring that gsp & gorm can be used separately
from Grails in the long run.

4. If a gradle project that is a dependency of grails-core is partially
published, it will break functional tests in all repositories.  This means
you have to comment the tests out across repositories until all artifacts
are published again. Inside of the same project, this issue does not exist.

3. The iteration time on development is vastly improved in a single
repository.  The optimizations I made to the gradle plugin took seconds to
test and I would not have been able to make them in separate projects in
less than a day.  The feedback loop is a significant time saver.  What
would take 20-30minutes due to build publishing before I was doing in
seconds.  The gradle plugin changes would likely have taken over a week (or
longer) if these plugins were still in a separate repository.

4. The known issues with grails-data-mapping are not major blockers.  While
they may initially slow the build times, we can address the majority of the
time by solving the mongo problem.  We have an initial approach that works,
we'll just need to adjust the configuration in the mongo projects to
connect to different containers.

5. Having a mono repo ensures that any change to Grails will be tested
fully.  Several of us have spent a significant amount of time chasing down
bugs, that we later have discovered are due to someone only running the
tests in the project.  If someone changes code related to the core of
grails, they must run all of the associated tests.  In a mono project, this
happens locally during development. Outside of it, it happens by users
discovering the bug in a milestone.

6. Spending time on the build process - the github action release workflow,
etc - is a significant problem.  We don't want to be working on a build
process.  We want to be developing code and fixing bugs for grails.

7. Spring Boot & Hibernate will have major upgrades on a more regular basis
going forward.  To make the changes necessary, we don't want to be working
on separate processes.  We need to adopt a more rapid release schedule and
react to library upgrades faster so that we don't end up in the situation
we have been in for Grails 7.  The reason Grails 7 development has taken so
incredibly long, is we're updating some libraries that are over 4 years
old.  The technical debt can be prevented by updating more often and
staying up to date with upstream libraries - which also ensures the
security of the framework.

8. Apache's release process requires a security review.  This security
review ensures that builds are not being tampered with and our current
release process across many repos requires build tampering.  We eventually
stop modifying a build, but to be able to release a milestone sooner with
the apache coordinates, we need to be in one repository.

9. We no longer have admin rights to the GitHub organization we are under.
One of the discoveries we made after moving, was that we can't trigger
workflow actions from one repo to another.  Being in one repository, means
we don't have to do that and infrastructure does not need to find work
arounds for our existing processes.

10. Contributing to grails will be made easier for newer contributors.
They wont' have to learn to build projects in certain orders or how to work
around issues when something fails.


I'm sure I've forgotten several of the reasons, but I'd like to propose we
go to a mono repo for the core libraries that make up a grails release.
Data mapping & geb are the only ones that remain to have this.  I'd like to
fast track this and deal with the mongo / slowness after merge.  I believe
we can resolve these issues and by merging sooner we can get to releasing
the first milestone under Apache.

-James

[1] https://lists.apache.org/thread/sfzzzbb1zo6k4w8hz0ro13wx4n4jyhr6

Reply via email to