Hi Richard,
Thank you for this information, any and all help that you can provide is
greatly appreciated.
>The use of Git and GitHub is well supported by the INFRA team.
-- True. I actually contacted them a year or two ago and they already had
mechanisms in place to easily migrate code and hook up CI. That doesn't really
worry me.
>Jenkins also supports GitHub very well [4]. For example, in UIMA, we just drop
>a `Jenkinsfile`
> I'm happy to help you setting that up for cTAKES as well.
-- Your assistance would be appreciated. A bit ago when Infra switched Jenkins
platforms we lost our (there kept) configurations and I had to create new
setups on their current platform. The wizard gui is helpful ... to a point.
Anyway, an editable build configuration stored in our code repo would
definitely be an improvement.
>I fear that people may not have svn installed anymore
-- Also very true, and a great reason to get our code into GitHub.
>So requiring svn to download models and drop them into m2 might be an
>inconvenience.
-- I agree wholeheartedly, and my writing may have been imprecise but that was
definitely not my intention.
>If the models live in a Maven Repository and can be dragged in as a normal
>dependency, that would seem most convenient.
-- Yup. A new model creator could deal with svn and the svn model repo, but
the 99.% of developers who don't contribute models to ctakes wouldn't need
to worry about this.
I hope that we don't let this slip. It will require some effort with setup
and test, and I fear that it may require reorganization of the code and
resources such as I have proposed. It definitely should not be a
one-person-job ... I also think that we need to have a ctakes 5.0 release
before any of this is undertaken, which requires the usual planning, effort and
cooperation.
Sean
From: Richard Eckart de Castilho
Sent: Tuesday, June 28, 2022 6:54 AM
To: dev@ctakes.apache.org
Subject: Re: Apache cTAKES GitHub mirror is stuck in 2019 [EXTERNAL]
[SUSPICIOUS] [SUSPICIOUS]
* External Email - Caution *
Hi all,
> On 6. Jun 2022, at 16:09, Finan, Sean
> wrote:
>
> Hi Kean,
>
> Thank you for the suggestion and the link. I am really glad that people are
> interested in this guithub topic and taking it seriously. It would be great
> if we could make it happen.
>
> While definitely a possibility, the git LFS paradigm is something that I
> would like to avoid.
>
> Like keeping our models on SVN, it would also require separating models from
> code into two different repos, e.g. github and bitbucket. As opposed to
> bitbucket, the apache svn repos are long established, familiar to and
> supported by the apache infrastructure team. The same goes for the apache
> foundation use of github. I like being able to lean on the apache infra team
> for help.
So GitHub seems to have support for LFS [1]. What I do not know is if the ASF's
GitHub plan allows us to use this and if so if there is a volume limit. Would
have to ask INFRA about that.
The use of Git and GitHub is well supported by the INFRA team. For example,
there is self-service for creating and managing repos. [2]
There is also the `.asf.yaml` mechanism for configuring GitHub repos and
hooking them up with the ASF infrastructure including mailing lists, website
publishing, etc. etc. [3]
> The apache Jenkins servers are linked to the svn repos, making continuous
> integration easy - on the rare occasion when somebody does change something
> in a model repo. While I expect anybody savvy enough to work on models to
> also have the knowhow and wherewithal to work with a separate svn repo, I
> don't want them to need to get out to jenkins and manually kick off snapshot
> builds.
Jenkins also supports GitHub very well [4]. For example, in UIMA, we just drop
a `Jenkinsfile` [5,6] configuration file into each repo and Jenkins picks them
up even gives us support pull requests [7].
I'm happy to help you setting that up for cTAKES as well.
> Probably most important is the requirement of the client user to have the LFS
> command line client. I think that there are enough hoops stuck in front of
> getting ctakes installed/checked out/cloned/etc. and it seems to me that one
> of the biggest reasons to use github is to make things easier for absolute
> newbies to just pull down code and experiment.
It is an additional hoop to jump through indeed, but it is a one-time action to
install LFS. Chances are that people may even already have it set up because
they use it in other repos.
> Keeping the models on a separate svn repo would mean that they aren't checked
> out as code, but would be put in the .m2 maven area when a user runs maven
> compile. While the total footprint of full ctakes would still be the same
> size, it would essentially make the code directory smaller and initial
> downloads/checkouts would be faster. Plus, if done properly maybe it coul