date:20220628

Re: Apache cTAKES GitHub mirror is stuck in 2019 [EXTERNAL] [SUSPICIOUS] [SUSPICIOUS]

2022-06-28 Thread Richard Eckart de Castilho

Hi all,

> On 6. Jun 2022, at 16:09, Finan, Sean 
>  wrote:
> 
> Hi Kean,
> 
> Thank you for the suggestion and the link. I am really glad that people are 
> interested in this guithub topic and taking it seriously. It would be great 
> if we could make it happen.
> 
> While definitely a possibility, the git LFS paradigm is something that I 
> would like to avoid. 
> 
> Like keeping our models on SVN, it would also require separating models from 
> code into two different repos, e.g. github and bitbucket. As opposed to 
> bitbucket, the apache svn repos are long established, familiar to and 
> supported by the apache infrastructure team. The same goes for the apache 
> foundation use of github. I like being able to lean on the apache infra team 
> for help.

So GitHub seems to have support for LFS [1]. What I do not know is if the ASF's 
GitHub plan allows us to use this and if so if there is a volume limit. Would 
have to ask INFRA about that.

The use of Git and GitHub is well supported by the INFRA team. For example, 
there is self-service for creating and managing repos. [2]

There is also the `.asf.yaml` mechanism for configuring GitHub repos and 
hooking them up with the ASF infrastructure including mailing lists, website 
publishing, etc. etc. [3]

> The apache Jenkins servers are linked to the svn repos, making continuous 
> integration easy - on the rare occasion when somebody does change something 
> in a model repo. While I expect anybody savvy enough to work on models to 
> also have the knowhow and wherewithal to work with a separate svn repo, I 
> don't want them to need to get out to jenkins and manually kick off snapshot 
> builds.

Jenkins also supports GitHub very well [4]. For example, in UIMA, we just drop 
a `Jenkinsfile` [5,6] configuration file into each repo and Jenkins picks them 
up even gives us support pull requests [7].
I'm happy to help you setting that up for cTAKES as well.

> Probably most important is the requirement of the client user to have the LFS 
> command line client. I think that there are enough hoops stuck in front of 
> getting ctakes installed/checked out/cloned/etc. and it seems to me that one 
> of the biggest reasons to use github is to make things easier for absolute 
> newbies to just pull down code and experiment.

It is an additional hoop to jump through indeed, but it is a one-time action to 
install LFS. Chances are that people may even already have it set up because 
they use it in other repos.

> Keeping the models on a separate svn repo would mean that they aren't checked 
> out as code, but would be put in the .m2 maven area when a user runs maven 
> compile. While the total footprint of full ctakes would still be the same 
> size, it would essentially make the code directory smaller and initial 
> downloads/checkouts would be faster. Plus, if done properly maybe it could 
> "clean up" all of those nearly identically named modules in my intellij 
> project window and I'd stop clicking on the wrong one when I've had too much 
> coffee.

Nowadays, I fear that people may not have svn installed anymore ;) So requiring 
svn to download models and drop them into m2 might be an inconvenience. If the 
models live in a Maven Repository and can be dragged in as a normal dependency, 
that would seem most convenient.

Cheers,

-- Richard

[1] 
https://docs.github.com/en/repositories/working-with-files/managing-large-files/configuring-git-large-file-storage
[2] https://gitbox.apache.org
[3] https://s.apache.org/asfyaml
[4] https://builds.apache.org/job/UIMA/
[5] https://github.com/apache/uima-uimaj/blob/main/Jenkinsfile
[6] https://github.com/apache/uima-build-jenkins-shared-library 
[7] https://builds.apache.org/job/UIMA/job/uima-uimaj/view/change-requests/

Re: Apache cTAKES GitHub mirror is stuck in 2019 [EXTERNAL] [SUSPICIOUS] [SUSPICIOUS]

2022-06-28 Thread Finan, Sean

Hi Richard,

Thank you for this information, any and all help that you can provide is 
greatly appreciated.

>The use of Git and GitHub is well supported by the INFRA team.
-- True. I actually contacted them a year or two ago and they already had 
mechanisms in place to easily migrate code and hook up CI.  That doesn't really 
worry me.

>Jenkins also supports GitHub very well [4]. For example, in UIMA, we just drop 
>a `Jenkinsfile` 
> I'm happy to help you setting that up for cTAKES as well.
-- Your assistance would be appreciated.  A bit ago when Infra switched Jenkins 
platforms we lost our (there kept) configurations and I had to create new 
setups on their current platform.  The wizard gui is helpful ... to a point.   
Anyway, an editable build configuration stored in our code repo would 
definitely be an improvement.

>I fear that people may not have svn installed anymore
-- Also very true, and a great reason to get our code into GitHub.

>So requiring svn to download models and drop them into m2 might be an 
>inconvenience.
-- I agree wholeheartedly, and my writing may have been imprecise but that was 
definitely not my intention.

>If the models live in a Maven Repository and can be dragged in as a normal 
>dependency, that would seem most convenient.
--  Yup.  A new model creator could deal with svn and the svn model repo, but 
the 99.% of developers who don't contribute models to ctakes wouldn't need 
to worry about this.

I hope that we don't let this slip.   It will require some effort with setup 
and test, and I fear that it may require reorganization of the code and 
resources such as I have proposed.  It definitely should not be a 
one-person-job ...  I also think that we need to have a ctakes 5.0 release 
before any of this is undertaken, which requires the usual planning, effort and 
cooperation.

Sean



From: Richard Eckart de Castilho 
Sent: Tuesday, June 28, 2022 6:54 AM
To: dev@ctakes.apache.org
Subject: Re: Apache cTAKES GitHub mirror is stuck in 2019 [EXTERNAL] 
[SUSPICIOUS] [SUSPICIOUS]

* External Email - Caution *


Hi all,

> On 6. Jun 2022, at 16:09, Finan, Sean 
>  wrote:
>
> Hi Kean,
>
> Thank you for the suggestion and the link. I am really glad that people are 
> interested in this guithub topic and taking it seriously. It would be great 
> if we could make it happen.
>
> While definitely a possibility, the git LFS paradigm is something that I 
> would like to avoid.
>
> Like keeping our models on SVN, it would also require separating models from 
> code into two different repos, e.g. github and bitbucket. As opposed to 
> bitbucket, the apache svn repos are long established, familiar to and 
> supported by the apache infrastructure team. The same goes for the apache 
> foundation use of github. I like being able to lean on the apache infra team 
> for help.

So GitHub seems to have support for LFS [1]. What I do not know is if the ASF's 
GitHub plan allows us to use this and if so if there is a volume limit. Would 
have to ask INFRA about that.

The use of Git and GitHub is well supported by the INFRA team. For example, 
there is self-service for creating and managing repos. [2]

There is also the `.asf.yaml` mechanism for configuring GitHub repos and 
hooking them up with the ASF infrastructure including mailing lists, website 
publishing, etc. etc. [3]

> The apache Jenkins servers are linked to the svn repos, making continuous 
> integration easy - on the rare occasion when somebody does change something 
> in a model repo. While I expect anybody savvy enough to work on models to 
> also have the knowhow and wherewithal to work with a separate svn repo, I 
> don't want them to need to get out to jenkins and manually kick off snapshot 
> builds.

Jenkins also supports GitHub very well [4]. For example, in UIMA, we just drop 
a `Jenkinsfile` [5,6] configuration file into each repo and Jenkins picks them 
up even gives us support pull requests [7].
I'm happy to help you setting that up for cTAKES as well.

> Probably most important is the requirement of the client user to have the LFS 
> command line client. I think that there are enough hoops stuck in front of 
> getting ctakes installed/checked out/cloned/etc. and it seems to me that one 
> of the biggest reasons to use github is to make things easier for absolute 
> newbies to just pull down code and experiment.

It is an additional hoop to jump through indeed, but it is a one-time action to 
install LFS. Chances are that people may even already have it set up because 
they use it in other repos.

> Keeping the models on a separate svn repo would mean that they aren't checked 
> out as code, but would be put in the .m2 maven area when a user runs maven 
> compile. While the total footprint of full ctakes would still be the same 
> size, it would essentially make the code directory smaller and initial 
> downloads/checkouts would be faster. Plus, if done properly maybe it coul

Re: Apache cTAKES GitHub mirror is stuck in 2019 [EXTERNAL] [SUSPICIOUS] [SUSPICIOUS]

Re: Apache cTAKES GitHub mirror is stuck in 2019 [EXTERNAL] [SUSPICIOUS] [SUSPICIOUS]

2 matches

Site Navigation

Mail list logo

Footer information