RE: Contribute to ctakes: it is in your best interests! RE: unknown dependencies [EXTERNAL] [SUSPICIOUS] [SUSPICIOUS] [SUSPICIOUS] [SUSPICIOUS] [SUSPICIOUS]

Finan, Sean Tue, 21 Nov 2017 08:32:32 -0800

I just checked the files into trunk an hour ago, so you'll need to update.

ctakes-examples-res   /src/main/resources/  
org/apache/ctakes/examples/annotation/anafora_annotated


-----Original Message-----
From: Miller, Timothy [mailto:timothy.mil...@childrens.harvard.edu] 
Sent: Tuesday, November 21, 2017 11:20 AM
To: dev@ctakes.apache.org
Subject: Re: Contribute to ctakes: it is in your best interests! RE: unknown 
dependencies [EXTERNAL] [SUSPICIOUS] [SUSPICIOUS] [SUSPICIOUS] [SUSPICIOUS] 
[SUSPICIOUS]

Yeah, it's definitely hard to do it the most efficient way because the 
sensitive nature of our source data. You can see roughly what the source data 
looks like in our ctakes-example-res project
(/home/tmill/Projects/ctakes-git/ctakes-examples-
res/src/main/resources/org/apache/ctakes/examples/annotation/anafora_an
notated)
Each document has a directory with the plaintext document and an xml file 
indicating spans of entities and relations between entities. The xml files 
contain no identified information, but the plaintext is required for feature 
extraction, and so we cannot rebuild models without them.

However, another possibility, as Alex mentioned, is to have models be not in 
the git repo but be resources. We already intended something like that by 
having them in *-res modules, but if there are other ideas for structures that 
would keep models completely out of the repo (or in another repo that wouldn't 
be required), I would be happy to hear about them.

One final thing we (myself and others) need to be better at is that large 
models shouldn't be checked in until they are used for default modules, and 
shouldn't be used for default models unless they offer large performance 
benefits (in terms of accuracy). Might be worth dev discussion if there is some 
indecisio (for example, a 1Gb model that offers 2% improvement on relation 
extraction, is that worth it?) Sometimes I've checked things in that run in 
experimental projects where they may or may not make it into default models.

Tim




On Tue, 2017-11-21 at 14:21 +0000, Finan, Sean wrote:
> Hi Alex,
> 
> > 
> > I know about the importance of these models.
> My apologies if I offended.
> 
> > 
> > I would like to know if there is a way also to generate them.
>  There is a little bit of documentation on models expertly written by 
> Tim.  Right now it is in a pamphlet that we distributed at a hackathon 
> a couple of years ago and the contents should definitely be copied 
> into the wiki.  I think that there is a jira for it, but I'm not 
> certain.
> On the main ctakes wiki page for 4.0
> https://urldefense.proofpoint.com/v2/url?u=https-3A__cwiki.apache.org
> _confluence_display_CTAKES_cTAKES-
> 2B4.0&d=DwIGaQ&c=qS4goWBT7poplM69zy_3xhKwEW14JZMSdioCoppxeFU&r=Heup-
> IbsIg9Q1TPOylpP9FE4GTK-OqdTDRRNQXipowRLRjx0ibQrHEo8uYx6674h&m=6V-
> pSvmqqANZgc5S56uDn3iKdm_e9XeiPBzEl4jTr5Q&s=PajX2LAbUuShItvLgZPSFtEdy8
> I1--L-ok4nTjXNphk&e=
> it is on the second line in the "Documentation" list.
> Again, it needs to be moved into the wiki - and updated if necessary.
> 
> > 
> > The same principle (I presume) it applies.
> You need a bit of machine learning awareness and annotated data.
> 
> > 
> > If we are able to generate them, then we can version the source and 
> > the process to generate them and not the binaries themselves.
> Some of the models are created using 'proprietary' data that cannot be 
> distributed.
> Some of the models are created with data that is actually larger in 
> footprint than the models.
> 
> > 
> > What is the lifecycle of a model?
> It depends what you mean by lifecycle.  In terms of sdlc it is a very 
> long waterfall.  First, the aims are set.  This often (around us,
> anyway) involves brainstorming between a number of people on aims for 
> the model, like what types and attributes can and should be produced.  
> An appropriate source for data needs to be found, the data acquired 
> ... and getting a grant to cover the cost of doing it.  Then the data 
> needs to be annotated, then experts fiddle with the various features 
> and methods for a while running a gazillion times to fine- tune.  For 
> example, I think that the temporal models have been under development 
> for over five years by several developers, and the training data was 
> annotated by another half dozen or so experts.  If new data is 
> acquired from another project the model is improved and updated.
> If you are asking about the lifetime of a model, that is highly 
> variable.  New data, new researchers, available time, interest and of 
> course the accuracy of an existing model all play a part.  A model may 
> go years without any changes, or it might be updated monthly or weekly 
> or even daily depending upon how a person is working and using vcs.
> 
> > 
> > Can it be integrated with other Deep Learning frameworks from ASF?
> Are you asking about other frameworks using ctakes models or ctakes 
> using other models?  I think that some of the models used by ctakes do 
> originally come from other sources.  Besides that, if those other 
> frameworks are willing to use libraries like cleartk then there 
> shouldn't be much of a problem.  There are currently some initiatives 
> trying to incorporate some deep learning frameworks.  If anybody out 
> there working on one is reading this then they can give you some 
> information.
> 
> > 
> > I also come from a background of Continuous Delivery,
> I appreciate that in every sense of the word!
> 
> I hope that this information helps.  The pamphlet section on models 
> that Tim wrote is the best starting point.  ML experts (which I am
> not) out there can contribute a lot more information, probably even a 
> correction or two.
> 
> Sean
> 
> -----Original Message-----
> From: Alexandru Zbarcea [mailto:zbarce...@gmail.com]
> Sent: Tuesday, November 21, 2017 8:35 AM
> To: Apache cTAKES Dev
> Subject: Re: Contribute to ctakes: it is in your best interests! RE:
> unknown dependencies [EXTERNAL] [SUSPICIOUS] [SUSPICIOUS] [SUSPICIOUS]
> 
> Hi Sean,
> 
> I know about the importance of these models. Tim was also kind enough 
> to explain to me in a previous email on the mailing list about the 
> importance of them and about the fact that these models were created 
> by experts.
> 
> However, I'm not proposing to remove them, but to document better 
> their importance. Also, I would like to know if there is a way also to 
> generate them. I appreciate the way Pipeline aggregation was solved in 
> cTAKES, by creating a new DSL [1] (Piper) that was easy to read and 
> also build a lot of automation and flexibility. The same principle (I 
> presume) it applies.
> If we are able to generate them, then we can version the source and 
> the process to generate them and not the binaries themselves.
> 
> If we can use the cTAKES CLIs to generate some of these models, and 
> simulate what the expert would do using the UI, we would have a 
> reproducible process that can also be perfected over time by other 
> experts.
> Is like the Lucene viewer vs Lucene Java API. I don't know how 
> feasible this is, though. Just my $0.0.2.
> 
> I'm looking to not only understand the cTAKES Java code, but how the 
> entire process works. One of the pieces missing for me, is what 
> expertise you actually need and how dependent of a context it is to 
> build these models. I also come from a background of Continuous 
> Delivery, so few questions popped
> out: What is the lifecycle of a model? Can it be integrated with other 
> Deep Learning frameworks from ASF?
> 
> What do you think?
> 
> Alex
> 
> [1] - https://urldefense.proofpoint.com/v2/url?u=https-3A__en.wikiped
> ia.org_wiki_Domain-2Dspecific-
> 5Flanguage&d=DwIBaQ&c=qS4goWBT7poplM69zy_3xhKwEW14JZMSdioCoppxeFU&r=f
> s67GvlGZstTpyIisCYNYmQCP6r0bcpKGd4f7d4gTao&m=Z1PqE3gYYReZ9DTKn8orPn03
> 5tOYJSebS_S_Yq39mHY&s=k5C2cLaa5HI6YU7YX0nXqzUWbrV_KHNqDzSWGyN_jqc&e=
> 
> On Tue, Nov 21, 2017 at 7:30 AM, Finan, Sean < Sean.Finan@childrens.h 
> arvard.edu> wrote:
> 
> > 
> > Hi Alex,
> > 
> > The model.jar files are needed and cannot be removed.  You may have 
> > noticed that a lot of those hard-coded paths point to these 
> > model.jar files.
> > 
> > Sean
> > 
> > 
> > -----Original Message-----
> > From: Alexandru Zbarcea [mailto:al...@apache.org]
> > Sent: Monday, November 20, 2017 7:33 PM
> > To: Apache cTAKES Dev
> > Subject: Re: Contribute to ctakes: it is in your best interests!
> > RE:
> > unknown dependencies [EXTERNAL] [SUSPICIOUS] [SUSPICIOUS] 
> > [SUSPICIOUS]
> > 
> > Thank Tim,
> > 
> > I am in favor of moving to git too. If there is a desire from the 
> > community to move entirely over git,
> > 
> > I can work with Apache Infra to make the migration.
> > 
> > I wonder if we can reduce the repository size on this transition. 
> > Based on Apache rules, history is not allowed to be rewritten.
> > Migrations like these are used though, to cleanup some of the big 
> > (space consuming) resource.
> > (e.g. models "*.jar"):
> > $ find . -name "*.jar" | xargs du -hsc 2.3M    
> > ./ctakes-temporal-res/src/main/resources/org/apache/
> > ctakes/temporal/ae/eventevent/model.jar
> > 348K    ./ctakes-temporal-res/src/main/resources/org/apache/
> > ctakes/temporal/ae/contextualmodality/model.jar
> > 4.0K    ./ctakes-temporal-res/src/main/resources/org/apache/
> > ctakes/temporal/ae/salience/model.jar
> > 1.0M    ./ctakes-temporal-res/src/main/resources/org/apache/
> > ctakes/temporal/ae/eventannotator/model.jar
> > 568K    ./ctakes-temporal-res/src/main/resources/org/apache/
> > ctakes/temporal/ae/doctimerel/model.jar
> > 2.2M    ./ctakes-temporal-res/src/main/resources/org/apache/
> > ctakes/temporal/ae/eventtime/model.jar
> > 1.3M    ./ctakes-temporal-res/src/main/resources/org/apache/
> > ctakes/temporal/ae/timeannotator/model.jar
> > 7.8M    ./ctakes-pos-tagger-res/src/main/resources/org/apache/
> > ctakes/postagger/models/clearnlp/mayo-en-pos-1.3.0.jar
> > 4.0K    ./ctakes-coreference-res/src/main/resources/org/apache/
> > ctakes/coreference/models/mention-cluster/model.jar
> > 1.5M    ./ctakes-core-res/src/main/resources/org/apache/ctakes/
> > core/sentdetect/model.jar
> > 
> > 504K    ./ctakes-assertion-res/src/main/resources/org/apache/
> > ctakes/assertion/models/subject/model.jar
> > 588K    ./ctakes-assertion-res/src/main/resources/org/apache/
> > ctakes/assertion/models/historyOf/model.jar
> > 332K    ./ctakes-assertion-res/src/main/resources/org/apache/
> > ctakes/assertion/models/uncertainty/model.jar
> > 740K    ./ctakes-assertion-res/src/main/resources/org/apache/
> > ctakes/assertion/models/conditional/model.jar
> > 592K    ./ctakes-assertion-res/src/main/resources/org/apache/
> > ctakes/assertion/models/polarity/sharpi2b2mipacqnegex/model.jar
> > 572K    ./ctakes-assertion-res/src/main/resources/org/apache/
> > ctakes/assertion/models/generic/model.jar
> > 1.5M    ./ctakes-assertion-res/resources/model/
> > sharpi2b2mipacqnegex/polarity/model.jar
> > 312K    ./ctakes-dependency-parser-res/src/main/resources/org/
> > apache/ctakes/dependency/parser/models/lemmatizer/dictionary-
> > 1.3.1.jar
> > 228M    ./ctakes-dependency-parser-res/src/main/resources/org/
> > apache/ctakes/dependency/parser/models/clearparser_models.jar
> > 5.8M    ./ctakes-dependency-parser-res/src/main/resources/org/
> > apache/ctakes/dependency/parser/models/srl/mayo-en-srl-1.3.0.jar
> > 452K    ./ctakes-dependency-parser-res/src/main/resources/org/
> > apache/ctakes/dependency/parser/models/pred/mayo-en-pred-1.3.0.jar
> > 1.2M    ./ctakes-dependency-parser-res/src/main/resources/org/
> > apache/ctakes/dependency/parser/models/role/mayo-en-role-1.3.0.jar
> > 25M     ./ctakes-dependency-parser-res/src/main/resources/
> > org/apache/ctakes/dependency/parser/models/dependency/mayo-
> > en-dep-1.3.0.jar
> > 688K    ./ctakes-relation-extractor-res/src/main/
> > resources/org/apache/ctakes/relationextractor/models/location_of/mo
> > del.jar
> > 488K    ./ctakes-relation-extractor-res/src/main/
> > resources/org/apache/ctakes/relationextractor/models/degree_of/mode
> > l.jar
> > 300K    ./ctakes-relation-extractor-res/src/main/
> > resources/org/apache/ctakes/relationextractor/models/
> > modifier_extractor/model.jar
> > 
> > 282M    total
> > 
> > or
> > 
> > $ find ./ -type f -size +5M | grep -v "\.jar" | grep -v "\.svn" | 
> > grep -v "\.git" | xargs du -hsc 9.2M
> >    ./ctakes-coreference-res/src/main/resources/org/apache/
> > ctakes/coreference/models/index_med_5k/_3.prx
> > 
> > 20M
> >     ./ctakes-coreference-res/src/main/resources/org/apache/
> > ctakes/coreference/models/index_med_5k/_3.tvf
> > 
> > 6.9M
> >    ./ctakes-coreference-res/src/main/resources/org/apache/
> > ctakes/coreference/pref_probs.txt
> > 
> > 13M
> >     ./ctakes-chunker-res/src/main/resources/org/apache/ctakes/
> > chunker/models/chunker-model.zip
> > 
> > 6.4M
> >    ./ctakes-constituency-parser-res/src/main/resources/org/
> > apache/ctakes/constituency/parser/models/thyme.bin
> > 
> > 15M
> >     ./ctakes-constituency-parser-res/src/main/resources/org/
> > apache/ctakes/constituency/parser/models/sharpacq-3.1.bin
> > 
> > 12M
> >     ./ctakes-constituency-parser-res/src/main/resources/org/
> > apache/ctakes/constituency/parser/models/sharpacq-1.5.bin
> > 
> > 84M
> >     ./resources/org/apache/ctakes/dictionary/lookup/fast/sno_rx_
> > 16ab/sno_rx_16ab.script
> > 
> > 11M
> >     ./ctakes-assertion-res/src/main/resources/org/apache/
> > ctakes/assertion/models/pos.model
> > 
> > 38M
> >     
> > ./ctakes-assertion-
> > res/resources/model/sharpi2b2mipacqnegex/polarity/
> > training-data.liblinear
> > 
> > 9.6M
> >    ./ctakes-temporal/src/main/resources/org/apache/ctakes/
> > temporal/thyme_word2vec_mapped_50.vec
> > 
> > 91M
> >     ./ctakes-temporal/src/main/resources/org/apache/ctakes/
> > temporal/gloveresult_3
> > 
> > 67M
> >     ./ctakes-temporal/src/main/resources/org/apache/ctakes/
> > temporal/mimic_vectors.txt
> > 
> > 378M    total
> > 
> > Are all these resources still relevant? Is there a way to generate 
> > them?
> > 
> > I do not wish to open the Pandora box though, Alex
> > 
> > 
> > On Mon, Nov 20, 2017 at 9:29 AM, Finan, Sean <Sean.Finan@childrens.
> > harvard.
> > edu> wrote:
> > 
> > > 
> > > Thanks Tim!
> > > 
> > > -----Original Message-----
> > > From: Miller, Timothy [mailto:timothy.mil...@childrens.harvard.ed
> > > u]
> > > Sent: Monday, November 20, 2017 6:33 AM
> > > To: dev@ctakes.apache.org
> > > Subject: Re: Contribute to ctakes: it is in your best interests!
> > > RE:
> > > unknown dependencies [EXTERNAL] [SUSPICIOUS] [SUSPICIOUS] 
> > > [SUSPICIOUS]
> > > 
> > > Git is available to apache projects, and many projects have moved 
> > > over (see here:
> > > https://urldefense.proofpoint.com/v2/url?u=https-3A__git-2Dw
> > > ip-2Dus.apache.org_repos_asf&d=DwIFAw&c=qS4goWBT7poplM69zy_
> > > 3xhKwEW14JZMSdioCoppxeFU&r=fs67GvlGZstTpyIisCYNYmQCP6r0bcpKG
> > > d4f7d4gTao&m=4MlIq9wS4oGckpd3UeTqtmRuisKsRIYt9x2E8_IDYuU&s=X
> > > doxI3lfNrIjSbIVrftDXbkKSJCPH4UkwRroutX-Xp8&e=):
> > > Here is the general info on what that looks like:
> > > https://urldefense.proofpoint.com/v2/url?u=https-3A__www.apa
> > > che.org_dev_writable-2Dgit&d=DwIFAw&c=qS4goWBT7poplM69zy_3x
> > > hKwEW14JZMSdioCoppxeFU&r=fs67GvlGZstTpyIisCYNYmQCP6r0bcpKGd4
> > > f7d4gTao&m=4MlIq9wS4oGckpd3UeTqtmRuisKsRIYt9x2E8_IDYuU&s=n-
> > > m8yd0ayquMf_zuubKtRyr7LydiMTj-tluvryaf0oA&e=
> > > 
> > > A few points from that link:
> > > > 
> > > > Projects can request moving to Git as their main code 
> > > > repository, by
> > > creating an INFRA issue. See also the infra-contact page. > 
> > > Projects can request new, blank repositories by using 
> > > reporeq.apache.org.
> > > > 
> > > > The current system has basic git support only. We are working on
> > > extending this service in the near future.
> > > > 
> > > > Custom commit or other hooks will not be supported, all projects 
> > > > get the
> > > same hooks. Setting up gitpubsub should provide sufficient 
> > > flexiblity without impacting the core Git setup, volunteers are 
> > > welcome to make that happen.
> > > 
> > > (Not sure what basic support only means.)
> > > 
> > > There are also read-only git repos available by default for every 
> > > project and updated in near-real-time:
> > > https://urldefense.proofpoint.com/v2/url?u=https-3A__www.apa
> > > che.org_dev_git.html&d=DwIFAw&c=qS4goWBT7poplM69zy_3xhKwEW14
> > > JZMSdioCoppxeFU&r=fs67GvlGZstTpyIisCYNYmQCP6r0bcpKGd4f7d4gTa
> > > o&m=4MlIq9wS4oGckpd3UeTqtmRuisKsRIYt9x2E8_IDYuU&s=C8RL68JNrL
> > > pGNVGdwP4YjKi3MZyMFevtQHOJxn7yWsc&e=
> > > 
> > > with those I guess the suggested workflow is to work off of that 
> > > repo and then just submit patches to someone who commits with svn 
> > > rather than committing directly.
> > > 
> > > I've been using the git-svn connector myself recently since I just 
> > > vastly prefer the git lightweight branching for focused 
> > > development, as it helps me keep a cleaner working directory. But 
> > > that adds some additional annoying steps.
> > > 
> > > Tim
> > > 
> > > ________________________________________
> > > From: Finan, Sean <sean.fi...@childrens.harvard.edu>
> > > Sent: Saturday, November 18, 2017 1:23 PM
> > > To: dev@ctakes.apache.org
> > > Subject: RE: Contribute to ctakes: it is in your best interests!
> > > RE:
> > > unknown dependencies [EXTERNAL] [SUSPICIOUS] [SUSPICIOUS]
> > > 
> > > Hi Dave,
> > > 
> > > Those are some great thoughts.  Being an apache project I am not 
> > > sure how far we can move from svn, but there may be a way.  You 
> > > are not the first to voice this desire for an active github repo 
> > > and I'm sure that you won't be the last.
> > > 
> > > I completely agree with your discussion board preference.  Do you 
> > > have any recommendations?
> > > 
> > > You make a great point regarding documentation.  In reference to 
> > > things that anybody can quickly contribute ... that would be a big 
> > > one.
> > > Volunteers?!?
> > > 
> > > I am really happy to hear that you want to contribute - more than 
> > > you already have, which is actually quite a bit!
> > > 
> > > Cheers,
> > > Sean
> > > 
> > > -----Original Message-----
> > > From: David Kincaid [mailto:kincaid.d...@gmail.com]
> > > Sent: Saturday, November 18, 2017 1:10 PM
> > > To: dev@ctakes.apache.org
> > > Subject: Re: Contribute to ctakes: it is in your best interests!
> > > RE:
> > > unknown dependencies [EXTERNAL] [SUSPICIOUS]
> > > 
> > > Sean, I can share a couple things that have been an obstacle for 
> > > me.
> > > It may seem a minor point to some, but I left Subversion behind 
> > > years ago and really have no desire to go back. If the project 
> > > were moved over to Git/Github it would really smooth the way for 
> > > me at least. I would be happy to help out with this. One of the 
> > > other things I would really like to see is the mailing list moved 
> > > onto a discussion board platform. It seems to me that a discussion 
> > > board style of tool tends to create a more active community than a 
> > > mailing list does.
> > > 
> > > The other thing that might help get new people involved is making 
> > > it easier to find information about the development environment.
> > > Things
> > > like branching strategies, coding conventions, etc are really hard 
> > > to find from the main cTAKES web site. I saw some references to 
> > > Jenkins builds recently on the list. I had no idea there was a 
> > > Jenkins CI server for the project somewhere. It also takes some 
> > > digging to find a link to Jira. Maybe we could create a Wiki page 
> > > that describes where all these tools are and how they are used.
> > > 
> > > You guys have really done some great work over the last couple of 
> > > years cleaning up the code base and improving the documentation by 
> > > a ton. Things like the fast dictionary annotator, dictionary 
> > > creator GUI are a great addition and make it a lot easier for 
> > > other people to get up and running more quickly. As I'm ramping up 
> > > my research as well as some proof of concept stuff at work I'll be 
> > > working more and more with cTAKES and would love to contribute 
> > > more to the project.
> > > 
> > > Just my thoughts.
> > > 
> > > - Dave
> > > 
> > > 
> > > On Sat, Nov 18, 2017 at 11:10 AM, Finan, Sean < 
> > > sean.fi...@childrens.harvard.edu> wrote:
> > > 
> > > > 
> > > > Hi Tim, Alex,
> > > > 
> > > > Great ideas.  I like your (Tim) idea to 1. start with commented 
> > > > code removal.
> > > > Then maybe move on to
> > > > 2. sanity-test type unit tests - Little two or three-line "does 
> > > > this method crack" tests.
> > > > And another that is simply
> > > > 3. "populate a test cas with type(s) X" and a factory with 
> > > > "getSectionTestCas" "getSetenceTestCas" "getPosTestCas"
> > "getChunkTestCas"
> > > 
> > > > 
> > > > ...  just really simple reusables for tests.
> > > > Then
> > > > 4. refactor to extract and consolidate duplicate code - it is 
> > > > all over the place ...
> > > > 
> > > > These are just my initial thoughts and suggestions, but I think 
> > > > that
> > > those
> > > > 
> > > > 4 tasks can be performed by anybody of any experience level.   
> > > > They
> > build
> > > 
> > > > 
> > > > upon each other and should help the implementers better 
> > > > understand
> > > ctakes.
> > > > 
> > > > After that the sky is the limit.
> > > > 
> > > > A couple of years ago I sat on a panel at a workshop for open 
> > > > source scientific software.  For the half dozen or so 
> > > > highlighted projects (ctakes was one!) the common thread was 
> > > > that getting people to contribute is extremely difficult.
> > > > I have a tendency to assume that people always act in their best 
> > > > interests.  Any student thinking of going towards industry 
> > > > should be jumping at the opportunity to contribution to a large, 
> > > > production-quality project.  They should also realize that 
> > > > contribution means potential recommendation (and possibly hiring
> > > > interest) by established developers, physicians and researchers 
> > > > that use ctakes.  Even just answering questions on a user or dev 
> > > > list creates
> > > credibility and can build a network.
> > > > 
> > > > Active researchers could discover common thoughts and directions 
> > > > that could lead to collaboration outside ctakes.  Researchers 
> > > > and companies trying to build upon open source should realize 
> > > > that direct contribution is easier than custom substitution.  
> > > > Plus, it is in their best interests that code does what they 
> > > > need it to do in the fastest, lightest, most stable way 
> > > > possible.
> > > > With a project like ctakes there are a lot of things that can be 
> > > > done, there are great opportunities to really shine.  "I wrote 
> > > > this tool for my thesis that performs some nlp task" sounds 
> > > > good.
> > > > Appending "in an Apache product and it has been taken up by 
> > > > thousands
> > across the globe"
> > > 
> > > > 
> > > > makes it sound a lot better.
> > > > At my previous job in industry the company actively contributed 
> > > > to several open source projects.  We had a few people for whom 
> > > > that was 50% of their job.  Why?  Because we made a commitment 
> > > > to use that open
> > > source software.
> > > > 
> > > > It was a better use of our resources to contribute to it, 
> > > > improve it and keep its momentum going and prevent it from 
> > > > becoming stale (or
> > > > abandoned) while our software continued to move forward.
> > > > 
> > > > Hmm, that was a touch more than I had planned to write.  A whole 
> > > > cup of coffee in that one.
> > > > 
> > > > Sean
> > > > 
> > > > 
> > > > 
> > > > 
> > > > -----Original Message-----
> > > > From: Miller, Timothy
> > > > [mailto:timothy.mil...@childrens.harvard.edu]
> > > > Sent: Saturday, November 18, 2017 8:13 AM
> > > > To: dev@ctakes.apache.org
> > > > Subject: Re: unknown dependencies [EXTERNAL] [SUSPICIOUS]
> > > > 
> > > > Thanks Alex, looks like that was probably a fat-fingered 
> > > > auto-import on my part.
> > > > 
> > > > I like your idea, and I don't know the best way to to start 
> > > > either, but maybe one suggestion is to start with one or two 
> > > > focused things to clean up, and then ask for volunteers to take 
> > > > on specific modules?
> > > > Then people can contribute an hour here and there to do cleanup 
> > > > on their task/module and try to fix that thing in a 1-2-month 
> > > > long sprint. I am happy to contribute to cleanup, I am 
> > > > responsible for my fair share of unclean code, but since I don't 
> > > > have strong software engineering chops it would be good to have 
> > > > people with that background propose the tasks and describe 
> > > > exactly what needs to be done. My idea of cleaning is just to 
> > > > delete commented out sections of
> > evaluation code.
> > > 
> > > > 
> > > > 
> > > > Tim
> > > > 
> > > > ________________________________________
> > > > From: Alexandru Zbarcea <al...@apache.org>
> > > > Sent: Friday, November 17, 2017 4:46 PM
> > > > To: Apache cTAKES Dev
> > > > Subject: unknown dependencies [EXTERNAL]
> > > > 
> > > > Hi,
> > > > 
> > > > I notice that a miss-dependency has slipped in the code:
> > > > jdk.internal.org.objectweb.asm.commons.AnalyzerAdapter;
> > > > 
> > > > Now, that the Jenkins builds is successful, I think it is easier 
> > > > to clean-up the code. I would like to be a common effort. I 
> > > > don't know the best way to approach this.
> > > > 
> > > > Looking forward to your advice,
> > > > Alex
> > > >

RE: Contribute to ctakes: it is in your best interests! RE: unknown dependencies [EXTERNAL] [SUSPICIOUS] [SUSPICIOUS] [SUSPICIOUS] [SUSPICIOUS] [SUSPICIOUS]

Reply via email to