Wow, I am really glad that these vulnerability updates have grabbed so much 
attention from the community!

Attempting to address things in order:

Ryan,
please share your report on vulnerable items!  We are using a couple of tools 
but they are definitely not uncovering so much information.  I want ctakes 
6.0.0 to be as hardened as possible, so having this in hand couldn't hurt - as 
daunting as it may be.  I like the fact that it has a separate column for 
ctakes as-a-dependency issues.

> If I shared this report, is there some concerted effort I or we together 
> could help to address these?
There are several ways to do this, but I have no idea which would be best for 
everybody.  I think that If you have a fork of ctakes you can share a pull 
request with the main ctakes repo.  I haven't done it myself, so I can't attest 
to the facility of such a workflow.  You could also create and share a gist 
containing proposed changes.  The easiest thing to do might be to simply open a 
topic on the ctakes 'issues' page and attach files or paste segments of code.  
These are just options - there might be better ways to do this.

  *
In addition, people can use this dev@ thread to coordinate work on fixes.  If 
somebody would like to tackle something, let everybody else know!

Peter,
After yesterday's update, everything should now be using log4j2 except for 
ctakes-pos-tagger.  The ctakes code in there is using log4j2, but clearnlp 
still uses log4j.  Since I can't change that code I am using the log4j-1.2-api 
bridge.  I exclude lo4j 1 from being pulled down by clearnlp and the log4j 
bridge facade forwards the old calls to log4j 2.   It is very possible that I 
missed another introduction of log4j 1 further into the dependency tree.   
Anyway, I haven't yet but will get rid of the log4j.properties files.  I 
believe that log4j2 requires them to be renamed anyway.  Please, if you have a 
.properties file that works well then provide a copy and I'll see if I can get 
it working with the ctakes stuff.  I hadn't thought of log4j scanning the 
entire classpath for .properties and picking up something in a dependency jar 
(e.g. Artemis).  For that matter, we should stop putting ours in the core jar 
and move it into user/resources/ so that it ends up only in the resources/ 
directory, and can then be overwritten, erased or modified rather than having a 
user set the log4j.configuration parameter.  It also makes ctakes .properties 
more transparent.  I wonder why "the log4j team themselves say that bundling 
property files inside distributed jars is good practice" ?  Did they just mean 
that it was better than not supplying any and having log4j rely upon defaults 
that may change with new log4j versions?  I digress.


Sean









________________________________
From: Peter Abramowitsch <pabramowit...@gmail.com>
Sent: Friday, July 26, 2024 11:42 AM
To: dev@ctakes.apache.org <dev@ctakes.apache.org>
Subject: Re: SLF4J instead of Log4J at the API level? [EXTERNAL]

* External Email - Caution *


And again on this same topic - log4j, I noticed a new inability to control
the logging level of our high-volume ctakes installation and discovered
that there are several jars (two 3rd-party new in 5.1.0) that contain a
log4j.properties with root logging set to INFO and a console appender on
root.  This is true also of ctakes-core.   Log4j takes log4.properties as
its highest priority source of configuration and funneling 1.2.17 through
either the reload jar or the log4j2 adaptor and slf4j does not alter this
for the log4j v1 calls throughout ctakes.

The net result for us is that millions of log lines are going through
syslogd ending up in /var/log/messages and slowing down the system.
Rebuilding ctakes-core without the built-in log4j.properties helps, but the
activemq and artemis jars used by PBJ also contain them.  It is simple
enough to not include those jars if not running PBJ.

The log4j team themselves say that bundling property files inside
distributed jars is good practice and it took a while to track this down.
Should we officially remove them from the ctakes-core build?

Peter

On Fri, Jul 26, 2024 at 8:20 AM Petersam, John Contractor
<john.peter...@ssa.gov.invalid> wrote:

> Hi Ryan,
> Could you please share the report?  My local build has most of the
> dependencies running at their latest versions, but my maven report doesn't
> include dependencies of dependencies so it's hard to get them all.
>
> Thanks,
> John
>
> -----Original Message-----
> From: Ryan Swenson <rswen...@nsightlabs.com>
> Sent: Friday, July 26, 2024 11:04 AM
> To: dev@ctakes.apache.org
> Subject: RE: SLF4J instead of Log4J at the API level? [EXTERNAL]
>
> Hi Sean & team,
>
> On this very topic, and related.  Sean - thanks for your timely response
> on my previous inquiry to Java 17, everything built perfectly with 6.0.0,
> and with our module added-in for its inclusion with all the other 6.0.0
> included modules.  The code built fine, while I inherited the entire
> project from a former developer and now ( lol, have a task of figuring out
> how to run our pipeline with correct classpath given several filesystem
> located dependencies, properties, and scattered local vs maven built
> libs).  After building successfully, and assuming it will run fine, Java 17
> is a go, and is currently permitted by our InfoSec.
>
> My organization then did an OWASP Security Scan of the 6.0.0 branch with
> our 1 added module inside our own Git Repo, and there was reported 1417
> vulnerabilities. I will be happy to share both the JSON files of this
> report, and share a Python script ( you can modify and/or use) to review.
>
> Essentially, there were 41 unique packages which have a host of security
> vulnerabilities.  Module wise, if I remove Smoking Status, User Resources,
> GUI, Tiny REST, and FHIR modules, I end up with 1190 vulnerabilities ( 227
> less from 1417).
>
> There were 13 "packages" libraries with 732 (135 Critical, 250 High, 0
> Low) vulnerabilities where I deemed these a lower level of effort, because
> their higher library versions provide backward compatibility and they are
> able to run with Java 6 or later, or wont have any issues running with Java
> 17. For these, I will simply specify their later versions in the pom, and
> re-build. There were another 9 libraries which I labeled medium which have
> 77 (46 Critical, 20 High, 10 Medium, 1 Low), due to likely having some
> potential breaking changes, which will require code changes, testing, &
> regressions.  Finally, there were 19 libraries with 381 vulnerabilities (10
> Critical, 178 High, 157 Medium, 36 Low) where either there was no higher
> version, requiring an alternative library and requiring code changes, or
> there were higher versions which offer no backward compatibility with
> breaking changes.
>
> - Examples: guava 10 to 32, if any @beta APIs were used, and/or methods
> which were used are overloaded in the later v32, we will have work cut out
> for us in refactoring.
> - Domj 1.61 is EOL, thus JDOM, JAXB, StAX should be considered, but now
> require refactoring -log4j 1.2.17 is EOL - Log4J2 or SLF4J should be
> considered, requiring refactoring
>
> However, its important to point out that the security report does include
> a column reflecting which module/pom each package / vulnerability is being
> reported so that 1) I can assess if this is with our custom code, or 2)
> with cTakes distro, and 3) with my knowledge of our code, what of #2 our
> module has co-dependence on - this will likely lead to some discovery of
> where we rely on less than actually what we build with, to further reduce
> effort, but there will still be the fact that there are issues which were
> reported under #2.
>
> If I shared this report, is there some concerted effort I or we together
> could help to address these?  At present, we have a raised exception which
> we have extended to now, and likely will have some leniency due to where I
> can in the interim perform the Java 8 to Java 17 upgrade, address the 732
> vulnerabilities with low LOE - updating poms with higher versions with
> minimal risk of breaking changes, and possibly address some of the mediums,
> and now only have the subset of vulnerabilities left - 381.
>
> In the meantime, I am copying runnable jars & working scripts setting the
> classpaths from other envs, to our dev env to get runnable code, and triage
> the differences between our envs, and then I will be in a place where I can
> start committing changes to our git, and test with build updates.
>
> Thanks,
> Ryan
>
>
>
>
>
>
>
>
>
>
>
>
>
>
> -----Original Message-----
> From: Finan, Sean <sean.fi...@childrens.harvard.edu.INVALID>
> Sent: Friday, July 26, 2024 9:35 AM
> To: dev@ctakes.apache.org
> Subject: Re: SLF4J instead of Log4J at the API level? [EXTERNAL]
>
> Hi Richard,
>
> I have thought of that, and almost did as much.  I think that main pom
> still has a bunch of slf4j dependencies commented out from my time in the
> sandbox.  There are only some very minor reasons that I didn't, one being
> the ease of the transition.  I have nothing against slf4j, and using an
> abstraction layer does make a lot of sense.  Would you be willing to do
> some refactoring?  I have to move over and work on some other items today -
> mainly ctakes on apache beam (plus spark, flink ...).   Something for the
> next 'new features' release.
>
> Sean
>
> ________________________________
> From: Richard Eckart de Castilho <r...@apache.org>
> Sent: Thursday, July 25, 2024 8:40 PM
> To: dev@ctakes.apache.org <dev@ctakes.apache.org>
> Subject: SLF4J instead of Log4J at the API level? [EXTERNAL]
>
> * External Email - Caution *
>
>
> Hi Sean,
>
> > On 25. Jul 2024, at 16:15, seanfi...@apache.org wrote:
> >
> > The following commit(s) were added to refs/heads/main by this push:
> >     new 1556d13  Replaced all logging with log4j2 , including java and
> commons-logging Removed or pushed back a few dependencies.
>
> If I saw it correctly, you are making direct calls to the log4j2 API.
> Have you considered using SLF4J 2.x as your API and log4j2 as the logging
> backend instead? If would facilitate embedding cTAKES in contexts that use
> a different logging backend than log4j2.
>
> Cheers,
>
> -- Richard
>
>

Reply via email to