Roman Black Duck software certainly have a useful platform though it would be useful to know what they are considering using for the POC.
Personally I¹ve used their Protex software and I can state from experience that it is quite a time consuming and thankless process to work through IP Clearance with it having done this several times over the past couple of years with pieces of code developed at my employer and then open sourced. I would certainly recommend trying a POC but I¹m not sure it is necessarily something you¹d want to impose on all incoming projects in the long term. Some info on Protex: My main concerns are that Protex while very useful is somewhat dumb primarily due to the quality of its knowledge base. For those who aren¹t aware essentially the tool scans the code looking for files that have ³signatures² that match other open source/proprietary code in the knowledge base. The open source code is scraped from all sorts of public sites like SourceForge, GitHub, BitBucket etc. For each match that occurs someone has to review the match and then they can indicate whether to exclude that match I.e. it was a false positive or to accept that match and attribute it appropriately. This is great in principle because it easily spots obvious plagiarism when it occurs. The problem from my point of view is that the false positive rate is very high and then you have to go through all the matches and manually state whether they are valid/invalid. This ends up being very time consuming because for each match on your code you have to review all the possible matches to see if there actually is a genuine match and if not then go through a process of telling the tool This is where the knowledge base starts to hurt you, there are lots of projects out there which check in everything including things like auto-generated IDE project files, build tool reports, VCS ignore files etc which tend to have very high similarity and get flagged up as false positives constantly. Ideally Apache projects won¹t themselves be checking these things in so the chances of these getting flagged should be low. As a more practical example I had a recent case where I was working through an analysis on some Hadoop related code my company is considering open sourcing which is primarily a collection of implementations of InputFormat and OutputFormat. A good number of our code files were flagged as potential matches and when reviewed the only similarity was that we had the same set of imports as many other Hadoop ecosystem projects. This is of course exacerbated by the fact that many developers use IDEs which organise their imports! So I had to spend several hours checking each file and ticking boxes in Protex to say that this was original code and not plagiarised. I would definitely recommend carrying out a POC and seeing what people make of it but be aware that it can be a painful and time consuming process. If the tool is indeed Protex then being familiar with it I would be willing to help out with a POC Cheers, Rob On 31/03/2014 01:52, "Roman Shaposhnik" <r...@apache.org> wrote: >Hi! > >a few recent discussions around IP management >in the Incubator have lead to an interesting dialogue >between the fine folks from Black Duck Software and >yours truly. > >The main ides here is that, perhaps, Black Duck >services can be as helpful to the open source >communities as the ones provided by the likes >of free Coverity scans. > >Of course, the best way to assess how much value >this potential collaboration can bring to the ASF >projects is to run a POC and see for ourselves. >Also, I think, if there's a single place in the ASF that >would benefit the most from additional pair of eyes >looking for potential IP related issues that would be >incubating projects. > >Hence, I'd like to propose that we do just that: run >a POC on a couple of projects identified by the >Incubator community. I will be the central contact >person for this, but I'd very much appreciate folks >volunteering to help. > >This thread aims at three things: > 1. collecting general feedback on whether > this is a good or bad idea. > > 2. folks volunteering to help with the POC > (if needed). > > 3. folk suggesting Incubator projects > for the first POC that we run with Black Duck. > >Please share your thoughts and feedback! > >Thanks, >Roman. > >--------------------------------------------------------------------- >To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org >For additional commands, e-mail: general-h...@incubator.apache.org > --------------------------------------------------------------------- To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org For additional commands, e-mail: general-h...@incubator.apache.org