On Tue, Jan 22, 2013 at 5:58 PM, Pedro Giffuni <p...@apache.org> wrote: > ----- Messaggio originale ----- > >> Da: Andrea Pescetti > >> >> Pedro Giffuni wrote: >>> It would be good to tun a RAT scan over the website. We have not done >>> anything to clean the content licensewise and we probably carry >>> copyleft content, including code, there! >> >> The website contains gigabytes of materials for which we are probably unable >> to >> trace detailed history and licensing, since they come from multiple CVS >> repositories, then lost and migrated to multiple SVN repositories, then lost >> and >> migrated to the current tree. >> >> So a RAT scan wouldn't probably yield anything actionable. >> >> The only thing we know for sure is that all those materials were contributed >> to >> be put on the openoffice.org website and that we are continuing to keep them >> online. Even if there is copyleft content or code I believe it will be fine >> so >> long as we don't put it in a release (and it won't happen that some site >> contents go into a release without a thorough check). >> > > If we are distributing code there it is our responsibility. > > > I am afraid there are also tarballs that deserve special consideration. > I recall we were carrying a GPL'd slovenian dictionary (not sure if I finally > got rid of it). Some content like the SDK should be verified for licensing > content and updated. >
If you have specific examples, that would be great. I thin a RAT scan on the website would be too noisy, and it only gets the static pages, not the content on the wiki. > The fact that information was transfered through CVS and SVN or whatever > is irrelevant we should know what we have and ultimately after any cleanup > SVN will remember what we had in there. > > I understand we are underpowered to fix all that but the biggest problem is > that we don't have any accounting over the content there, so it's a can of > worms waiting to be opened. > There are a range of potential issues, of varying severity: 1) Something hosted where we have no legal permission to host it. That would be bad. 2) Something hosted where there is suggestion that it is an Apache-approved release but it isn't. That is a policy issue, not a legal one. We could decide to add a disclaimer, or remove the content. I'd take this on a case-by-case basis. There are parts of the website, such as the forums and the wiki, where user content has traditionally been hosted, under a variety of licences. 3) Distribution of significant files directly from the website, with resulting bandwidth impact. This is an Infrastructure policy violation, and the content would need to be moved. Perhaps other potential issues, but we'd really need to talk specifics, In any case, I'd hope that all committers feel empowered to fix such issues when they arise. -Rob > Pedro. >