Hi, There is so much to comment on... The world is not black & white, Assaf.
The fact that Savannah data is not easily parsable is a form of protection, admittedly weak, but still. If it's so easy to grab the data then let's not worry about publishing a dump ourselves. This is also a legal matter, as soon as you deal with personnal data aggregation. Let's not confuse data and aggregated data: checking what date you coded a feature is one thing, profiling your work-hours habits by aggregating your activity is another. In addition, in the current context of NSA aggregating data, I think it'd be a bad PR move to start shipping out most of our DB for the sake of it. Discussion with Savannah users: yes there are a lot of users, but we can still initiate a discussion, e.g. on planet.gnu.org or on savannah-users. Covering enough users to get a representative feedback. BUT as a pre-requisite of all this, I'll re-ask my initial point more clearly: what good is there to work specifically on the Savane db-structure/project, which various Savannah Hackers, me included, have failed to revive in the past, which is now dead for years, when there are other forge projects alive? Just for reference, I just spent the last month full-time to revamp FusionForge's build system and it's now fully packaged for Debian and RedHat, automated install. I'll spend next month implementing no-cron, immediate system replication. And I don't even think FusionForge is the most active project. Check the "Unfork!" talk I gave at the GNU Hackers Meeting, whose video will be posted soon, if you're not convinced already :) - Sylvain On Tue, Sep 02, 2014 at 09:49:35PM +0000, Karl Berry wrote: > Hi Assaf, > > 1. Hacking on the GNU Savannah code itself will be easier, and more > inviting. > 2. Bringing the current Savannah code up-to-date. > 3. Examining the current databases as preparation for migration to a > new platform. > > All those are unquestionably (IMO) valid arguments, but don't speak to > making a database dump *publicly* available. Available on the Savannah > machines would be enough for those purposes. > > 4. Allowing interested people to explore the GNU Savannah public > data (which is already public), develop new useful features, and > finding new interesting statistics. > > Ok, that could be persuasive. > > As I've written in the previous email, I consider "public" only > information that's available to non-logged users. > > That sounds very sensible to me. > > I agree that there is a technical differences between making an > interested user jump through web-parsing hoops, and between > providing an SQL-based database which allows simple queries. But > there is no conceptual difference. The information is already out > there. > > I agree there is no conceptual difference, and I also agree there is an > important technical difference. > > Are the SQL commands I've listed in the previous emails adequate for > removing private information? > > As Sylvain said, it still seems much better to me, in principle, to > extract only the public information than delete the private. Better to > err on the side of less being included. > select from bugs where privacy=0 or whatever ... > > thanks, > k