If you want to unsubscribe, please find instructions at
http://apache.org/foundation/mailinglists.html

And the name of this list is dev@community.apache.org

Cheers
Niclas

On Thu, May 7, 2015 at 7:48 AM, Betty James <bsquar...@gmail.com> wrote:

> Oh my gosh.  How do I get off this thread.  don't know how I got on, but I
> am just a totally ignorant individual using Open Office and trying to
> donate (which doesn't sound necessary anymore)....so unless you are in good
> shape and in your 70's try to figure out how I can get off the list!
>
> Betty B. James
>
> On Tue, May 5, 2015 at 7:33 AM, Boris Baldassari <
> castalia.laborat...@gmail.com> wrote:
>
> > Hi Folks,
> >
> > Sorry for the late answer on this thread. Don't know what has been done
> > since then, but I've some experience to share on this, so here are my
> 2c..
> >
> > * Parsing dates and time zones:
> > If you are to use Perl, the Date::Parse module handles dates and time
> > zones pretty well. As for Python I don't know -- there probably is a
> module
> > for that too..
> > I used Date::Parse to parse ASF mboxes (notably for Ant and JMeter, the
> > data sets have been published here [0]), and it worked great. I do have a
> > Perl script to do that, which I can provide -- but I have no access I'm
> > aware of in the dev scm, and not sure if Perl is the most common language
> > here.. so please let me know.
> >
> > * Parsing mboxes for software repository data mining:
> > There is a suite of tools exactly targeted at this kind of duty on
> github:
> > Metrics Grimoire [1], developed (and used) by Bitergia [2]. I don't know
> > how they manage time zones, but the toolsuite is widely used around (see
> > [3] or [4] as examples) so I believe they are quite robust. It includes
> > tools for data retrieval as well as visualisation.
> >
> > * As for the feedback/thoughts about the architecture and formats:
> > I love the REST-API idea proposed by Rob. That's really easy to access
> and
> > retrieve through scripts on-demand. CSV and JSON are my favourite
> formats,
> > because they are, again, easy to parse and widely used -- every language
> > and library has some facility to read them natively.
> >
> >
> > Cheers,
> >
> >
> > [0] http://castalia.solutions/datasets/
> > [1] https://metricsgrimoire.github.io/
> > [2] http://bitergia.com
> > [3] Eclipse Dashboard: http://dashboard.eclipse.org/
> > [4] OpenStack Dashboard: http://activity.openstack.org/dash/browser/
> >
> >
> >
> > --
> > Boris Baldassari
> > Castalia Solutions -- Elegant Software Engineering
> > Web: http://castalia.solutions
> > Phone: +33 6 48 03 82 89
> >
> >
> > Le 28/04/2015 16:11, Rich Bowen a écrit :
> >
> >>
> >>
> >> On 04/27/2015 09:36 AM, Shane Curcuru wrote:
> >>
> >>> I'm interested in working on some visualizations of mailing list
> >>> activity over time, in particular some simple analyses, like thread
> >>> length/participants and the like.  Given that the raw data can all be
> >>> precomputed from mbox archives, is there any semi-standard way to
> >>> distill and save metadata about mboxes?
> >>>
> >>> If we had a generic static database of past mail metadata and
> statistics
> >>> (i.e. not details of contents, but perhaps overall # of lines of text
> or
> >>> something), it would be interesting to see what kinds of visualizations
> >>> that different people would come up with.
> >>>
> >>> Anyone have pointers to either a data format or the best parsing
> library
> >>> for this?  I'm trying to think ahead, and work on the parsing, storing
> >>> statistics, and visualizations as separate pieces so it's easier for
> >>> different people to collaborate on something.
> >>>
> >>
> >> Roberto posted something to the list a month or so ago about the efforts
> >> that he's been working on for this kind of thing. You might ping him.
> >>
> >> --Rich
> >>
> >>
> >>
> >
>



-- 
Niclas Hedhman, Software Developer
http://zest.apache.org - New Energy for Java

Reply via email to