On Sat, May 11, 2013 at 1:02 AM, janI <j...@apache.org> wrote:
> On 11 May 2013 01:20, Rob Weir <robw...@apache.org> wrote:
>
>> On Thu, May 9, 2013 at 3:07 AM, Andrea Pescetti <pesce...@apache.org>
>> wrote:
>> > janI wrote:
>> >>
>> >> On 9 May 2013 00:23, Rob Weir wrote:
>> >>>
>> >>> 5) Development is also made more difficult by the intrinsic complexity
>> >>> of the code base, the build system and the poor state of the developer
>> >>> documentation.
>> >>
>> >> yes because we try to tell people to grasp the whole system in one-go
>> >>
>> >> "start by build AOO", that is a nice way of making a developer feel
>> >> insecure.
>> >
>> >
>> > What can be done to improve this? Building OpenOffice is the first step
>> for
>> > any core code contributor, I can't understand why this makes people feel
>> > insecure. But it's a critical point, so whatever we can do to improve
>> this
>> > stage will help.
>> >
>> > For example, impatient people who do not use "./configure --help" will
>> have
>> > a hard time figuring out the errors with dmake and epm, and downloading
>> the
>> > prebuilt unowinreg from
>> http://www.openoffice.org/tools/unowinreg_prebuild/
>> > could be simplified or automated... But none of this seems a tremendous
>> > improvement to me. Can we do better? Does
>> > http://wiki.openoffice.org/wiki/Documentation/Building_Guide_AOO
>> > need some rewriting?
>> >
>> >
>> >> What we need to do (in my opinion) is to define small tasks
>> >> (preferable "hot" topics), not solve a bug (remember an office system is
>> >> not really "hot" for developers, and solving bugs is boring.
>> >
>> >
>> > The "hotness" concept is correct. For example, a key feature of
>> OpenOffice 4
>> > is the sidebar; it may well be one of the "hot" developments you
>> describe.
>> > So far it has been developed in a branch, with no public communication,
>> then
>> > moved to trunk, again with no public communication, then subject to QA
>> > (again, no public communication except for mailing lists).
>> >
>> > If we identified 3-5 small development tasks to make the sidebar better
>> or
>> > fix the discovered bugs, and exposed them in a blog post, we could
>> > (realistically) attract 3-5 new developers. But the trade-off would be
>> that
>> > Andre, or some other expert developer, must spend time on mentoring new
>> > developers instead of making the fixes himself, which would probably take
>> > him less time. And those developers would need to work on less documented
>> > (since they are evolving) features, so this is feasible only if an
>> > experience developer is willing to invest a lot of time on it, as an
>> > investment to get more developers and better documentation.
>> >
>> >> people.apache.org/~jani/topCommit6mdr.txt
>> >> people.apache.org/~jani/topCommit1year.txt
>> >> people.apache.org/~jani/topCommit2year.txt
>> >
>> >
>> > Before we see direct links to these resources appearing everywhere and
>> > purported as official published statistics from the project, I'd
>> recommend
>>
>> It took less than a day, and these numbers are already being used on
>> Lwn.net:
>>
>> http://lwn.net/Articles/550079/
>>
>> There is an art to working on a high-profile project that is monitored
>> closely by detractors, and part of it is not to quote statistics
>> unless you are sure they are meaningful.   The comparison of this year
>> versus last year, without regard for the fact that last year had 12
>> months of data while this year has had only 4, is something that is
>> easy to forget when the data is quoted out of context.  But it makes a
>> huge difference when you consider that contributors come and go
>> according to their interests, and any longer period of time will show
>> more of them.
>>
>
> I strongly disagree with you here, but to follow your point I think the
> comparesion of active committers is meaningfull otherwise I would not have
> written the mail. When you look at the number of active committers over
> time, the actual time period is not critical (if it is relavetively big).
> The number are explained why the are not compareable.
>

In any development community you will have some core members who
remain over a longer, but still finite period of time.  And you will
have some who come in, "scratch an itch", do what they want and leave.
 At major transition points, major releases, some will feel a sense of
accomplishment and decide it is a good time to move on.  You'll see
this in every project.

So comparing number of committers in 1 week versus 1 month versus 3
months versus 1 year versus 2 years, etc., will always show that
relationship.  The longer period of time will always show more
committers.  Even with constant average number of committers this will
be true.  Even with moderate growth this will be true.

You cannot fairly compare different time periods with different sample
lengths.  It doesn't tell you anything meaningful at all.  So when you
post something and claim it compares the number of committers "last
year" versus "this year", but this year is actually only a few months,
then you post numbers that are deceptive.    Maybe you didn't intend
it that way, but they are.

If you want to make a comparison over time you need to keep the length
of the period constant.  For example, compare January-April 2012 with
January-April 2013.  Or do numbers for rolling 12 month periods, or
something like that.  But if you mix different time periods and
different lengths of periods, and then label them with simplistic
labels like "this year" and "last year" you will have some people (as
happened this time) grab on to the numbers and not appreciate that
your numbers are incomparable.

> It cannot be correct, that we cannot discuss facts in here....this mail
> thread have in my opinion had a good giving discussion, which I dont think
> we would have had without the initial mail (I have tried to start exactly
> this discussion before).
>

We do need to be careful that our facts are thought out and are given
sufficient context and disclaimers that they do not immediately become
source material for Wikipedia.


> Just for the record, the number files do have a disclaimer, meaning the
> direct usage they have done is not legal, furthermore the first mail in the
> mail thread contains a description of the numbers.
>

I don't want to discourage you from looking at data or posting what
you find.  But if it is preliminary, unreviewed, perhaps subject to
methodological errors, provided to spur discussion, but not final
publication quality data, then it might be worth labeling it in such a
way that it is not immediately taken as official data from the
project.   This is something many of us have faced.

-Rob

> rgds
> jan I.
>
>
> -Rob
>>
>>
>> > putting the URL of this discussion (from markmail or mail-archive) in the
>> > files, so people can get your accompanying explanation and the
>> follow-up. As
>> > you say, it's always dangerous to interpret numbers... but it's also very
>> > easy to play with numbers, so explanations are always needed when
>> presenting
>> > numbers.
>> >
>> > Regards,
>> >   Andrea.
>> >
>> >
>> > ---------------------------------------------------------------------
>> > To unsubscribe, e-mail: dev-unsubscr...@openoffice.apache.org
>> > For additional commands, e-mail: dev-h...@openoffice.apache.org
>> >
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: dev-unsubscr...@openoffice.apache.org
>> For additional commands, e-mail: dev-h...@openoffice.apache.org
>>
>>

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@openoffice.apache.org
For additional commands, e-mail: dev-h...@openoffice.apache.org

Reply via email to