I've have been contributing to SO for a while now. Here're few observations I'd like to contribute to the discussion:
The level of questions on SO is often of more entry-level. "Harder" questions (that require expertise in a certain area) remain unanswered for a while. Same questions here on the list (as they are often cross-posted) receive faster turnaround. Roughly speaking, there're two groups of questions: Implementing things on Spark and Running Spark. The second one is borderline on SO guidelines as they often involve cluster setups, long logs and little idea of what's going on (mind you, often those questions come from people starting with Spark) In my opinion, Stack Overflow offers a better Q/A experience, in particular, they have tooling in place to reduce duplicates, something that often overloads this list (same "getting started issues" or "how to map, filter, flatmap" over and over again). That said, this list offers a richer forum, where the expertise pool is a lot deeper. Also, while SO is fairly strict in requiring posters from showing a minimal amount of effort in the question being asked, this list is quite friendly to the same behavior. This could be probably an element that makes the list 'lower impedance'. One additional thing on SO is that the [apache-spark] tag is a 'low rep' tag. Neither questions nor answers get significant voting, reducing the 'rep gaming' factor (discouraging participation?) Thinking about how to improve both platforms: SO[apache-spark] and this ML, and get back the list to "not overwhelming" message volumes, we could implement some 'load balancing' policies: - encourage new users to use Stack Overflow, in particular, redirect newbie questions to SO the friendly way: "did you search SO already?" or link to an existing question. - most how to "map, flatmap, filter, aggregate, reduce, ..." would fall under this category - encourage domain experts to hang on SO more often (my impression is that MLLib, GraphX are fairly underserved) - have an 'scalation process' in place, where we could post 'interesting/hard/bug' questions from SO back to the list (or encourage the poster to do so) - update our "community guidelines" on [ http://spark.apache.org/community.html] to implement such policies. Those are just some ideas on how to improve the community and better serve the newcomers while avoiding overload of our existing expertise pool. kr, Gerard. On Thu, Jan 22, 2015 at 10:42 AM, Sean Owen <so...@cloudera.com> wrote: > Yes, there is some project business like votes of record on releases that > needs to be carried on in standard, simple accessible place and SO is not > at all suitable. > > Nobody is stuck with Nabble. The suggestion is to enable a different > overlay on the existing list. SO remains a place you can ask questions too. > So I agree with Nick's take. > > BTW are there perhaps plans to split this mailing list into > subproject-specific lists? That might also help tune in/out the subset of > conversations of interest. > On Jan 22, 2015 10:30 AM, "Petar Zecevic" <petar.zece...@gmail.com> wrote: > >> >> Ok, thanks for the clarifications. I didn't know this list has to remain >> as the only official list. >> >> Nabble is really not the best solution in the world, but we're stuck with >> it, I guess. >> >> That's it from me on this subject. >> >> Petar >> >> >> On 22.1.2015. 3:55, Nicholas Chammas wrote: >> >> I think a few things need to be laid out clearly: >> >> 1. This mailing list is the “official” user discussion platform. That >> is, it is sponsored and managed by the ASF. >> 2. Users are free to organize independent discussion platforms >> focusing on Spark, and there is already one such platform in Stack >> Overflow >> under the apache-spark and related tags. Stack Overflow works quite >> well. >> 3. The ASF will not agree to deprecating or migrating this user list >> to a platform that they do not control. >> 4. This mailing list has grown to an unwieldy size and discussions >> are hard to find or follow; discussion tooling is also lacking. We want to >> improve the utility and user experience of this mailing list. >> 5. We don’t want to fragment this “official” discussion community. >> 6. Nabble is an independent product not affiliated with the ASF. It >> offers a slightly better interface to the Apache mailing list archives. >> >> So to respond to some of your points, pzecevic: >> >> Apache user group could be frozen (not accepting new questions, if that’s >> possible) and redirect users to Stack Overflow (automatic reply?). >> >> From what I understand of the ASF’s policies, this is not possible. :( >> This mailing list must remain the official Spark user discussion platform. >> >> Other thing, about new Stack Exchange site I proposed earlier. If a new >> site is created, there is no problem with guidelines, I think, because >> Spark community can apply different guidelines for the new site. >> >> I think Stack Overflow and the various Spark tags are working fine. I >> don’t see a compelling need for a Stack Exchange dedicated to Spark, either >> now or in the near future. Also, I doubt a Spark-specific site can pass the >> 4 tests in the Area 51 FAQ <http://area51.stackexchange.com/faq>: >> >> - Almost all Spark questions are on-topic for Stack Overflow >> - Stack Overflow already exists, it already has a tag for Spark, and >> nobody is complaining >> - You’re not creating such a big group that you don’t have enough >> experts to answer all possible questions >> - There’s a high probability that users of Stack Overflow would enjoy >> seeing the occasional question about Spark >> >> I think complaining won’t be sufficient. :) >> >> Someone expressed a concern that they won’t allow creating a >> project-specific site, but there already exist some project-specific sites, >> like Tor, Drupal, Ubuntu… >> >> The communities for these projects are many, many times larger than the >> Spark community is or likely ever will be, simply due to the nature of the >> problems they are solving. >> >> What we need is an improvement to this mailing list. We need better >> tooling than Nabble to sit on top of the Apache archives, and we also need >> some way to control the volume and quality of mail on the list so that it >> remains a useful resource for the majority of users. >> >> Nick >> >> >> On Wed Jan 21 2015 at 3:13:21 PM pzecevic <petar.zece...@gmail.com> >> wrote: >> >>> Hi, >>> I tried to find the last reply by Nick Chammas (that I received in the >>> digest) using the Nabble web interface, but I cannot find it (perhaps he >>> didn't reply directly to the user list?). That's one example of Nabble's >>> usability. >>> >>> Anyhow, I wanted to add my two cents... >>> >>> Apache user group could be frozen (not accepting new questions, if that's >>> possible) and redirect users to Stack Overflow (automatic reply?). Old >>> questions remain (and are searchable) on Nabble, new questions go to >>> Stack >>> Exchange, so no need for migration. That's the idea, at least, as I'm not >>> sure if that's technically doable... Is it? >>> dev mailing list could perhaps stay on Nabble (it's not that busy), or >>> have >>> a special tag on Stack Exchange. >>> >>> Other thing, about new Stack Exchange site I proposed earlier. If a new >>> site >>> is created, there is no problem with guidelines, I think, because Spark >>> community can apply different guidelines for the new site. >>> >>> There is a FAQ about creating new sites: >>> http://area51.stackexchange.com/faq >>> It says: "Stack Exchange sites are free to create and free to use. All we >>> ask is that you have an enthusiastic, committed group of expert users who >>> check in regularly, asking and answering questions." >>> I think this requirement is satisfied... >>> Someone expressed a concern that they won't allow creating a >>> project-specific site, but there already exist some project-specific >>> sites, >>> like Tor, Drupal, Ubuntu... >>> >>> Later, though, the FAQ also says: >>> "If Y already exists, it already has a tag for X, and nobody is >>> complaining" >>> (then you should not create a new site). But we could complain :) >>> >>> The advantage of having a separate site is that users, who should have >>> more >>> privileges, would need to earn them through Spark questions and answers >>> only. The other thing, already mentioned, is that the community could >>> create >>> Spark specific guidelines. There are also 'meta' sites for asking >>> questions >>> like this one, etc. >>> >>> There is a process for starting a site - it's not instantaneous. New site >>> needs to go through private beta and public beta, so that could be a >>> drawback. >>> >>> >>> Like btiernay, I must say: there might be something about Apache projects >>> and mailing lists that I do not know, so excuse me if that is the case... >>> >>> >>> >>> >>> -- >>> View this message in context: >>> http://apache-spark-user-list.1001560.n3.nabble.com/Discourse-A-proposed-alternative-to-the-Spark-User-list-tp20851p21299.html >>> Sent from the Apache Spark User List mailing list archive at Nabble.com. >>> >>> --------------------------------------------------------------------- >>> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org >>> For additional commands, e-mail: user-h...@spark.apache.org >>> >>> >>