Hello,

This is a great start! I am interested in helping with the development of a
crowd sourcing application. The next step would be creating a set of
requirements for the web app. Would the ORP wiki be a good place to store
the requirements?

--Dan


On Tue, Sep 14, 2010 at 9:51 AM, Grant Ingersoll <[email protected]>wrote:

> I think the biggest hurdle we have in front of us is curating a data set
> that we can redistribute.  I'm in the process of uploading all the ASF
> public mail archives as of Sept. 13 to Amazon S3.  I also have some tools
> (thanks to Chris Rhodes) for processing this into Solr XML.  I think this
> would give us a standard corpus to start with and would fairly well mimic
> some enterprise search/eDiscovery tasks pretty well.
>
> At any rate, as with any community, the proof is in people stepping up to
> help out.  I like that so many people suggested we keep going.  As for what
> to do, I think the options are pretty wide open and there is opportunity for
> people to define the project w/o any previous encumbrances.
>
> Some ideas that have been kicked around in the past:
> 1. Creative-commons data set, judgments, queries
> 2. Open Street Map (spatial search)
> 3. Mail archives
> 4. A crowd sourcing application.  Given a set of documents and queries,
> have people provide judgments.  Ideally, this runs in a web container and we
> could probably even find resources to host it here.  Combining that with one
> of the items above, we would be on our way.  App could also solicit queries
> by providing users open search box and opportunities to browse the data.
>
> I know much of this is simplistic, but it is a start.
>
> -Grant
>
>
> On Sep 13, 2010, at 9:04 PM, Dan Cardin wrote:
>
> > Hello,
> >
> > I am new to ORP. I would like to contribute to the project. I do not have
> a
> > lot of experience in this field of IR, crowd sourcing or AI. If someone
> > could take the lead and set forward path I would be willing to contribute
> my
> > skill set to ORP.
> >
> > How can I help? I have a lot of experience doing software development and
> > system administration.
> >
> > Cheers,
> > --Dan
> >
> > On Mon, Sep 13, 2010 at 1:36 PM, Omar Alonso <[email protected]> wrote:
> >
> >> I think ORP is a great candidate for crowdsourcing/human computation. In
> >> the last year or so there's been quite a bit of research and
> applications on
> >> this. See the page for the SIGIR workshop on using crowdsourcing for IR
> >> evaluation: 
> >> http://www.ischool.utexas.edu/~cse2010/<http://www.ischool.utexas.edu/%7Ecse2010/>
> <http://www.ischool.utexas.edu/%7Ecse2010/>
> >>
> >> Omar
> >>
> >> --- On Mon, 9/13/10, Itamar Syn-Hershko <[email protected]> wrote:
> >>
> >>> From: Itamar Syn-Hershko <[email protected]>
> >>> Subject: Re: Whither ORP?
> >>> To: [email protected]
> >>> Date: Monday, September 13, 2010, 9:33 AM
> >>> With the proper two-way open-source
> >>> development process (taking and then giving) I think it can
> >>> become an important part of open-IR technologies, just like
> >>> what Lucene did to the search engines world. What ORP has to
> >>> offer is of great interest to HebMorph, an open-source
> >>> project of mine trying to decide on what is the best way to
> >>> index and search Hebrew texts.
> >>>
> >>> To this end I decided to put some of the development
> >>> efforts of the HebMorph project into making tools for the
> >>> ORP. I have announced this before, but unfortunately I had
> >>> to attend to more pressing tasks before I could complete
> >>> this (and there was no response from the community
> >>> anyway...). Just in case you're interested in seeing what I
> >>> came up with so far: http://github.com/synhershko/Orev.
> >>>
> >>> IMHO, the ORP should stand by itself, and relate to
> >>> Lucene/Solr only as its basis framework for these initial
> >>> stages. Perhaps also try to attract more people who could
> >>> find an interest in what it has to offer, so it can really
> >>> start growing.
> >>>
> >>> Itamar.
> >>>
> >>> On 12/9/2010 1:29 PM, Grant Ingersoll wrote:
> >>>> On Sep 11, 2010, at 8:51 PM, Robert Muir wrote:
> >>>>
> >>>>
> >>>>> i propose we take what we have and import into
> >>> lucene-java's benchmark
> >>>>> contrib.  it already has integration with
> >>> wikipedia and reuters for perf
> >>>>> purposes, and the quality package is actually
> >>> there anyways.  later, maybe
> >>>>> more people have time and contrib/benchmark
> >>> evolves naturally... e.g. to
> >>>>> modules/benchmark with solr support as a first big
> >>> step.
> >>>>>
> >>>> Yeah, that seems reasonable.  I have been
> >>> thinking lately that it might be useful to pull our DocMaker
> >>> stuff out separately from benchmark so that people have easy
> >>> ways of generating content from things like Wikipedia, etc.
> >>>>
> >>>> Still, at the end of the day, I like what ORP _could_
> >>> bring to the table and to some extent I think that is lost
> >>> by folding it into Lucene benchmark.
> >>>>
> >>>>
> >>>>> On Sep 11, 2010 7:33 PM, "Grant Ingersoll"<[email protected]>
> >>> wrote:
> >>>>>
> >>>>>> Seems ORP isn't really catching on with
> >>> people. I know personally I don't
> >>>>>>
> >>>>> have the time I had hoped to have to get it going.
> >>> At the same time, I
> >>>>> really think it could be a good project. We've got
> >>> some tools put together,
> >>>>> but we still haven't done much about the bigger
> >>> goal of a "self contained"
> >>>>> evaluation.
> >>>>>
> >>>>>> Any thoughts on how we should proceed with
> >>> ORP?
> >>>>>>
> >>>>>> -Grant
> >>>>>>
> >>>>
> >>>>
> >>>>
> >>
> >>
> >>
> >>
>
> --------------------------
> Grant Ingersoll
> http://lucenerevolution.org Apache Lucene/Solr Conference, Boston Oct 7-8
>
>

Reply via email to