riak search and solr/lucene

2010-11-04 Thread Joseph Lambert
I am using the PHP library for a project and was looking through the code to see what differentiates the Solr HTTP interface query versus the Lucene search (besides the syntax and the interface, etc) as paging is very useful for my code. From the PHP library with lucene I can do a search with lucen

Can we use Lucene components like analyzers ?

2010-11-04 Thread Prometheus WillSurvive
Hi friends, RiakSearch using the lucene analyzer (java) . In same logic can we use or integrate the other lucene components to the riakSearch such as highlighter etc. ? RiakSearch calling outside java components to get benefit lucene current analyzers. Any plan on this ? Pormetheus __

Re: Map using a specific Time

2010-11-04 Thread Duff OMelia
> You need to parse the lastmodified time in the header like so: > > Date.parse(lastmodified) > > That method parses a date string into a unix int. I have a blog post about > it on my blog somewhere. This works wonderfully. Thanks so much for your help Alexander! --

Re: Riak Recap for Nov. 1 - 2

2010-11-04 Thread Dan Reverri
Alexander is correct, a full bucket query on any bucket will perform a list keys across all keys in the cluster. Thanks, Dan Daniel Reverri Developer Advocate Basho Technologies, Inc. d...@basho.com On Wed, Nov 3, 2010 at 9:25 PM, Alexander Sicular wrote: > Could we get some more clarification

riak-search Benchmar.k

2010-11-04 Thread Prometheus WillSurvive
Hi Guys, Week ago I put data ready to index riaksearch via solr interface in the rapidshare to make it available to community. I would love to get some benchmark resuts from you guys. Is there anybody test it ? Prometheus.. ___ riak-users maili

Date format for Riak Search and its JSON output

2010-11-04 Thread Nicolas Fouché
Hi, I can't find the right format for dates in Riak Search, for fields suffixed with "_dt". I tried rfc822 (e.g. "Thu, 04 Nov 2010 16:02:59 -"), and xmlschema/iso8601 (e.g. "2010-11-04T16:07:43Z"), but I get an {unhandled_type,date} error when requesting my data in JSON. No problem with XML.

Data stored in Search *and* KV ?

2010-11-04 Thread Nicolas Fouché
Hi, Following the discussion with seancribbs and Tv on #riak: http://pastie.org/private/acxekqyapbk7fz1hcp8udg I only index documents with the bucket precommit hook and I do all my searches via Map/Reduce queries. I'm not sure how Search works, if there is an inverted index and a forward index. I

Re: riak search and solr/lucene

2010-11-04 Thread Rusty Klophaus
Hi Joseph, Answers inline below. On Thu, Nov 4, 2010 at 12:49 AM, Joseph Lambert wrote: > I am using the PHP library for a project and was looking through the code > to see what differentiates the Solr HTTP interface query versus the Lucene > search (besides the syntax and the interface, etc) as

Re: Can we use Lucene components like analyzers ?

2010-11-04 Thread Rusty Klophaus
Hi Prometheus, That is definitely something we'll consider for the future, but it's currently not on the near-term roadmap. Best, Rusty On Thu, Nov 4, 2010 at 5:12 AM, Prometheus WillSurvive < prometheus.willsurv...@gmail.com> wrote: > Hi friends, > > RiakSearch using the lucene analyzer (java)

Re: Data stored in Search *and* KV ?

2010-11-04 Thread Rusty Klophaus
Hi Nicolas, Search stores a representation of the indexed object so that it knows what to do later if you delete the object, change the object, or re-index the object with a different analyzer. The representation allows us to delete the old inverted index entries correctly. We have considered add

Re: Date format for Riak Search and its JSON output

2010-11-04 Thread Rusty Klophaus
Hi Nicolas, Thank you for the excellent description of the problem. There are currently issues/bugs around date support in Riak Search, looks like you found another. I'd recommend treating dates as strings for now. The issue you described is now tracked here: https://issues.basho.com/show_bug.cgi?

Re: Date format for Riak Search and its JSON output

2010-11-04 Thread Nicolas Fouché
I monkey-patched Riak::Client::CurbBackend to get useful logs https://gist.github.com/662986 Best, Nicolas On Thu, Nov 4, 2010 at 7:42 PM, Rusty Klophaus wrote: > Hi Nicolas, > Thank you for the excellent description of the problem. There are currently > issues/bugs around date support in Riak S

Re: Data stored in Search *and* KV ?

2010-11-04 Thread Nicolas Fouché
Thanks, I understand now. These two products are still a bit "separate". Best, Nicolas On Thu, Nov 4, 2010 at 7:21 PM, Rusty Klophaus wrote: > Hi Nicolas, > Search stores a representation of the indexed object so that it knows what > to do later if you delete the object, change the object, or r

Re: Date format for Riak Search and its JSON output

2010-11-04 Thread Sean Cribbs
You know you can also do this: client.http.send(:curl).verbose = true Sean Cribbs Developer Advocate Basho Technologies, Inc. http://basho.com/ On Nov 4, 2010, at 8:15 PM, Nicolas Fouché wrote: > I monkey-patched Riak::Client::CurbBackend to get useful logs > https://gist.github.com/662986 >

Re: riak search and solr/lucene

2010-11-04 Thread Joseph Lambert
Rusty, Sorry, I meant Lucene search. Solr can be passed start and count, Lucene search can't be, but they share functions in the Erlang code. - Joe Lambert joseph.g.lamb...@gmail.com +86 13656213284 On Fri, Nov 5, 2010 at 2:15 AM, Rusty Klophaus wrote: > Hi Joseph, > > Answers inline below.

Re: riak search and solr/lucene

2010-11-04 Thread Joseph Lambert
Disregard that last message. What I meant was, in a Solr query, all the results are returned and then it sorts and then takes the chunk that is requested by the start and count parameters. Why not instead make the results of the search() function the input of a MapReduce job, and if the user adds s

Re: riak-search Benchmar.k

2010-11-04 Thread Pablo Borges
I haven't used your data, but I'm trying to benchmark riak search against solr. For my test, I'm using each paragraph in the english version of wikipedia as a document as well as some sequential and random data in a couple of fields to reflect our current usage (which is a CMS). That's about 22 mi