Re: Does Java client support search with paging and sorting?

2012-07-20 Thread Brian Roach
Hello Lei,

The Java client currently doesn't use the Solr API for search; It uses 
Map/Reduce. Sadly there's no support for those solr parameters. The only way to 
do it currently would be by adding phases to sort then reduce the entire result 
set which isn't exactly optimal.

We currently do have plans on improving this situation on both sides. We're 
adding new features to the protocol buffers API that will support the 'rows' 
and 'start' parameters which I am hoping to get into the client "soon". There's 
a bit of work there because it involves ripping out the old MR search stuff and 
retrofitting the new, which is going to require a bit of interface breaking as 
well above the old underlying PB client.  I may also look at adding the 
solr-likeinterface to the HTTP side of the client after that. 

Unfortunately I don't have a good timeline for you. Hopefully in the next month 
for the PB additions is about the best I can offer at the moment. 

As for help, we're always thrilled when other people make contributions :) If 
you were to add in the solr API support to the original underlying HTTP client 
it'd be awesome. 

Thanks,
Brian Roach

On Jul 19, 2012, at 2:41 PM, Lei Gu wrote:

> Hi,
> We are exploring using Riak as our persistent storage for our next project.
> Does Java client support search with paging and sorting, like the web solr 
> api?
> If yes, can you point one with class/method support it?
> 
> Here is an example with page and number of rows per page set,
> 
> curl "http://localhost:8098/solr/books/select?start=0&rows=1&q=prog*";
> 
> If not, is there a plan to add the support to the Java client? Can we help?
> 
> Thanks.
> 
> -- Lei
> 
> 
> 
> The information contained in this electronic mail transmission is intended 
> only for the use of the individual or entity named in this transmission. If 
> you are not the intended recipient of this transmission, you are hereby 
> notified that any disclosure, copying or distribution of the contents of this 
> transmission is strictly prohibited and that you should delete the contents 
> of this transmission from your system immediately. Any comments or statements 
> contained in this transmission do not necessarily reflect the views or 
> position of GSI Commerce, Inc. or its subsidiaries and/or affiliates.
> 
> ___
> riak-users mailing list
> riak-users@lists.basho.com
> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com


___
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com


Riak 1.2.0 RC2

2012-07-20 Thread Jared Morrow
Riak Users,

Riak 1.2.0 RC2 has been built and uploaded with changes reported both by users 
on this list and internal testing.   

The downloads page has been updated with the new build:  
http://basho.com/resources/downloads/

Also, the release notes have been updated with bugs fixed in RC2:  
https://github.com/basho/riak/blob/1.2/RELEASE-NOTES.md

The new set of bugs fixed are the following:
riak_api - Add riak_core as application dep to riak_api.app
riak_api - Register riak_api_stat mod with riak_core at start up
Add eleveldb:close - Fixes MANIFEST file missing bug
riak_kv - Call eleveldb:close before destroy
riak_control - Resolve base64 cookie truncation race condition.
Fix FreeBSD package permissions on sbin
Create SmartOS SMF service for epmd
Thanks as always for the feedback!

-Basho



___
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com


Is it possible/wise to modify riak search index objects?

2012-07-20 Thread Metin Akat
Hi,

I am using riak to store (relatively large) text files. I store them as
normal riak objects where the value is the text of the file. Now I want to
index and search them. All is fine, I just enabled the "standard" search
pre-commit hook for that bucket and they get indexed nicely. But, there is
one tricky requirement. I need to be able to index and search some metadata
about these files. For example date of submission, size of file, type
(internal business logic) of file etc.

I have been thinking quite a lot about this recently. Asked several times
on #riak. I got one answer suggesting that I create a second "metadata"
riak object for each file, link it to the "file object" and index it
separately. That's not really what I want, because I need to be able to
execute "combined" queries, like value: AND date:.

So, here is the ideal solution that I'm thinking about It would be
great if it's possible to modify the riak search index object. After the
file is submitted, and after it's indexed, I could just fetch the index and
just add some more fields to it.
I see there is a bucket with the search index objects that's automatically
created by riak search. So I guess it is indeed possible, though I don't
know what to expect. Is it a good idea? If not, what else could I do in
order to solve the problem?

Regards,
Metin
___
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com


Data.parse() does not seem to be working in a MapReduce phase

2012-07-20 Thread Ryan Lazuka
Data.parse() does not seem to be working in a MapReduce phase.
http://codepad.org/sT0E5ch5   Here is the resulting array:

[{"description":"","signed-in_date":"07-20-2012","signed-in_time":"9:43
pm","date-time":null},{"description":"","signed-in_date":"07-20-2012","signed-in_time":"9:41
pm","date-time":null}] ­

For example, it's trying to parse this date/time: "2012-07-20T21:41:00",
but that just results in a null value.  Like the "date-time" properties
above.
___
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com


Inconsistent results with secondary indexes and spaces

2012-07-20 Thread Paul Gross
I'm seeing different results when performing a 2i query with spaces on 
different platforms. On OS X, I find the object. On an ubuntu vagrant 
image used by Travis CI, I do not.


For example, here is my test script:

   require 'riak'

   client = Riak::Client.new
   bucket = client.bucket("test")
   bucket.keys.each { |k| bucket.delete(k) }

   object = ::Riak::RObject.new(bucket, "key")
   object.content_type = "text/plain"
   object.data = "hello"
   object.indexes = {"with_space_bin" => "with space",
   "without_space_bin" => "without_space"}
   object.store

   puts "Found with space" if bucket.get_index("with_space_bin", "with
   space").any?
   puts "Found without space" if bucket.get_index("without_space_bin",
   "without_space").any?

When I connect to the riak on OS X, it prints both found with space and 
found without. When I connect to the riak running on ubuntu, it only 
prints found without space. I'm running the ruby code from my mac both 
times, so the client library is exactly the same (riak-client 1.0.4). Is 
there a difference in the way riak handles spaces on different 
platforms? Possibly a difference in erlang versions? Both riaks are 1.1.2.


Here are the stats on my mac:

% curl localhost:8098/stats
{"vnode_gets":408,"vnode_puts":195,"vnode_index_reads":638,"vnode_index_writes":195,"vnode_index_writes_postings":234,"vnode_index_deletes":63,"vnode_index_deletes_postings":225,"read_repairs":0,"vnode_gets_total":408,"vnode_puts_total":195,"vnode_index_reads_total":638,"vnode_index_writes_total":195,"vnode_index_writes_postings_total":234,"vnode_index_deletes_total":63,"vnode_index_deletes_postings_total":225,"node_gets":0,"node_gets_total":136,"node_get_fsm_time_mean":0,"node_get_fsm_time_median":0,"node_get_fsm_time_95":0,"node_get_fsm_time_99":0,"node_get_fsm_time_100":0,"node_puts":0,"node_puts_total":65,"node_put_fsm_time_mean":0,"node_put_fsm_time_median":0,"node_put_fsm_time_95":0,"node_put_fsm_time_99":0,"node_put_fsm_time_100":0,"node_get_fsm_siblings_mean":0,"node_get_fsm_siblings_median":0,"node_get_fsm_siblings_95":0,"node_get_fsm_siblings_99":0,"node_get_fsm_siblings_100":0,"node_get_fsm_objsize_mean":0,"node_get_fsm_objsize_median":0,"node_get_fsm_objsize_95":0,"node_get_fsm_objsize_99":0,"node_get_fsm_objsize_100":0,"read_repairs_total":0,"coord_redirs_total":0,"precommit_fail":0,"postcommit_fail":0,"cpu_nprocs":141,"cpu_avg1":392,"cpu_avg5":387,"cpu_avg15":384,"mem_total":417450,"mem_allocated":4159748000,"nodename":"riak@127.0.0.1","connected_nodes":[],"sys_driver_version":"1.5","sys_global_heaps_size":0,"sys_heap_type":"private","sys_logical_processors":4,"sys_otp_release":"R14B04","sys_process_count":1359,"sys_smp_support":true,"sys_system_version":"Erlang 
R14B04 (erts-5.8.5) [source] [64-bit] [smp:4:4] [rq:4] 
[async-threads:64] [hipe] 
[kernel-poll:true]","sys_system_architecture":"i386-apple-darwin11.2.0","sys_threads_enabled":true,"sys_thread_pool_size":64,"sys_wordsize":8,"ring_members":["riak@127.0.0.1"],"ring_num_partitions":64,"ring_ownership":"[{'riak@127.0.0.1',64}]","ring_creation_size":64,"storage_backend":"riak_kv_eleveldb_backend","pbc_connects_total":0,"pbc_connects":0,"pbc_active":0,"ssl_version":"4.1.6","public_key_version":"0.13","runtime_tools_version":"1.8.6","basho_stats_version":"1.0.2","riak_search_version":"1.1.2","riak_kv_version":"1.1.2","bitcask_version":"1.5.1","luke_version":"0.2.5","erlang_js_version":"1.0.2","mochiweb_version":"1.5.1","inets_version":"5.7.1","riak_pipe_version":"1.1.2","merge_index_version":"1.1.0","cluster_info_version":"1.2.1","basho_metrics_version":"1.0.0","riak_control_version":"0.1.0","riak_core_version":"1.1.2","lager_version":"1.0.0","riak_sysmon_version":"1.1.2","webmachine_version":"1.9.1","crypto_version":"2.0.4","os_mon_version":"2.2.7","sasl_version":"2.1.10","stdlib_version":"1.17.5","kernel_version":"2.14.5","executing_mappers":0,"memory_total":24676960,"memory_processes":9487312,"memory_processes_used":9466216,"memory_system":15189648,"memory_atom":1032393,"memory_atom_used":1008563,"memory_binary":509904,"memory_code":9056222,"memory_ets":831328,"ignored_gossip_total":0,"rings_reconciled_total":0,"rings_reconciled":0,"gossip_received":0,"handoff_timeouts":0,"converge_delay_min":"undefined","converge_delay_max":-1,"converge_delay_mean":0,"converge_delay_last":"undefined","rebalance_delay_min":"undefined","rebalance_delay_max":-1,"rebalance_delay_mean":0,"rebalance_delay_last":"undefined","riak_kv_vnodes_running":64,"riak_kv_vnodeq_min":0,"riak_kv_vnodeq_median":0,"riak_kv_vnodeq_mean":0,"riak_kv_vnodeq_max":0,"riak_kv_vnodeq_total":0,"riak_pipe_vnodes_running":64,"riak_pipe_vnodeq_min":0,"riak_pipe_vnodeq_median":0,"riak_pipe_vnodeq_mean":0,"riak_pipe_vnodeq_max":0,"riak_pipe_vnodeq_total":0}


And here are the stats on ubuntu:

$ curl localhost:8098/stats
{"vnode_gets":465,"vnode_puts":219,"vnode_index_reads":572,"vnode_index_writes":219,"vnode_index_writes_postings":234,"vnode_index_deletes":72,"v

Re: Is it possible/wise to modify riak search index objects?

2012-07-20 Thread Alexander Sicular
Turn your text into a json obj. Maybe something like this:

{ size: 100
Name: bla
Date: 1/1/2012
Raw_txt: txt
}


@siculars 
http://siculars.posterous.com

Sent from my iRotaryPhone

On Jul 20, 2012, at 17:49, Metin Akat  wrote:

> Hi,
> 
> I am using riak to store (relatively large) text files. I store them as 
> normal riak objects where the value is the text of the file. Now I want to 
> index and search them. All is fine, I just enabled the "standard" search 
> pre-commit hook for that bucket and they get indexed nicely. But, there is 
> one tricky requirement. I need to be able to index and search some metadata 
> about these files. For example date of submission, size of file, type 
> (internal business logic) of file etc. 
> 
> I have been thinking quite a lot about this recently. Asked several times on 
> #riak. I got one answer suggesting that I create a second "metadata" riak 
> object for each file, link it to the "file object" and index it separately. 
> That's not really what I want, because I need to be able to execute 
> "combined" queries, like value: AND date:.
> 
> So, here is the ideal solution that I'm thinking about It would be great 
> if it's possible to modify the riak search index object. After the file is 
> submitted, and after it's indexed, I could just fetch the index and just add 
> some more fields to it.
> I see there is a bucket with the search index objects that's automatically 
> created by riak search. So I guess it is indeed possible, though I don't know 
> what to expect. Is it a good idea? If not, what else could I do in order to 
> solve the problem?
> 
> Regards,
> Metin
> ___
> riak-users mailing list
> riak-users@lists.basho.com
> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

___
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com


Re: Is it possible/wise to modify riak search index objects?

2012-07-20 Thread Metin Akat
I was thinking about this too, but as I said, these text files are
sometimes quite big.  Sometimes megabytes. Rarely - tens of megabytes. They
are all "write once, read quite a lot". So having them as JSON is probably
going to put quite a lot of load onto riak and my application (deserialize
a big chunk of JSON on every read). Of course, I might be wrong, I'll have
to benchmark it probably, but I don't really feel very comfortable about
it. Besides of potentially being a performance issue, it also feels quite
ugly to me. Have you done this? How big files? How's the performance?

On Sat, Jul 21, 2012 at 7:52 AM, Alexander Sicular wrote:

> Turn your text into a json obj. Maybe something like this:
>
> { size: 100
> Name: bla
> Date: 1/1/2012
> Raw_txt: txt
> }
>
>
> @siculars
> http://siculars.posterous.com
>
> Sent from my iRotaryPhone
>
> On Jul 20, 2012, at 17:49, Metin Akat  wrote:
>
> > Hi,
> >
> > I am using riak to store (relatively large) text files. I store them as
> normal riak objects where the value is the text of the file. Now I want to
> index and search them. All is fine, I just enabled the "standard" search
> pre-commit hook for that bucket and they get indexed nicely. But, there is
> one tricky requirement. I need to be able to index and search some metadata
> about these files. For example date of submission, size of file, type
> (internal business logic) of file etc.
> >
> > I have been thinking quite a lot about this recently. Asked several
> times on #riak. I got one answer suggesting that I create a second
> "metadata" riak object for each file, link it to the "file object" and
> index it separately. That's not really what I want, because I need to be
> able to execute "combined" queries, like value: AND date: date>.
> >
> > So, here is the ideal solution that I'm thinking about It would be
> great if it's possible to modify the riak search index object. After the
> file is submitted, and after it's indexed, I could just fetch the index and
> just add some more fields to it.
> > I see there is a bucket with the search index objects that's
> automatically created by riak search. So I guess it is indeed possible,
> though I don't know what to expect. Is it a good idea? If not, what else
> could I do in order to solve the problem?
> >
> > Regards,
> > Metin
> > ___
> > riak-users mailing list
> > riak-users@lists.basho.com
> > http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
>
___
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com