You're expected to base 64 encode it. UUID is simply the kind of value it expects, like a date or an integer.

Eric Redmond, Engineer @ Basho


On Sun, Aug 10, 2014 at 5:03 PM, David James <davidcja...@gmail.com> wrote:

Thanks for the quick responses.

Eric: I don't understand. Why does Solr have the UUIDField (http://lucene.apache.org/solr/4_7_0/solr-core/org/apache/solr/schema/UUIDField.html) if it were not indexable? What is the nature of the limitation?

Jason: Thanks, I will consider Base 64 encoding.


On Sun, Aug 10, 2014 at 7:19 PM, Jason Campbell <xia...@xiaclo.net> wrote:
I like UUIDs for everything as well, although I expected compatibility issues with something. Base 64 encoding the binary value is a nice compromise for me, and takes 22 characters (if you drop the padding) instead of the usual 36 for the hyphenated hex format.

It would still require re encoding all the keys, but it's a partial solutions.

From: Eric Redmond
Sent: Monday, 11 August 2014 9:15 AM
To: David James
Cc: riak-users
Subject: Re: Using UUID as keys is problematic for Riak Search

You're correct that yokozuna only supports utf8, because the Solr interface only supports utf8 (note that the failure happens when attempting to build a non-utf8 JSON add document command). There's not much we can do here at the moment, since we've yet to (if ever) support a custom interface to Solr that accepts arbitrary binary values. In the mean time, to use yokozuna, you'll have to encode your keys to utf8.

Eric Redmond, Engineer @ Basho


On Sun, Aug 10, 2014 at 4:01 PM, David James <davidcja...@gmail.com> wrote:

I'm using UUIDs for keys in Riak -- converted to bytes, not UTF-8 strings. (I'd rather spend 16 bytes for each key, not 36.)

As I understand it, Yokozuna maps the Riak key to _yz_id.

Here is the suggested schema from the documentation:

<!-- schema.xml -->
<field name="_yz_id" type="_yz_str" indexed="true" stored="true" multiValued="false" required="true"/>
<fieldType name="_yz_str" class="solr.StrField" sortMissingLast="true"/>

Would you expect this to work with Riak Search? I would hope so.

(Or must keys be UTF-8 strings?)

I get this error, which does not surprise me, given that the _yz_id is defined as a string:

==> log/error.log <==

2014-08-10 18:24:16.221 [error] <0.610.0>@yz_kv:index:206 failed to index object {<<"test-0001">>,<<94,143,33,35,45,180,78,164,151,237,72,81,56,13,28,250>>} with error {ucs,{bad_utf8_character_code}} because [{xmerl_ucs,from_utf8,1,[{file,"xmerl_ucs.erl"},{line,185}]},{mochijson2,json_encode_string,2,[{file,"src/mochijson2.erl"},{line,186}]},{mochijson2,'-json_encode_proplist/2-fun-0-',3,[{file,"src/mochijson2.erl"},{line,167}]},{lists,foldl,3,[{file,"lists.erl"},{line,1248}]},{mochijson2,json_encode_proplist,2,[{file,"src/mochijson2.erl"},{line,170}]},{mochijson2,'-json_encode_proplist/2-fun-0-',3,[{file,"src/mochijson2.erl"},{line,167}]},{lists,foldl,3,[{file,"lists.erl"},{line,1248}]},{mochijson2,json_encode_proplist,2,[{file,"src/mochijson2.erl"},{line,170}]}]

I don't think changing the schema.xml type for _yz_id to "solr.UUIDField" is a good idea.

What can I do?

Thanks,
David






_______________________________________________
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com


_______________________________________________
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

Reply via email to