Yokozuna and Spatial Search

2013-09-07 Thread Vincenzo Vitale
Hi,

I'm trying to make SpatialSearch in my application working with Yokozuna.
(develop branch, hash 601560bf9ea0859e598957c13733fbbb0e656e17 of the 6th
september)

The json object looks like this:

{"where":{"latitude":7430019,"longitude":4210023,"geolocation_p":"7.430019,4.210023"},"timestamp":"2013-09-08T01:10:07.752Z"}

since there is already a dynamic field for *_p defined.

But the query:
http://127.0.0.1:8093/solr/my-index/select?q=*:*&fq={!geofilt}&spatial=true&pt=7.430019%2C4.210023&sfield=where_geolocation_p&d=1

returns the error:
can not use FieldCache on multivalued field:
where_geolocation_p_0_coordinate


Looking at this:
http://stackoverflow.com/questions/7068605/solr-spatial-search-can-not-use-fieldcache-on-multivalued-field

it seems the problem is the missing parameter and the dynamic field
declaration for *_coordinates in the configuration file.

Is this the cause of the problem?

The _yz_default.xml files in the data directory seems overwritten every
time riak is restarted, is there a way to customize the solr configuration
per bucket?


Thanks in advance,
Vincenzo.


--
If your e-mail inbox is out of control, check out http://sanebox.com/t/mmzve.
I love it.
___
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com


Re: Yokozuna and Spatial Search

2013-09-08 Thread Vincenzo Vitale
I got it working with this change to the default conf:
https://github.com/basho/yokozuna/pull/169

Before doing this, I first tried creating my own schema but put was hanging.


V.


On Sun, Sep 8, 2013 at 3:56 AM, Vincenzo Vitale
wrote:

> Hi,
>
> I'm trying to make SpatialSearch in my application working with Yokozuna.
> (develop branch, hash 601560bf9ea0859e598957c13733fbbb0e656e17 of the 6th
> september)
>
> The json object looks like this:
>
>
> {"where":{"latitude":7430019,"longitude":4210023,"geolocation_p":"7.430019,4.210023"},"timestamp":"2013-09-08T01:10:07.752Z"}
>
>  since there is already a dynamic field for *_p defined.
>
> But the query:
>
> http://127.0.0.1:8093/solr/my-index/select?q=*:*&fq={!geofilt}&spatial=true&pt=7.430019%2C4.210023&sfield=where_geolocation_p&d=1<http://127.0.0.1:8093/solr/my-index/select?q=*:*&fq=%7B!geofilt%7D&spatial=true&pt=7.430019%2C4.210023&sfield=where_geolocation_p&d=1>
>
> returns the error:
> can not use FieldCache on multivalued field:
> where_geolocation_p_0_coordinate
>
>
> Looking at this:
>
> http://stackoverflow.com/questions/7068605/solr-spatial-search-can-not-use-fieldcache-on-multivalued-field
>
> it seems the problem is the missing parameter and the dynamic field
> declaration for *_coordinates in the configuration file.
>
> Is this the cause of the problem?
>
> The _yz_default.xml files in the data directory seems overwritten every
> time riak is restarted, is there a way to customize the solr configuration
> per bucket?
>
>
> Thanks in advance,
> Vincenzo.
>
>
> --
> If your e-mail inbox is out of control, check out
> http://sanebox.com/t/mmzve. I love it.
>



-- 
If your e-mail inbox is out of control, check out http://sanebox.com/t/mmzve.
I love it.
___
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com


Should content-type not be required?

2013-09-10 Thread Vincenzo Vitale
Suppose I want to just store keys in a bucket without any body, this make
sense in scenarios where the key completely identify the entity. Is it
possible to use the riak http api without including the content-type header?

Looking at the http specifications, content–type is not mandatory or
suggested when the body is empty:

http://www.w3.org/Protocols/rfc2616/rfc2616-sec7.html#sec7.2.1

and it's probably a good argument to say that if the entity message is
empty then the content type doesn't really make sense.

Personally I think that just the existence of an http entity - body or not
body – is sufficient to justify a "type"; it's unfortunate that any other
higher type (entity type maybe?) - decoupled from what is defined as
content - exist in the spec.

Framework like spray are quite strict about this, not setting any content
type when the content is empty:

https://github.com/spray/spray/blob/master/spray-http/src/main/scala/spray/http/HttpEntity.scala#L74

how to deal with such situation? Adding a fake content because of this data
store constraint doesn't seems right.



Thanks,

Vincenzo.

-- 
If your e-mail inbox is out of control, check out http://sanebox.com/t/mmzve.
I love it.
___
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com


Re: Should content-type not be required?

2013-09-10 Thread Vincenzo Vitale
Hi Sam,
thanks for the fast reply.

In my company we normally use your third suggestion and indeed it's a
robust strategy.

At the moment we are relying on a riak client based on spray (
https://github.com/agemooij/riak-scala-client) so even if we set the
Content-Type - when the body is empty - it doesn't get to the final request
being dispatched.

Our workaround is to have a content body equal to the one byte carriage
return.
It's not a big deal...

Writing to the mailing list was mainly because curious to know your
thoughts about this :)


Have a good day,
V.


On Wed, Sep 11, 2013 at 12:03 AM, Sam Elliott  wrote:

> The content-type is important for Riak KV: Various clients will use it to
> identify the difference between a response where the whole body is the key,
> or a response that contains siblings (if you have allow_mult=true).
>
> I suggest finding or creating a content type, so that this is clearer to
> your app. It will also allow you to version your objects better. Here are a
> few suggestions (of course replace the text between the < >):
> - application/octet-stream - this is usually used for binary data, and is
> the easiest thing to set the content-type to.
> - application/vnd.. - a
> vendor-specific type, which you can create yourself
> - application/vnd...v
> - another vendor-specific type, which supports versioning.
>
> You don't have to use the information in your final app, but it is used by
> riak and riak clients, so that's why we require it.
>
> --
> Sam Elliott
> Engineer
> sam.elli...@basho.com
> --
>
>
> On Tuesday, 10 September 2013 at 5:03PM, Vincenzo Vitale wrote:
>
> > Suppose I want to just store keys in a bucket without any body, this
> make sense in scenarios where the key completely identify the entity. Is it
> possible to use the riak http api without including the content-type header?
> > Looking at the http specifications, content–type is not mandatory or
> suggested when the body is empty:
> > http://www.w3.org/Protocols/rfc2616/rfc2616-sec7.html#sec7.2.1
> > and it's probably a good argument to say that if the entity message is
> empty then the content type doesn't really make sense.
> > Personally I think that just the existence of an http entity - body or
> not body – is sufficient to justify a "type"; it's unfortunate that any
> other higher type (entity type maybe?) - decoupled from what is defined as
> content - exist in the spec.
> > Framework like spray are quite strict about this, not setting any
> content type when the content is empty:
> >
> https://github.com/spray/spray/blob/master/spray-http/src/main/scala/spray/http/HttpEntity.scala#L74
> > how to deal with such situation? Adding a fake content because of this
> data store constraint doesn't seems right.
> >
> >
> > Thanks,
> > Vincenzo.
> >
> > --
> > If your e-mail inbox is out of control, check out
> http://sanebox.com/t/mmzve. I love it.
> > ___
> > riak-users mailing list
> > riak-users@lists.basho.com (mailto:riak-users@lists.basho.com)
> > http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
>
>
>
>


-- 
If your e-mail inbox is out of control, check out http://sanebox.com/t/mmzve.
I love it.
___
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com


Re: Should content-type not be required?

2013-09-11 Thread Vincenzo Vitale
I'm sorry I cannot disclose all the details but essentially for this bucket
it's enough to check if a specific item is already present or not.
All the information identifying the item is part of the key so there is no
need to have anything in the body.

A generic example could be a discount voucher system where the voucher id
is unique. As soon as the voucher has been used a new item is created in
the bucket *used-vouchers* and every time the system needs to validate a
voucher, then it's enough to check if that id already exists in the bucket.
Who really used the voucher can be part of another bucket (named orders
maybe?) and associated with a secondary index to the voucher, with slower
performances compared to key lookup.


Does it make sense?


Vincenzo




On Wed, Sep 11, 2013 at 4:22 AM, Sam Elliott  wrote:

> I'm interested to know why you're trying to store an empty body in Riak.
> Surely just don't even make the request to Riak? I guess I could be
> overlooking something obvious.
>
> Sam
> --
> Sam Elliott
> Engineer
> sam.elli...@basho.com
> --
>
>
> On Tuesday, 10 September 2013 at 6:29PM, Vincenzo Vitale wrote:
>
> > Hi Sam,
> > thanks for the fast reply.
> >
> > In my company we normally use your third suggestion and indeed it's a
> robust strategy.
> >
> > At the moment we are relying on a riak client based on spray (
> https://github.com/agemooij/riak-scala-client) so even if we set the
> Content-Type - when the body is empty - it doesn't get to the final request
> being dispatched.
> >
> > Our workaround is to have a content body equal to the one byte carriage
> return.
> > It's not a big deal...
> >
> > Writing to the mailing list was mainly because curious to know your
> thoughts about this :)
> >
> >
> > Have a good day,
> > V.
> >
> >
> >
> > On Wed, Sep 11, 2013 at 12:03 AM, Sam Elliott  sam.elli...@basho.com)> wrote:
> > > The content-type is important for Riak KV: Various clients will use it
> to identify the difference between a response where the whole body is the
> key, or a response that contains siblings (if you have allow_mult=true).
> > >
> > > I suggest finding or creating a content type, so that this is clearer
> to your app. It will also allow you to version your objects better. Here
> are a few suggestions (of course replace the text between the < >):
> > > - application/octet-stream - this is usually used for binary data, and
> is the easiest thing to set the content-type to.
> > > - application/vnd.. - a
> vendor-specific type, which you can create yourself
> > > - application/vnd...v number> - another vendor-specific type, which supports versioning.
> > >
> > > You don't have to use the information in your final app, but it is
> used by riak and riak clients, so that's why we require it.
> > >
> > > --
> > > Sam Elliott
> > > Engineer
> > > sam.elli...@basho.com (mailto:sam.elli...@basho.com)
> > > --
> > >
> > >
> > > On Tuesday, 10 September 2013 at 5:03PM, Vincenzo Vitale wrote:
> > >
> > > > Suppose I want to just store keys in a bucket without any body, this
> make sense in scenarios where the key completely identify the entity. Is it
> possible to use the riak http api without including the content-type header?
> > > > Looking at the http specifications, content–type is not mandatory or
> suggested when the body is empty:
> > > > http://www.w3.org/Protocols/rfc2616/rfc2616-sec7.html#sec7.2.1
> > > > and it's probably a good argument to say that if the entity message
> is empty then the content type doesn't really make sense.
> > > > Personally I think that just the existence of an http entity - body
> or not body – is sufficient to justify a "type"; it's unfortunate that any
> other higher type (entity type maybe?) - decoupled from what is defined as
> content - exist in the spec.
> > > > Framework like spray are quite strict about this, not setting any
> content type when the content is empty:
> > > >
> https://github.com/spray/spray/blob/master/spray-http/src/main/scala/spray/http/HttpEntity.scala#L74
> > > > how to deal with such situation? Adding a fake content because of
> this data store constraint doesn't seems right.
> > > >
> > > >
> > > > Thanks,
> > > > Vincenzo.
> > > >
> > > > --
> > > > If your e-mail inbox is out of control, check out
> http://sanebox.com/t/mmzve. I love it.
> > > > ___
> > > > riak-users mailing list
> > > > riak-users@lists.basho.com (mailto:riak-users@lists.basho.com)
> (mailto:riak-users@lists.basho.com)
> > > > http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
> > >
> >
> >
> >
> >
> > --
> > If your e-mail inbox is out of control, check out
> http://sanebox.com/t/mmzve. I love it.
>
>
>
>


-- 
If your e-mail inbox is out of control, check out http://sanebox.com/t/mmzve.
I love it.
___
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com


Re: Should content-type not be required?

2013-09-12 Thread Vincenzo Vitale
Yep, if money are involved I guess eventual consistency can also be painful
if there are not other checks in place.

In our real use case this should not be a big issue.


On Wed, Sep 11, 2013 at 6:46 PM, Evan Vigil-McClanahan <
emcclana...@basho.com> wrote:

> It does make sense, but it isn't an ideal use-case for riak.  Eventual
> consistency means that existence checking under partition is always
> going to be a bit fraught.
>
> On Tue, Sep 10, 2013 at 2:03 PM, Vincenzo Vitale
>  wrote:
> > Suppose I want to just store keys in a bucket without any body, this make
> > sense in scenarios where the key completely identify the entity. Is it
> > possible to use the riak http api without including the content-type
> header?
> >
> > Looking at the http specifications, content–type is not mandatory or
> > suggested when the body is empty:
> >
> > http://www.w3.org/Protocols/rfc2616/rfc2616-sec7.html#sec7.2.1
> >
> > and it's probably a good argument to say that if the entity message is
> empty
> > then the content type doesn't really make sense.
> >
> > Personally I think that just the existence of an http entity - body or
> not
> > body – is sufficient to justify a "type"; it's unfortunate that any other
> > higher type (entity type maybe?) - decoupled from what is defined as
> content
> > - exist in the spec.
> >
> > Framework like spray are quite strict about this, not setting any content
> > type when the content is empty:
> >
> >
> https://github.com/spray/spray/blob/master/spray-http/src/main/scala/spray/http/HttpEntity.scala#L74
> >
> > how to deal with such situation? Adding a fake content because of this
> data
> > store constraint doesn't seems right.
> >
> >
> >
> > Thanks,
> >
> > Vincenzo.
> >
> >
> > --
> > If your e-mail inbox is out of control, check out
> > http://sanebox.com/t/mmzve. I love it.
> >
> > ___
> > riak-users mailing list
> > riak-users@lists.basho.com
> > http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
> >
>



-- 
If your e-mail inbox is out of control, check out http://sanebox.com/t/mmzve.
I love it.
___
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com


Re: 2i at large scale?

2013-09-25 Thread Vincenzo Vitale
Basho people will know if this is normal or not, but keep in mind that in
this way you are storing three copies of the data in the same machine,
where you have all the default 64 vnodes of your ring.
I think riak is designed for a cluster setup.

Are you planning to run the same benchmark with more nodes ?
On 25 Sep 2013 17:53, "Wagner Camarao"  wrote:

> Hi,
>
> I'm benchmarking 2i at scale of billion records, running one physical node
> locally with mostly default configs - except for LevelDB instead of
> Bitcask. Up to this point (14MM records in the bucket that's being indexed)
> it's still performing lookups well for my use case (read ~ 7ms using
> riak-ruby-client over http).
>
> However, along this process I've noticed riak to go down twice. First time
> (8MM records) I could just start it again and continue my benchmarking from
> the point it were left, but now at the second time (14MM records) when I
> started riak again, it took about 3 minutes to respond to my first request.
>
> What was happening during these long startup minutes, after my second
> crash?
>
> Up to which scale have you guys been successfully using secondary indexes?
>
> Any other ideas given my use case / benchmarking scenario?
>
> Thanks,
> Wagner
>
> ___
> riak-users mailing list
> riak-users@lists.basho.com
> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
>
>
___
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com