Riak mapreduce error

2017-02-06 Thread raghuveer sj
Hi Team,

I am trying to run mapreduce in erlang.

curl -XPUT http://localhost:8098/buckets/training/keys/foo -H
'Content-Type: text/plain' -d 'caremad data goes here'
curl -XPUT http://localhost:8098/buckets/training/keys/bar -H
'Content-Type: text/plain' -d 'caremad caremad caremad caremad'
curl -XPUT http://localhost:8098/buckets/training/keys/baz -H
'Content-Type: text/plain' -d 'nothing to see here'
curl -XPUT http://localhost:8098/buckets/training/keys/bam -H
'Content-Type: text/plain' -d 'caremad caremad caremad'

*Running in erlang shell :*

ReFun = fun(O, _, Re) -> case re:run(riak_object:get_value(O), Re,
[global]) of
{match, Matches} -> [{riak_object:key(O), length(Matches)}];
nomatch -> [{riak_object:key(O), 0}]
end end.

code:which(riakc_pb_socket).
"./ebin/riakc_pb_socket.beam"

{ok, Pid} = riakc_pb_socket:start_link("127.0.0.1", 8087).
{ok,<0.36.0>}

riakc_pb_socket:ping(Pid).
pong

{ok, Re} = re:compile("caremad").
{ok,{re_pattern,0,0,0,
<<69,82,67,80,85,0,0,0,0,0,0,0,81,0,0,0,255,255,255,255,
  255,255,...>>}}

{ok, Riak} = riakc_pb_socket:start_link("127.0.0.1", 8087).
{ok,<0.42.0>}

riakc_pb_socket:mapred_bucket(Riak, <<"training">>, [{map, {qfun, ReFun},
Re, true}]).
** 1: variable 'ReFun' is unbound*

Trying to run the famous erlang sample program sample. I am stuck at this
error. Kindly help me out.

Regards,
Raghuveer
___
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com


Re: Secondary indexes or Riak search ?

2017-02-06 Thread Russell Brown
It’s worth noting that secondary indexes (2i) has some other advantages over 
solr search. If you _can_ model your queries in 2i then I'd recommend it.

Secondary indexes  have a richer API than is currently documented, if you look 
at https://docs.basho.com/riak/1.4.7/dev/using/2i/ you’ll see that documents a 
feature that allows the index terms to be filtered via reg ex. There is also 
the feature that can return the actual riak objects for a $keys index search,
You can pack the index terms with data and return the terms in a query so that 
you don’t need a further object fetch (see return_terms in docs.)
Secondary indexes are written atomically with the object they index.
Operationally they don’t require you run a JVM and Solr alongside your riak 
nodes.

You have the tools with basho_bench to answer the question about performance 
and overhead for your workload. I suspect for “overhead” 2i wins, as there is 
no JVM-per-node.

Modelling for 2i is perhaps harder, in the classical nosql way, you have to do 
more work upfront when designing your querying.

I hope that helps a little. I worked quite a lot on 2i and never really 
understood why riak-search was seen as a replacment, imo they’re complementary, 
and you pick the one that best fits.

Cheers

Russell

On 2 Feb 2017, at 09:43, Alex Feng  wrote:

> Hello Riak-users,
> 
> I am currently using Riak search to do some queries, since my queries are 
> very simple, it should be fulfilled by secondary indexes as well. 
> So, my question is which one has better performance and less overhead, let's 
> say both can fulfill the query requirement.
> 
> Many thanks in advance.
> 
> Br,
> Alex
> ___
> riak-users mailing list
> riak-users@lists.basho.com
> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com


___
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com


Re: Secondary indexes or Riak search ?

2017-02-06 Thread Alex Feng
Hi Russell,

It is really helpful, thank you a lot.
We are suffering from solr crash now, are considering to switch to 2i.

Br,
Alex

2017-02-06 16:53 GMT+08:00 Russell Brown :

> It’s worth noting that secondary indexes (2i) has some other advantages
> over solr search. If you _can_ model your queries in 2i then I'd recommend
> it.
>
> Secondary indexes  have a richer API than is currently documented, if you
> look at https://docs.basho.com/riak/1.4.7/dev/using/2i/ you’ll see that
> documents a feature that allows the index terms to be filtered via reg ex.
> There is also the feature that can return the actual riak objects for a
> $keys index search,
> You can pack the index terms with data and return the terms in a query so
> that you don’t need a further object fetch (see return_terms in docs.)
> Secondary indexes are written atomically with the object they index.
> Operationally they don’t require you run a JVM and Solr alongside your
> riak nodes.
>
> You have the tools with basho_bench to answer the question about
> performance and overhead for your workload. I suspect for “overhead” 2i
> wins, as there is no JVM-per-node.
>
> Modelling for 2i is perhaps harder, in the classical nosql way, you have
> to do more work upfront when designing your querying.
>
> I hope that helps a little. I worked quite a lot on 2i and never really
> understood why riak-search was seen as a replacment, imo they’re
> complementary, and you pick the one that best fits.
>
> Cheers
>
> Russell
>
> On 2 Feb 2017, at 09:43, Alex Feng  wrote:
>
> > Hello Riak-users,
> >
> > I am currently using Riak search to do some queries, since my queries
> are very simple, it should be fulfilled by secondary indexes as well.
> > So, my question is which one has better performance and less overhead,
> let's say both can fulfill the query requirement.
> >
> > Many thanks in advance.
> >
> > Br,
> > Alex
> > ___
> > riak-users mailing list
> > riak-users@lists.basho.com
> > http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
>
>
___
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com


Re: Periodically solr down issue.

2017-02-06 Thread Magnus Kessler
Hi Alex,

org.apache.solr.client.solrj.SolrServerException: IOException occured when
>> talking to server at: http://nosql-2.dsdb:8093/inter
>> nal_solr/production_scheduling
>>
>>
Please check that there is no other process, including an instance of Solr,
running on your machine that may own the TCP port 8093 when you restart
your Riak node. Also make sure that each Solr node can reach other Solr
nodes on this port. Have you got any firewall that blocks access to port
8093?

Kind Regards,

Magnus


-- 
Magnus Kessler
Client Services Engineer
Basho Technologies Limited

Registered Office - 8 Lincoln’s Inn Fields London WC2A 3BP Reg 07970431
___
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com


Re: Riak mapreduce error

2017-02-06 Thread Magnus Kessler
On 3 February 2017 at 18:31, raghuveer sj  wrote:

> Hi Team,
>
> I am trying to run mapreduce in erlang.
>
> curl -XPUT http://localhost:8098/buckets/training/keys/foo -H
> 'Content-Type: text/plain' -d 'caremad data goes here'
> curl -XPUT http://localhost:8098/buckets/training/keys/bar -H
> 'Content-Type: text/plain' -d 'caremad caremad caremad caremad'
> curl -XPUT http://localhost:8098/buckets/training/keys/baz -H
> 'Content-Type: text/plain' -d 'nothing to see here'
> curl -XPUT http://localhost:8098/buckets/training/keys/bam -H
> 'Content-Type: text/plain' -d 'caremad caremad caremad'
>
> *Running in erlang shell :*
>
> ReFun = fun(O, _, Re) -> case re:run(riak_object:get_value(O), Re,
> [global]) of
> {match, Matches} -> [{riak_object:key(O), length(Matches)}];
> nomatch -> [{riak_object:key(O), 0}]
> end end.
>
> code:which(riakc_pb_socket).
> "./ebin/riakc_pb_socket.beam"
>
> {ok, Pid} = riakc_pb_socket:start_link("127.0.0.1", 8087).
> {ok,<0.36.0>}
>
> riakc_pb_socket:ping(Pid).
> pong
>
> {ok, Re} = re:compile("caremad").
> {ok,{re_pattern,0,0,0,
> <<69,82,67,80,85,0,0,0,0,0,0,0,81,0,0,0,255,255,255,255,
>   255,255,...>>}}
>
> {ok, Riak} = riakc_pb_socket:start_link("127.0.0.1", 8087).
> {ok,<0.42.0>}
>
> riakc_pb_socket:mapred_bucket(Riak, <<"training">>, [{map, {qfun, ReFun},
> Re, true}]).
> ** 1: variable 'ReFun' is unbound*
>
> Trying to run the famous erlang sample program sample. I am stuck at this
> error. Kindly help me out.
>
> Regards,
> Raghuveer
>
>
Hi Raghuveer,

I have run the steps you provided, and found that they work fine for me.
Can you let me know which version of Riak you are running this against, and
which version of Erlang is used on the client side? Has the
riak-erlang-client been compiled with the same Erlang version?

Kind Regards,

Magnus

-- 
Magnus Kessler
Client Services Engineer
Basho Technologies Limited

Registered Office - 8 Lincoln’s Inn Fields London WC2A 3BP Reg 07970431
___
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com


[Basho Riak] Fail To Update Document Repeatly With Cluster of 5 Nodes

2017-02-06 Thread my hue
Dear Riak Team,

I and my team used riak as database for my production with an cluster
including 5 nodes.
While production run, we meet an critical bug that is sometimes fail to
update document.
I and my colleagues performed debug and detected an issue with the scenario
as follow:

+  fetch document
+  change value of document
+  update document

Repeat about 10 times and will meet fail. With the document is updated
continually,
sometimes will face update fail.

The first time,  5 nodes of cluster we used riak version 2.1.1.
After meet above bug, we upgraded to use riak version 2.2.0 and this issue
still occurs.

Via many time test,  debug using  Tcpdump at riak node :

*tcpdump -A -ttt  -i {interface} src host {host} and dst port {port} *

And together with the command:

*riak-admin status | grep "node_puts_map\| node_puts_map_total\|
node_puts_total\| vnode_map_update_total\| vnode_puts_total\"*

we  got that the riak server already get the update request.
However, do not know why riak backend fail to update document.
At the fail time,  from riak server log everything is ok.

Then we removed cluster and use a single riak server,  and see that above
bug never happen.

For that reason, think that is only happen with cluster work. We took
research on basho riak document and our riak configure
seems that like suggestion from document.  We totally blocked on this issue
and hope that can get support from you
so that can obtain a stable work from riak database for our production.
Thank you so much.  Hope that can get your reply soon.


* The following is our riak node information :

Riak version:  2.2.0
OS :  CentOS Linux release 7.2.1511
Cpu :  4 core
Memory : 4G
Riak configure : the attached file "riak.conf"

*Note :*

- We mostly using default configure of riak configure except that  we used
storage backend is multi

storage_backend = multi
multi_backend.bitcask_mult.storage_backend = bitcask
multi_backend.bitcask_mult.bitcask.data_root = /var/lib/riak/bitcask_mult
multi_backend.default = bitcask_mult


-

- Bucket type created with the following command:

riak-admin bucket-type create dev_restor '{"props":{"backend":"bitcask_
mult","datatype":"map"}}'
riak-admin bucket-type activate dev_restor


-

- Bucket Type Status :

>> riak-admin bucket-type status dev_restor

dev_restor is active
young_vclock: 20
w: quorum
small_vclock: 50
rw: quorum
r: quorum
pw: 0
precommit: []
pr: 0
postcommit: []
old_vclock: 86400
notfound_ok: true
n_val: 3
linkfun: {modfun,riak_kv_wm_link_walker,mapreduce_linkfun}
last_write_wins: false
dw: quorum
dvv_enabled: true
chash_keyfun: {riak_core_util,chash_std_keyfun}
big_vclock: 50
basic_quorum: false
backend: <<"bitcask_mult">>
allow_mult: true
datatype: map
active: true
claimant: 'riak-node1@64.137.190.244'


-

- Bucket Property :

{"props":{"name":"menu","active":true,"allow_mult":true,"
backend":"bitcask_mult","basic_quorum":false,"big_vcloc
k":50,"chash_keyfun":{"mod":"riak_core_util","fun":"chash_
std_keyfun"},"claimant":"riak-node1@64.137.190.244","datatyp
e":"map","dvv_enabled":true,"dw":"quorum","last_write_wins"
:false,"linkfun":{"mod":"riak_kv_wm_link_walker","fun":"
mapreduce_linkfun"},"n_val":3,"name":"menu","notfound_ok":
true,"old_vclock":86400,"postcommit":[],"pr":0,"
precommit":[],"pw":0,"r":"quorum","rw":"quorum","search_inde
x":"menu_idx","small_vclock":50,"w":"quorum","young_vclock":20}}



-

- Member status :

>> riak-admin member-status

= Membership ==

Status RingPendingNode

---
valid  18.8%  --  'riak-node1@64.137.190.244'
valid  18.8%  --  'riak-node2@64.137.247.82'
valid  18.8%  --  'riak-node3@64.137.162.64'
valid  25.0%  --  'riak-node4@64.137.161.229'
valid  18.8%  --  'riak-node5@64.137.217.73'

---
Valid:5 / Leaving:0 / Exiting:0 / Joining:0 / Down:0



-

- Ring

>> riak-admin status | grep ring

ring_creation_size : 64
ring_members : ['riak-node1@64.137.190.244','riak-node2@64.137.247.82', '
riak-node3@64.137.162.64','riak-node4@64.137.161.229', '
riak-node5@64.137.217.73']
ring_num_partitions : 64
ring_ownership : <<"[{'riak-node2@64.137.247.82',12},\n {'

Re: [Basho Riak] Fail To Update Document Repeatly With Cluster of 5 Nodes

2017-02-06 Thread John Daily
Originally I suspected the context which allows Riak to resolve conflicts was 
not present in your data, but I see it in your map structure. Thanks for 
supplying such a detailed description.

How fast is your turnaround time between an update and a fetch? Even if the 
cluster is healthy it’s not impossible to see a timeout between nodes, which 
could result in a stale retrieval. Have you verified whether the stale data 
persists?

A single node cluster gives an advantage that you’ll never see in a real 
cluster: a perfectly synchronized clock. It also reduces (but does not 
completely eliminate) the possibility of an internal timeout between processes.

-John

> On Feb 6, 2017, at 1:02 PM, my hue  wrote:
> 
> Dear Riak Team,
> 
> I and my team used riak as database for my production with an cluster 
> including 5 nodes. 
> While production run, we meet an critical bug that is sometimes fail to 
> update document. 
> I and my colleagues performed debug and detected an issue with the scenario 
> as follow: 
> 
> +  fetch document  
> +  change value of document 
> +  update document
> 
> Repeat about 10 times and will meet fail. With the document is updated 
> continually, 
> sometimes will face update fail.
> 
> The first time,  5 nodes of cluster we used riak version 2.1.1.  
> After meet above bug, we upgraded to use riak version 2.2.0 and this issue 
> still occurs.
> 
> Via many time test,  debug using  Tcpdump at riak node :
> 
> tcpdump -A -ttt  -i {interface} src host {host} and dst port {port} 
> 
> And together with the command: 
> 
> riak-admin status | grep "node_puts_map\| node_puts_map_total\| 
> node_puts_total\| vnode_map_update_total\| vnode_puts_total\"
> 
> we  got that the riak server already get the update request. 
> However, do not know why riak backend fail to update document.  
> At the fail time,  from riak server log everything is ok. 
> 
> Then we removed cluster and use a single riak server,  and see that above bug 
> never happen.
>  
> For that reason, think that is only happen with cluster work. We took 
> research on basho riak document and our riak configure 
> seems that like suggestion from document.  We totally blocked on this issue 
> and hope that can get support from you  
> so that can obtain a stable work from riak database for our production. 
> Thank you so much.  Hope that can get your reply soon.
> 
> 
> * The following is our riak node information : 
> 
> Riak version:  2.2.0
> OS :  CentOS Linux release 7.2.1511
> Cpu :  4 core
> Memory : 4G  
> Riak configure : the attached file "riak.conf"
> 
> Note : 
> 
> - We mostly using default configure of riak configure except that  we used 
> storage backend is multi  
> 
> storage_backend = multi
> multi_backend.bitcask_mult.storage_backend = bitcask
> multi_backend.bitcask_mult.bitcask.data_root = /var/lib/riak/bitcask_mult
> multi_backend.default = bitcask_mult
> 
> -
> 
> - Bucket type created with the following command:
> 
> riak-admin bucket-type create dev_restor 
> '{"props":{"backend":"bitcask_mult","datatype":"map"}}'
> riak-admin bucket-type activate dev_restor
> 
> -
> 
> - Bucket Type Status :
> 
> >> riak-admin bucket-type status dev_restor
> 
> dev_restor is active
> young_vclock: 20
> w: quorum
> small_vclock: 50
> rw: quorum
> r: quorum
> pw: 0
> precommit: []
> pr: 0
> postcommit: []
> old_vclock: 86400
> notfound_ok: true
> n_val: 3
> linkfun: {modfun,riak_kv_wm_link_walker,mapreduce_linkfun}
> last_write_wins: false
> dw: quorum
> dvv_enabled: true
> chash_keyfun: {riak_core_util,chash_std_keyfun}
> big_vclock: 50
> basic_quorum: false
> backend: <<"bitcask_mult">>
> allow_mult: true
> datatype: map
> active: true
> claimant: 'riak-node1@64.137.190.244 '
> 
> -
> 
> - Bucket Property :
> 
> {"props":{"name":"menu","active":true,"allow_mult":true,"backend":"bitcask_mult","basic_quorum":false,"big_vclock":50,"chash_keyfun":{"mod":"riak_core_util","fun":"chash_std_keyfun"},"claimant":"riak-node1@64.137.190.244
>  
> ","datatype":"map","dvv_enabled":true,"dw":"quorum","last_write_wins":false,"linkfun":{"mod":"riak_kv_wm_link_walker","fun":"mapreduce_linkfun"},"n_val":3,"name":"menu","notfound_ok":true,"old_vclock":86400,"postcommit":[],"pr":0,"precommit":[],"pw":0,"r":"quorum","rw":"quorum","search_index":"menu_idx","small_vclock":50,"w":"quorum","young_vclock":20}}
> 
> 
> -
> 
> - Member status :
> 
> >> riak-admin member-status
> 
> 

Re: [Basho Riak] Fail To Update Document Repeatly With Cluster of 5 Nodes

2017-02-06 Thread Russell Brown
What operation are you performing? It looks like the map is a single level map 
of last-write-wins registers. Are you updating a value? Is there a chance that 
the time on the node handling the update is behind the value in the 
lww-register?

Have you tried using the `modify_type` operation in riakc_pb_socket which does 
the fetch/update operation in sequence for you?

Anything in the error logs on any of the nodes?

Is the opaque context identical from the fetch and then later after the update?

Thanks

Russell

On 6 Feb 2017, at 19:11, John Daily  wrote:

> Originally I suspected the context which allows Riak to resolve conflicts was 
> not present in your data, but I see it in your map structure. Thanks for 
> supplying such a detailed description.
> 
> How fast is your turnaround time between an update and a fetch? Even if the 
> cluster is healthy it’s not impossible to see a timeout between nodes, which 
> could result in a stale retrieval. Have you verified whether the stale data 
> persists?
> 
> A single node cluster gives an advantage that you’ll never see in a real 
> cluster: a perfectly synchronized clock. It also reduces (but does not 
> completely eliminate) the possibility of an internal timeout between 
> processes.
> 
> -John
> 
>> On Feb 6, 2017, at 1:02 PM, my hue  wrote:
>> 
>> Dear Riak Team,
>> 
>> I and my team used riak as database for my production with an cluster 
>> including 5 nodes. 
>> While production run, we meet an critical bug that is sometimes fail to 
>> update document. 
>> I and my colleagues performed debug and detected an issue with the scenario 
>> as follow: 
>> 
>> +  fetch document  
>> +  change value of document 
>> +  update document
>> 
>> Repeat about 10 times and will meet fail. With the document is updated 
>> continually, 
>> sometimes will face update fail.
>> 
>> The first time,  5 nodes of cluster we used riak version 2.1.1.  
>> After meet above bug, we upgraded to use riak version 2.2.0 and this issue 
>> still occurs.
>> 
>> Via many time test,  debug using  Tcpdump at riak node :
>> 
>> tcpdump -A -ttt  -i {interface} src host {host} and dst port {port} 
>> 
>> And together with the command: 
>> 
>> riak-admin status | grep "node_puts_map\| node_puts_map_total\| 
>> node_puts_total\| vnode_map_update_total\| vnode_puts_total\"
>> 
>> we  got that the riak server already get the update request. 
>> However, do not know why riak backend fail to update document.  
>> At the fail time,  from riak server log everything is ok. 
>> 
>> Then we removed cluster and use a single riak server,  and see that above 
>> bug never happen.
>>  
>> For that reason, think that is only happen with cluster work. We took 
>> research on basho riak document and our riak configure 
>> seems that like suggestion from document.  We totally blocked on this issue 
>> and hope that can get support from you  
>> so that can obtain a stable work from riak database for our production. 
>> Thank you so much.  Hope that can get your reply soon.
>> 
>> 
>> * The following is our riak node information : 
>> 
>> Riak version:  2.2.0
>> OS :  CentOS Linux release 7.2.1511
>> Cpu :  4 core
>> Memory : 4G  
>> Riak configure : the attached file "riak.conf"
>> 
>> Note : 
>> 
>> - We mostly using default configure of riak configure except that  we used 
>> storage backend is multi  
>> 
>> storage_backend = multi
>> multi_backend.bitcask_mult.storage_backend = bitcask
>> multi_backend.bitcask_mult.bitcask.data_root = /var/lib/riak/bitcask_mult
>> multi_backend.default = bitcask_mult
>> 
>> -
>> 
>> - Bucket type created with the following command:
>> 
>> riak-admin bucket-type create dev_restor 
>> '{"props":{"backend":"bitcask_mult","datatype":"map"}}'
>> riak-admin bucket-type activate dev_restor
>> 
>> -
>> 
>> - Bucket Type Status :
>> 
>> >> riak-admin bucket-type status dev_restor
>> 
>> dev_restor is active
>> young_vclock: 20
>> w: quorum
>> small_vclock: 50
>> rw: quorum
>> r: quorum
>> pw: 0
>> precommit: []
>> pr: 0
>> postcommit: []
>> old_vclock: 86400
>> notfound_ok: true
>> n_val: 3
>> linkfun: {modfun,riak_kv_wm_link_walker,mapreduce_linkfun}
>> last_write_wins: false
>> dw: quorum
>> dvv_enabled: true
>> chash_keyfun: {riak_core_util,chash_std_keyfun}
>> big_vclock: 50
>> basic_quorum: false
>> backend: <<"bitcask_mult">>
>> allow_mult: true
>> datatype: map
>> active: true
>> claimant: 'riak-node1@64.137.190.244'
>> 
>> -
>> 
>> - Bucket Property :
>> 
>> {"props":{"name":"menu","active":true,"allow_mult":true,"backend":"bitcask_mult","basic_quorum":false,"big_vclock":50,"chash_keyfun":{"mod":"riak_core_