Keep in mind that, apart from custom settings, buckets are just a prefix on keys, so you could create your own "buckets" by just adding the strings directly to your keys.
Your current solution is effectively doing that, but keep that option in mind if you decide you need another 50k buckets with a slightly different set of default values: you could create one alternative bucket with the new settings, and run your own namespaces inside that. Thanks for the follow up. -John On Aug 5, 2013, at 11:33 AM, Paul Ingalls <p...@fanzo.me> wrote: > We are creating a lot of buckets because I needed the extra namespace. My > alternative was constantly updating a value, which in some cases would grow > the value very large. I figured keeping the key/value groups small would be > quite a bit faster, and since the docs said you could have as many buckets as > you like I thought I was good. I just didn't play close enough attention to > the "except when" clause…:) > > Thanks for the help everyone. I'll have some more questions soon. I ran > into some challenges over the weekend that I want to learn more about… > > Paul > > Paul Ingalls > Founder & CEO Fanzo > p...@fanzo.me > @paulingalls > http://www.linkedin.com/in/paulingalls > > > > On Aug 2, 2013, at 5:08 PM, John Daily <jda...@basho.com> wrote: > >> 50k definitely is a fair few. What's the objective for that many, if I may >> ask? Always interesting to hear about different data models. >> >> Sent from my iPhone >> >> On Aug 2, 2013, at 7:56 PM, Paul Ingalls <p...@fanzo.me> wrote: >> >>> Not that many, I didn't let it run that long since the performance was so >>> poor. Maybe 50k or so... >>> >>> Paul Ingalls >>> Founder & CEO Fanzo >>> p...@fanzo.me >>> @paulingalls >>> http://www.linkedin.com/in/paulingalls >>> >>> >>> >>> On Aug 2, 2013, at 3:54 PM, John Daily <jda...@basho.com> wrote: >>> >>>> Excellent news. How many buckets with custom settings did you create? >>>> >>>> Sent from my iPhone >>>> >>>> On Aug 2, 2013, at 6:51 PM, Paul Ingalls <p...@fanzo.me> wrote: >>>> >>>>> For those interested, I identified my performance problem. >>>>> >>>>> I was creating a lot of buckets, and the properties did not match the >>>>> default bucket properties of the node. So getting the bucket was taking >>>>> between 300-400 milliseconds instead of 3-4. Apparently creating buckets >>>>> with non default bucket properties is not just a bad idea, but a REALLY >>>>> bad idea. >>>>> >>>>> I changed the default to be what I wanted for most of my buckets >>>>> (allow_mult=true) and tweaked the code to handle siblings where I wasn't >>>>> before, and now I'm getting lots of operations per second. >>>>> >>>>> Thanks the for help every one! >>>>> >>>>> Paul >>>>> >>>>> Paul Ingalls >>>>> Founder & CEO Fanzo >>>>> p...@fanzo.me >>>>> @paulingalls >>>>> http://www.linkedin.com/in/paulingalls >>>>> >>>>> >>>>> >>>>> On Aug 2, 2013, at 12:00 AM, Paul Ingalls <p...@fanzo.me> wrote: >>>>> >>>>>> One thing I am doing different than the benchmark is creating a lot of >>>>>> buckets… >>>>>> >>>>>> >>>>>> Paul Ingalls >>>>>> Founder & CEO Fanzo >>>>>> p...@fanzo.me >>>>>> @paulingalls >>>>>> http://www.linkedin.com/in/paulingalls >>>>>> >>>>>> >>>>>> >>>>>> On Aug 1, 2013, at 11:52 PM, Paul Ingalls <p...@fanzo.me> wrote: >>>>>> >>>>>>> The objects are either tweet jsons, so between 1-2k, or simple strings >>>>>>> with some links. >>>>>>> >>>>>>> I am using secondary indexes with timestamps so I can do range queries. >>>>>>> Would this significantly impact performance? I am not indexing for >>>>>>> search... >>>>>>> >>>>>>> Here are some more bench results. Obviously using the azure load >>>>>>> balancer is a bad idea. Fortunately I'm not. >>>>>>> >>>>>>> This was run from my laptop using 8 concurrent connections and the pb >>>>>>> client, using the load balancer in azure. Lots of errors. >>>>>>> >>>>>>> <summary.png> >>>>>>> >>>>>>> >>>>>>> >>>>>>> this was run on a vm in the same cloud service using 8 concurrent >>>>>>> workers and the pb client, but clustered directly (i input all the IP >>>>>>> addresses to riakc_pb_ips). Much better ops. So it must be something >>>>>>> with my usage that isn't getting captured by the benchmark. >>>>>>> >>>>>>> <summary.png> >>>>>>> >>>>>>> Paul Ingalls >>>>>>> Founder & CEO Fanzo >>>>>>> p...@fanzo.me >>>>>>> @paulingalls >>>>>>> http://www.linkedin.com/in/paulingalls >>>>>>> >>>>>>> >>>>>>> >>>>>>> On Aug 1, 2013, at 10:46 PM, Jeremy Ong <jer...@quarkgames.com> wrote: >>>>>>> >>>>>>>> What is the average size of one of your objects? Are you indexing? >>>>>>>> >>>>>>>> >>>>>>>> On Thu, Aug 1, 2013 at 10:23 PM, Paul Ingalls <p...@fanzo.me> wrote: >>>>>>>> Running them on separate machines. >>>>>>>> >>>>>>>> I just ran basho bench against the service from my laptop. here is >>>>>>>> the result. in this case I am just hitting one node. will try it >>>>>>>> again shortly using a load balancer to hit all the nodes. >>>>>>>> >>>>>>>> <summary.png> >>>>>>>> >>>>>>>> Paul Ingalls >>>>>>>> Founder & CEO Fanzo >>>>>>>> p...@fanzo.me >>>>>>>> @paulingalls >>>>>>>> http://www.linkedin.com/in/paulingalls >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> On Aug 1, 2013, at 8:57 PM, Jared Morrow <ja...@basho.com> wrote: >>>>>>>> >>>>>>>>> Also, are you generating the load from the same VMs that Riak is >>>>>>>>> running on or do you have separate machines generating load? >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> On Thursday, August 1, 2013, Jeremy Ong wrote: >>>>>>>>> What erlang version did you build with? How are you load balancing >>>>>>>>> between the nodes? What kind of disks are you using? >>>>>>>>> >>>>>>>>> On Thu, Aug 1, 2013 at 7:53 PM, Paul Ingalls <p...@fanzo.me> wrote: >>>>>>>>> > FYI, 2 more nodes died with the end of the last test. Storm, which >>>>>>>>> > I'm >>>>>>>>> > using to put data in, kills the topology a bit abruptly, perhaps >>>>>>>>> > the nodes >>>>>>>>> > don't like a client going away like that? >>>>>>>>> > >>>>>>>>> > log from one of the nodes: >>>>>>>>> > >>>>>>>>> > 2013-08-02 02:27:23 =ERROR REPORT==== >>>>>>>>> > Error in process <0.4959.0> on node 'riak@riak004' with exit value: >>>>>>>>> > {badarg,[{riak_core_stat,vnodeq_len,1,[{file,"src/riak_core_stat.erl"},{line,181}]},{riak_core_stat,'-vnodeq_stats/0-lc$^0/1-0-',1,[{file,"src/riak_core_stat.erl"},{line,172}]},{riak_core_stat,'-vnodeq_stats/0-lc$^0/1-0-',1,[... >>>>>>>>> > >>>>>>>>> > 2013-08-02 02:27:33 =ERROR REPORT==== >>>>>>>>> > Error in process <0.5055.0> on node 'riak@riak004' with exit value: >>>>>>>>> > {badarg,[{riak_core_stat,vnodeq_len,1,[{file,"src/riak_core_stat.erl"},{line,181}]},{riak_core_stat,'-vnodeq_stats/0-lc$^0/1-0-',1,[{file,"src/riak_core_stat.erl"},{line,172}]},{riak_core_stat,'-vnodeq_stats/0-lc$^0/1-0-',1,[... >>>>>>>>> > >>>>>>>>> > 2013-08-02 02:27:51 =ERROR REPORT==== >>>>>>>>> > Error in process <0.5228.0> on node 'riak@riak004' with exit value: >>>>>>>>> > {badarg,[{riak_core_stat,vnodeq_len,1,[{file,"src/riak_core_stat.erl"},{line,181}]},{riak_core_stat,'-vnodeq_stats/0-lc$^0/1-0-',1,[{file,"src/riak_core_stat.erl"},{line,172}]},{riak_core_stat,'-vnodeq_stats/0-lc$^0/1-0-',1,[... >>>>>>>>> > >>>>>>>>> > and the log from the other node: >>>>>>>>> > >>>>>>>>> > 2013-08-02 00:09:39 =ERROR REPORT==== >>>>>>>>> > Error in process <0.4952.0> on node 'riak@riak007' with exit value: >>>>>>>>> > {badarg,[{riak_core_stat,vnodeq_len,1,[{file,"src/riak_core_stat.erl"},{line,181}]},{riak_core_stat,'-vnodeq_stats/0-lc$^0/1-0-',1,[{file,"src/riak_core_stat.erl"},{line,172}]},{riak_core_stat,'-vnodeq_stats/0-lc$^0/1-0-',1,[... >>>>>>>>> > >>>>>>>>> > 2013-08-02 00:09:44 =ERROR REPORT==== >>>>>>>>> > ** State machine <0.2368.0> terminating >>>>>>>>> > ** Last event in was unregistered >>>>>>>>> > ** When State == active >>>>>>>>> > ** Data == >>>>>>>>> > {state,114179815416476790484662877555959610910619729920,riak_kv_vnode,{deleted,{state,114179815416476790484662877555959610910619729920,riak_kv_eleveldb_backend,{state,<<>>,"/mnt/datadrive/riak/data/leveldb/114179815416476790484662877555959610910619729920",[{create_if_missing,true},{max_open_files,128},{use_bloomfilter,true},{write_buffer_size,58858594}],[{add_paths,[]},{allow_strfun,false},{anti_entropy,{on,[]}},{anti_entropy_build_limit,{1,3600000}},{anti_entropy_concurrency,2},{anti_entropy_data_dir,"/mnt/datadrive/riak/data/anti_entropy"},{anti_entropy_expire,604800000},{anti_entropy_leveldb_opts,[{write_buffer_size,4194304},{max_open_files,20}]},{anti_entropy_tick,15000},{create_if_missing,true},{data_root,"/mnt/datadrive/riak/data/leveldb"},{fsm_limit,50000},{hook_js_vm_count,2},{http_url_encoding,on},{included_applications,[]},{js_max_vm_mem,8},{js_thread_stack,16},{legacy_stats,true},{listkeys_backpressure,true},{map_js_vm_count,8},{mapred_2i_pipe,true},{mapred_name,"mapred"},{max_open_files,128},{object_format,v1},{reduce_js_vm_count,6},{stats_urlpath,"stats"},{storage_backend,riak_kv_eleveldb_backend},{use_bloomfilter,true},{vnode_vclocks,true},{write_buffer_size,58858594}],[],[],[{fill_cache,false}],true,false},{dict,0,16,16,8,80,48,{[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[]},{{[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[]}}},undefined,3000,1000,100,100,true,true,undefined}},riak@riak003,none,undefined,undefined,undefined,{pool,riak_kv_worker,10,[]},undefined,107615} >>>>>>>>> > ** Reason for termination = >>>>>>>>> > ** >>>>>>>>> > {badarg,[{eleveldb,close,[<<>>],[]},{riak_kv_eleveldb_backend,stop,1,[{file,"src/riak_kv_eleveldb_backend.erl"},{line,149}]},{riak_kv_vnode,terminate,2,[{file,"src/riak_kv_vnode.erl"},{line,836}]},{riak_core_vnode,terminate,3,[{file,"src/riak_core_vnode.erl"},{line,847}]},{gen_fsm,terminate,7,[{file,"gen_fsm.erl"},{line,586}]},{proc_lib,init_p_do_apply,3,[{file," >>>>>>>> >>>>>>>> >>>>>>> >>>>>> >>>>> >>>>> _______________________________________________ >>>>> riak-users mailing list >>>>> riak-users@lists.basho.com >>>>> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com >>> >
_______________________________________________ riak-users mailing list riak-users@lists.basho.com http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com