Kevin

We are using HTTP, (have tried PB without any performance gain) and
using riak-java-client as client lib.

--
Jan Buchholdt
Software Pilot
Trifork A/S
Cell +45 50761121



On 2010-11-08 14:20, Kevin Smith wrote:
 Jan -

 Which protocol (HTTP or protocol buffers) and client lib are you using?

 --Kevin
 On Nov 8, 2010, at 6:36 AM, Jan Buchholdt wrote:

 We are evaluating Riak for a project, but having a hard time making it fast 
enough for our need.

 Our model is very simple and looks like this:

 ---------------------                         * ---------------------
 |       Person      | ------------------------>   |   Document        |
 ---------------------                           ---------------------

 We have a set of persons and each person can have many documents.

 Our typical queries are:

 Get an overview of all the persons documents. This query returns the person 
along with a subset of data from all the persons documents.
 Get document by id.

 Our requirements are that these quires should be performed under in under 
100millis when we have 10 requests per second or less load.

 The size of the data:
 A document is approximately 1 kb
 No data for a persons except the personidentifier
 Around 6 million persons.
 Each person has from from 0 to a couple of thousand documents.
 All in all we have 120 mio documents.
 Most persons don't have more than 1 to 10 documents, but then we have some few 
"heavy" persons having 500 to 1000 documents.

 Riak setup:
 4 Nodes.
 Hardware configuration for each node:
 HP ProLiant DL360 G7
 18 gb ram
 SAS discs
 Intel(R) Xeon(R) CPU E5620 @ 2.40GHz Proc 1
 Solaris 10 update 9

 We use the default bitcask storage engine
 We replicate data to 3 machines when it is written.
 Reads are read from just one machine

 We tried implementing our datamodel using Riak links as described below:

 Persons are stored in a person bucket using their person identifier as key
 /person/
 {personid}
 Documents are saved in another bucket
 /document/
 {documented}
 At each person we store links to the persons documents.

 We are having problems with the query fetching all the documents for a person. 
 Reading all the documents for a person is done using a link walk. The linkwalk 
start reading all the document keys using the personid. It then fetches all 
documents.
 For persons with 1 - 5 documents the response times are often over 100 mills. And for 
the "heavy" persons with many documents response times are several seconds. But 
we are very new to Riak and are probably using a wrong approach.

 Below are our thoughts (having almost no experience with Riak):

 The chosen datamodel is good for writes. Writing a new document results in 3 
operations against Riak. Writing the document using its id as key. Reading the 
Person to get all the persons document links. Append the new document's key to 
the persons links and write back the person.

 Reading, using linkwalk, is slow because it is expensive to fetch many 
documents even though the linkwalk can read their keys right away by reading 
the links for the person. Even though we have 4 nodes and linkwalks are 
parallelized many documents need to be retrieved from one node. Having to fetch 
for example 100 documents on one node (one disc) is expensive. We do not know 
how data is stored but are afraid Riak is doing a lot of disk seeks.

 We are considering another more denormalized approach where we write all the documents 
for a person in one "blob". But then we are afraid our writes become slow, 
because when adding a new document the blob must be read, the new document inserted and 
the blob written back.

 We could really need some input. Is our assumptions wrong? (we have not yet 
dug into the problems). Is there a good datamodel for our requirements? etc?.
 We haven't looked at Riak search at all. Maybe it could solve some of our 
problems.



-- --
 Jan Buchholdt
 Software Pilot
 Trifork A/S
 Cell +45 50761121


 _______________________________________________
 riak-users mailing list
 riak-users@lists.basho.com
 http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com


_______________________________________________
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

Reply via email to