Hi,

What I was hoping to avoid was MongoDB shifting data in a collection to 
make room for new elements in a document (ie. RowKey), as it appears to 
be quite an expensive operation. I was noticing a connection between 
many of the big spikes in my results and the following log entries in 
mongodb.log (it will log queries >100ms by default). Most of the long 
updates had the "moved" key set on the log entry. eg:

update ogm_test_database.associations_BlogEntry query: { _id: 
ObjectId('4f957de9ca8af8159d604763') } update: { $push: { rows: { table: 
"BlogEntry", columns: { id: 52105 }, tuple: { id: 52105, author_a_id: 
15134 } } } } idhack:1 *moved:1 1831ms*


Since then I've read the following page, which explains that an amount 
of padding is added using a heuristic. This is calculated based on 
if/how much you grow documents in a collection during its lifetime. So, 
for "real world" use maybe this isn't such a problem. In the performance 
tests we work with an empty collection so there tends to be a number of 
these spikes before the heuristic kicks in

http://www.mongodb.org/display/DOCS/Padding+Factor


On 04/25/2012 08:13 AM, Emmanuel Bernard wrote:
> Hi Alan and all,
>
> I have been researching the spikes issue you encounter in the stress test 
> from a theoretical point of view.
> You were trying a different associations storage approach (splitting 
> associations as one row per document rather than the whole association per 
> document). Does that return better results?
>
> I am skeptical for a few reasons. MongoDB has a global write lock per mongod 
> process (they are working on a more fine grained solution). So if the spikes 
> are due to lock contention, shuffling data won't help much. Also make sure 
> you use MongoDB 2.0 instead of 1.8 as they yield lock on page fault which 
> should solve a lot of these spikes problems.
>
> I have found this blog entry to be quite insightful 
> http://blog.pythonisito.com/2011/12/mongodbs-write-lock.html
>
> Generally speaking, if all data can stay in memory, MongoDB should behave 
> wonderfully.
>
> Which leads to my demo and the time difference between Infinispan 5s and 
> Mongodb 20s. I can see several reasons:
>
> - we don't really batch operations in the mongodb dialect and we should.
>    We should accumulate operations and apply them at the end of the flush 
> operation in one batch. That will require some new infrastructure from OGM's 
> engine though
>    to tell the dialect when to "flush".
> - my VM might swap memory on disk which would explain the difference
> - or it could be that Infinispan is simply 4 times faster which would not be 
> too surprising as Infinispan is in-process.
>
> Emmanuel

_______________________________________________
hibernate-dev mailing list
hibernate-dev@lists.jboss.org
https://lists.jboss.org/mailman/listinfo/hibernate-dev

Reply via email to