>Last week , I got more servers from another HW providers with more 
>CPU/RAM/DISKs . 12 Disks in >each storage node.  This deployment of swift 
>cluster keep in better performance for longer time. >Unfortunately , after 
>15,000,000 object . The performance reduced to half and the Failure appeared.
>I concerned about that if the (total number objs/disk numbers) = ?  will cause 
>such affect in large >deployment.(aka. cloud storage provider , telecom , bank 
>etc.)

We also see lower than expected performance and we have many files on the nodes.
We currently have about 15 million files per file-system/disk for our object 
servers with 6 disks per machine (90 Million files on one node) and 48GB of 
memory.

The object-servers are io-bound when we do put's, effectively doing about 1 
write per disk we have in the cluster.
(while also doing about 2 reads per disk at the same time)
This is way lower than expected, especially since we also use flashcache with 
10GB caching per disk and the container nodes are on seperate hardware with 
SSD's.

One of the theory's we have is that the inode tree of the filesystem no longer 
fits in memory which could result in lots of extra io's to the disks.
Also, the object-replicator will walk through all the files effectively busting 
the inode-cache continuously.
A way to test this theory is to add more memory to the nodes to see if this 
helps / moves the issue up a few million files but we haven't had the resources 
to test this out yet.

Cheers,
Robert van Leeuwen
_______________________________________________
Mailing list: https://launchpad.net/~openstack
Post to     : openstack@lists.launchpad.net
Unsubscribe : https://launchpad.net/~openstack
More help   : https://help.launchpad.net/ListHelp

Reply via email to