Hello Robert,

very interesting experiences and thank you very much sharing them to us.

Based on your experiences I made some tests with a 256 Byte inode size. I use 
Ubuntu with kernel 3.5.0 and its working properly.
That should reduce the memory consumption and will perform better if I store 
much more objects.

Deleting objects is a very strange behavior. We delete about 700.000 objects in 
the night  and the load is rising 50 percent:

http://www.schuermann.net/temp/delete.png

This is the network graph during deletion:

http://www.schuermann.net/temp/delete-traffic.png

Normal?

Regards,
Klaus


Von: Robert van Leeuwen [mailto:robert.vanleeu...@spilgames.com] 
Gesendet: Mittwoch, 28. August 2013 09:34
An: openstack@lists.openstack.org
Betreff: Re: [Openstack] [SWIFT] PUTs and GETs getting slower

Just a follow up on this thread because I've took some time to write up our 
experiences:
http://engineering.spilgames.com/openstack-swift-lots-small-files/

Klaus, 

Answering your question on initial sync times:
Yes, we also see long initials syncs. 
For us it will take a few days for a new node to be synced. 
Usually it goes pretty quickly at first (30 MB/second) and the performance 
gradually degrades when the disks start filling up and the machines are running 
low on memory.
We have about 6TB on a node to sync.

Cheers,
Robert van Leeuwen


________________________________________
From: Klaus Schürmann [klaus.schuerm...@mediabeam.com]
Sent: Tuesday, August 20, 2013 9:04 AM
To: Maximiliano Venesio; Robert van Leeuwen
Cc: openstack@lists.openstack.org
Subject: AW: [Openstack] [SWIFT] PUTs and GETs getting slower
Hi,
 
after adding additional disks and storing the account- and container-server on 
SSDs the performance is much better:
 
Before:
GETs      average               620 ms
PUTs     average               1900 ms
 
After:
GETs      average               280 ms
PUTs     average               1100 ms
 
Only the rebalance process took days to sync all the data to the additional 
five disks (before each storage node had 3 disks). I used a concurrency of 4. 
One round to replicate all partitions took over 24 hours. After five days the 
replicate process takes only 300 seconds.
Each additional disk has now 300 GB data stored. Is such duration normal to 
sync the data?
 
Thanks
Klaus
 
 
Von: Maximiliano Venesio [mailto:maximiliano.vene...@mercadolibre.com] 
Gesendet: Donnerstag, 8. August 2013 17:26
An: Robert van Leeuwen
Cc: openstack@lists.openstack.org
Betreff: Re: [Openstack] [SWIFT] PUTs and GETs getting slower
 
Hi Robert, 
 
I was reading your post and is interesting because we have similar swift 
deployments and uses cases. 
We are storing millons of small images in our swift cluster, 32 Storage nodes 
w/12 - 2TB HDD + 2 SSD each one, and we are having an total average of 200k rpm 
in whole cluster.
In terms of % of util. of our disks,  we have an average of 50% of util in all 
our disks but we just are using a 15% of the total capacity of them.
When I look at used inodes on our object nodes with "df -i" we hit about 17 
million inodes per disk.
 
So it seems a big number of inodes considering that we are using just a 15% of 
the total capacity. A different thing here is that we are using 512K of inode 
size and we have a big amount of memory . 
Also we always have one of our disks close to 100% of util, and this is caused 
by the object-auditor that scans all our disks continuously.  
 
So we was also thinking in the possibility to change the kind of disks that we 
are using, to use smaller and faster disks.
Will be really util to know what kind of disks are you using in your old and 
new storage nodes, and compare that with our case.
 
 
Cheers,
Max


 

 
Maximiliano Venesio 
#melicloud CloudBuilders
Arias 3751, Piso 7 (C1430CRG) 
Ciudad de Buenos Aires - Argentina
Cel: +549(11) 15-3770-1853
Tel : +54(11) 4640-8411
 
On Tue, Aug 6, 2013 at 11:54 AM, Robert van Leeuwen 
<robert.vanleeu...@spilgames.com> wrote:
Could you check your disk IO on the container /object nodes?

We have quite a lot of files in swift and for comparison purposes I played a 
bit with COSbench to see where we hit the limits.
We currently max out at about 200 - 300 put request/second and the bottleneck 
is the disk IO on the object nodes
Our account / container nodes are on SSD's and are not a limiting factor.

You can look for IO bottlenecks with e.g. "iostat -x 10" (this will refresh the 
view every 10 seconds.)
During the benchmark is see some of the disks are hitting 100% utilization.
That it is hitting the IO limits with just 200 puts a second has to do with the 
number of files on the disks.
When I look at used inodes on our object nodes with "df -i" we hit about 60 
million inodes per disk.
(a significant part of that are actually directories I calculated about 30 
million files based on the number of files in swift)
We use flashcache in front of those disks and it is still REALLY slow, just 
doing a "ls" can take up to 30 seconds.
Probably adding lots of memory should help caching the inodes in memory but 
that is quite challenging:
I am not sure how big a directory is in the xfs inode tree but just the files:
30 million x 1k inodes =  30GB
And that is just one disk :)

We still use the old recommended inode size of 1k and the default of 256 can be 
used now with recent kernels:
https://lists.launchpad.net/openstack/msg24784.html

So sometime ago we decided to go for nodes with more,smaller & faster disks 
with more memory.
Those machines are not even close to their limits however we still have more 
"old" nodes
so performance is limited by those machines.
At this moment it is sufficient for our use case but I am pretty confident we 
would be able to
significantly improve performance by adding more of those machines and doing 
some re-balancing of the load.

Cheers,
Robert van Leeuwen
_______________________________________________
Mailing list: http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack
Post to     : openstack@lists.openstack.org
Unsubscribe : http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack
 

_______________________________________________
Mailing list: http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack
Post to     : openstack@lists.openstack.org
Unsubscribe : http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack

Reply via email to