[ceph-users] Ceph Crach at sync_thread_timeout after heavy random writes.

Chen, Xiaoxi Mon, 25 Mar 2013 02:01:38 -0700

Hi list,
         We have hit and reproduce this issue for several times, ceph will 
suicide because FileStore: sync_entry timed out after a very heavy random IO on 
top of the RBD.
         My test environment is:
                            4 Nodes ceph cluster with 20 HDDs for OSDs and 4 
Intel DCS3700 ssds for journal per node, that is 80 spindles in total
                            48 VMs spread across 12 Physical nodes, 48 RBD 
attached to the VMs 1:1 via Qemu.
                            Ceph @ 0.58
                            XFS were used.
         I am using Aiostress (something like FIO) to produce random write 
requests on top of each RBDs.


         From Ceph-w , ceph reports a very high Ops (10000+ /s) , but 
technically , 80 spindles can provide up to 150*80/2=6000 IOPS for 4K random 
write.
         When digging into the code, I found that the OSD write data to 
Pagecache than returned, although it called ::sync_file_range, but this syscall 
doesn't actually sync data to disk when it return,it's an aync call. So the 
situation is , the random write will be extremely fast since it only write to 
journal and pagecache, but once syncing , it will take very long time. The 
speed gap between journal and OSDs exist, the amount of data that need to be 
sync keep increasing, and it will certainly exceed 600s.

         For more information, I have tried to reproduce this by rados 
bench,but failed.

         Could you please let me know if you need any more informations & have 
some solutions? Thanks
                                                                                
                                                                                
                                                                                
            Xiaoxi

_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

[ceph-users] Ceph Crach at sync_thread_timeout after heavy random writes.

Reply via email to