Hello, I applied suggested modifications on one of our ATS cluster last friday. Unfortunately, I had issues with IO/Wait during the week-end.
Modified cluster is not the one which gets most load: I have 2 clusters and the one I modified is parent for the other. But, this time, I noticed something special: when IO/Wait start increasing, ATS cache write failure do the same (key "proxy.process.cache.write.failure" from http_stats plugin). When I restart ATS service, ATS cache write failure stops as well as IO/Wait. So, now I think that problem is related to cache configuration. I'll investigate this option but, as always, comments and ideas are welcomed :) Regards, Jean-Baptiste Favre On 16/06/2014 18:01, Jean Baptiste Favre wrote: > Hello Jay, > > Strange, I did not saw your reply :-/ > > For now, I've added more VMs and have no more IO/Wait issue. > Anyway, I'll try your suggestions. > > I made some checks. Here re the results: > > On 15/06/2014 22:32, jtomo...@yahoo.com.INVALID wrote: >> Hi Baptiste, sorry my late return for your issue. >> >> I suggest some environment and software settings, considering 4GB of RAM and >> 4 CPU threads: >> >> 1- check if ATS is linked with libhwloc library (ldd bin/traffic_server | >> grep libhwloc) if not, recompile using it > Not linked. Will try it > >> 2- remove irqbalance (for Ubuntu distro: sudo apt-get purge irqbalance) > Not installed > >> 3- reserve the last core (CPU3) to disk IRQs. In this case, distribute >> network IRQs among all cores and set ATS threads to use the first 3 cores >> (CPU0, CPU1, CPU2). >> >> # set disk IRQs to CPU3 >> for DEV in vmw_pvscsi ata_piix; do >> for IRQ in $(grep $DEV /proc/interrupts | cut -d: -f1 | sed "s/ //g"); >> do >> echo 08 > /proc/irq/$IRQ/smp_affinity >> done >> done >> >> # distribute network IRQs among CPU0-CPU3 >> IRQ_NET_1=$(grep eth0-rxtx-1 /proc/interrupts | cut -d: -f1 | sed "s/ //g") >> IRQ_NET_2=$(grep eth0-rxtx-2 /proc/interrupts | cut -d: -f1 | sed "s/ //g") >> IRQ_NET_3=$(grep eth0-rxtx-3 /proc/interrupts | cut -d: -f1 | sed "s/ //g") >> echo 02 > /proc/irq/$IRQ_NET_1/smp_affinity >> echo 04 > /proc/irq/$IRQ_NET_2/smp_affinity >> echo 08 > /proc/irq/$IRQ_NET_3/smp_affinity > Will try > >> 4- set these values at records.config : >> CONFIG proxy.config.cache.ram_cache.size INT 3G >> CONFIG proxy.config.cache.ram_cache_cutoff INT 4M >> CONFIG proxy.config.exec_thread.limit INT 3 >> CONFIG proxy.config.cache.threads_per_disk INT 12 >> CONFIG proxy.config.task_threads INT 6 > Will try > >> 5- also consider set these disk tweaks : >> >> # set noop scheduler for storage disks >> for DEV in sdb sdc sdd; do >> echo 1024 > /sys/block/$DEV/queue/nr_requests >> echo noop > /sys/block/$DEV/queue/scheduler >> echo 8192 > /sys/block/$DEV/queue/read_ahead_kb >> done > Will do it as well > >> Feel free to post here your iostats -x 1 results after set these changes >> >> Cheers >> Jay Tomolek > > Regards, > Jean-Baptiste >