On 9/4/2018 8:50 PM, Subhra Mazumdar wrote: > On 08/31/2018 09:09 AM, Steven Sistare wrote: >> On 8/30/2018 4:24 PM, subhra mazumdar wrote: >>> Introduce pipe_ll_usec field for pipes that indicates the amount of micro >>> seconds a thread should spin if pipe is empty or full before sleeping. This >>> is similar to network sockets. Workloads like hackbench in pipe mode >>> benefits significantly from this by avoiding the sleep and wakeup overhead. >>> Other similar usecases can benefit. pipe_wait_flag is used to signal any >>> thread busy waiting. pipe_busy_loop_timeout checks if spin time is over. >>> >>> Signed-off-by: subhra mazumdar <subhra.mazum...@oracle.com> >>> --- >>> include/linux/pipe_fs_i.h | 19 +++++++++++++++++++ >>> 1 file changed, 19 insertions(+) >>> >>> diff --git a/include/linux/pipe_fs_i.h b/include/linux/pipe_fs_i.h >>> index e7497c9..fdfd2a2 100644 >>> --- a/include/linux/pipe_fs_i.h >>> +++ b/include/linux/pipe_fs_i.h >>> @@ -1,6 +1,8 @@ >>> #ifndef _LINUX_PIPE_FS_I_H >>> #define _LINUX_PIPE_FS_I_H >>> +#include <linux/sched/clock.h> >>> + >>> #define PIPE_DEF_BUFFERS 16 >>> #define PIPE_BUF_FLAG_LRU 0x01 /* page is on the LRU */ >>> @@ -54,6 +56,8 @@ struct pipe_inode_info { >>> unsigned int waiting_writers; >>> unsigned int r_counter; >>> unsigned int w_counter; >>> + unsigned int pipe_ll_usec; >>> + unsigned long pipe_wait_flag; >>> struct page *tmp_page; >>> struct fasync_struct *fasync_readers; >>> struct fasync_struct *fasync_writers; >>> @@ -157,6 +161,21 @@ static inline int pipe_buf_steal(struct >>> pipe_inode_info *pipe, >>> return buf->ops->steal(pipe, buf); >>> } >>> +static inline unsigned long pipe_busy_loop_current_time(void) >>> +{ >>> + return (unsigned long)(local_clock() >> 10); >> Why ">> 10" ? local_lock() has nanosec units, and you compare to the tunable >> pipe_llc_sec which has microsec units. Should be ">> 3". Better yet, >> redefine >> the tunable to have nanosec units. I suspect you will need very large values >> of the tunable to show similar results. > It's 2^10. I don't think using nanosec units is necessary. It is unlikely > data will be read or written in nano seconds. sk_busy_loop_timeout for > sockets uses micro seconds too.
Ah, you are using 2^10 as an approximation of 1000. OK. - Steve >> >> Also, since this type of optimization consumes CPU extra cycles that could >> be used by other tasks, show the overall CPU utilization before and after >> the optimization, such as by using "time hackbench ...". > OK. > > Thanks, > Subhra >> >> - Steve >> >>> +} >>> + >>> +static inline bool pipe_busy_loop_timeout(struct pipe_inode_info *pipe, >>> + unsigned long start_time) >>> +{ >>> + unsigned long bp_usec = READ_ONCE(pipe->pipe_ll_usec); >>> + unsigned long end_time = start_time + bp_usec; >>> + unsigned long now = pipe_busy_loop_current_time(); >>> + >>> + return time_after(now, end_time); >>> +} >>> + >>> /* Differs from PIPE_BUF in that PIPE_SIZE is the length of the actual >>> memory allocation, whereas PIPE_BUF makes atomicity guarantees. */ >>> #define PIPE_SIZE PAGE_SIZE >>> >