Sounds good :), Brad many thanks for the explanation . On Thu, Mar 16, 2017 at 12:42 PM, Brad Hubbard <bhubb...@redhat.com> wrote:
> On Thu, Mar 16, 2017 at 4:33 PM, nokia ceph <nokiacephus...@gmail.com> > wrote: > > Hello Brad, > > > > I meant for this parameter bdev_aio_max_queue_depth , Sage suggested try > > diff values, 128,1024 , 4096 . So my doubt how this calculation happens? > Is > > this related to memory? > > The bdev_aio_max_queue_depth parameter represents the nr_events > argument to the libaio io_setup function. > > int io_setup(unsigned nr_events, aio_context_t *ctx_idp); > > From the man page for io_setup: > > "The io_setup() system call creates an asynchronous I/O context > suitable for concurrently processing nr_events operations." > > The current theory we are working with is that io_submit is returning > EAGAIN because nr_events is too small at the default of 32. Therefore > we have suggested raising this value. There is no real calculation > involved in the values Sage is suggesting other than they are > *larger*. It's a matter of playing with the value to see if, and when, > the error messages go away. If we know a larger value reduces or > eradicates the error we can then turn our focus more to *why*. Longer > term this can assist us in setting a more reasonable default. > > > > > Thanks > > > > > > > > > > On Thu, Mar 16, 2017 at 11:53 AM, Brad Hubbard <bhubb...@redhat.com> > wrote: > >> > >> On Thu, Mar 16, 2017 at 4:15 PM, nokia ceph <nokiacephus...@gmail.com> > >> wrote: > >> > Hello, > >> > > >> > We are running latest kernel - 3.10.0-514.2.2.el7.x86_64 { RHEL 7.3 } > >> > > >> > Sure I will try to alter this directive - bdev_aio_max_queue_depth and > >> > will > >> > share our results. > >> > > >> > Could you please explain how this calculation happens? > >> > >> What calculation are you referring to? > >> > >> > Thanks > >> > > >> > > >> > On Wed, Mar 15, 2017 at 7:54 PM, Sage Weil <s...@newdream.net> wrote: > >> >> > >> >> On Wed, 15 Mar 2017, Brad Hubbard wrote: > >> >> > +ceph-devel > >> >> > > >> >> > On Wed, Mar 15, 2017 at 5:25 PM, nokia ceph > >> >> > <nokiacephus...@gmail.com> > >> >> > wrote: > >> >> > > Hello, > >> >> > > > >> >> > > We suspect these messages not only at the time of OSD creation. > But > >> >> > > in > >> >> > > idle > >> >> > > conditions also. May I know what is the impact of these error? > Can > >> >> > > we > >> >> > > safely > >> >> > > ignore this? Or is there any way/config to fix this problem > >> >> > > > >> >> > > Few occurrence for these events as follows:--- > >> >> > > > >> >> > > ==== > >> >> > > 2017-03-14 17:16:09.500370 7fedeba61700 4 rocksdb: (Original Log > >> >> > > Time > >> >> > > 2017/03/14-17:16:09.453130) [default] Level-0 commit table #60 > >> >> > > started > >> >> > > 2017-03-14 17:16:09.500374 7fedeba61700 4 rocksdb: (Original Log > >> >> > > Time > >> >> > > 2017/03/14-17:16:09.500273) [default] Level-0 commit table #60: > >> >> > > memtable #1 > >> >> > > done > >> >> > > 2017-03-14 17:16:09.500376 7fedeba61700 4 rocksdb: (Original Log > >> >> > > Time > >> >> > > 2017/03/14-17:16:09.500297) EVENT_LOG_v1 {"time_micros": > >> >> > > 1489511769500289, > >> >> > > "job": 17, "event": "flush_finished", "lsm_state": [2, 4, 6, 0, > 0, > >> >> > > 0, > >> >> > > 0], > >> >> > > "immutable_memtables": 0} > >> >> > > 2017-03-14 17:16:09.500382 7fedeba61700 4 rocksdb: (Original Log > >> >> > > Time > >> >> > > 2017/03/14-17:16:09.500330) [default] Level summary: base level 1 > >> >> > > max > >> >> > > bytes > >> >> > > base 268435456 files[2 4 6 0 0 0 0] max score 0.76 > >> >> > > > >> >> > > 2017-03-14 17:16:09.500390 7fedeba61700 4 rocksdb: [JOB 17] Try > to > >> >> > > delete > >> >> > > WAL files size 244090350, prev total WAL file size 247331500, > >> >> > > number > >> >> > > of live > >> >> > > WAL files 2. > >> >> > > > >> >> > > 2017-03-14 17:34:11.610513 7fedf3a71700 -1 > >> >> > > bdev(/var/lib/ceph/osd/ceph-73/block) aio_submit retries 6 > >> >> > > >> >> > These errors come from here. > >> >> > > >> >> > void KernelDevice::aio_submit(IOContext *ioc) > >> >> > { > >> >> > ... > >> >> > int r = aio_queue.submit(*cur, &retries); > >> >> > if (retries) > >> >> > derr << __func__ << " retries " << retries << dendl; > >> >> > > >> >> > The submit function is this one which calls libaio's io_submit > >> >> > function directly and increments retries if it receives EAGAIN. > >> >> > > >> >> > #if defined(HAVE_LIBAIO) > >> >> > int FS::aio_queue_t::submit(aio_t &aio, int *retries) > >> >> > { > >> >> > // 2^16 * 125us = ~8 seconds, so max sleep is ~16 seconds > >> >> > int attempts = 16; > >> >> > int delay = 125; > >> >> > iocb *piocb = &aio.iocb; > >> >> > while (true) { > >> >> > int r = io_submit(ctx, 1, &piocb); <-------------NOTE > >> >> > if (r < 0) { > >> >> > if (r == -EAGAIN && attempts-- > 0) { <-------------NOTE > >> >> > usleep(delay); > >> >> > delay *= 2; > >> >> > (*retries)++; > >> >> > continue; > >> >> > } > >> >> > return r; > >> >> > } > >> >> > assert(r == 1); > >> >> > break; > >> >> > } > >> >> > return 0; > >> >> > } > >> >> > > >> >> > > >> >> > From the man page. > >> >> > > >> >> > IO_SUBMIT(2) Linux Programmer's > >> >> > Manual IO_SUBMIT(2) > >> >> > > >> >> > NAME > >> >> > io_submit - submit asynchronous I/O blocks for processing > >> >> > ... > >> >> > RETURN VALUE > >> >> > On success, io_submit() returns the number of iocbs > submitted > >> >> > (which may be 0 if nr is zero). For the failure > >> >> > return, see NOTES. > >> >> > > >> >> > ERRORS > >> >> > EAGAIN Insufficient resources are available to queue any > >> >> > iocbs. > >> >> > > >> >> > I suspect increasing bdev_aio_max_queue_depth may help here but > some > >> >> > of the other devs may have more/better ideas. > >> >> > >> >> Yes--try increasing bdev_aio_max_queue_depth. It defaults to 32; try > >> >> changing it to 128, 1024, or 4096 and see if these errors go away. > >> >> > >> >> I've never been able to trigger this on my test boxes, but I put in > the > >> >> warning to help ensure we pick a good default. > >> >> > >> >> What kernel version are you running? > >> >> > >> >> Thanks! > >> >> sage > >> > > >> > > >> > >> > >> > >> -- > >> Cheers, > >> Brad > > > > > > > > -- > Cheers, > Brad >
_______________________________________________ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com