On Monday April 16, [EMAIL PROTECTED] wrote: > > cfq_dispatch_insert() was called with rq == 0. This one is getting really > annoying... and md is involved again (RAID0 this time.)
Yeah... weird. RAID0 is so light-weight and so different from RAID1 or RAID5 that I feel fairly safe concluding that the problem isn't in or near md. But that doesn't help you. This really feels like a locking problem. The problem occurs when ->next_rq is NULL, but ->sort_list.rb_node is not NULL. That happens plenty of times in the code (particularly as the first request is inserted) but always under ->queue_lock so it should never be visible to cfq_dispatch_insert.. Except that drivers/scsi/ide-scsi.c:idescsi_eh_reset calls elv_next_request which could ultimately call __cfq_dispatch_requests without taking ->queue_lock (that I can see). But you probably aren't using ide-scsi (does anyone?). Given that interrupts are always disabled when queue_lock is taken, it might be useful to add WARN_ON(!irqs_disabled()); every time ->next_rq is set. Something like the following. It might show something useful.... if we are lucky. NeilBrown diff .prev/block/cfq-iosched.c ./block/cfq-iosched.c --- .prev/block/cfq-iosched.c 2007-04-17 15:01:36.000000000 +1000 +++ ./block/cfq-iosched.c 2007-04-17 15:02:25.000000000 +1000 @@ -628,6 +628,7 @@ static void cfq_remove_request(struct re { struct cfq_queue *cfqq = RQ_CFQQ(rq); + BUG_ON(!irqs_disabled()); if (cfqq->next_rq == rq) cfqq->next_rq = cfq_find_next_rq(cfqq->cfqd, cfqq, rq); @@ -1810,6 +1811,7 @@ cfq_rq_enqueued(struct cfq_data *cfqd, s /* * check if this request is a better next-serve candidate)) { */ + BUG_ON(!irqs_disabled()); cfqq->next_rq = cfq_choose_req(cfqd, cfqq->next_rq, rq); BUG_ON(!cfqq->next_rq); - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/