Why are highly busy zIIPs worse than highly busy CPs?

Peter Hunkeler Thu, 07 Jun 2018 00:27:02 -0700

There are some statements around zIIP utilization which I read here and there. 
Statements like:


- "You should not utilize one zIIP more than 30%, two zIIPs more than 60%..."
- "A task may become delayed for up to 3.2 ms (actually ZIIPAWMT) before the 
busy zIIP asks for help from a CP".



For this discussion, lets assume equal speed CPs and zIIPs, and a reasonable CP 
to zIIP ratio, and more than on processor of each kind.


It has been a long time strength of IBM Z (and all the predecessors) that the 
CPs in an LPAR can be utilized way above 90% without major problems arising. I 
seem to understand that this has changed lately, but still some 85% (?) should 
be fine.


Now, all work running on zIIPs was once work running on CPs (and still is if 
there are no zIIPs). So the work is no different (apart from much being run 
under an SRB instead of a TCB), and the response time requirement is no 
different. Right?


If so, how comes that busy zIIPs are said to be more of a problem than busy 
CPs? If the work can accept some queueing when run on CPs, why not when run on 
zIIPs. Queueing theory should apply equally to both.



When a processor is busy 50%, then 50% of the time there is at least one ready 
task, the one executing. Maybe there are some more waiting on the work queue. 
But these 50% say nothing about the delay of the tasks on the work queue.


In a simplified case, assume 5 tasks with equal priority, each one quickly, say 
after 0.5 ms,  coming to the point where it has to give up the processor for a 
very short period of time before being requeued on the work queue. They all 
constantly work that way for 30 seconds in row, then become undispatchable for 
the remaining 30 seconds of that 50% busy minute. During the first 30 seconds, 
the zIIP is 100% busy, and after 3.2ms (ZIIPAWMT), the zIIP will ask a CP for 
help.


None of the tasks has been delayed by 3.2ms, although the ZIIP recognized its 
work queue has not become empty for 3.2ms and asked for help. To the contrary, 
the work has gotten better service because two processors are now serving the 
single work queue. (Again for simplicity, not currently taking priorities into 
account).


Same case but the task are working 1ms each time. Now it always takes more than 
3.2ms for the last task on the work queue before it is being redispatched as 
long as the zIIP has not asked for help. But the zIIP will ask for help after 
3.2ms, and the delay for the tasks will shrink.


Isn't this a better situation for zIIP work than for non-zIIP work? Same 
scenario on CPs. There is no-one to help.


Any thoughts?


--
Peter Hunkeler

----------------------------------------------------------------------
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to [email protected] with the message: INFO IBM-MAIN

Why are highly busy zIIPs worse than highly busy CPs?

Reply via email to