Hi Bill, On Fri, May 27, 2011 at 11:26 AM, J. William Campbell <jwilliamcampb...@comcast.net> wrote: > On 5/26/2011 4:28 PM, Graeme Russ wrote:
>> >> Why mess around with bit shifting (which you would then have to cludge >> into >> your platform code) when carting around a 64-bit value is relatively >> cheap, >> transparent and poratble (all all supported up-to-date tool chains) >> > I really STRONGLY disagree with this statement. If you actually needed 64 > bit variables, fine use them. But as I have already shown, you do not need > them in general. We are computing a 32 bit result. There is some entropy > argument that says you shouldn't need 64 bits to do so. Another way to look > at it is that converting the top 32 bit word and the bottom 32 bit word to > ms separately can be easier than doing them both together at once. However, > as we will see below, I do agree we need two 32 bit words to make this > process go smoothly. I just don't agree that they should/will constitute a > 64 bit binary word. See below. >>>> >>>> - void timer_isr() >>>> - Optional (particularly if tick counter rollover period is >>>> exceptionally log which is usually the case for native 64-bit tick >>>> counters) >>>> - Simply calls sync_timebase() >>>> - For platforms without any tick counter, this can implement one >>>> (but accuracy will be harmed due to usage of disable_interrupts() >>>> and >>>> enable_interrupts() in U-Boot >>>> >>>> So to get the new API up and running, only two functions are mandatory: >>>> >>>> get_ticks() which reads the hardware tick counter and deals with any >>>> 'funny >>>> stuff' including rollovers, short timers (12-bit for example), composite >>>> counters (16-bit 0-999 microsecond + 16-bit millisecond) and maintains a >>>> 'clean' 64-bit tick counter which rolls over from all 1's to all 0's. >>>> The >>> >>> I think it is the task of get_ticks to return the hardware tick counter >>> as >>> an increasing counter, period. The counter may wrap at some final count >>> that is not all ones. That is ok. Sync_timebase deals with the rollovers >>> if >> >> The hardware tick counter may, the 64-bit software tick counter maintained >> by get_ticks() may not > >>> necessary. get_ticks is very lightweight. get_ticks should deal with >>> decrementing counters by returning the complement of the counter. The >>> sc520 >>> case is a bit more complex if you intend to use the 0-999 and 16 bit >>> millisec registers, in that you do need to add them to the previous value >>> to >> >> As I mentioned in another post, this is a problem for the platform >> maintainer and is abstracted away throught the platform specific >> implementation of get_ticks() >> >>> make an increasing counter. Sync_timebase "likes" short counters in that >>> they are easy to convert to millisec and tick remainders. >> >> The compiler should handle using 64-bit rather than 32-bit transparently > > True enough. But you don't need 64 bit variables at this point two 32 bit > ones work just fine, in fact better in most cases. Remember, we are not dealing with a high performance OS here. The primary goal is portability - Performance optimisations (which do not break portability) can be performed later >>>> >>>> 64-bit tick counter does not need to be reset to zero ever (even on >>>> startup >>>> - sync_timebase tacks care of all the details) >>> >>> True, but sync_timebase does have to be initialized (as does the timer >>> itself in most cases, so this is not a restriction). >> >> This can be done in timer_init() via a call to sync_timebase() after the >> timer has been configured. This should bring everything into line >> >>>> ticks_per_millisecond() simply return the number of ticks in a >>>> millisecond >>>> - This may as simple as: >>>> >>>> inline u64 ticks_per_millisecond(void) >>>> { >>>> return CONFIG_SYS_TICK_PER_MS; >>>> } >>>> >>>> But it may be trickier if you have a programmable tick frequency >>> >>> You will have to call the routine that initializes sync_timebase. This >>> routine should have a name, like void init_sync_timebase(void)? >>>> >>>> The optional timer ISR is required if the tick counter has a short roll >>>> over duration (short is up to you - 1 second is short, 1 hour might be, >>>> 1 >>>> century is not) >>>> >>>> Regards, >>>> >>>> Graeme >>>> >>> It is probably true that sync_timebase should have a parameter flag. The >>> reason is that if the timer isr is called only when the timer wraps, then >>> the calls to sync_timebase may be slightly more than a full timer period >>> apart. (due to interrupt latency). Therefore, when the timer difference >>> is >>> computed, if the current update is due to a wrap AND the previous update >>> is >>> due to a wrap, the difference should be approximately 1 wrap. If it comes >>> up >>> real short, you must add a wrap. This isn't necessary if the routine is >>> called more often than once per wrap. Also, when sync_timebase is called >>> in >> >> timer_isr() MUST be called more often than the rollover period of the >> underlying hardware tick counter - This is a requirement > > The equality case can be made to work. If the extension of the counter is > done in the interrupt routine, not in get_ticks, get_ticks just needs to > read the msb of the counter, read the lsb of the counter, then verify that > the msb has not changed. If you have interrupts that work, that is the > easiest way to go. If the lsb of the counter has represents 1 ms or less, > you can just drop it (equivalent to the what the PPC does now). If the > interrupt is slower than that, you must use at least part of the LSB. If you > don't have interrupts, the point is moot. So now we have a complicated ISR and a complicated get_ticks() and you have to change get_ticks() when you decide to implement an ISR >>> >>> get_timer, you must first disable interrupts and then enable interrupts >>> after sync_timebase returns >> >> Why? - get_ticks() provides an atomic read of the hardware tick counter. >> If get_ticks() needs to disable and enable interrupts to do so, that is a >> problem for the platform maintainer >> >> Admittedly, sync_timebase() will not be re-entrant, but how will it ever >> be called concurrently? - Ah, I see - a call to get_timer() interrupted >> by the timer ISR :) > > Yes, that is the problem. I have come to the view that two 32 bit words are > the best approach. Note that the lsb may actually not fill the full 32 bits. Urghhh > The top 32 bits are the rollover count and the bottom 32 bits are the > current counter. If the counter is a full 32 bits, so much the better. Ahhhhh - Lets keep it that way > Again, one could put them together inside the interrupt routine , but it is > easier to check for a changed value if you don't do this. Otherwise, you > have to check both words. It also makes the isr faster. It is just an As I said before - Simple First, Fast Later > increment of the overflow counter, like the PPC is now. I happen to think it > is easier to convert the two 32 bit words to milliseconds one at a time, but > if you feel you must use 64 bit words, that is fine too. Just remember, the > counter does not always fill the entire bottom word. Urghhh > In cases where there are no interrupts, get_ticks has to detect that the > timer has "backed up" and increment the overflow counter itself, unless the > counter is 64 bits to begin with and overflow is impossible anyway. > get_ticks should NOT try to detect overflows if interrupts are available. If > it got both words before an interrupt happened, the answer is correct. If it > got an interrupt in between fetching the words, the event will be detected > and the value re-fetched. All sync_timebase would do now is convert the > returned value to milliseconds. > > So, if you have a 64 bit hardware counter, get_ticks reads and returns it. > Else if you have interrupts, get_ticks reads the overflow counter into the > msb. Next, it reads the hardware timer into the lsb. If the counter is a > down counter, the lsb is = to the counter max - the lsb. The msb is then > checked to make sure it hasn't changed, if it has, repeat the process. All > the interrupt routine does is increase the overflow count. > If you don't have interrupts get_ticks reads the hardware counter into the > lsb. If the counter is a down counter, the lsb is = to the counter max - the > lsb. If the lsb is less than it was in the previous call to get ticks, the > overflow counter is increased. get_ticks then loads the overflow counter > into the msb. > > sync_timebase converts the msb and lsb into millisec. It may do this by a 64 > bit divide, or some shifting to align the lsb with then msb and the a 64 bit > divide, or a bunch of 32 bit fractional multiplies, or any such approach > that works. > > How does that sound? The fact that you have described three different implementations of get_ticks() with two of these differentiated between whether you have interrupts or not immediately suggests this solution is inherently more complex and less maintainable. Lets say you have a platform with a 32-bit tick counter running at a reasonably long rollover time so you decide not to use interrupts. Then you create a new platform with the same tick counter, but it runs much faster and you realise you need to implement the interrupt routine to make get_timer() work for long enough periods - Fine, you add an ISR for the new platform that calls sync_timebase - No other changes are required. The last thing we want is for the 64-bit tick counter to be conceptually different across platforms I just realised - the ISR _does not need to call the sync_timebase at all_ The ISR only needs to call get_ticks(), so it will be fast anyway Regards, Graeme _______________________________________________ U-Boot mailing list U-Boot@lists.denx.de http://lists.denx.de/mailman/listinfo/u-boot