Thank you Matthew, that is excellent.
--Chuck

On Mon, Mar 13, 2017 at 1:58 PM, Matthew Lai <m...@matthewlai.ca> wrote:

> Code:
>
> ---------------------------------------------------------------------
>
> #include <libopencm3/stm32/rcc.h>
> #include <libopencm3/stm32/timer.h>
>
> int main(void)
> {
>         rcc_periph_clock_enable(RCC_TIM3);
>         timer_set_period(TIM3, 1);
>         return 0;
> }
>
> ---------------------------------------------------------------------
>
> Using default compiler options for f4:
>
> ---------------------------------------------------------------------
>
> 080001ac <main>:
>  80001ac:    b508          push    {r3, lr}
>  80001ae:    f640 0001     movw    r0, #2049    ; 0x801
>  80001b2:    f000 f80b     bl    80001cc <rcc_periph_clock_enable>
>  80001b6:    2101          movs    r1, #1
>  80001b8:    4802          ldr    r0, [pc, #8]    ; (80001c4 <main+0x18>)
>  80001ba:    f000 f805     bl    80001c8 <timer_set_period>
>  80001be:    2000          movs    r0, #0
>  80001c0:    bd08          pop    {r3, pc}
>  80001c2:    bf00          nop
>  80001c4:    40000400     .word    0x40000400
>
> 080001c8 <timer_set_period>:
>  80001c8:    62c1          str    r1, [r0, #44]    ; 0x2c
>  80001ca:    4770          bx    lr
>
> 080001cc <rcc_periph_clock_enable>:
>  80001cc:    0943          lsrs    r3, r0, #5
>  80001ce:    f103 4380     add.w    r3, r3, #1073741824    ; 0x40000000
>  80001d2:    f503 330e     add.w    r3, r3, #145408    ; 0x23800
>  80001d6:    f000 021f     and.w    r2, r0, #31
>  80001da:    6819          ldr    r1, [r3, #0]
>  80001dc:    2001          movs    r0, #1
>  80001de:    4090          lsls    r0, r2
>  80001e0:    4308          orrs    r0, r1
>  80001e2:    6018          str    r0, [r3, #0]
>  80001e4:    4770          bx    lr
>
> ---------------------------------------------------------------------
> The store in rcc_periph_clock_enable happens at 80001e2. After returning
> to main(), we take 2 cycles to load arguments to timer_set_period, then
> calls it, and timer_set_period writes to the timer register in the first
> instruction.
>
> If my math is correct, the second write happens on the 5th cycle after the
> first write. This violates the errata sheet constraint by one cycle. For
> APB peripherals, the errata sheet says we need to wait 1 + (AHB/APB)
> cycles, which in this case would be 5 (most people use (AHB/APB = 4).
>
> Now this is with default compiler options on f4, with the lowest APB
> prescaler. With a higher prescaler, it would be even worse. With -flto, it
> would be even worse. On f7, we are very very far from the 8 cycles required
> (keeping in mind that Cortex-M7 can execute 2 instructions per cycle).
>
> So I believe we are hitting it on all series (that require this delay), at
> default compiler settings and clock settings provided by libopencm3.
>
> Matthew
>
>
> On 03/13/2017 08:15 PM, Chuck McManis wrote:
>
> Matthew,
>
> I'm really grateful you brought up the warning you read in the F7 manual.
> Clearly if someone tried to read or write a register and it wasn't ready
> the system would hard fault and that would be bad.
>
> I have reviewed the code and the warning in the datasheet and believe it
> is impossible to hit this situation with the current library. Certainly on
> the F0-F4 series and unlikely on the F7 series.
>
> It sounded like you still had concerns. If you do, it would be super
> helpful if you had an example that demonstrated this problem and especially
> helpful if that example could be recreated on one of the ST demo boards.
>
> --Chuck
>
>
> On Mon, Mar 13, 2017 at 1:01 PM, Matthew Lai <m...@matthewlai.ca> wrote:
>
>> Chuck, I'm not sure why you feel that way, but having disagreements and
>> discussions about technical aspects of a project is usually a good thing.
>>
>> I am also not sure how I am not being helpful by pointing out something
>> on the errata sheets that would be a potential source of bugs, some ways it
>> can happen, and some ways to mitigate it, for different chips, based on the
>> chip manufacturer's recommendations.
>>
>> Use any combination of compiler flags to do what? Prove that there will
>> be a less than 16 instructions delay between clock enable and the first
>> peripheral register write? Does that really need proving? You don't even
>> need to change compiler flags for that.
>>
>> I also have no idea what ST's supplied software has anything to do with
>> any of this.
>>
>> Can we go back to technical discussions now?
>>
>> If so, here is how Chromium OS fixed it:
>> <https://chromium-review.googlesource.com/c/246181/>
>> https://chromium-review.googlesource.com/c/246181/
>>
>> Matthew
>> On 03/13/2017 07:41 PM, Chuck McManis wrote:
>>
>> Matthew, it feels like you are trying to create an argument here.
>>
>> If you want to be helpful, and this is actually a problem given the way
>> libopencm3 is built, then *reproduce the problem* with libopencm3. You can
>> use any combination of compiler flags you want, you just can't change the
>> source inside of libopencm3.
>>
>> *If* you can reproduce the problem then that will show what the window of
>> opportunity is. If you cannot, then you will see that Karl is correct in
>> that it won't occur in libopencm3.
>>
>> If having this issue is such a problem for you, then by all means use the
>> ST Micro supplied software which is written presumably by their experts.
>>
>> ---Chuck
>>
>> On Mon, Mar 13, 2017 at 10:57 AM, Matthew Lai < <m...@matthewlai.ca>
>> m...@matthewlai.ca> wrote:
>>
>>> On 03/13/2017 01:25 PM, Karl Palsson wrote:
>>>
>>>> Matthew Lai <m...@matthewlai.ca> wrote:
>>>>
>>>> I would ignore it.  There's very few places where it will really
>>>>>> matter, and the people who will need it, will be handling it themselves
>>>>>> anyway, not using high level helper functions.
>>>>>>
>>>>> It would mean things like:
>>>>> rcc_periph_clock_enable(RCC_TIM3);
>>>>> timer_set_period(TIM3, 1024);
>>>>>
>>>>> Would violate this, at high prescalar settings (or low
>>>>> prescalar settings with LTO). That is actually the example
>>>>> given here:
>>>>> https://github.com/libopencm3/libopencm3/blob/master/lib/stm
>>>>> 32/common/timer_common_all.c#L70
>>>>>
>>>>> It would be a potential source of very difficult to find bugs.
>>>>>
>>>> There's plenty of edge cases if you really want to go looking.
>>>> I'm not really entirely sure how far to go trying to protect
>>>> people. [1]
>>>>
>>> That is true, but in this case, an user that uses libopencm3 functions
>>> to setup clocks, enables clock for a timer, and uses it right away will be
>>> bitten.
>>>
>>> I'd say we should protect the user in this case, since 1) it's a very
>>> common sequence, 2) most users probably don't know about it (it's not
>>> mentioned at all in reference manuals until f7), 3) resulting bugs will be
>>> very hard to find, and 4) I think there's a pretty easy fix (see below).
>>>
>>> APB1 is usually AHB/4 on f2, f4, and f7 at least. That means 5 cycles
>>> are required for f2 and f4, and 8 cycles on f7. I wouldn't want to rely on
>>> the compiler to happen to generate that much delay, especially on f7 which
>>> is dual-issue (8 cycles can mean 16 instructions).
>>>
>>>> Since I imagine rcc_periph_clock_enable() to not be
>>>>> performance-critical in most applications, wouldn't adding the
>>>>> wait states be better default behaviour, and people who really
>>>>> need to enable clocks very quickly can go to registers
>>>>> directly?
>>>>>
>>>> It's more that I'm not sure you can do it neatly, and be robust
>>>> enough to be worth trying. You'll need to interpret what bus a
>>>> peripheral is on, as well as what speed it's currently running
>>>> at. Doing this using the rcc_axx_speed globals is... ok, but
>>>> they're not set if people didn't use clock helpers to set them
>>>> up.
>>>>
>>>> Adding your peripheral speed as a parameter is a gross user API.
>>>> For the vast majority of use cases, this is simply never going to
>>>> be a problem. Doing any sort of meaningful cycle delay is going
>>>> to just add code that most people will never need.
>>>>
>>>> In my opinion at least :)
>>>>
>>> According to the errata sheets, it's very easy on pre-f7 chips. All
>>> that's required is a DSB barrier instruction. We don't actually have to do
>>> any cycle counting, or determining which bus the peripheral is on.
>>>
>>> Unfortunately that option is not mentioned for f7, which requires 2x
>>> peripheral clock cycles (could be because of the new AXIM bus). However,
>>> there's still no need to figure out which bus a peripheral is on - we can
>>> always just do enough delay for APB1. There's no harm in delaying for
>>> longer than necessary, unless someone needs to enable clocks really
>>> quickly. Figuring out APB1 frequency is not easy. I would just use the
>>> globals and add a note in the documentation that users are on their own if
>>> they are setting up their own clocks.
>>>
>>> Matthew
>>>
>>
>>
>>
>
>
------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
libopencm3-devel mailing list
libopencm3-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/libopencm3-devel

Reply via email to