ppisa commented on issue #15319:
URL: https://github.com/apache/nuttx/issues/15319#issuecomment-2871153779

   @tmedicci I have found a little more time to look at actual NuttX and 
Expressif HAL code to analyze its state. You are right that there is attempt to 
count beyond unusable low limits
   ```
   #define PCNT_HIGH_LIMIT 1000
   #define PCNT_LOW_LIMIT  -1000
   ```
   defined in 
https://github.com/apache/nuttx/blob/master/boards/risc-v/esp32c6/common/src/esp_board_pcnt.c#L44
   
   There is code adjusting `unit->accum_value ` in 
https://github.com/apache/nuttx/blob/master/arch/risc-v/src/common/espressif/esp_pcnt.c#L397.
   
   From my analysis, there is a problem with race condition, short window in 
which the read, when value read from `esp_position()` in 
https://github.com/apache/nuttx/blob/master/arch/risc-v/src/common/espressif/esp_qencoder.c#L291can
 be off by `unit->config.low_limit` or `unit->config.high_limit`. The timing 
which leds to that misread is when `esp_position()` is called, it disables 
interrupt, then it calls IOCTL for `esp_pcnt_unit_get_count()` 
https://github.com/apache/nuttx/blob/master/arch/risc-v/src/common/espressif/esp_pcnt.c#L633C12-L633C35
 and the event on phase A or phase B arrives during that windows. Interrupt is 
disabled so `esp_pcnt_isr_default()` is not processed yet but if overflow in 
the counter occurs then the value is off by `unit->config.low_limit` or 
`unit->config.high_limit`. The same appears even if the 
`esp_pcnt_unit_get_count()` is called from PCNT driver but because interrupt 
disable covers in this case the shorter code region then probability is much 
lower. I 
 have some some thoughts how that can be solved but to do it right it is not 
easy. You should test if similar problems do not affect Espressif IDF and 
Zephyr code. It could lead to really big problems in some a little more serious 
PMSM control etc...
   
   The practical side, @matiamic has found some time and tested state and found 
that for him mainline does not solve overflow at all, position is limited for 
some QENC channels to -1000 to +1000 range, for other does not work at all and 
attempt to enable some of them leads to the build errors.
   
   The code in Martin @Vajnar's fix-qe branch works for him, but again with 
limited range
   
   https://github.com/Vajnar/nuttx/commits/fix-qe/
   
   As the quick fix, I suggest to use fix-qe branch and extend range to -0x4000 
to +0x4000, then the result is 14 bit binary modulo arithmetic which allows to 
extend to 32-bits for the changes smaller than -0x2000 to +0x2000 between reads 
calls by next simple code added to motion control use application.
   
   The monition controller state variable
   ```
     uint32_t pos_act = 0;
   ```
   and actual read and extension
   ```
     int32_t pos_inc = read value from even actual problematic QENC driver
   
     pos_inc -= pos_act;
   
     pos_inc <<= 32 - 14;
     pos_inc >>= 32 - 14; // bit 13 se rozšíří do 14 až 31.
   
     pos_act += pos_inc;
   ```
   In long term, I suggest to think about the actual state, for sure small 
range like -1000, +1000 even if the problems are fixed, means invoking 
interrupts to often, we use incremental counters even with 10k slots, 40k 
counts bet revolution in industrial applications, this would overload chips 
like ESP32 by interrupts. The range which is not power of two makes really 
complex to solve natural extensions in the software. I consider as the best to 
leave QENC without any limits, then rules for 16-bit overflow arithmetic apply 
and it is easy to extend if the read is called at least two times per overflow. 
If extension is not guaranteed by reads done often enough then I can imagine 
some extension by interrupts.
   
   In general, ignoring my initial report about suspicious changes has allowed 
multiple levels of breakage, even build errors reported for some configs, 
misread time widow, excessive time blocking the scheduler and wasting time of 
users  etc. which has been in mainline for almost one half of the year.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@nuttx.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to