On 07/25/2016 07:48 AM, Sathya Perla wrote:
-----Original Message-----
From: Guilherme G. Piccoli [mailto:gpicc...@linux.vnet.ibm.com]

Temperature values on be2net driver are made available to userspace via
hwmon abstraction, so tools like lm-
sensors can present them to the user.
The driver provides hwmon structures for each adapter's function.
Nevertheless, the temperature information come from fw queries performed
by
be_worker() with some frequency, and this procedure is called with a
single function as argument; this means
that the temperature value is updated only in the specific function that
was passed to be_worker().

This can lead to incongruency in reported temperature by a function, or
in a worse scenario, some functions
might be unable to provide temperature info to userspace, if they
weren't fed with this information from fw in
be_worker() run.

Hi, I'm wondering if you are OK with the temperature value being 128s old
(2/2 patch), then why is it a problem
if two different functions report a temperature value that is queried a
few seconds apart?
Also, you'll not have a scenario where the FW cmd succeeds for one
function and fails for other functions.
It's a common FW for the entire adapter.


This patch changes the way temperature is set in be2net driver. At
anytime the fw query is performed, it will set
the temperature value for all functions of the adapter, instead of only
setting the temperature of the function
passed to be_worker().
If the possible inconsistency across functions is indeed a problem, then a
simpler solution would be to
issue the FW cmd synchronously when the sysfs attr is read, i.e., in
be_hwmon_show_temp() routine itself.


Hi Sathya, thanks very much for your quick reply. I agree with you that an 1 or 2 sec inconsistency wouldn't harm, but the main problem we're seeing is that be_worker() is being called with a single function as a parameter - in our case, the last function is being passed as argument to be_worker() multiple times in a row, and then we have its temperature updated but the other functions' temperature set as invalid.

Regarding the temperature update run on be_hwmon_show_temp(), it was an idea too, but I was afraid in delay this output too much - imagine some userspace tool reads hwmon attributes for all functions almost at "same time", supposing the fw command can't run in parallel, the "last" read would need to wait 4 fw commands to complete before showing it's output. Besides, in a worse scenario, some "not-friendly" tool might issue lots of reads to hwmon per second then issuing lots of fw commands, which does not seem a good idea. Of course this last case we can avoid by implementing a counter or timer on be_hwmon_show_temp() to allow maximum number of fw cmds in a time frame.

I appreciate your advice on how do you prefer to address this issue.
Thanks,


Guilherme


thanks!
-Sathya


Reply via email to