[PATCH 1/2] regulator: 88pg86x: add DT bindings document
The only device-specific node names are "buck1" and "buck2" for the two regulators present on the device. Sleep mode GPIO and per-regulator GPIO enable pins are not exposed (the driver does not support them either). Signed-off-by: Alexander Monakov --- .../devicetree/bindings/regulator/88pg86x.txt | 22 ++ 1 file changed, 22 insertions(+) create mode 100644 Documentation/devicetree/bindings/regulator/88pg86x.txt diff --git a/Documentation/devicetree/bindings/regulator/88pg86x.txt b/Documentation/devicetree/bindings/regulator/88pg86x.txt new file mode 100644 index ..13b7f49a2ea8 --- /dev/null +++ b/Documentation/devicetree/bindings/regulator/88pg86x.txt @@ -0,0 +1,22 @@ +Marvell 88PG867/88PG868 voltage regulators + +Required properties: +- compatible: one of "marvell,88pg867", "marvell,88pg868"; +- reg: I2C slave address. + +Optional subnodes for regulators: "buck1", "buck2", using common regulator +bindings given in . + +Example: + + pg868@19 { + compatible = "marvell,88pg868"; + reg = <0x19>; + + vcpu: buck1 { + regulator-boot-on; + regulator-always-on; + regulator-min-microvolt = <100>; + regulator-max-microvolt = <135>; + }; + }; -- 2.11.0
[PATCH 2/2] regulator: 88pg86x: new i2c dual regulator chip
This chip is found on Google Chromecast and Valve Steam Link devices. It provides two DC regulators with I2C voltage control, separate GPIO enable pins and one sleep mode pin. This driver does not expose GPIO functionality, but supports voltage control in 1.0-2.2V range, based on I2C register information given in Chromecast kernel driver by Jisheng Zhang. Cc: Jisheng Zhang Signed-off-by: Alexander Monakov --- Hello, Some high-level information on the chip is available in Marvell's PDF titled "DC/DC Power Regulators Product Portfolio" [1]. Chromecast's drivers show two linear ranges: 1.0-1.6V at 25mV steps and 1.6-2.2V at 50mV steps. The brochure says that programmable range is 0.6-3.3V, though, so this patch doesn't support the full voltage range. I've also opted not to support GPIO sleep and enable pins over concerns that it would be a dead weight. I'd like to use this for controlling the voltage regulator on Steam Link device. Thanks. Alexander [1] https://www.marvell.com/power-management/assets/Marvell-DC-DC-Power-Regulators-Product-Portfolio.pdf drivers/regulator/88pg86x.c | 114 drivers/regulator/Kconfig | 9 drivers/regulator/Makefile | 1 + 3 files changed, 124 insertions(+) create mode 100644 drivers/regulator/88pg86x.c diff --git a/drivers/regulator/88pg86x.c b/drivers/regulator/88pg86x.c new file mode 100644 index ..d5ef55c81185 --- /dev/null +++ b/drivers/regulator/88pg86x.c @@ -0,0 +1,114 @@ +// SPDX-License-Identifier: GPL-2.0 +#include +#include +#include +#include +#include + +static const struct regulator_ops pg86x_ops = { + .set_voltage_sel = regulator_set_voltage_sel_regmap, + .get_voltage_sel = regulator_get_voltage_sel_regmap, + .list_voltage = regulator_list_voltage_linear_range, +}; + +static const struct regulator_linear_range pg86x_buck1_ranges[] = { + REGULATOR_LINEAR_RANGE( 0, 0, 10, 0), + REGULATOR_LINEAR_RANGE(100, 11, 34, 25000), + REGULATOR_LINEAR_RANGE(160, 35, 47, 5), +}; + +static const struct regulator_linear_range pg86x_buck2_ranges[] = { + REGULATOR_LINEAR_RANGE( 0, 0, 15, 0), + REGULATOR_LINEAR_RANGE(100, 16, 39, 25000), + REGULATOR_LINEAR_RANGE(160, 40, 52, 5), +}; + +static const struct regulator_desc pg86x_regulators[] = { + { + .id = 0, + .type = REGULATOR_VOLTAGE, + .name = "buck1", + .of_match = of_match_ptr("buck1"), + .n_voltages = 11 + 24 + 13, + .linear_ranges = pg86x_buck1_ranges, + .n_linear_ranges = 3, + .vsel_reg = 0x24, + .vsel_mask = 0xff, + .ops = &pg86x_ops, + .owner = THIS_MODULE + }, + { + .id = 1, + .type = REGULATOR_VOLTAGE, + .name = "buck2", + .of_match = of_match_ptr("buck2"), + .n_voltages = 16 + 24 + 13, + .linear_ranges = pg86x_buck2_ranges, + .n_linear_ranges = 3, + .vsel_reg = 0x13, + .vsel_mask = 0xff, + .ops = &pg86x_ops, + .owner = THIS_MODULE + }, +}; + +static const struct regmap_config pg86x_regmap = { + .reg_bits = 8, + .val_bits = 8, +}; + +static int pg86x_i2c_probe(struct i2c_client *i2c) +{ + int id, ret; + struct regulator_config config = {.dev = &i2c->dev}; + struct regmap *regmap = devm_regmap_init_i2c(i2c, &pg86x_regmap); + + if (IS_ERR(regmap)) { + ret = PTR_ERR(regmap); + dev_err(&i2c->dev, "regmap init failed: %d\n", ret); + return ret; + } + + for (id = 0; id < ARRAY_SIZE(pg86x_regulators); id++) { + struct regulator_dev *rdev; + rdev = devm_regulator_register(&i2c->dev, + &pg86x_regulators[id], + &config); + if (IS_ERR(rdev)) { + ret = PTR_ERR(rdev); + dev_err(&i2c->dev, "failed to register %s: %d\n", + pg86x_regulators[id].name, ret); + return ret; + } + } + return 0; +} + +static const struct of_device_id pg86x_dt_ids [] = { + { .compatible = "marvell,88pg867" }, + { .compatible = "marvell,88pg868" }, + { } +}; +MODULE_DEVICE_TABLE(of, pg86x_dt_ids); + +static const struct i2c_device_id pg86x_i2c_id[] = { + { "88pg867", }, + { "88pg868", }, + { } +}; +MODULE_DEVICE_TABLE(i2c, pg86x_i2c_id); + +static struct i2c_driver pg86x_regulator_driver =
Re: [PATCH] iommu/amd: Fix event counter availability check
Hi, Adding Shuah Khan to Cc: I've noticed you've seen this issue on Ryzen 2400GE; can you have a look at the patch? Would be nice to know if it fixes the problem for you too. Thanks. Alexander On Fri, 29 May 2020, Alexander Monakov wrote: > The driver performs an extra check if the IOMMU's capabilities advertise > presence of performance counters: it verifies that counters are writable > by writing a hard-coded value to a counter and testing that reading that > counter gives back the same value. > > Unfortunately it does so quite early, even before pci_enable_device is > called for the IOMMU, i.e. when accessing its MMIO space is not > guaranteed to work. On Ryzen 4500U CPU, this actually breaks the test: > the driver assumes the counters are not writable, and disables the > functionality. > > Moving init_iommu_perf_ctr just after iommu_flush_all_caches resolves > the issue. This is the earliest point in amd_iommu_init_pci where the > call succeeds on my laptop. > > Signed-off-by: Alexander Monakov > Cc: Joerg Roedel > Cc: Suravee Suthikulpanit > Cc: io...@lists.linux-foundation.org > --- > > PS. I'm seeing another hiccup with IOMMU probing on my system: > pci :00:00.2: can't derive routing for PCI INT A > pci :00:00.2: PCI INT A: not connected > > Hopefully I can figure it out, but I'd appreciate hints. > > drivers/iommu/amd_iommu_init.c | 6 +++--- > 1 file changed, 3 insertions(+), 3 deletions(-) > > diff --git a/drivers/iommu/amd_iommu_init.c b/drivers/iommu/amd_iommu_init.c > index 5b81fd16f5fa..1b7ec6b6a282 100644 > --- a/drivers/iommu/amd_iommu_init.c > +++ b/drivers/iommu/amd_iommu_init.c > @@ -1788,8 +1788,6 @@ static int __init iommu_init_pci(struct amd_iommu > *iommu) > if (iommu->cap & (1UL << IOMMU_CAP_NPCACHE)) > amd_iommu_np_cache = true; > > - init_iommu_perf_ctr(iommu); > - > if (is_rd890_iommu(iommu->dev)) { > int i, j; > > @@ -1891,8 +1889,10 @@ static int __init amd_iommu_init_pci(void) > > init_device_table_dma(); > > - for_each_iommu(iommu) > + for_each_iommu(iommu) { > iommu_flush_all_caches(iommu); > + init_iommu_perf_ctr(iommu); > + } > > if (!ret) > print_iommu_info(); > > base-commit: 75caf310d16cc5e2f851c048cd597f5437013368 >
Re: [PATCH] iommu/amd: Fix event counter availability check
On Mon, 1 Jun 2020, Suravee Suthikulpanit wrote: > > Moving init_iommu_perf_ctr just after iommu_flush_all_caches resolves > > the issue. This is the earliest point in amd_iommu_init_pci where the > > call succeeds on my laptop. > > According to your description, it should just need to be anywhere after the > pci_enable_device() is called for the IOMMU device, isn't it? So, on your > system, what if we just move the init_iommu_perf_ctr() here: No, this doesn't work, as I already said in the paragraph you are responding to. See my last sentence in the quoted part. So the implication is init_device_table_dma together with subsequent cache flush is also setting up something that is necessary for counters to be writable. Alexander
Re: [PATCH] iommu/amd: Fix event counter availability check
> Instead of blindly moving the code around to a spot that would just work, > I am trying to understand what might be required here. In this case, > the init_device_table_dma()should not be needed. I suspect it's the IOMMU > invalidate all command that's also needed here. > > I'm also checking with the HW and BIOS team. Meanwhile, could you please give > the following change a try: Yes, this also fixes the problem for me. Alexander > > diff --git a/drivers/iommu/amd_iommu_init.c b/drivers/iommu/amd_iommu_init.c > index 5b81fd16f5fa..b07458cc1b0b 100644 > --- a/drivers/iommu/amd_iommu_init.c > +++ b/drivers/iommu/amd_iommu_init.c > @@ -1875,6 +1875,8 @@ static int __init amd_iommu_init_pci(void) > ret = iommu_init_pci(iommu); > if (ret) > break; > + iommu_flush_all_caches(iommu); > + init_iommu_perf_ctr(iommu); > } > > /*
Re: [PATCH v2] drm/amd/display: use correct scale for actual_brightness
On Tue, 4 Aug 2020, Alexander Monakov wrote: > Documentation for sysfs backlight level interface requires that > values in both 'brightness' and 'actual_brightness' files are > interpreted to be in range from 0 to the value given in the > 'max_brightness' file. > > With amdgpu, max_brightness gives 255, and values written by the user > into 'brightness' are internally rescaled to a wider range. However, > reading from 'actual_brightness' gives the raw register value without > inverse rescaling. This causes issues for various userspace tools such > as PowerTop and systemd that expect the value to be in the correct > range. > > Introduce a helper to retrieve internal backlight range. Use it to > reimplement 'convert_brightness' as 'convert_brightness_from_user' and > introduce 'convert_brightness_to_user'. > > Bug: https://bugzilla.kernel.org/show_bug.cgi?id=203905 > Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/1242 > Cc: Alex Deucher > Cc: Nicholas Kazlauskas > Signed-off-by: Alexander Monakov > --- > v2: split convert_brightness to &_from_user and &_to_user (Nicholas) Nicholas, does this implement the kind of split you had in mind? > .../gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 81 +-- > 1 file changed, 40 insertions(+), 41 deletions(-) > > diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c > b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c > index 710edc70e37e..b60a763f3f95 100644 > --- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c > +++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c > @@ -2881,51 +2881,50 @@ static int set_backlight_via_aux(struct dc_link > *link, uint32_t brightness) > return rc ? 0 : 1; > } > > -static u32 convert_brightness(const struct amdgpu_dm_backlight_caps *caps, > - const uint32_t user_brightness) > +static int get_brightness_range(const struct amdgpu_dm_backlight_caps *caps, > + unsigned *min, unsigned *max) > { > - u32 min, max, conversion_pace; > - u32 brightness = user_brightness; > - > if (!caps) > - goto out; > + return 0; > > - if (!caps->aux_support) { > - max = caps->max_input_signal; > - min = caps->min_input_signal; > - /* > - * The brightness input is in the range 0-255 > - * It needs to be rescaled to be between the > - * requested min and max input signal > - * It also needs to be scaled up by 0x101 to > - * match the DC interface which has a range of > - * 0 to 0x > - */ > - conversion_pace = 0x101; > - brightness = > - user_brightness > - * conversion_pace > - * (max - min) > - / AMDGPU_MAX_BL_LEVEL > - + min * conversion_pace; > + if (caps->aux_support) { > + // Firmware limits are in nits, DC API wants millinits. > + *max = 1000 * caps->aux_max_input_signal; > + *min = 1000 * caps->aux_min_input_signal; > } else { > - /* TODO > - * We are doing a linear interpolation here, which is OK but > - * does not provide the optimal result. We probably want > - * something close to the Perceptual Quantizer (PQ) curve. > - */ > - max = caps->aux_max_input_signal; > - min = caps->aux_min_input_signal; > - > - brightness = (AMDGPU_MAX_BL_LEVEL - user_brightness) * min > -+ user_brightness * max; > - // Multiple the value by 1000 since we use millinits > - brightness *= 1000; > - brightness = DIV_ROUND_CLOSEST(brightness, AMDGPU_MAX_BL_LEVEL); > + // Firmware limits are 8-bit, PWM control is 16-bit. > + *max = 0x101 * caps->max_input_signal; > + *min = 0x101 * caps->min_input_signal; > } > + return 1; > +} > > -out: > - return brightness; > +static u32 convert_brightness_from_user(const struct > amdgpu_dm_backlight_caps *caps, > + uint32_t brightness) > +{ > + unsigned min, max; > + > + if (!get_brightness_range(caps, &min, &max)) > + return brightness; > + > + // Rescale 0..255 to min..max > + return min + DIV_ROUND_CLOSEST((max - min) * brightness, > +
Re: Non-deterministically boot into dark screen with `amdgpu`
Hi, you should Сс a specialized mailing list and a relevant maintainer, otherwise your email is likely to be ignored as LKML is an incredibly high-volume list. Adding amd-gfx and Alex Deucher. More thoughts below. On Sun, 9 Aug 2020, Ignat Insarov wrote: > Hello! > > This is an issue report. I am not familiar with the Linux kernel > development procedure, so please direct me to a more appropriate or > specialized medium if this is not the right avenue. > > My laptop (Ryzen 7 Pro CPU/GPU) boots into dark screen more often than > not. Screen blackness correlates with a line in the `systemd` journal > that says `RAM width Nbits DDR4`, where N is either 128 (resulting in > dark screen) or 64 (resulting in a healthy boot). The number seems to > be chosen at random with bias towards 128. This has been going on for > a while so here is some statistics: > > * 356 boots proceed far enough to attempt mode setting. > * 82 boots set RAM width to 64 bits and presumably succeed. > * 274 boots set RAM width to 128 bits and presumably fail. > > The issue is prevented with the `nomodeset` kernel option. > > I reported this previously (about a year ago) on the forum of my Linux > distribution.[1] The issue still persists as of linux 5.8.0. > > The details of my graphics controller, as well as some journal > excerpts, can be seen at [1]. One thing that has changed since then is > that on failure, there now appears a null pointer dereference error. I > am attaching the log of kernel messages from the most recent failed > boot — please request more information if needed. > > I appreciate any directions and advice as to how I may go about fixing > this annoyance. > > [1]: https://bbs.archlinux.org/viewtopic.php?id=248273 On the forum you show that in the "success" case there's one less "BIOS signature incorrect" message. This implies that amdgpu_get_bios() in https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/gpu/drm/amd/amdgpu/amdgpu_bios.c gets the video BIOS from a different source. If that happens every time (one "signature incorrect" message for "success", two for "failure") that may be relevant to the problem you're experiencing. If you don't mind patching and rebuilding the kernel I suggest adding debug printks to the aforementioned function to see exactly which methods fail with wrong signature and which succeeds. Also might be worthwhile to check if there's a BIOS update for your laptop. Alexander
RE: amd-pmc s2idle driver issues
On Tue, 22 Dec 2020, Deucher, Alexander wrote: > > I've tried the "platform/x86: amd-pmc: Add AMD platform support for > > S2Idle" > > patch on my Acer Swift SF314-42 laptop (Renoir SoC, Ryzen 4500U CPU) and > > hit the following issues: > > > > 1. The driver doesn't bind to any device. It has the following binding > > table: > > > > +static const struct acpi_device_id amd_pmc_acpi_ids[] = { > > + {"AMDI0005", 0}, > > + {"AMD0004", 0}, > > + { } > > +}; > > > > This laptop has "AMD0005" instead. Adding it to the list allows the driver > > to > > successfully probe. > > > > 2. The debugfs interface does not seem to be very helpful. It shows > > > > SMU FW Info: > > > > It's not very informative. The code seems to be fetching SMU version from > > mmio, so I guess the file should be saying "FW version" rather than "FW > > Info", and then, I think version number is not supposed to be "-1". > > > > Does your platform support modern standby? You may have to select between > legacy S3 and modern standby in the sbios. Yes. Out-of-the-box it's a "modern standby" laptop. There's a "hidden" bios menu with extra settings that apparently allows to select legacy S3. I did not change it, so I'm testing the "modern" mode. Note that this driver fetches SMU version from MMIO, which looks odd to me: elsewhere (i.e. in the amdgpu driver) SMU version is retrieved by issuing the corresponding SMU command, as far as I can tell. Alexander
RE: amd-pmc s2idle driver issues
On Tue, 22 Dec 2020, Deucher, Alexander wrote: > > Yes. Out-of-the-box it's a "modern standby" laptop. There's a "hidden" > > bios menu with extra settings that apparently allows to select legacy S3. > > I did not change it, so I'm testing the "modern" mode. > > > > Note that this driver fetches SMU version from MMIO, which looks odd to > > me: > > elsewhere (i.e. in the amdgpu driver) SMU version is retrieved by issuing > > the > > corresponding SMU command, as far as I can tell. > > There are multiple interfaces to the SMU. It's shared by the entire SoC on > APUs. Just pointing that out because evidently this interface does not work on this laptop, producing all-ones instead of something resembling a version number. Which APU generations does this driver support? If it does not support Renoir (yet?) it should be documented in the Kconfig text. Is Renoir support related to missing AMD0005 ACPI id binding, and borked version number info? Alexander
AHCI SATA Runtime PM
Hello, Hans, Linux PM folks, I'm looking for clarification regarding this patch discussion: https://patchwork.kernel.org/project/linux-pm/patch/20180420101834.15783-2-0v3rdr...@gmail.com/ Hans said, > Ah, so the AHCI code has runtime pm enabled by default (so there > is another pm_runtime_allow() somewhere, but then disables it for > unused ports to make hotpluging something into those ports work. I have a laptop where two unused AHCI SATA controllers are present (but obviously nothing can be hotplugged into them). Apparently due to the above, they do not enter runtime autosuspend. The problem is, these "ATA port" nodes don't seem to participate in udev hierarchy, so it's unclear how I'm supposed to automatically re-enable runtime PM for them. For PCI device nodes, I have ACTION=="add", SUBSYSTEM=="pci", TEST=="power/control", ATTR{power/control}="auto" but ata1/uevent is empty, so there's no obvious way to write the corresponding UDev rule. Prior to discovering the above patch discussion, I have filed a bug: https://bugzilla.kernel.org/show_bug.cgi?id=211837 Does the above correctly reflect how AHCI PM is supposed to be? If so, what is the proper way to enable runtime PM for unused ATA ports? Thank you. Alexander
Re: unknown NMI on AMD Rome
On Tue, 16 Mar 2021, Adam Borowski wrote: > On Tue, Mar 16, 2021 at 04:45:02PM +0100, Jiri Olsa wrote: > > hi, > > when running 'perf top' on AMD Rome (/proc/cpuinfo below) > > with fedora 33 kernel 5.10.22-200.fc33.x86_64 > > > > we got unknown NMI messages: > > > > [ 226.700160] Uhhuh. NMI received for unknown reason 3d on CPU 90. > > [ 226.700162] Do you have a strange power saving mode enabled? > > [ 226.700163] Dazed and confused, but trying to continue > > > > also when discussing ths with Borislav, he managed to reproduce easily > > on his AMD Rome machine > > Likewise, 3c on Pinnacle Ridge. I've also seen it on Renoir, and it appears related to PMU interrupt racing against C-state entry/exit. Disabling C2 and C3 via 'cpupower' is enough to avoid those NMIs in my case. IIRC there were a few patches related to this area from AMD in the past. Alexander
Re: [PATCH v2] iommu/amd: Fix extended features logging
On Sun, 11 Apr 2021, Joe Perches wrote: > > v2: avoid pr_info(""), change pci_info() to pr_info() for a nicer > > solution > > > > drivers/iommu/amd/init.c | 4 ++-- > > 1 file changed, 2 insertions(+), 2 deletions(-) > > > > diff --git a/drivers/iommu/amd/init.c b/drivers/iommu/amd/init.c > > index 596d0c413473..62913f82a21f 100644 > > --- a/drivers/iommu/amd/init.c > > +++ b/drivers/iommu/amd/init.c > > @@ -1929,8 +1929,8 @@ static void print_iommu_info(void) > > pci_info(pdev, "Found IOMMU cap 0x%hx\n", iommu->cap_ptr); > > > > > > if (iommu->cap & (1 << IOMMU_CAP_EFR)) { > > - pci_info(pdev, "Extended features (%#llx):", > > -iommu->features); > > + pr_info("Extended features (%#llx):", iommu->features); > > + > > for (i = 0; i < ARRAY_SIZE(feat_str); ++i) { > > if (iommu_feature(iommu, (1ULL << i))) > > pr_cont(" %s", feat_str[i]); > > How about avoiding all of this by using a temporary buffer > and a single pci_info. I think it is mostly up to the maintainers, but from my perspective, it's not good to conflate such a simple bugfix with the substantial rewrite you are proposing (which also increases code complexity). My two-line patch is a straightforward fix to a bug that people already agreed needs to be fixed (just the previous attempt turned out to be insufficient). If there's a desire to eliminate pr_cont calls (which I wouldn't support in this instance), the rewrite can go in separately from the bugfix. A major problem with landing a simple bugfix together with a rewrite in a big patch is that if a rewrite causes a problem, the whole patch gets reverted and we end up without a trivial bugfix. And, once again: can we please not emit the feature list via pci_info, the line is long enough already even without the pci bus location info. Joerg, are you satisfied with my v2 patch, are you waiting for anything before picking it up? Alexander > > Miscellanea: > o Move the feat_str and i declarations into the if block for locality > > --- > drivers/iommu/amd/init.c | 29 ++--- > 1 file changed, 18 insertions(+), 11 deletions(-) > > diff --git a/drivers/iommu/amd/init.c b/drivers/iommu/amd/init.c > index 321f5906e6ed..0d219044726e 100644 > --- a/drivers/iommu/amd/init.c > +++ b/drivers/iommu/amd/init.c > @@ -1943,30 +1943,37 @@ static int __init iommu_init_pci(struct amd_iommu > *iommu) > > static void print_iommu_info(void) > { > - static const char * const feat_str[] = { > - "PreF", "PPR", "X2APIC", "NX", "GT", "[5]", > - "IA", "GA", "HE", "PC" > - }; > struct amd_iommu *iommu; > > for_each_iommu(iommu) { > struct pci_dev *pdev = iommu->dev; > - int i; > > pci_info(pdev, "Found IOMMU cap 0x%x\n", iommu->cap_ptr); > > if (iommu->cap & (1 << IOMMU_CAP_EFR)) { > - pci_info(pdev, "Extended features (%#llx):", > - iommu->features); > + static const char * const feat_str[] = { > + "PreF", "PPR", "X2APIC", "NX", "GT", "[5]", > + "IA", "GA", "HE", "PC" > + }; > + int i; > + char features[128] = ""; > + int len = 0; > + > for (i = 0; i < ARRAY_SIZE(feat_str); ++i) { > - if (iommu_feature(iommu, (1ULL << i))) > - pr_cont(" %s", feat_str[i]); > + if (!iommu_feature(iommu, BIT_ULL(i))) > + continue; > + len += scnprintf(features + len, > + sizeof(features) - len, > + " %s", feat_str[i]); > } > > if (iommu->features & FEATURE_GAM_VAPIC) > - pr_cont(" GA_vAPIC"); > + len += scnprintf(features + len, > + sizeof(features) - len, > + " %s", "GA_vPIC"); > > - pr_cont("\n"); > + pci_info(pdev, "Extended features (%#llx):%s\n", > + iommu->features, features); > } > } > if (irq_remapping_enabled) { > > >
[PATCH] cpufreq: Kconfig: fix documentation links
User documentation for cpufreq governors and drivers has been moved to admin-guide; adjust references from Kconfig entries accordingly. Remove references from undocumented cpufreq drivers, as well as the 'userspace' cpufreq governor, for which no additional details are provided in the admin-guide text. Fixes: 2a0e49279850 ("cpufreq: User/admin documentation update and consolidation") Signed-off-by: Alexander Monakov Cc: Rafael Wysocki Cc: Viresh Kumar Cc: linux...@vger.kernel.org --- drivers/cpufreq/Kconfig | 23 ++- 1 file changed, 6 insertions(+), 17 deletions(-) diff --git a/drivers/cpufreq/Kconfig b/drivers/cpufreq/Kconfig index 85de313ddec2..c3038cdc6865 100644 --- a/drivers/cpufreq/Kconfig +++ b/drivers/cpufreq/Kconfig @@ -13,7 +13,8 @@ config CPU_FREQ clock speed, you need to either enable a dynamic cpufreq governor (see below) after boot, or use a userspace tool. - For details, take a look at . + For details, take a look at + . If in doubt, say N. @@ -140,8 +141,6 @@ config CPU_FREQ_GOV_USERSPACE To compile this driver as a module, choose M here: the module will be called cpufreq_userspace. - For details, take a look at . - If in doubt, say Y. config CPU_FREQ_GOV_ONDEMAND @@ -158,7 +157,8 @@ config CPU_FREQ_GOV_ONDEMAND To compile this driver as a module, choose M here: the module will be called cpufreq_ondemand. - For details, take a look at linux/Documentation/cpu-freq. + For details, take a look at + . If in doubt, say N. @@ -182,7 +182,8 @@ config CPU_FREQ_GOV_CONSERVATIVE To compile this driver as a module, choose M here: the module will be called cpufreq_conservative. - For details, take a look at linux/Documentation/cpu-freq. + For details, take a look at + . If in doubt, say N. @@ -246,8 +247,6 @@ config IA64_ACPI_CPUFREQ This driver adds a CPUFreq driver which utilizes the ACPI Processor Performance States. - For details, take a look at . - If in doubt, say N. endif @@ -271,8 +270,6 @@ config LOONGSON2_CPUFREQ Loongson2F and it's successors support this feature. - For details, take a look at . - If in doubt, say N. config LOONGSON1_CPUFREQ @@ -282,8 +279,6 @@ config LOONGSON1_CPUFREQ This option adds a CPUFreq driver for loongson1 processors which support software configurable cpu frequency. - For details, take a look at . - If in doubt, say N. endif @@ -293,8 +288,6 @@ config SPARC_US3_CPUFREQ help This adds the CPUFreq driver for UltraSPARC-III processors. - For details, take a look at . - If in doubt, say N. config SPARC_US2E_CPUFREQ @@ -302,8 +295,6 @@ config SPARC_US2E_CPUFREQ help This adds the CPUFreq driver for UltraSPARC-IIe processors. - For details, take a look at . - If in doubt, say N. endif @@ -318,8 +309,6 @@ config SH_CPU_FREQ will also generate a notice in the boot log before disabling itself if the CPU in question is not capable of rate rounding. - For details, take a look at . - If unsure, say N. endif -- 2.30.0
Re: [PATCH 2/2] iommu/amd: Remove performance counter pre-initialization test
On Tue, 20 Apr 2021, Suthikulpanit, Suravee wrote: > David / Joerg, > > On 4/10/2021 5:03 PM, David Coe wrote: > > > > The immediately obvious difference is the with the enormous count seen on > > mem_dte_mis on the older Ryzen 2400G. Will do some RTFM but anyone > > with comments and insight? > > > > 841,689,151,202,939 amd_iommu_0/mem_dte_mis/ (33.44%) > > > > Otherwise, all seems to running smoothly (especially for a distribution > > still in β). Bravo and many thanks all! > > The initial hypothesis is that the issue happens only when users specify more > number of events than > the available counters, which Perf will time-multiplex the events onto the > counters. > > Looking at the Perf and AMD IOMMU PMU multiplexing logic, it requires: > 1. Stop the counter (i.e. set CSOURCE to zero to stop counting) > 2. Save the counter value of the current event > 3. Reload the counter value of the new event (previously saved) > 4. Start the counter (i.e. set CSOURCE to count new events) > > The problem here is that when the driver writes zero to CSOURCE register in > step 1, this would enable power-gating, > which prevents access to the counter and result in writing/reading value in > step 2 and 3. > > I have found a system that reproduced this case (w/ unusually large number of > count), and debug the issue further. > As a hack, I have tried skipping step 1, and it seems to eliminate this issue. > However, this is logically incorrect, > and might result in inaccurate data depending on the events. > > Here are the options: > 1. Continue to look for workaround for this issue. > 2. Find a way to disable event time-multiplexing (e.g. only limit the number > of counters to 8) >if power gating is enabled on the platform. > 3. Back to the original logic where we had the pre-init check of the counter > vlues, which is still the safest choice >at the moment unless If the "power-gated" counter only ignores writes, but yields correct values on reads, you don't need to change its value. 0. When initializing the counter, save its value as 'raw_counter_value'. When switching: 1. Set CSOURCE to zero (pauses the counter). 2. Read current counter value ('new_value'). 3. Add '(new_value - raw_counter_value) & mask' to the current event count; where 'mask' is '(1ULL << 48) - 1' to account for overflow correctly. 4. Assign 'raw_counter_value = new_value'. 5. Set CSOURCE to new configuration value. Modifying the hardware counter value is only needed to get overflow interrupts, but there's no code to handle them on Linux (since the interrupts are asynchronous, and given the nature of the events, I don't see how it would be useful). Alexander
Re: unknown NMI on AMD Rome
On Wed, 17 Mar 2021, Peter Zijlstra wrote: > On Wed, Mar 17, 2021 at 09:48:29AM +0100, Ingo Molnar wrote: > > > https://developer.amd.com/wp-content/resources/56323-PUB_0.78.pdf > > > > So: > > > > > > 1215 IBS (Instruction Based Sampling) Counter Valid Value > > May be Incorrect After Exit From Core C6 (CC6) State > > > > Description > > > > If a core's IBS feature is enabled and configured to generate an > > interrupt, including NMI (Non-Maskable > > Interrupt), and the IBS counter overflows during the entry into the Core > > C6 (CC6) state, the interrupt may be > > issued, but an invalid value of the valid bit may be restored when the > > core exits CC6. > > Potential Effect on System > > > > The operating system may receive interrupts due to an IBS counter event, > > including NMI, and not observe an > > valid IBS register. Console messages indicating "NMI received for unknown > > reason" have been observed on > > Linux systems. > > > > Suggested Workaround: None > > Fix Planned: No fix planned > > Should be simple enough to disable CC6 while IBS is in use. Kim, can you > please make that happen? Wouldn't that "magically" significantly speed up workloads running under 'perf top', in case they don't saturate the CPUs? Scheduling gets much snappier if the target CPU doesn't need to wake up from deep sleep :) Alternatively, would you consider adding the errata reference to the printk message when IBS is in use, and rate-limit it so it doesn't flood dmesg? Then the user will know what's going on, and may choose to temporarily disable C-states using the 'cpupower' tool. Alexander
[PATCH] iommu/amd: Fix extended features logging
print_iommu_info prints the EFR register and then the decoded list of features on a separate line: pci :00:00.2: AMD-Vi: Extended features (0x206d73ef22254ade): PPR X2APIC NX GT IA GA PC GA_vAPIC The second line is emitted via 'pr_cont', which causes it to have a different ('warn') loglevel compared to the previous line ('info'). Commit 9a295ff0ffc9 attempted to rectify this by removing the newline from the pci_info format string, but this doesn't work, as pci_info calls implicitly append a newline anyway. Restore the newline, and call pr_info with empty format string to set the loglevel for subsequent pr_cont calls. The same solution is used in EFI and uvesafb drivers. Fixes: 9a295ff0ffc9 ("iommu/amd: Print extended features in one line to fix divergent log levels") Signed-off-by: Alexander Monakov Cc: Paul Menzel Cc: Joerg Roedel Cc: Suravee Suthikulpanit Cc: io...@lists.linux-foundation.org --- drivers/iommu/amd/init.c | 5 - 1 file changed, 4 insertions(+), 1 deletion(-) diff --git a/drivers/iommu/amd/init.c b/drivers/iommu/amd/init.c index 596d0c413473..a25e241eff1c 100644 --- a/drivers/iommu/amd/init.c +++ b/drivers/iommu/amd/init.c @@ -1929,8 +1929,11 @@ static void print_iommu_info(void) pci_info(pdev, "Found IOMMU cap 0x%hx\n", iommu->cap_ptr); if (iommu->cap & (1 << IOMMU_CAP_EFR)) { - pci_info(pdev, "Extended features (%#llx):", + pci_info(pdev, "Extended features (%#llx):\n", iommu->features); + + pr_info(""); + for (i = 0; i < ARRAY_SIZE(feat_str); ++i) { if (iommu_feature(iommu, (1ULL << i))) pr_cont(" %s", feat_str[i]); -- 2.30.0
Re: [PATCH] iommu/amd: Fix extended features logging
On Sun, 11 Apr 2021, Paul Menzel wrote: > > The second line is emitted via 'pr_cont', which causes it to have a > > different ('warn') loglevel compared to the previous line ('info'). > > > > Commit 9a295ff0ffc9 attempted to rectify this by removing the newline > > from the pci_info format string, but this doesn't work, as pci_info > > calls implicitly append a newline anyway. > > Hmm, did I really screw that up during my testing? I am sorry about that. > > I tried to wrap my head around, where the newline is implicitly appended, and > only found the definitions below. > > include/linux/pci.h:#define pci_info(pdev, fmt, arg...) > dev_info(&(pdev)->dev, fmt, ##arg) > > include/linux/dev_printk.h:#define dev_info(dev, fmt, ...) > \ > include/linux/dev_printk.h: _dev_info(dev, dev_fmt(fmt), > ##__VA_ARGS__) > > include/linux/dev_printk.h:__printf(2, 3) __cold > include/linux/dev_printk.h:void _dev_info(const struct device *dev, const > char *fmt, ...); > > include/linux/compiler_attributes.h:#define __printf(a, b) > __attribute__((__format__(printf, a, b))) Yeah, it's not obvious: it happens in kernel/printk/printk.c:vprintk_store where it does if (dev_info) lflags |= LOG_NEWLINE; It doesn't seem to be documented; I think it prevents using pr_cont with "rich" printk facilities that go via _dev_info. I suspect it quietly changed in commit c313af145b9bc ("printk() - isolate KERN_CONT users from ordinary complete lines"). > In the discussion *smpboot: CPU numbers printed as warning* [1] John wrote: > > > It is supported to provide loglevels for CONT messages. The loglevel is > > then only used if the append fails: > > > > pr_cont(KERN_INFO "message part"); > > > > I don't know if we want to go down that path. But it is supported. Yeah, I saw that, but decided to go with the 'pr_info("")' solution, because it is less magic, and already used in two other drivers. pr_info("") will also prepend 'AMD-Vi:' to the feature list, which is nice. Best regards, Alexander
Re: [PATCH] iommu/amd: Fix extended features logging
On Sun, 11 Apr 2021, Joe Perches wrote: > (cc'ing the printk maintainers) > [snip] > > This shouldn't be necessary. > If this is true then a lot of output logging code broke. See also my response to Paul at https://lore.kernel.org/lkml/alpine.lnx.2.20.13.2104111410340.11...@monopod.intra.ispras.ru/ Alexander
Re: [PATCH] iommu/amd: Fix extended features logging
On Sun, 11 Apr 2021, John Ogness wrote: > > pr_info("") will also prepend 'AMD-Vi:' to the feature list, which is > > nice. > > I'd rather fix dev_info callers to allow pr_cont and then fix any code > that is using this workaround. Note that existing two users of pr_info("") that I found are not using that as a workaround: efi.c is using that when it announced a list of EFI tables: efi: ACPI=0xadffe000 ACPI 2.0=0xadffe014 TPMFinalLog=0xadf2d000 ESRT=0xab70d618 SMBIOS=0xab70b000 SMBIOS 3.0=0xab709000 RNG=0xab707b98 TPMEventLog=0x9602c018 and uvesafb.c similarly uses it to print a list of conditionally-present strings. In both cases it is really a standalone message, not a continuation for something previously printed. In contrast, my patch deals with a continuation line. I wouldn't really like the decoded feature list to appear on the same line as the previous message, because it would end up quite long: [0.266332] pci :00:00.2: AMD-Vi: Extended features (0x206d73ef22254ade): [0.266334] AMD-Vi: PPR X2APIC NX GT IA GA PC GA_vAPIC I think a good compromise is to change the first line from pci_info to pr_info, losing the pci bus address. I'll send a v2. Alexander
[PATCH v2] iommu/amd: Fix extended features logging
print_iommu_info prints the EFR register and then the decoded list of features on a separate line: pci :00:00.2: AMD-Vi: Extended features (0x206d73ef22254ade): PPR X2APIC NX GT IA GA PC GA_vAPIC The second line is emitted via 'pr_cont', which causes it to have a different ('warn') loglevel compared to the previous line ('info'). Commit 9a295ff0ffc9 attempted to rectify this by removing the newline from the pci_info format string, but this doesn't work, as pci_info calls implicitly append a newline anyway. Printing the decoded features on the same line would make it quite long. Instead, change pci_info() to pr_info() to omit PCI bus location info, which is shown in the preceding message anyway. This results in: pci :00:00.2: AMD-Vi: Found IOMMU cap 0x40 AMD-Vi: Extended features (0x206d73ef22254ade): PPR X2APIC NX GT IA GA PC GA_vAPIC AMD-Vi: Interrupt remapping enabled Fixes: 9a295ff0ffc9 ("iommu/amd: Print extended features in one line to fix divergent log levels") Link: https://lore.kernel.org/lkml/alpine.lnx.2.20.13.2104112326460.11...@monopod.intra.ispras.ru Signed-off-by: Alexander Monakov Cc: Paul Menzel Cc: Joerg Roedel Cc: Suravee Suthikulpanit Cc: io...@lists.linux-foundation.org --- v2: avoid pr_info(""), change pci_info() to pr_info() for a nicer solution drivers/iommu/amd/init.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/iommu/amd/init.c b/drivers/iommu/amd/init.c index 596d0c413473..62913f82a21f 100644 --- a/drivers/iommu/amd/init.c +++ b/drivers/iommu/amd/init.c @@ -1929,8 +1929,8 @@ static void print_iommu_info(void) pci_info(pdev, "Found IOMMU cap 0x%hx\n", iommu->cap_ptr); if (iommu->cap & (1 << IOMMU_CAP_EFR)) { - pci_info(pdev, "Extended features (%#llx):", -iommu->features); + pr_info("Extended features (%#llx):", iommu->features); + for (i = 0; i < ARRAY_SIZE(feat_str); ++i) { if (iommu_feature(iommu, (1ULL << i))) pr_cont(" %s", feat_str[i]); -- 2.30.0
amd-pmc s2idle driver issues
Hi folks, I've tried the "platform/x86: amd-pmc: Add AMD platform support for S2Idle" patch on my Acer Swift SF314-42 laptop (Renoir SoC, Ryzen 4500U CPU) and hit the following issues: 1. The driver doesn't bind to any device. It has the following binding table: +static const struct acpi_device_id amd_pmc_acpi_ids[] = { + {"AMDI0005", 0}, + {"AMD0004", 0}, + { } +}; This laptop has "AMD0005" instead. Adding it to the list allows the driver to successfully probe. 2. The debugfs interface does not seem to be very helpful. It shows SMU FW Info: It's not very informative. The code seems to be fetching SMU version from mmio, so I guess the file should be saying "FW version" rather than "FW Info", and then, I think version number is not supposed to be "-1". (and I'm afraid I cannot use the driver, as there seems to be an issue with GPU resume: sometimes the screen is frozen or black after resume, so I need to reboot the laptop :( ) Alexander
Re: [PATCH] ACPI/osl: speedup grace period in acpi_os_map_cleanup
On Sun, Nov 9, 2014 at 1:13 AM, Paul E. McKenney wrote: > > Did anyone try replacing the synchronize_rcu() with > synchronize_rcu_expedited()? That should provide substantial speedups > over synchronize_rcu(). I've just briefly tested it on my laptop, and it also helps to avoid the issue. Alexander -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] phy: berlin-usb: adjust USB_PHY_RX_CTRL init flags
Make the value written into the USB_PHY_RX_CTRL configuration register match 0xAA79 value written by manufacturer-supplied kernels for Sony NSZ-GS7 (Berlin2 SoC), Google Chromecast and Valve Steam Link (BG2CD). This fixes timeouts communicating to the internal hub on Steam Link. Cc: Kishon Vijay Abraham I Cc: Antoine Tenart Signed-off-by: Alexander Monakov --- Hello, I'm trying to run mainline kernel on "Steam Link" device built around Marvell BG2CD SoC, and in doing so I've found that USB almost always fails to initialize with timeout errors when connecting to the hub: [0.223563] usb usb1: New USB device found, idVendor=1d6b, idProduct=0002 [0.223580] usb usb1: New USB device strings: Mfr=3, Product=2, SerialNumber=1 [0.223596] usb usb1: Product: EHCI Host Controller [0.223608] usb usb1: Manufacturer: Linux 4.15.0-00014-gc6f881cd62dc-dirty ehci_hcd [0.223624] usb usb1: SerialNumber: ci_hdrc.0 [0.223985] hub 1-0:1.0: USB hub found [0.224022] hub 1-0:1.0: 1 port detected [0.524726] ci_hdrc ci_hdrc.0: port 1 reset error -110 [1.044673] ci_hdrc ci_hdrc.0: port 1 reset error -110 [1.399721] mwifiex_sdio mmc0:0001:1: info: FW download over, size 814048 bytes [1.484666] ci_hdrc ci_hdrc.0: port 1 reset error -110 [2.113451] mwifiex_sdio mmc0:0001:1: WLAN FW is active [2.113725] sdio platform data not available [2.192570] mwifiex_sdio mmc0:0001:1: info: MWIFIEX VERSION: mwifiex 1.0 (15.68.7.p112) [2.192599] mwifiex_sdio mmc0:0001:1: driver_version = mwifiex 1.0 (15.68.7.p112) [2.213433] usb 1-1: new high-speed USB device number 2 using ci_hdrc [2.294676] ci_hdrc ci_hdrc.0: port 1 reset error -110 and so on until... [ 19.254658] ci_hdrc ci_hdrc.0: port 1 reset error -110 [ 19.913433] usb usb1-port1: Cannot enable. Maybe the USB cable is bad? [ 19.913472] usb usb1-port1: unable to enumerate USB device I've inspected how the original kernel performs phy initialization, and found that the value written into the USB_PHY_RX_CTRL register differs in one field. Correcting the field is enough to stabilize it in my case. I don't know where the _260 choice in mainline source comes from. I've checked three publicly available kernels for devices with this phy, and they all write the same value, 0xAA79. FWIW, this is not the only difference I've found. Other kernels also: - do not touch TX config registers; - drive a particular GPIO low before and high after configuring the phy. I am unsure if anything needs to or can be done about those differences. Thanks. Alexander drivers/phy/marvell/phy-berlin-usb.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/phy/marvell/phy-berlin-usb.c b/drivers/phy/marvell/phy-berlin-usb.c index 2017751ede26..8f2b5cae360f 100644 --- a/drivers/phy/marvell/phy-berlin-usb.c +++ b/drivers/phy/marvell/phy-berlin-usb.c @@ -127,7 +127,7 @@ static int phy_berlin_usb_power_on(struct phy *phy) writel(V2I_VCO_RATIO(0x5) | R_ROTATE_0 | ANA_TEST_DC_CTRL(0x5), priv->base + USB_PHY_ANALOG); writel(PHASE_FREEZE_DLY_4_CL | ACK_LENGTH_16_CL | SQ_LENGTH_12 | - DISCON_THRESHOLD_260 | SQ_THRESHOLD(0xa) | LPF_COEF(0x2) | + DISCON_THRESHOLD_270 | SQ_THRESHOLD(0xa) | LPF_COEF(0x2) | INTPL_CUR_30, priv->base + USB_PHY_RX_CTRL); writel(TX_VDD12_13 | TX_OUT_AMP(0x3), priv->base + USB_PHY_TX_CTRL1); -- 2.11.0
Re: [musl] Broken vDSO on built kernel
On Wed, 8 Jun 2016, fREW Schmidt wrote: > I was debugging an issue I ran into ( > https://github.com/docker/docker/issues/23378) and after chatting with the > folks in #musl and we triaged it down to a broken vDSO (tested by running > `strace date` and seeing a clock_gettime call.) To provide a bit more detail, we've found that the vdso mappend into the application has no dynamic symbols; Glibc ignores the vdso, so it continues to work, but musl segfaults since the vdso is invalid. I'm pasting below `readelf -aW` output on the vdso dumped on the affected system (via 'fwrite((void*)getauxval(AT_SYSINFO_EHDR), 8192, 1, stdout)'; as you can see, there's no dynamic symbols and symbol hash tables. It may be a toolchain bug since there was no issue with 4.6 kernel (and reportedly there were no significant vdso changes merged into 4.7 -- the issue is seen on Ubuntu's 4.7rc2). Alexander ELF Header: Magic: 7f 45 4c 46 02 01 01 00 00 00 00 00 00 00 00 00 Class: ELF64 Data: 2's complement, little endian Version: 1 (current) OS/ABI:UNIX - System V ABI Version: 0 Type: EXEC (Executable file) Machine: Advanced Micro Devices X86-64 Version: 0x1 Entry point address: 0x600 Start of program headers: 64 (bytes into file) Start of section headers: 2976 (bytes into file) Flags: 0x0 Size of this header: 64 (bytes) Size of program headers: 56 (bytes) Number of program headers: 4 Size of section headers: 64 (bytes) Number of section headers: 10 Section header string table index: 9 Section Headers: [Nr] Name TypeAddress OffSize ES Flg Lk Inf Al [ 0] NULL 00 00 00 0 0 0 [ 1] .rodata PROGBITS0120 000120 000340 00 WA 0 0 1 [ 2] .note NOTE0460 000460 3c 00 A 0 0 4 [ 3] .eh_frame_hdr PROGBITS049c 00049c 3c 00 A 0 0 4 [ 4] .eh_frame PROGBITS04d8 0004d8 000120 00 A 0 0 8 [ 5] .text PROGBITS0600 000600 0004c9 00 AX 0 0 16 [ 6] .altinstructions PROGBITS0ac9 000ac9 34 00 A 0 0 1 [ 7] .altinstr_replacement PROGBITS0afd 000afd 0c 00 AX 0 0 1 [ 8] .comment PROGBITS 000b09 2e 01 MS 0 0 1 [ 9] .shstrtab STRTAB 000b37 67 00 0 0 1 Key to Flags: W (write), A (alloc), X (execute), M (merge), S (strings), l (large) I (info), L (link order), G (group), T (TLS), E (exclude), x (unknown) O (extra OS processing required) o (OS specific), p (processor specific) There are no section groups in this file. Program Headers: Type Offset VirtAddr PhysAddr FileSiz MemSiz Flg Align LOAD 0x00 0x 0x 0x000b09 0x000b09 R E 0x1000 DYNAMIC0x00 0x 0x 0x00 0x00 R 0x8 NOTE 0x000460 0x0460 0x0460 0x3c 0x3c R 0x4 GNU_EH_FRAME 0x00049c 0x049c 0x049c 0x3c 0x3c R 0x4 Section to Segment mapping: Segment Sections... 00 .rodata .note .eh_frame_hdr .eh_frame .text .altinstructions .altinstr_replacement 01 02 .note 03 .eh_frame_hdr There is no dynamic section in this file. There are no relocations in this file. The decoding of unwind sections for machine type Advanced Micro Devices X86-64 is not currently supported. No version information found in this file. Displaying notes found at file offset 0x0460 with length 0x003c: Owner Data size Description Linux0x0004 Unknown note type: (0x) GNU 0x0014 NT_GNU_BUILD_ID (unique build ID bitstring) Build ID: 496e9db3f494533ffaaf39737eb1023938861349
[PATCH] intel_idle: mention assumption that wbinvd is not needed
Intel SDM does not explicitly say that entering a C-state via MWAIT will implicitly flush CPU caches as appropriate for that C-state. However, documentation for individual Intel CPU generations does mention this behavior. Since intel_idle binds to any Intel CPU with MWAIT, mention this assumption on MWAIT behavior. In passing, reword opening comment to make it clear that driver can load on any future Intel CPU with MWAIT. Signed-off-by: Alexander Monakov Cc: Rafael J. Wysocki --- Hi, I noticed that one of significant optimizations of intel_idle over acpi_idle is elision of explicit wbinvd: ACPI requires the OS to flush caches when entering C3, and Linux issues an explicit wbinvd to do that, but intel_idle simply issues mwait with the expectation that the CPU will automatically flush caches if needed. To me this is a fairly subtle point that became even more subtle following the update to intel_idle that made it capable to bind to old and future Intel CPUs with MWAIT (by the way, thanks for that!) Can you take this patch to spell out the assumption? drivers/idle/intel_idle.c | 8 ++-- 1 file changed, 6 insertions(+), 2 deletions(-) diff --git a/drivers/idle/intel_idle.c b/drivers/idle/intel_idle.c index f4495841bf68..1e5666cf8763 100644 --- a/drivers/idle/intel_idle.c +++ b/drivers/idle/intel_idle.c @@ -8,7 +8,7 @@ */ /* - * intel_idle is a cpuidle driver that loads on specific Intel processors + * intel_idle is a cpuidle driver that loads on all Intel CPUs with MWAIT * in lieu of the legacy ACPI processor_idle driver. The intent is to * make Linux more efficient on these processors, as intel_idle knows * more than ACPI, as well as make Linux more immune to ACPI BIOS bugs. @@ -20,7 +20,11 @@ * All CPUs have same idle states as boot CPU * * Chipset BM_STS (bus master status) bit is a NOP - * for preventing entry into deep C-stats + * for preventing entry into deep C-states + * + * CPU will flush caches as needed when entering a C-state via MWAIT + * (in contrast to entering ACPI C3, where acpi_idle driver is + * itself responsible for flushing, via WBINVD) */ /* -- 2.26.2
Re: [PATCH] intel_idle: mention assumption that wbinvd is not needed
On Mon, 12 Oct 2020, Rafael J. Wysocki wrote: > > @@ -20,7 +20,11 @@ > > * All CPUs have same idle states as boot CPU > > * > > * Chipset BM_STS (bus master status) bit is a NOP > > - * for preventing entry into deep C-stats > > + * for preventing entry into deep C-states > > + * > > + * CPU will flush caches as needed when entering a C-state via MWAIT > > I would rephrase this to mention that the above actually is an assumption. This comment block is by itself a list of assumptions. It begins with heading "Design Assumptions" and then lists two assumptions. This patch adds a third one. With that clarified, do you still need me to change this hunk? > > > + * (in contrast to entering ACPI C3, where acpi_idle driver is > > And mentioning acpi_idle here is not needed; it would be sufficient to > say something like "in which case the WBINVD instruction needs to be > executed to flush the caches". I see, thanks, I will change this for v2 once the above is cleared up. Thanks. Alexander
[PATCH v2] intel_idle: mention assumption that wbinvd is not needed
Intel SDM does not explicitly say that entering a C-state via MWAIT will implicitly flush CPU caches as appropriate for that C-state. However, documentation for individual Intel CPU generations does mention this behavior. Since intel_idle binds to any Intel CPU with MWAIT, list this assumption of MWAIT behavior. In passing, reword opening comment to make it clear that the driver can load on any old and future Intel CPU with MWAIT. Signed-off-by: Alexander Monakov Cc: Rafael J. Wysocki --- v2: reword remark about WBINVD (Rafael) drivers/idle/intel_idle.c | 8 ++-- 1 file changed, 6 insertions(+), 2 deletions(-) diff --git a/drivers/idle/intel_idle.c b/drivers/idle/intel_idle.c index f4495841bf68..6d87f2129119 100644 --- a/drivers/idle/intel_idle.c +++ b/drivers/idle/intel_idle.c @@ -8,7 +8,7 @@ */ /* - * intel_idle is a cpuidle driver that loads on specific Intel processors + * intel_idle is a cpuidle driver that loads on all Intel CPUs with MWAIT * in lieu of the legacy ACPI processor_idle driver. The intent is to * make Linux more efficient on these processors, as intel_idle knows * more than ACPI, as well as make Linux more immune to ACPI BIOS bugs. @@ -20,7 +20,11 @@ * All CPUs have same idle states as boot CPU * * Chipset BM_STS (bus master status) bit is a NOP - * for preventing entry into deep C-stats + * for preventing entry into deep C-states + * + * CPU will flush caches as needed when entering a C-state via MWAIT + * (in contrast to entering ACPI C3, in which case the WBINVD + * instruction needs to be executed to flush the caches) */ /* -- 2.26.2
[PATCH] iommu/amd: Fix event counter availability check
The driver performs an extra check if the IOMMU's capabilities advertise presence of performance counters: it verifies that counters are writable by writing a hard-coded value to a counter and testing that reading that counter gives back the same value. Unfortunately it does so quite early, even before pci_enable_device is called for the IOMMU, i.e. when accessing its MMIO space is not guaranteed to work. On Ryzen 4500U CPU, this actually breaks the test: the driver assumes the counters are not writable, and disables the functionality. Moving init_iommu_perf_ctr just after iommu_flush_all_caches resolves the issue. This is the earliest point in amd_iommu_init_pci where the call succeeds on my laptop. Signed-off-by: Alexander Monakov Cc: Joerg Roedel Cc: Suravee Suthikulpanit Cc: io...@lists.linux-foundation.org --- PS. I'm seeing another hiccup with IOMMU probing on my system: pci :00:00.2: can't derive routing for PCI INT A pci :00:00.2: PCI INT A: not connected Hopefully I can figure it out, but I'd appreciate hints. drivers/iommu/amd_iommu_init.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/drivers/iommu/amd_iommu_init.c b/drivers/iommu/amd_iommu_init.c index 5b81fd16f5fa..1b7ec6b6a282 100644 --- a/drivers/iommu/amd_iommu_init.c +++ b/drivers/iommu/amd_iommu_init.c @@ -1788,8 +1788,6 @@ static int __init iommu_init_pci(struct amd_iommu *iommu) if (iommu->cap & (1UL << IOMMU_CAP_NPCACHE)) amd_iommu_np_cache = true; - init_iommu_perf_ctr(iommu); - if (is_rd890_iommu(iommu->dev)) { int i, j; @@ -1891,8 +1889,10 @@ static int __init amd_iommu_init_pci(void) init_device_table_dma(); - for_each_iommu(iommu) + for_each_iommu(iommu) { iommu_flush_all_caches(iommu); + init_iommu_perf_ctr(iommu); + } if (!ret) print_iommu_info(); base-commit: 75caf310d16cc5e2f851c048cd597f5437013368 -- 2.26.2
[PATCH 3/3] EDAC/amd64: Add AMD family 17h model 60h PCI IDs
Add support for AMD Renoir (4000-series Ryzen CPUs). Signed-off-by: Alexander Monakov Cc: Thomas Gleixner Cc: Borislav Petkov Cc: x...@kernel.org Cc: Yazen Ghannam Cc: Brian Woods Cc: Clemens Ladisch Cc: Jean Delvare Cc: Guenter Roeck Cc: linux-hw...@vger.kernel.org Cc: linux-e...@vger.kernel.org --- drivers/edac/amd64_edac.c | 14 ++ drivers/edac/amd64_edac.h | 3 +++ 2 files changed, 17 insertions(+) diff --git a/drivers/edac/amd64_edac.c b/drivers/edac/amd64_edac.c index 1136500c5f53..d50365e9217a 100644 --- a/drivers/edac/amd64_edac.c +++ b/drivers/edac/amd64_edac.c @@ -2319,6 +2319,16 @@ static struct amd64_family_type family_types[] = { .dbam_to_cs = f17_addr_mask_to_cs_size, } }, + [F17_M60H_CPUS] = { + .ctl_name = "F17h_M60h", + .f0_id = PCI_DEVICE_ID_AMD_17H_M60H_DF_F0, + .f6_id = PCI_DEVICE_ID_AMD_17H_M60H_DF_F6, + .max_mcs = 2, + .ops = { + .early_channel_count= f17_early_channel_count, + .dbam_to_cs = f17_addr_mask_to_cs_size, + } + }, [F17_M70H_CPUS] = { .ctl_name = "F17h_M70h", .f0_id = PCI_DEVICE_ID_AMD_17H_M70H_DF_F0, @@ -3357,6 +3367,10 @@ static struct amd64_family_type *per_family_init(struct amd64_pvt *pvt) fam_type = &family_types[F17_M30H_CPUS]; pvt->ops = &family_types[F17_M30H_CPUS].ops; break; + } else if (pvt->model >= 0x60 && pvt->model <= 0x6f) { + fam_type = &family_types[F17_M60H_CPUS]; + pvt->ops = &family_types[F17_M60H_CPUS].ops; + break; } else if (pvt->model >= 0x70 && pvt->model <= 0x7f) { fam_type = &family_types[F17_M70H_CPUS]; pvt->ops = &family_types[F17_M70H_CPUS].ops; diff --git a/drivers/edac/amd64_edac.h b/drivers/edac/amd64_edac.h index abbf3c274d74..52b5d03eeba0 100644 --- a/drivers/edac/amd64_edac.h +++ b/drivers/edac/amd64_edac.h @@ -120,6 +120,8 @@ #define PCI_DEVICE_ID_AMD_17H_M10H_DF_F6 0x15ee #define PCI_DEVICE_ID_AMD_17H_M30H_DF_F0 0x1490 #define PCI_DEVICE_ID_AMD_17H_M30H_DF_F6 0x1496 +#define PCI_DEVICE_ID_AMD_17H_M60H_DF_F0 0x1448 +#define PCI_DEVICE_ID_AMD_17H_M60H_DF_F6 0x144e #define PCI_DEVICE_ID_AMD_17H_M70H_DF_F0 0x1440 #define PCI_DEVICE_ID_AMD_17H_M70H_DF_F6 0x1446 #define PCI_DEVICE_ID_AMD_19H_DF_F00x1650 @@ -293,6 +295,7 @@ enum amd_families { F17_CPUS, F17_M10H_CPUS, F17_M30H_CPUS, + F17_M60H_CPUS, F17_M70H_CPUS, F19_CPUS, NUM_FAMILIES, -- 2.26.2
[PATCH 0/3] k10temp and EDAC support for AMD Renoir
Hello! This simple patch series adds support for new AMD Ryzen CPU generation (model 17h family 60h) by adding the new PCI IDs as appropriate in the amd_nb, k10temp and amd64_edac drivers. The change in k10temp is the one most important for users (temperature sensors of the new CPUs), the amd_nb patch is a prerequisite for that, and finally edac_amd64 patch is necessary, because otherwise loading that module produces a warning due to incomplete support. Thank you. Alexander Monakov (3): x86/amd_nb: add AMD family 17h model 60h PCI IDs hwmon: (k10temp) Add AMD family 17h model 60h PCI match EDAC/amd64: Add AMD family 17h model 60h PCI IDs arch/x86/kernel/amd_nb.c | 5 + drivers/edac/amd64_edac.c | 14 ++ drivers/edac/amd64_edac.h | 3 +++ drivers/hwmon/k10temp.c | 1 + include/linux/pci_ids.h | 1 + 5 files changed, 24 insertions(+) Cc: Thomas Gleixner Cc: Borislav Petkov Cc: x...@kernel.org Cc: Yazen Ghannam Cc: Brian Woods Cc: Clemens Ladisch Cc: Jean Delvare Cc: Guenter Roeck Cc: linux-hw...@vger.kernel.org Cc: linux-e...@vger.kernel.org -- 2.26.2
[PATCH 1/3] x86/amd_nb: add AMD family 17h model 60h PCI IDs
Add PCI IDs for AMD Renoir (4000-series Ryzen CPUs). This is necessary to enable support for temperature sensors via the k10temp module. Signed-off-by: Alexander Monakov Cc: Thomas Gleixner Cc: Borislav Petkov Cc: x...@kernel.org Cc: Yazen Ghannam Cc: Brian Woods Cc: Clemens Ladisch Cc: Jean Delvare Cc: Guenter Roeck Cc: linux-hw...@vger.kernel.org Cc: linux-e...@vger.kernel.org --- arch/x86/kernel/amd_nb.c | 5 + include/linux/pci_ids.h | 1 + 2 files changed, 6 insertions(+) diff --git a/arch/x86/kernel/amd_nb.c b/arch/x86/kernel/amd_nb.c index b6b3297851f3..18f6b7c4bd79 100644 --- a/arch/x86/kernel/amd_nb.c +++ b/arch/x86/kernel/amd_nb.c @@ -18,9 +18,11 @@ #define PCI_DEVICE_ID_AMD_17H_ROOT 0x1450 #define PCI_DEVICE_ID_AMD_17H_M10H_ROOT0x15d0 #define PCI_DEVICE_ID_AMD_17H_M30H_ROOT0x1480 +#define PCI_DEVICE_ID_AMD_17H_M60H_ROOT0x1630 #define PCI_DEVICE_ID_AMD_17H_DF_F40x1464 #define PCI_DEVICE_ID_AMD_17H_M10H_DF_F4 0x15ec #define PCI_DEVICE_ID_AMD_17H_M30H_DF_F4 0x1494 +#define PCI_DEVICE_ID_AMD_17H_M60H_DF_F4 0x144c #define PCI_DEVICE_ID_AMD_17H_M70H_DF_F4 0x1444 #define PCI_DEVICE_ID_AMD_19H_DF_F40x1654 @@ -33,6 +35,7 @@ static const struct pci_device_id amd_root_ids[] = { { PCI_DEVICE(PCI_VENDOR_ID_AMD, PCI_DEVICE_ID_AMD_17H_ROOT) }, { PCI_DEVICE(PCI_VENDOR_ID_AMD, PCI_DEVICE_ID_AMD_17H_M10H_ROOT) }, { PCI_DEVICE(PCI_VENDOR_ID_AMD, PCI_DEVICE_ID_AMD_17H_M30H_ROOT) }, + { PCI_DEVICE(PCI_VENDOR_ID_AMD, PCI_DEVICE_ID_AMD_17H_M60H_ROOT) }, {} }; @@ -50,6 +53,7 @@ static const struct pci_device_id amd_nb_misc_ids[] = { { PCI_DEVICE(PCI_VENDOR_ID_AMD, PCI_DEVICE_ID_AMD_17H_DF_F3) }, { PCI_DEVICE(PCI_VENDOR_ID_AMD, PCI_DEVICE_ID_AMD_17H_M10H_DF_F3) }, { PCI_DEVICE(PCI_VENDOR_ID_AMD, PCI_DEVICE_ID_AMD_17H_M30H_DF_F3) }, + { PCI_DEVICE(PCI_VENDOR_ID_AMD, PCI_DEVICE_ID_AMD_17H_M60H_DF_F3) }, { PCI_DEVICE(PCI_VENDOR_ID_AMD, PCI_DEVICE_ID_AMD_CNB17H_F3) }, { PCI_DEVICE(PCI_VENDOR_ID_AMD, PCI_DEVICE_ID_AMD_17H_M70H_DF_F3) }, { PCI_DEVICE(PCI_VENDOR_ID_AMD, PCI_DEVICE_ID_AMD_19H_DF_F3) }, @@ -65,6 +69,7 @@ static const struct pci_device_id amd_nb_link_ids[] = { { PCI_DEVICE(PCI_VENDOR_ID_AMD, PCI_DEVICE_ID_AMD_17H_DF_F4) }, { PCI_DEVICE(PCI_VENDOR_ID_AMD, PCI_DEVICE_ID_AMD_17H_M10H_DF_F4) }, { PCI_DEVICE(PCI_VENDOR_ID_AMD, PCI_DEVICE_ID_AMD_17H_M30H_DF_F4) }, + { PCI_DEVICE(PCI_VENDOR_ID_AMD, PCI_DEVICE_ID_AMD_17H_M60H_DF_F4) }, { PCI_DEVICE(PCI_VENDOR_ID_AMD, PCI_DEVICE_ID_AMD_17H_M70H_DF_F4) }, { PCI_DEVICE(PCI_VENDOR_ID_AMD, PCI_DEVICE_ID_AMD_19H_DF_F4) }, { PCI_DEVICE(PCI_VENDOR_ID_AMD, PCI_DEVICE_ID_AMD_CNB17H_F4) }, diff --git a/include/linux/pci_ids.h b/include/linux/pci_ids.h index 1dfc4e1dcb94..3155f5ada02e 100644 --- a/include/linux/pci_ids.h +++ b/include/linux/pci_ids.h @@ -550,6 +550,7 @@ #define PCI_DEVICE_ID_AMD_17H_DF_F30x1463 #define PCI_DEVICE_ID_AMD_17H_M10H_DF_F3 0x15eb #define PCI_DEVICE_ID_AMD_17H_M30H_DF_F3 0x1493 +#define PCI_DEVICE_ID_AMD_17H_M60H_DF_F3 0x144b #define PCI_DEVICE_ID_AMD_17H_M70H_DF_F3 0x1443 #define PCI_DEVICE_ID_AMD_19H_DF_F30x1653 #define PCI_DEVICE_ID_AMD_CNB17H_F30x1703 -- 2.26.2
[PATCH 2/3] hwmon: (k10temp) Add AMD family 17h model 60h PCI match
Add support for retrieving Tdie and Tctl on AMD Renoir (4000-series Ryzen CPUs). It appears SMU offsets for reading current/voltage and CCD temperature have changed for this generation (reads from currently used offsets yield zeros), so those features cannot be enabled so trivially. Signed-off-by: Alexander Monakov Cc: Thomas Gleixner Cc: Borislav Petkov Cc: x...@kernel.org Cc: Yazen Ghannam Cc: Brian Woods Cc: Clemens Ladisch Cc: Jean Delvare Cc: Guenter Roeck Cc: linux-hw...@vger.kernel.org Cc: linux-e...@vger.kernel.org --- drivers/hwmon/k10temp.c | 1 + 1 file changed, 1 insertion(+) diff --git a/drivers/hwmon/k10temp.c b/drivers/hwmon/k10temp.c index 9915578533bb..8f12995ec133 100644 --- a/drivers/hwmon/k10temp.c +++ b/drivers/hwmon/k10temp.c @@ -632,6 +632,7 @@ static const struct pci_device_id k10temp_id_table[] = { { PCI_VDEVICE(AMD, PCI_DEVICE_ID_AMD_17H_DF_F3) }, { PCI_VDEVICE(AMD, PCI_DEVICE_ID_AMD_17H_M10H_DF_F3) }, { PCI_VDEVICE(AMD, PCI_DEVICE_ID_AMD_17H_M30H_DF_F3) }, + { PCI_VDEVICE(AMD, PCI_DEVICE_ID_AMD_17H_M60H_DF_F3) }, { PCI_VDEVICE(AMD, PCI_DEVICE_ID_AMD_17H_M70H_DF_F3) }, { PCI_VDEVICE(HYGON, PCI_DEVICE_ID_AMD_17H_DF_F3) }, {} -- 2.26.2
[PATCH] iommu/amd: fix over-read of ACPI UID from IVRS table
IVRS parsing code always tries to read 255 bytes from memory when retrieving ACPI device path, and makes an assumption that firmware provides a zero-terminated string. Both of those are bugs: the entry is likely to be shorter than 255 bytes, and zero-termination is not guaranteed. With Acer SF314-42 firmware these issues manifest visibly in dmesg: AMD-Vi: ivrs, add hid:AMDI0020, uid:\_SB.FUR0\xf0\xa5, rdevid:160 AMD-Vi: ivrs, add hid:AMDI0020, uid:\_SB.FUR1\xf0\xa5, rdevid:160 AMD-Vi: ivrs, add hid:AMDI0020, uid:\_SB.FUR2\xf0\xa5, rdevid:160 AMD-Vi: ivrs, add hid:AMDI0020, uid:\_SB.FUR3>\x83e\x8d\x9a\xd1... The first three lines show how the code over-reads adjacent table entries into the UID, and in the last line it even reads garbage data beyond the end of the IVRS table itself. Since each entry has the length of the UID (uidl member of ivhd_entry struct), use that for memcpy, and manually add a zero terminator. Avoid zero-filling hid and uid arrays up front, and instead ensure the uid array is always zero-terminated. No change needed for the hid array, as it was already properly zero-terminated. Fixes: 2a0cb4e2d423c ("iommu/amd: Add new map for storing IVHD dev entry type HID") Signed-off-by: Alexander Monakov Cc: Joerg Roedel Cc: io...@lists.linux-foundation.org --- drivers/iommu/amd_iommu_init.c | 9 + 1 file changed, 5 insertions(+), 4 deletions(-) diff --git a/drivers/iommu/amd_iommu_init.c b/drivers/iommu/amd_iommu_init.c index 6be3853a5d97..faed3d35017a 100644 --- a/drivers/iommu/amd_iommu_init.c +++ b/drivers/iommu/amd_iommu_init.c @@ -1329,8 +1329,8 @@ static int __init init_iommu_from_acpi(struct amd_iommu *iommu, } case IVHD_DEV_ACPI_HID: { u16 devid; - u8 hid[ACPIHID_HID_LEN] = {0}; - u8 uid[ACPIHID_UID_LEN] = {0}; + u8 hid[ACPIHID_HID_LEN]; + u8 uid[ACPIHID_UID_LEN]; int ret; if (h->type != 0x40) { @@ -1347,6 +1347,7 @@ static int __init init_iommu_from_acpi(struct amd_iommu *iommu, break; } + uid[0] = '\0'; switch (e->uidf) { case UID_NOT_PRESENT: @@ -1361,8 +1362,8 @@ static int __init init_iommu_from_acpi(struct amd_iommu *iommu, break; case UID_IS_CHARACTER: - memcpy(uid, (u8 *)(&e->uid), ACPIHID_UID_LEN - 1); - uid[ACPIHID_UID_LEN - 1] = '\0'; + memcpy(uid, &e->uid, e->uidl); + uid[e->uidl] = '\0'; break; default: -- 2.26.2
Re: [PATCH] iommu/amd: Fix event counter availability check
On Tue, 2 Jun 2020, Shuah Khan wrote: > I changed the logic to read config to get max banks and counters > before checking if counters are writable and tried writing to all. > The result is the same and all of them aren't writable. However, > when disable the writable check and assume they are, I can run [snip] This is similar to what I did. I also noticed that counters can be successfully used with perf if the initial check is ignored. I was considering sending a patch to remove the check and adjust the event counting logic to use counters as read-only, but after a bit more investigation I've noticed how late pci_enable_device is done, and came up with this patch. It's a path of less resistance: I'd expect maintainers to be more averse to removing the check rather than fixing it so it works as intended (even though I think the check should not be there in the first place). However: The ability to modify the counters is needed only for sampling the events (getting an interrupt when a counter overflows). There's no code to do that for these AMD IOMMU counters. A solution I would prefer is to not write to those counters at all. It would simplify or even remove a bunch of code. I can submit a corresponding patch if there's general agreement this path is ok. What do you think? Alexander
schedutil issue with serial workloads
Hello, this is a question/bugreport about behavior of schedutil on serial workloads such as rsync, or './configure', or 'make install'. These workloads are such that there's no single task that takes a substantial portion of CPU time, but at any moment there's at least one runnable task, and overall the workload is compute-bound. To run the workload efficiently, cpufreq governor should select a high frequency. Assume the system is idle except for the workload in question. Sadly, schedutil will select the lowest frequency, unless the workload is confined to one core with taskset (in which case it will select the highest frequency, correctly though somewhat paradoxically). This sounds like it should be a known problem, but I couldn't find any mention of it in the documentation. I was able to replicate the effect with a pair of 'ping-pong' programs that get a token, burn some cycles to simulate work, and pass the token. Thus, each program has 50% CPU utilization. To repeat my test: gcc -O2 pingpong.c -o pingpong mkfifo ping mkfifo pong taskset -c 0 ./pingpong 100 < ping > pong & taskset -c 1 ./pingpong 100 < pong > ping & echo > ping #include #include int main(int argc, char *argv[]) { unsigned i, n; sscanf(argv[1], "%u", &n); for (;;) { char c; read(0, &c, 1); for (i = n; i; i--) asm("" :: "r"(i)); write(1, &c, 1); } } Alexander
Re: schedutil issue with serial workloads
On Fri, 5 Jun 2020, Rafael J. Wysocki wrote: > > This sounds like it should be a known problem, but I couldn't find any > > mention of it in the documentation. > > Well, what would you expect to happen instead of what you see? Not sure why you ask. Named workloads are pretty common for example on "build-bot" machines. If you don't work exclusively on the kernel you probably recognize that on, let's say, more "traditionally engineered" packages ./configure can take 10x more wall-clock time than subsequent 'make -j $(nproc)', and if schedutil makes the problem worse by consistently choosing the lowest possible frequency for the configure phase, that's a huge pitfall that's worth fixing or documenting. To answer your question, assuming schedutil is intended to become a good choice for a wide range of use-cases, I'd expect it to choose a high frequency, ideally the highest, and definitely not the lowest. I think I was pretty transparent about that in my initial mail. I understand there is no obvious fix and inventing one may prove difficult. Short term, better Kconfig help text to help people make a suitable choice for their workloads would be nice. I'd also like to point out that current Kconfig is confusingly worded where it says "If the utilization is frequency-invariant, ...". It can be interpreted as "the workload is producing the same utilization irrespective of frequency", but what it actually refers to is the implementation detail of how the "utilization" metric is interpreted. Does that sentence need to be in Kconfig help at all? Thanks. Alexander
Re: [PATCH] tools/power turbostat: call pread64 in kernel directly
Hi, I am not the original submitter, but I have answers and a proper patch :) On Fri, 21 Aug 2020, Len Brown wrote: > Re: offset size > > The offsets on this file are the MSR offsets. > What MSR are you trying to access at offset 0xc0010299? This MSR is particular is part of AMD RAPL (energy measurements) interface. > Re: pread vs pread64 > > If I take on faith that you have some kind of 32-bit execution > environment that makes pread into pread32 instead of pread64, and that > truncates an off_t to 32-bits from 64-bits, and it actually makes > sense to request a read at this large offset... The problem here stems from the backward compatibility in Glibc: off_t is 32-bit on 32-bit x86, unless compiled with -D_FILE_OFFSET_BITS=64. This macro should be used for all new code. Distros should enable it for all builds, but when one builds turbostat 'by hand', they hit the issue. > would we really have to invoke syscall() directly -- couldn't we > invoke pread64() directly? (eg. below) No, the proper fix is to pass -D_FILE_OFFSET_BITS=64 to the compiler. Here's the patch: ---8<--- From: Alexander Monakov Date: Sun, 23 Aug 2020 23:27:02 +0300 Subject: [PATCH] turbostat: build with _FILE_OFFSET_BITS=64 For compatibility reasons, Glibc off_t is a 32-bit type on 32-bit x86 unless _FILE_OFFSET_BITS=64 is defined. Add this define, as otherwise reading MSRs with index 0x8000 and above attempts a pread with a negative offset, which fails. Signed-off-by: Alexander Monakov --- tools/power/x86/turbostat/Makefile | 1 + 1 file changed, 1 insertion(+) diff --git a/tools/power/x86/turbostat/Makefile b/tools/power/x86/turbostat/Makefile index 2b6551269e43..40ae44402eec 100644 --- a/tools/power/x86/turbostat/Makefile +++ b/tools/power/x86/turbostat/Makefile @@ -12,6 +12,7 @@ turbostat : turbostat.c override CFLAGS += -O2 -Wall -I../../../include override CFLAGS += -DMSRHEADER='"../../../../arch/x86/include/asm/msr-index.h"' override CFLAGS += -DINTEL_FAMILY_HEADER='"../../../../arch/x86/include/asm/intel-family.h"' +override CFLAGS += -D_FILE_OFFSET_BITS=64 override CFLAGS += -D_FORTIFY_SOURCE=2 %: %.c -- 2.26.2
[PATCH] drm/amd/display: use correct scale for actual_brightness
Documentation for sysfs backlight level interface requires that values in both 'brightness' and 'actual_brightness' files are interpreted to be in range from 0 to the value given in the 'max_brightness' file. With amdgpu, max_brightness gives 255, and values written by the user into 'brightness' are internally rescaled to a wider range. However, reading from 'actual_brightness' gives the raw register value without inverse rescaling. This causes issues for various userspace tools such as PowerTop and systemd that expect the value to be in the correct range. Introduce a helper to retrieve internal backlight range. Extend the existing 'convert_brightness' function to handle conversion in both directions. Bug: https://bugzilla.kernel.org/show_bug.cgi?id=203905 Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/1242 Cc: Alex Deucher Signed-off-by: Alexander Monakov --- .../gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 73 --- 1 file changed, 32 insertions(+), 41 deletions(-) diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c index 710edc70e37e..03e21e7b7917 100644 --- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c +++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c @@ -2881,51 +2881,42 @@ static int set_backlight_via_aux(struct dc_link *link, uint32_t brightness) return rc ? 0 : 1; } -static u32 convert_brightness(const struct amdgpu_dm_backlight_caps *caps, - const uint32_t user_brightness) +static int get_brightness_range(const struct amdgpu_dm_backlight_caps *caps, + unsigned *min, unsigned *max) { - u32 min, max, conversion_pace; - u32 brightness = user_brightness; - if (!caps) - goto out; + return 0; - if (!caps->aux_support) { - max = caps->max_input_signal; - min = caps->min_input_signal; - /* -* The brightness input is in the range 0-255 -* It needs to be rescaled to be between the -* requested min and max input signal -* It also needs to be scaled up by 0x101 to -* match the DC interface which has a range of -* 0 to 0x -*/ - conversion_pace = 0x101; - brightness = - user_brightness - * conversion_pace - * (max - min) - / AMDGPU_MAX_BL_LEVEL - + min * conversion_pace; + if (caps->aux_support) { + // Firmware limits are in nits, DC API wants millinits. + *max = 1000 * caps->aux_max_input_signal; + *min = 1000 * caps->aux_min_input_signal; } else { - /* TODO -* We are doing a linear interpolation here, which is OK but -* does not provide the optimal result. We probably want -* something close to the Perceptual Quantizer (PQ) curve. -*/ - max = caps->aux_max_input_signal; - min = caps->aux_min_input_signal; - - brightness = (AMDGPU_MAX_BL_LEVEL - user_brightness) * min - + user_brightness * max; - // Multiple the value by 1000 since we use millinits - brightness *= 1000; - brightness = DIV_ROUND_CLOSEST(brightness, AMDGPU_MAX_BL_LEVEL); + // Firmware limits are 8-bit, PWM control is 16-bit. + *max = 0x101 * caps->max_input_signal; + *min = 0x101 * caps->min_input_signal; } + return 1; +} -out: - return brightness; +static u32 convert_brightness(const struct amdgpu_dm_backlight_caps *caps, + const uint32_t brightness, int from_user) +{ + unsigned min, max; + + if (!get_brightness_range(caps, &min, &max)) + return brightness; + + if (from_user) + // Rescale 0..255 to min..max + return min + DIV_ROUND_CLOSEST((max - min) * brightness, + AMDGPU_MAX_BL_LEVEL); + + if (brightness < min) + return 0; + // Rescale min..max to 0..255 + return DIV_ROUND_CLOSEST(AMDGPU_MAX_BL_LEVEL * (brightness - min), +max - min); } static int amdgpu_dm_backlight_update_status(struct backlight_device *bd) @@ -2941,7 +2932,7 @@ static int amdgpu_dm_backlight_update_status(struct backlight_device *bd) link = (struct dc_link *)dm->backlight_link; - brightness = convert_brightness(&caps, bd->props.brightness); + brightness = convert_brightness(&caps, bd->props.brightness, 1);
Re: [PATCH] drm/amd/display: use correct scale for actual_brightness
On Tue, 4 Aug 2020, Kazlauskas, Nicholas wrote: > This is a cleaner change the other proposed patch since it doesn't need to Can you give a URL to the other patch please? > modify the exist conversion functions but I'd be worried about broken > userspace relying on 0-255 as the only acceptable range. Not sure what you mean by this. Userspace simply reads the maximum value from max_brightness sysfs file. On other gpu/firmware combinations it can be 7 or 9 for example, it just happens to be 255 with modern amdgpu. Minimum value is always zero. Value seen in max_brightness remains 255 with this patch, so as far as userspace is concerned nothing is changed apart from value given by actual_brightness file. Alexander > > Not an expert on existing implementations to point out a specific one though. > > Regards, > Nicholas Kazlauskas > > On 2020-08-03 4:02 p.m., Alexander Monakov wrote: > > Documentation for sysfs backlight level interface requires that > > values in both 'brightness' and 'actual_brightness' files are > > interpreted to be in range from 0 to the value given in the > > 'max_brightness' file. > > > > With amdgpu, max_brightness gives 255, and values written by the user > > into 'brightness' are internally rescaled to a wider range. However, > > reading from 'actual_brightness' gives the raw register value without > > inverse rescaling. This causes issues for various userspace tools such > > as PowerTop and systemd that expect the value to be in the correct > > range. > > > > Introduce a helper to retrieve internal backlight range. Extend the > > existing 'convert_brightness' function to handle conversion in both > > directions. > > > > Bug: https://bugzilla.kernel.org/show_bug.cgi?id=203905 > > Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/1242 > > Cc: Alex Deucher > > Signed-off-by: Alexander Monakov > > --- > > .../gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 73 --- > > 1 file changed, 32 insertions(+), 41 deletions(-) > > > > diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c > > b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c > > index 710edc70e37e..03e21e7b7917 100644 > > --- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c > > +++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c > > @@ -2881,51 +2881,42 @@ static int set_backlight_via_aux(struct dc_link > > *link, uint32_t brightness) > > return rc ? 0 : 1; > > } > > -static u32 convert_brightness(const struct amdgpu_dm_backlight_caps > > *caps, > > - const uint32_t user_brightness) > > +static int get_brightness_range(const struct amdgpu_dm_backlight_caps > > *caps, > > + unsigned *min, unsigned *max) > > { > > - u32 min, max, conversion_pace; > > - u32 brightness = user_brightness; > > - > > if (!caps) > > - goto out; > > + return 0; > > - if (!caps->aux_support) { > > - max = caps->max_input_signal; > > - min = caps->min_input_signal; > > - /* > > -* The brightness input is in the range 0-255 > > -* It needs to be rescaled to be between the > > -* requested min and max input signal > > -* It also needs to be scaled up by 0x101 to > > -* match the DC interface which has a range of > > -* 0 to 0x > > -*/ > > - conversion_pace = 0x101; > > - brightness = > > - user_brightness > > - * conversion_pace > > - * (max - min) > > - / AMDGPU_MAX_BL_LEVEL > > - + min * conversion_pace; > > + if (caps->aux_support) { > > + // Firmware limits are in nits, DC API wants millinits. > > + *max = 1000 * caps->aux_max_input_signal; > > + *min = 1000 * caps->aux_min_input_signal; > > } else { > > - /* TODO > > -* We are doing a linear interpolation here, which is OK but > > -* does not provide the optimal result. We probably want > > -* something close to the Perceptual Quantizer (PQ) curve. > > -*/ > > - max = caps->aux_max_input_signal; > > - min = caps->aux_min_input_signal; > > - > > - brightness = (AMDGPU_MAX_BL_LEVEL - user_brightness) * min > > -
[PATCH v2] drm/amd/display: use correct scale for actual_brightness
Documentation for sysfs backlight level interface requires that values in both 'brightness' and 'actual_brightness' files are interpreted to be in range from 0 to the value given in the 'max_brightness' file. With amdgpu, max_brightness gives 255, and values written by the user into 'brightness' are internally rescaled to a wider range. However, reading from 'actual_brightness' gives the raw register value without inverse rescaling. This causes issues for various userspace tools such as PowerTop and systemd that expect the value to be in the correct range. Introduce a helper to retrieve internal backlight range. Use it to reimplement 'convert_brightness' as 'convert_brightness_from_user' and introduce 'convert_brightness_to_user'. Bug: https://bugzilla.kernel.org/show_bug.cgi?id=203905 Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/1242 Cc: Alex Deucher Cc: Nicholas Kazlauskas Signed-off-by: Alexander Monakov --- v2: split convert_brightness to &_from_user and &_to_user (Nicholas) .../gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 81 +-- 1 file changed, 40 insertions(+), 41 deletions(-) diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c index 710edc70e37e..b60a763f3f95 100644 --- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c +++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c @@ -2881,51 +2881,50 @@ static int set_backlight_via_aux(struct dc_link *link, uint32_t brightness) return rc ? 0 : 1; } -static u32 convert_brightness(const struct amdgpu_dm_backlight_caps *caps, - const uint32_t user_brightness) +static int get_brightness_range(const struct amdgpu_dm_backlight_caps *caps, + unsigned *min, unsigned *max) { - u32 min, max, conversion_pace; - u32 brightness = user_brightness; - if (!caps) - goto out; + return 0; - if (!caps->aux_support) { - max = caps->max_input_signal; - min = caps->min_input_signal; - /* -* The brightness input is in the range 0-255 -* It needs to be rescaled to be between the -* requested min and max input signal -* It also needs to be scaled up by 0x101 to -* match the DC interface which has a range of -* 0 to 0x -*/ - conversion_pace = 0x101; - brightness = - user_brightness - * conversion_pace - * (max - min) - / AMDGPU_MAX_BL_LEVEL - + min * conversion_pace; + if (caps->aux_support) { + // Firmware limits are in nits, DC API wants millinits. + *max = 1000 * caps->aux_max_input_signal; + *min = 1000 * caps->aux_min_input_signal; } else { - /* TODO -* We are doing a linear interpolation here, which is OK but -* does not provide the optimal result. We probably want -* something close to the Perceptual Quantizer (PQ) curve. -*/ - max = caps->aux_max_input_signal; - min = caps->aux_min_input_signal; - - brightness = (AMDGPU_MAX_BL_LEVEL - user_brightness) * min - + user_brightness * max; - // Multiple the value by 1000 since we use millinits - brightness *= 1000; - brightness = DIV_ROUND_CLOSEST(brightness, AMDGPU_MAX_BL_LEVEL); + // Firmware limits are 8-bit, PWM control is 16-bit. + *max = 0x101 * caps->max_input_signal; + *min = 0x101 * caps->min_input_signal; } + return 1; +} -out: - return brightness; +static u32 convert_brightness_from_user(const struct amdgpu_dm_backlight_caps *caps, + uint32_t brightness) +{ + unsigned min, max; + + if (!get_brightness_range(caps, &min, &max)) + return brightness; + + // Rescale 0..255 to min..max + return min + DIV_ROUND_CLOSEST((max - min) * brightness, + AMDGPU_MAX_BL_LEVEL); +} + +static u32 convert_brightness_to_user(const struct amdgpu_dm_backlight_caps *caps, + uint32_t brightness) +{ + unsigned min, max; + + if (!get_brightness_range(caps, &min, &max)) + return brightness; + + if (brightness < min) + return 0; + // Rescale min..max to 0..255 + return DIV_ROUND_CLOSEST(AMDGPU_MAX_BL_LEVEL * (brightness - min), +max - min); } static int
Re: [PATCH v2] drm/amd/display: use correct scale for actual_brightness
Ping. On Tue, 4 Aug 2020, Alexander Monakov wrote: > Documentation for sysfs backlight level interface requires that > values in both 'brightness' and 'actual_brightness' files are > interpreted to be in range from 0 to the value given in the > 'max_brightness' file. > > With amdgpu, max_brightness gives 255, and values written by the user > into 'brightness' are internally rescaled to a wider range. However, > reading from 'actual_brightness' gives the raw register value without > inverse rescaling. This causes issues for various userspace tools such > as PowerTop and systemd that expect the value to be in the correct > range. > > Introduce a helper to retrieve internal backlight range. Use it to > reimplement 'convert_brightness' as 'convert_brightness_from_user' and > introduce 'convert_brightness_to_user'. > > Bug: https://bugzilla.kernel.org/show_bug.cgi?id=203905 > Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/1242 > Cc: Alex Deucher > Cc: Nicholas Kazlauskas > Signed-off-by: Alexander Monakov > --- > v2: split convert_brightness to &_from_user and &_to_user (Nicholas) > > .../gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 81 +-- > 1 file changed, 40 insertions(+), 41 deletions(-) > > diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c > b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c > index 710edc70e37e..b60a763f3f95 100644 > --- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c > +++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c > @@ -2881,51 +2881,50 @@ static int set_backlight_via_aux(struct dc_link > *link, uint32_t brightness) > return rc ? 0 : 1; > } > > -static u32 convert_brightness(const struct amdgpu_dm_backlight_caps *caps, > - const uint32_t user_brightness) > +static int get_brightness_range(const struct amdgpu_dm_backlight_caps *caps, > + unsigned *min, unsigned *max) > { > - u32 min, max, conversion_pace; > - u32 brightness = user_brightness; > - > if (!caps) > - goto out; > + return 0; > > - if (!caps->aux_support) { > - max = caps->max_input_signal; > - min = caps->min_input_signal; > - /* > - * The brightness input is in the range 0-255 > - * It needs to be rescaled to be between the > - * requested min and max input signal > - * It also needs to be scaled up by 0x101 to > - * match the DC interface which has a range of > - * 0 to 0x > - */ > - conversion_pace = 0x101; > - brightness = > - user_brightness > - * conversion_pace > - * (max - min) > - / AMDGPU_MAX_BL_LEVEL > - + min * conversion_pace; > + if (caps->aux_support) { > + // Firmware limits are in nits, DC API wants millinits. > + *max = 1000 * caps->aux_max_input_signal; > + *min = 1000 * caps->aux_min_input_signal; > } else { > - /* TODO > - * We are doing a linear interpolation here, which is OK but > - * does not provide the optimal result. We probably want > - * something close to the Perceptual Quantizer (PQ) curve. > - */ > - max = caps->aux_max_input_signal; > - min = caps->aux_min_input_signal; > - > - brightness = (AMDGPU_MAX_BL_LEVEL - user_brightness) * min > -+ user_brightness * max; > - // Multiple the value by 1000 since we use millinits > - brightness *= 1000; > - brightness = DIV_ROUND_CLOSEST(brightness, AMDGPU_MAX_BL_LEVEL); > + // Firmware limits are 8-bit, PWM control is 16-bit. > + *max = 0x101 * caps->max_input_signal; > + *min = 0x101 * caps->min_input_signal; > } > + return 1; > +} > > -out: > - return brightness; > +static u32 convert_brightness_from_user(const struct > amdgpu_dm_backlight_caps *caps, > + uint32_t brightness) > +{ > + unsigned min, max; > + > + if (!get_brightness_range(caps, &min, &max)) > + return brightness; > + > + // Rescale 0..255 to min..max > + return min + DIV_ROUND_CLOSEST((max - min) * brightness, > +AMDGPU_MAX_BL_LEVEL); > +}
Re: [PATCH] iommu/amd: Fix event counter availability check
On Tue, 16 Jun 2020, Suravee Suthikulpanit wrote: > > > Instead of blindly moving the code around to a spot that would just work, > > > I am trying to understand what might be required here. In this case, > > > the init_device_table_dma()should not be needed. I suspect it's the IOMMU > > > invalidate all command that's also needed here. > > > > > > I'm also checking with the HW and BIOS team. Meanwhile, could you please > > > give > > > the following change a try: > > Hello. Can you give any update please? > > > > Alexander > > > > Sorry for late reply. I have a reproducer and working with the HW team to > understand the issue. > I should be able to provide update with solution by the end of this week. Hello, hope you are doing well. Has this investigation found anything? Thanks. Alexander
Re: [PATCH] iommu/amd: Fix event counter availability check
On Tue, 16 Jun 2020, Suravee Suthikulpanit wrote: > > > On 6/1/20 4:01 PM, Alexander Monakov wrote: > > > > On Mon, 1 Jun 2020, Suravee Suthikulpanit wrote: > > > > > > > > > > Moving init_iommu_perf_ctr just after iommu_flush_all_caches > > > > > > resolves the issue. This is the earliest point in amd_iommu_init_pci > > > > > > where the call succeeds on my laptop. > > > > > According to your description, it should just need to be anywhere > > > > > after the pci_enable_device() is called for the IOMMU device, isn't > > > > > it? So, on your system, what if we just move the init_iommu_perf_ctr() > > > > > here: > > > > No, this doesn't work, as I already said in the paragraph you are > > > > responding to. See my last sentence in the quoted part. > > > > > > > > So the implication is init_device_table_dma together with subsequent > > > > cache flush is also setting up something that is necessary for counters > > > > to be writable. > > > > > > > > Alexander > > > > > > > Instead of blindly moving the code around to a spot that would just work, > > > I am trying to understand what might be required here. In this case, > > > the init_device_table_dma()should not be needed. I suspect it's the IOMMU > > > invalidate all command that's also needed here. > > > > > > I'm also checking with the HW and BIOS team. Meanwhile, could you please > > > give the following change a try: > > Hello. Can you give any update please? > > > > Alexander > > > > Sorry for late reply. I have a reproducer and working with the HW team to > understand the issue. > I should be able to provide update with solution by the end of this week. Hi, can you share any information (two more weeks have passed)? Alexander
Re: [PATCH] hwmon: (k10temp) Add AMD family 17h model 60h probe
Hi, I've already said in my patch submission (which was Cc'ed to LKML, linux-edac and linux-hwmon so you should have been able to find it): >> It appears SMU offsets for reading current/voltage and CCD temperature >> have changed for this generation (reads from currently used offsets >> yield zeros), so those features cannot be enabled so trivially. In https://github.com/FlyGoat/ryzen_nb_smu/issues/3 there some reverse-engineered info that indicates that for Renoir, SMU region has been moved to 0x6F000. Its layout also changed, as far as I can tell. (also, please ask yourself if you want to offer the maintainers an apology) Alexander
[PATCH] EDAC/amd64: Dump registers before checking ECC
Move dump_misc_regs earlier so decoded register contents are shown even if ECC cannot be enabled. The dump gives info such as number of memory controllers, so it serves as a sensible smoke-check even on platforms without ECC memory. Signed-off-by: Alexander Monakov Cc: Borislav Petkov Cc: linux-e...@vger.kernel.org --- Hi, I used this patch when testing EDAC on AMD Family 17h Model 60h CPU, which needs another patch (adding PCI ids) that I intend to send shortly. I think it would be useful to have this upstream. drivers/edac/amd64_edac.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/edac/amd64_edac.c b/drivers/edac/amd64_edac.c index f91f3bc1e0b2..1136500c5f53 100644 --- a/drivers/edac/amd64_edac.c +++ b/drivers/edac/amd64_edac.c @@ -3533,6 +3533,8 @@ static int probe_one_instance(unsigned int nid) goto err_enable; } + dump_misc_regs(pvt); + if (!ecc_enabled(pvt)) { ret = -ENODEV; @@ -3559,8 +3561,6 @@ static int probe_one_instance(unsigned int nid) goto err_enable; } - dump_misc_regs(pvt); - return ret; err_enable: -- 2.26.2
Re: [PATCH] iommu/amd: Fix event counter availability check
On Mon, 1 Jun 2020, Suravee Suthikulpanit wrote: > Alexander > > On 6/1/20 4:01 PM, Alexander Monakov wrote: > > On Mon, 1 Jun 2020, Suravee Suthikulpanit wrote: > > > > > > Moving init_iommu_perf_ctr just after iommu_flush_all_caches resolves > > > > the issue. This is the earliest point in amd_iommu_init_pci where the > > > > call succeeds on my laptop. > > > > > > According to your description, it should just need to be anywhere after > > > the > > > pci_enable_device() is called for the IOMMU device, isn't it? So, on your > > > system, what if we just move the init_iommu_perf_ctr() here: > > > > No, this doesn't work, as I already said in the paragraph you are responding > > to. See my last sentence in the quoted part. > > > > So the implication is init_device_table_dma together with subsequent cache > > flush is also setting up something that is necessary for counters to be > > writable. > > > > Alexander > > > > Instead of blindly moving the code around to a spot that would just work, > I am trying to understand what might be required here. In this case, > the init_device_table_dma()should not be needed. I suspect it's the IOMMU > invalidate all command that's also needed here. > > I'm also checking with the HW and BIOS team. Meanwhile, could you please give > the following change a try: Hello. Can you give any update please? Alexander > diff --git a/drivers/iommu/amd_iommu_init.c b/drivers/iommu/amd_iommu_init.c > index 5b81fd16f5fa..b07458cc1b0b 100644 > --- a/drivers/iommu/amd_iommu_init.c > +++ b/drivers/iommu/amd_iommu_init.c > @@ -1875,6 +1875,8 @@ static int __init amd_iommu_init_pci(void) > ret = iommu_init_pci(iommu); > if (ret) > break; > + iommu_flush_all_caches(iommu); > + init_iommu_perf_ctr(iommu); > } > > /* > -- > 2.17.1 > > Thanks, > Suravee >
Re: [PATCH 0/3] perf/x86/rapl: Enable RAPL for AMD Fam17h
On Fri, 15 May 2020, Stephane Eranian wrote: > The series first moves the rapl.c file to common perf_events x86 and then > adds the support. > From the user's point of view, the interface is identical with > /sys/devices/power. The energy-pkg event is the only one supported. AMD also has per-core energy metering via MSR 0xc001029a, and I wonder if you have plans to expose it to perf as well. I see it does not fit so nicely with the existing code (as it's per-core instead of per-die). The turbostat tool already exposes these per-core readings: CoreCPU Avg_MHz Busy% Bzy_MHz TSC_MHz CorWatt PkgWatt - - 3951100.00 3951237354.92 30.04 0 0 3945100.00 394523708.9729.98 1 1 3945100.00 394523709.11 2 2 3945100.00 394523708.96 4 3 3946100.00 394623709.32 5 4 3946100.00 394623709.11 6 5 3946100.00 394623709.39 turbostat sums the per-core energy figures to show the per-socket 54.92W value. Though as you can see on this example, the figure is not in agreement with the per-socket MSR you're using in your patch. Maybe the per-core values are less reliable, but I believe I have a test that shows per-package figure to be inaccurate as well. It would be nice if AMD clarified how the counters work. And, for what (little) it's worth, the series is: Tested-by: Alexander Monakov Thank you. Alexander
[tip: ras/core] hwmon: (k10temp) Add AMD family 17h model 60h PCI match
The following commit has been merged into the ras/core branch of tip: Commit-ID: 279f0b3a4b80660fba6faadc2ca2fa426bf3f7e9 Gitweb: https://git.kernel.org/tip/279f0b3a4b80660fba6faadc2ca2fa426bf3f7e9 Author:Alexander Monakov AuthorDate:Sun, 10 May 2020 20:48:41 Committer: Borislav Petkov CommitterDate: Fri, 22 May 2020 18:39:07 +02:00 hwmon: (k10temp) Add AMD family 17h model 60h PCI match Add support for retrieving Tdie and Tctl on AMD Renoir (4000-series Ryzen CPUs). It appears SMU offsets for reading current/voltage and CCD temperature have changed for this generation (reads from currently used offsets yield zeros), so those features cannot be enabled so trivially. Signed-off-by: Alexander Monakov Signed-off-by: Borislav Petkov Acked-by: Guenter Roeck Link: https://lkml.kernel.org/r/20200510204842.2603-3-amona...@ispras.ru --- drivers/hwmon/k10temp.c | 1 + 1 file changed, 1 insertion(+) diff --git a/drivers/hwmon/k10temp.c b/drivers/hwmon/k10temp.c index 3f37d5d..7ba82e0 100644 --- a/drivers/hwmon/k10temp.c +++ b/drivers/hwmon/k10temp.c @@ -632,6 +632,7 @@ static const struct pci_device_id k10temp_id_table[] = { { PCI_VDEVICE(AMD, PCI_DEVICE_ID_AMD_17H_DF_F3) }, { PCI_VDEVICE(AMD, PCI_DEVICE_ID_AMD_17H_M10H_DF_F3) }, { PCI_VDEVICE(AMD, PCI_DEVICE_ID_AMD_17H_M30H_DF_F3) }, + { PCI_VDEVICE(AMD, PCI_DEVICE_ID_AMD_17H_M60H_DF_F3) }, { PCI_VDEVICE(AMD, PCI_DEVICE_ID_AMD_17H_M70H_DF_F3) }, { PCI_VDEVICE(HYGON, PCI_DEVICE_ID_AMD_17H_DF_F3) }, {}
[tip: ras/core] EDAC/amd64: Add AMD family 17h model 60h PCI IDs
The following commit has been merged into the ras/core branch of tip: Commit-ID: b6bea24d41519e8c31e4798f1c1a3f67e540c5d0 Gitweb: https://git.kernel.org/tip/b6bea24d41519e8c31e4798f1c1a3f67e540c5d0 Author:Alexander Monakov AuthorDate:Sun, 10 May 2020 20:48:42 Committer: Borislav Petkov CommitterDate: Fri, 22 May 2020 18:43:13 +02:00 EDAC/amd64: Add AMD family 17h model 60h PCI IDs Add support for AMD Renoir (4000-series Ryzen CPUs). Signed-off-by: Alexander Monakov Signed-off-by: Borislav Petkov Acked-by: Yazen Ghannam Link: https://lkml.kernel.org/r/20200510204842.2603-4-amona...@ispras.ru --- drivers/edac/amd64_edac.c | 14 ++ drivers/edac/amd64_edac.h | 3 +++ 2 files changed, 17 insertions(+) diff --git a/drivers/edac/amd64_edac.c b/drivers/edac/amd64_edac.c index 6bdc5bb..42024a8 100644 --- a/drivers/edac/amd64_edac.c +++ b/drivers/edac/amd64_edac.c @@ -2316,6 +2316,16 @@ static struct amd64_family_type family_types[] = { .dbam_to_cs = f17_addr_mask_to_cs_size, } }, + [F17_M60H_CPUS] = { + .ctl_name = "F17h_M60h", + .f0_id = PCI_DEVICE_ID_AMD_17H_M60H_DF_F0, + .f6_id = PCI_DEVICE_ID_AMD_17H_M60H_DF_F6, + .max_mcs = 2, + .ops = { + .early_channel_count= f17_early_channel_count, + .dbam_to_cs = f17_addr_mask_to_cs_size, + } + }, [F17_M70H_CPUS] = { .ctl_name = "F17h_M70h", .f0_id = PCI_DEVICE_ID_AMD_17H_M70H_DF_F0, @@ -3354,6 +3364,10 @@ static struct amd64_family_type *per_family_init(struct amd64_pvt *pvt) fam_type = &family_types[F17_M30H_CPUS]; pvt->ops = &family_types[F17_M30H_CPUS].ops; break; + } else if (pvt->model >= 0x60 && pvt->model <= 0x6f) { + fam_type = &family_types[F17_M60H_CPUS]; + pvt->ops = &family_types[F17_M60H_CPUS].ops; + break; } else if (pvt->model >= 0x70 && pvt->model <= 0x7f) { fam_type = &family_types[F17_M70H_CPUS]; pvt->ops = &family_types[F17_M70H_CPUS].ops; diff --git a/drivers/edac/amd64_edac.h b/drivers/edac/amd64_edac.h index abbf3c2..52b5d03 100644 --- a/drivers/edac/amd64_edac.h +++ b/drivers/edac/amd64_edac.h @@ -120,6 +120,8 @@ #define PCI_DEVICE_ID_AMD_17H_M10H_DF_F6 0x15ee #define PCI_DEVICE_ID_AMD_17H_M30H_DF_F0 0x1490 #define PCI_DEVICE_ID_AMD_17H_M30H_DF_F6 0x1496 +#define PCI_DEVICE_ID_AMD_17H_M60H_DF_F0 0x1448 +#define PCI_DEVICE_ID_AMD_17H_M60H_DF_F6 0x144e #define PCI_DEVICE_ID_AMD_17H_M70H_DF_F0 0x1440 #define PCI_DEVICE_ID_AMD_17H_M70H_DF_F6 0x1446 #define PCI_DEVICE_ID_AMD_19H_DF_F00x1650 @@ -293,6 +295,7 @@ enum amd_families { F17_CPUS, F17_M10H_CPUS, F17_M30H_CPUS, + F17_M60H_CPUS, F17_M70H_CPUS, F19_CPUS, NUM_FAMILIES,
[tip: ras/core] x86/amd_nb: Add AMD family 17h model 60h PCI IDs
The following commit has been merged into the ras/core branch of tip: Commit-ID: a4e91825d7e1252f7cba005f1451e5464b23c15d Gitweb: https://git.kernel.org/tip/a4e91825d7e1252f7cba005f1451e5464b23c15d Author:Alexander Monakov AuthorDate:Sun, 10 May 2020 20:48:40 Committer: Borislav Petkov CommitterDate: Fri, 22 May 2020 18:24:40 +02:00 x86/amd_nb: Add AMD family 17h model 60h PCI IDs Add PCI IDs for AMD Renoir (4000-series Ryzen CPUs). This is necessary to enable support for temperature sensors via the k10temp module. Signed-off-by: Alexander Monakov Signed-off-by: Borislav Petkov Acked-by: Yazen Ghannam Acked-by: Guenter Roeck Link: https://lkml.kernel.org/r/20200510204842.2603-2-amona...@ispras.ru --- arch/x86/kernel/amd_nb.c | 5 + include/linux/pci_ids.h | 1 + 2 files changed, 6 insertions(+) diff --git a/arch/x86/kernel/amd_nb.c b/arch/x86/kernel/amd_nb.c index b6b3297..18f6b7c 100644 --- a/arch/x86/kernel/amd_nb.c +++ b/arch/x86/kernel/amd_nb.c @@ -18,9 +18,11 @@ #define PCI_DEVICE_ID_AMD_17H_ROOT 0x1450 #define PCI_DEVICE_ID_AMD_17H_M10H_ROOT0x15d0 #define PCI_DEVICE_ID_AMD_17H_M30H_ROOT0x1480 +#define PCI_DEVICE_ID_AMD_17H_M60H_ROOT0x1630 #define PCI_DEVICE_ID_AMD_17H_DF_F40x1464 #define PCI_DEVICE_ID_AMD_17H_M10H_DF_F4 0x15ec #define PCI_DEVICE_ID_AMD_17H_M30H_DF_F4 0x1494 +#define PCI_DEVICE_ID_AMD_17H_M60H_DF_F4 0x144c #define PCI_DEVICE_ID_AMD_17H_M70H_DF_F4 0x1444 #define PCI_DEVICE_ID_AMD_19H_DF_F40x1654 @@ -33,6 +35,7 @@ static const struct pci_device_id amd_root_ids[] = { { PCI_DEVICE(PCI_VENDOR_ID_AMD, PCI_DEVICE_ID_AMD_17H_ROOT) }, { PCI_DEVICE(PCI_VENDOR_ID_AMD, PCI_DEVICE_ID_AMD_17H_M10H_ROOT) }, { PCI_DEVICE(PCI_VENDOR_ID_AMD, PCI_DEVICE_ID_AMD_17H_M30H_ROOT) }, + { PCI_DEVICE(PCI_VENDOR_ID_AMD, PCI_DEVICE_ID_AMD_17H_M60H_ROOT) }, {} }; @@ -50,6 +53,7 @@ static const struct pci_device_id amd_nb_misc_ids[] = { { PCI_DEVICE(PCI_VENDOR_ID_AMD, PCI_DEVICE_ID_AMD_17H_DF_F3) }, { PCI_DEVICE(PCI_VENDOR_ID_AMD, PCI_DEVICE_ID_AMD_17H_M10H_DF_F3) }, { PCI_DEVICE(PCI_VENDOR_ID_AMD, PCI_DEVICE_ID_AMD_17H_M30H_DF_F3) }, + { PCI_DEVICE(PCI_VENDOR_ID_AMD, PCI_DEVICE_ID_AMD_17H_M60H_DF_F3) }, { PCI_DEVICE(PCI_VENDOR_ID_AMD, PCI_DEVICE_ID_AMD_CNB17H_F3) }, { PCI_DEVICE(PCI_VENDOR_ID_AMD, PCI_DEVICE_ID_AMD_17H_M70H_DF_F3) }, { PCI_DEVICE(PCI_VENDOR_ID_AMD, PCI_DEVICE_ID_AMD_19H_DF_F3) }, @@ -65,6 +69,7 @@ static const struct pci_device_id amd_nb_link_ids[] = { { PCI_DEVICE(PCI_VENDOR_ID_AMD, PCI_DEVICE_ID_AMD_17H_DF_F4) }, { PCI_DEVICE(PCI_VENDOR_ID_AMD, PCI_DEVICE_ID_AMD_17H_M10H_DF_F4) }, { PCI_DEVICE(PCI_VENDOR_ID_AMD, PCI_DEVICE_ID_AMD_17H_M30H_DF_F4) }, + { PCI_DEVICE(PCI_VENDOR_ID_AMD, PCI_DEVICE_ID_AMD_17H_M60H_DF_F4) }, { PCI_DEVICE(PCI_VENDOR_ID_AMD, PCI_DEVICE_ID_AMD_17H_M70H_DF_F4) }, { PCI_DEVICE(PCI_VENDOR_ID_AMD, PCI_DEVICE_ID_AMD_19H_DF_F4) }, { PCI_DEVICE(PCI_VENDOR_ID_AMD, PCI_DEVICE_ID_AMD_CNB17H_F4) }, diff --git a/include/linux/pci_ids.h b/include/linux/pci_ids.h index 1dfc4e1..3155f5a 100644 --- a/include/linux/pci_ids.h +++ b/include/linux/pci_ids.h @@ -550,6 +550,7 @@ #define PCI_DEVICE_ID_AMD_17H_DF_F30x1463 #define PCI_DEVICE_ID_AMD_17H_M10H_DF_F3 0x15eb #define PCI_DEVICE_ID_AMD_17H_M30H_DF_F3 0x1493 +#define PCI_DEVICE_ID_AMD_17H_M60H_DF_F3 0x144b #define PCI_DEVICE_ID_AMD_17H_M70H_DF_F3 0x1443 #define PCI_DEVICE_ID_AMD_19H_DF_F30x1653 #define PCI_DEVICE_ID_AMD_CNB17H_F30x1703