Re: [LKP] [drm/mgag200] 90f479ae51: vm-scalability.median -18.8% regression

2019-07-31 Thread Huang, Ying
Hi, Daniel,

Daniel Vetter  writes:

> On Tue, Jul 30, 2019 at 10:27 PM Dave Airlie  wrote:
>>
>> On Wed, 31 Jul 2019 at 05:00, Daniel Vetter  wrote:
>> >
>> > On Tue, Jul 30, 2019 at 8:50 PM Thomas Zimmermann  
>> > wrote:
>> > >
>> > > Hi
>> > >
>> > > Am 30.07.19 um 20:12 schrieb Daniel Vetter:
>> > > > On Tue, Jul 30, 2019 at 7:50 PM Thomas Zimmermann 
>> > > >  wrote:
>> > > >> Am 29.07.19 um 11:51 schrieb kernel test robot:
>> > > >>> Greeting,
>> > > >>>
>> > > >>> FYI, we noticed a -18.8% regression of vm-scalability.median due to 
>> > > >>> commit:>
>> > > >>>
>> > > >>> commit: 90f479ae51afa45efab97afdde9b94b9660dd3e4 ("drm/mgag200: 
>> > > >>> Replace struct mga_fbdev with generic framebuffer emulation")
>> > > >>> https://kernel.googlesource.com/pub/scm/linux/kernel/git/next/linux-next.git
>> > > >>>  master
>> > > >>
>> > > >> Daniel, Noralf, we may have to revert this patch.
>> > > >>
>> > > >> I expected some change in display performance, but not in VM. Since 
>> > > >> it's
>> > > >> a server chipset, probably no one cares much about display 
>> > > >> performance.
>> > > >> So that seemed like a good trade-off for re-using shared code.
>> > > >>
>> > > >> Part of the patch set is that the generic fb emulation now maps and
>> > > >> unmaps the fbdev BO when updating the screen. I guess that's the cause
>> > > >> of the performance regression. And it should be visible with other
>> > > >> drivers as well if they use a shadow FB for fbdev emulation.
>> > > >
>> > > > For fbcon we should need to do any maps/unamps at all, this is for the
>> > > > fbdev mmap support only. If the testcase mentioned here tests fbdev
>> > > > mmap handling it's pretty badly misnamed :-) And as long as you don't
>> > > > have an fbdev mmap there shouldn't be any impact at all.
>> > >
>> > > The ast and mgag200 have only a few MiB of VRAM, so we have to get the
>> > > fbdev BO out if it's not being displayed. If not being mapped, it can be
>> > > evicted and make room for X, etc.
>> > >
>> > > To make this work, the BO's memory is mapped and unmapped in
>> > > drm_fb_helper_dirty_work() before being updated from the shadow FB. [1]
>> > > That fbdev mapping is established on each screen update, more or less.
>> > > From my (yet unverified) understanding, this causes the performance
>> > > regression in the VM code.
>> > >
>> > > The original code in mgag200 used to kmap the fbdev BO while it's being
>> > > displayed; [2] and the drawing code only mapped it when necessary (i.e.,
>> > > not being display). [3]
>> >
>> > Hm yeah, this vmap/vunmap is going to be pretty bad. We indeed should
>> > cache this.
>> >
>> > > I think this could be added for VRAM helpers as well, but it's still a
>> > > workaround and non-VRAM drivers might also run into such a performance
>> > > regression if they use the fbdev's shadow fb.
>> >
>> > Yeah agreed, fbdev emulation should try to cache the vmap.
>> >
>> > > Noralf mentioned that there are plans for other DRM clients besides the
>> > > console. They would as well run into similar problems.
>> > >
>> > > >> The thing is that we'd need another generic fbdev emulation for ast 
>> > > >> and
>> > > >> mgag200 that handles this issue properly.
>> > > >
>> > > > Yeah I dont think we want to jump the gun here.  If you can try to
>> > > > repro locally and profile where we're wasting cpu time I hope that
>> > > > should sched a light what's going wrong here.
>> > >
>> > > I don't have much time ATM and I'm not even officially at work until
>> > > late Aug. I'd send you the revert and investigate later. I agree that
>> > > using generic fbdev emulation would be preferable.
>> >
>> > Still not sure that's the right thing to do really. Yes it's a
>> > regression, but vm testcases shouldn run a single line of fbcon or drm
>> > code. So why this is impacted so heavily by a silly drm change is very
>> > confusing to me. We might be papering over a deeper and much more
>> > serious issue ...
>>
>> It's a regression, the right thing is to revert first and then work
>> out the right thing to do.
>
> Sure, but I have no idea whether the testcase is doing something
> reasonable. If it's accidentally testing vm scalability of fbdev and
> there's no one else doing something this pointless, then it's not a
> real bug. Plus I think we're shooting the messenger here.
>
>> It's likely the test runs on the console and printfs stuff out while running.
>
> But why did we not regress the world if a few prints on the console
> have such a huge impact? We didn't get an entire stream of mails about
> breaking stuff ...

The regression seems not related to the commit.  But we have retested
and confirmed the regression.  Hard to understand what happens.

Best Regards,
Huang, Ying
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

[PATCH v1 06/10] device property: switch to use UUID API

2016-04-08 Thread Huang, Ying
Andy Shevchenko  writes:

> On Fri, 2016-02-26 at 16:11 +0200, Andy Shevchenko wrote:
>> On Thu, 2016-02-18 at 01:03 +0100, Rafael J. Wysocki wrote:
>> > 
>> > On Wednesday, February 17, 2016 02:17:24 PM Andy Shevchenko wrote:
>> > > 
>> > > Switch to use a generic UUID API instead of custom approach. It
>> > > allows to
>> > > define UUIDs, compare them, and validate.
>> []
>> 
>
> Summon initial author of the UUID library.
>
> Summary: the API of comparison functions is rather strange. What the
> point to not take pointers directly? (Moreover I hope compiler too
> clever not to make a copy of constant arguments there)
>
> I could only imagine the case you are trying to avoid temporary
> variables for constants like NULL_UUID.
>
> Issue with this is the ugliness in the users of that, in particularly
> present in ACPI (drivers/acpi/apei/ghes.c).
>
> I would like to have more clear interface for that. Perhaps we may add
> something like
>
> cmp_p(pointer, non-pointer);
> cmp_pp(pointer, pointer);
>
> to not break existing API for now.
>
> It would be useful for many cases in the kernel.

You can take a look at the drivers/acpi/apei/erst.c for uuid_le_cmp
usage.

#define CPER_CREATOR_PSTORE \
UUID_LE(0x75a574e3, 0x5052, 0x4b29, 0x8a, 0x8e, 0xbe, 0x2c, \
0x64, 0x90, 0xb8, 0x9d)

if (uuid_le_cmp(rcd->hdr.creator_id, CPER_CREATOR_PSTORE) != 0)
goto skip;

Looks better?

This is the typical use case in mind when I write the uuid.h.

As for uuid_le_cmp usage in drivers/acpi/apei/ghes.c,

if (!uuid_le_cmp(*(uuid_le *)gdata->section_type,
 CPER_SEC_PLATFORM_MEM)) {

The code looks not good mainly because acpi_hest_generic_data is not
defined with uuid_le in mind.

struct acpi_hest_generic_data {
u8 section_type[16];
u32 error_severity;
u16 revision;
u8 validation_bits;
u8 flags;
u32 error_data_length;
u8 fru_id[16];
u8 fru_text[20];
};

If section_type was defined as uuid_le instead of u8[16], the
uuid_le_cmp usage would look better.  So I suggest to use uuid_le/be in
data structure definition in new code if possible.

Best Regards,
Huang, Ying

>> > 
>> > > 
>> > > +static const uuid_le ads_uuid =
>> > > +UUID_LE(0xdbb8e3e6, 0x5886, 0x4ba6,
>> > > +0x87, 0x95, 0x13, 0x19, 0xf5, 0x2a, 0x96, 0x6b);
>> > >  
>> > >  static bool acpi_enumerate_nondev_subnodes(acpi_handle scope,
>> > >        const union
>> > > acpi_object
>> > > *desc,
>> > > @@ -138,7 +136,7 @@ static bool
>> > > acpi_enumerate_nondev_subnodes(acpi_handle scope,
>> > >         || links->type != ACPI_TYPE_PACKAGE)
>> > >     break;
>> > >  
>> > > -if (memcmp(uuid->buffer.pointer, ads_uuid,
>> > > sizeof(ads_uuid)))
>> > > +if (uuid_le_cmp(*(uuid_le *)uuid->buffer.pointer,
>> > > ads_uuid))
>> > Maybe it's too late, but I don't quite understand the pointer
>> > manipulations here.
>> > 
>> > I can see why you need a type conversion (although it looks ugly),
>> > but why do you
>> > need to dereference it too?
>> The function takes that kind of type on input. The other variants are
>> not compiled.
>> Perhaps we better change uuid_{lb}e_cmp() first to take normal
>> pointers, though I think the initial idea was to get type checking at
>> compile time.
>> 


[PATCH v1 06/10] device property: switch to use UUID API

2016-04-09 Thread huang ying
On Fri, Apr 8, 2016 at 6:00 PM, Andy Shevchenko
 wrote:
> On Fri, 2016-04-08 at 09:27 +0800, Huang, Ying wrote:
>> Andy Shevchenko  writes:
>>
>> >
>> > On Fri, 2016-02-26 at 16:11 +0200, Andy Shevchenko wrote:
>> > >
>> > > On Thu, 2016-02-18 at 01:03 +0100, Rafael J. Wysocki wrote:
>> > > >
>> > > >
>> > > > On Wednesday, February 17, 2016 02:17:24 PM Andy Shevchenko
>> > > > wrote:
>> > > > >
>> > > > >
>> > > > > Switch to use a generic UUID API instead of custom approach.
>> > > > > It
>> > > > > allows to
>> > > > > define UUIDs, compare them, and validate.
>> > > []
>> > >
>> > Summon initial author of the UUID library.
>> >
>> > Summary: the API of comparison functions is rather strange. What the
>> > point to not take pointers directly? (Moreover I hope compiler too
>> > clever not to make a copy of constant arguments there)
>> >
>> > I could only imagine the case you are trying to avoid temporary
>> > variables for constants like NULL_UUID.
>> >
>> > Issue with this is the ugliness in the users of that, in
>> > particularly
>> > present in ACPI (drivers/acpi/apei/ghes.c).
>> >
>> > I would like to have more clear interface for that. Perhaps we may
>> > add
>> > something like
>> >
>> > cmp_p(pointer, non-pointer);
>> > cmp_pp(pointer, pointer);
>> >
>> > to not break existing API for now.
>> >
>> > It would be useful for many cases in the kernel.
>> You can take a look at the drivers/acpi/apei/erst.c for uuid_le_cmp
>> usage.
>>
>> #define
>> CPER_CREATOR_PSTORE \
>> UUID_LE(0x75a574e3, 0x5052, 0x4b29, 0x8a, 0x8e, 0xbe,
>> 0x2c, \
>> 0x64, 0x90, 0xb8, 0x9d)
>>
>> if (uuid_le_cmp(rcd->hdr.creator_id, CPER_CREATOR_PSTORE) !=
>> 0)
>> goto skip;
>>
>> Looks better?
>
> I don't quite understand the issues with
>
> if (uuid_le_cmp(&rcd->hdr.creator_id, &CPER_CREATOR_PSTORE) != 0)

I tried to make uuid_le looks like a primitive data type and UUID
constant looks like primitive type constants if possible.  If we can
define data as uuid_le/be, then it will look just like that.  But if
there are too many places we cannot use uuid_le/be directly, I am OK
to convert the interface to use pointer instead.

> or, like I mentioned previously, we may introduce _cmp_p() and use like
>
> if (uuid_le_cmp_p(&rcd->hdr.creator_id, CPER_CREATOR_PSTORE) != 0)

Personally, I don't like this interface. It is better for two
parameters to have same data type.

> if it looks better (again, I don't know if compiler is going to copy the last 
> argument).
>
>>
>> This is the typical use case in mind when I write the uuid.h.
>>
>> As for uuid_le_cmp usage in drivers/acpi/apei/ghes.c,
>>
>>   if (!uuid_le_cmp(*(uuid_le *)gdata->section_type,
>>CPER_SEC_PLATFORM_MEM)) {
>
> Ditto
>
> if (!uuid_le_cmp_p((uuid_le *)gdata->section_type,
> CPER_SEC_PLATFORM_MEM)) {
>
>>
>> The code looks not good mainly because acpi_hest_generic_data is not
>> defined with uuid_le in mind.
>>
>> struct acpi_hest_generic_data {
>>   u8 section_type[16];
>>   u32 error_severity;
>>   u16 revision;
>>   u8 validation_bits;
>>   u8 flags;
>>   u32 error_data_length;
>>   u8 fru_id[16];
>>   u8 fru_text[20];
>> };
>>
>> If section_type was defined as uuid_le instead of u8[16], the
>> uuid_le_cmp usage would look better.  So I suggest to use uuid_le/be
>> in
>> data structure definition in new code if possible.
>
> This is understandable for such structures, but we might get a UUID from
> a buffer which is pointer to u8. It's not possible to convert to uuid_*
> since it's too generic stuff and might require to introduce
> ACPI_TYPE_UUID with standardization and all necessary work. Apparently
> not the shortest way.

If this is just a special case that happens seldom, we can just work
around it with *(uuid_le/be *)buf.  If it is common, we can change the
interface or add a new interface.

Best Regards,
Huang, YIng

>> >
>> > >
>> > > >
>> > > >
>> > > > >
>> > > > >
>> > > > > +static co

Re: [PATCH] mm: convert totalram_pages, totalhigh_pages and managed_pages to atomic.

2018-10-23 Thread Huang, Ying
Arun KS  writes:

> Remove managed_page_count_lock spinlock and instead use atomic
> variables.
>
> Suggested-by: Michal Hocko 
> Suggested-by: Vlastimil Babka 
> Signed-off-by: Arun KS 
>
> ---
> As discussed here,
> https://patchwork.kernel.org/patch/10627521/#22261253

My 2 cents.  I think you should include at least part of the discussion
in the patch description to make it more readable by itself.

Best Regards,
Huang, Ying
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel