RE: [PATCH 1/3] lib/string_helpers: Consolidate yesno() implementation
> > > > +static inline const char *yesno(bool v) { return v ? "yes" : "no"; } > > return "yes\0no" + v * 4; > > :-) except '"no\0\0yes" + v * 4' works a bit better. - Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK Registration No: 1397386 (Wales)
RE: [PATCH 1/3] lib/string_helpers: Consolidate yesno() implementation
> > > +static inline const char *yesno(bool v) { return v ? "yes" : "no"; } return "yes\0no" + v * 4; :-) - Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK Registration No: 1397386 (Wales)
RE: [PATCH 1/3] lib/string_helpers: Consolidate yesno() implementation
> > except '"no\0\0yes" + v * 4' works a bit better. > > Is it a C code obfuscation contest? That would be: return &(v * 3)["no\0yes"]; :-) - Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK Registration No: 1397386 (Wales)
RE: [PATCH 0/3] lib/string_helpers: Add a few string helpers
... > Yeah, and I am sorry for bikeshedding. Honestly, I do not know what is > better. This is why I do not want to block this series when others > like this. > > My main motivation is to point out that: > > enabledisable(enable) > > might be, for some people, more eye bleeding than > > enable ? "enable" : "disable" Indeed - you need to look the former up, wasting brain time. > The problem is not that visible with yesno() and onoff(). But as you said, > onoff() confliscts with variable names. And enabledisable() sucks. > As a result, there is a non-trivial risk of two mass changes: > > now: > > - contition ? "yes" : "no" > + yesno(condition) > > a few moths later: > > - yesno(condition) > + str_yes_no(condition) Followed by: - str_yes_no(x) + no_yes_str(x) David - Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK Registration No: 1397386 (Wales)
RE: [PATCH] drm/amdgpu: do not initialise global variables to 0 or NULL
From: Greg KH > Sent: 02 November 2020 20:11 > > On Mon, Nov 02, 2020 at 02:43:45PM -0500, Alex Deucher wrote: > > On Mon, Nov 2, 2020 at 1:42 PM Deepak R Varma wrote: > > > > > > Initializing global variable to 0 or NULL is not necessary and should > > > be avoided. Issue reported by checkpatch script as: > > > ERROR: do not initialise globals to 0 (or NULL). > > > > I agree that this is technically correct, but a lot of people don't > > seem to know that so we get a lot of comments about this code for the > > variables that are not explicitly set. Seems less confusing to > > initialize them even if it not necessary. I don't have a particularly > > strong opinion on it however. > > The kernel coding style is to do it this way. You even save space and > time by doing it as well :) Uninitialised globals end up as 'named common' (variables that are their own code section - from FORTRAN) until the final link puts them into the .bss. Globals initialised to 0 go into the .bss of the object file being created. So both end up in the final .bss. If the code goes into a module you aren't allowed 'common' data in a module to every global must be initialised. I'm surprised checkpatch complains. David - Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK Registration No: 1397386 (Wales) ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx
RE: [PATCH] drm/amdgpu: Fix realloc of ptr
From: t...@redhat.com > Sent: 26 February 2022 15:59 > > From: Tom Rix > > Clang static analysis reports this error > amdgpu_debugfs.c:1690:9: warning: 1st function call > argument is an uninitialized value > tmp = krealloc_array(tmp, i + 1, > ^~~ > > realloc will free tmp, so tmp can not be garbage. > And the return needs to be checked. Are you sure? A quick check seems to show that krealloc() behaves the same way as libc realloc() and the pointer isn't freed on failure. David > Fixes: 5ce5a584cb82 ("drm/amdgpu: add debugfs for reset registers list") > Signed-off-by: Tom Rix > --- > drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c | 6 +- > 1 file changed, 5 insertions(+), 1 deletion(-) > > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c > b/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c > index 9eb9b440bd438..159b97c0b4ebc 100644 > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c > @@ -1676,7 +1676,7 @@ static ssize_t > amdgpu_reset_dump_register_list_write(struct file *f, > { > struct amdgpu_device *adev = (struct amdgpu_device > *)file_inode(f)->i_private; > char reg_offset[11]; > - uint32_t *tmp; > + uint32_t *tmp = NULL; > int ret, i = 0, len = 0; > > do { > @@ -1688,6 +1688,10 @@ static ssize_t > amdgpu_reset_dump_register_list_write(struct file *f, > } > > tmp = krealloc_array(tmp, i + 1, sizeof(uint32_t), GFP_KERNEL); > + if (!tmp) { > + ret = -ENOMEM; > + goto error_free; > + } > if (sscanf(reg_offset, "%X %n", &tmp[i], &ret) != 1) { > ret = -EINVAL; > goto error_free; > -- > 2.26.3 - Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK Registration No: 1397386 (Wales)
RE: [PATCH 2/6] treewide: remove using list iterator after loop body as a ptr
From: Matthew Wilcox > Sent: 28 February 2022 20:16 > > On Mon, Feb 28, 2022 at 12:10:24PM -0800, Linus Torvalds wrote: > > We can do > > > > typeof(pos) pos > > > > in the 'for ()' loop, and never use __iter at all. > > > > That means that inside the for-loop, we use a _different_ 'pos' than > > outside. > > Then we can never use -Wshadow ;-( I'd love to be able to turn it on; > it catches real bugs. > > > +#define list_for_each_entry(pos, head, member) > > \ > > + for (typeof(pos) pos = list_first_entry(head, typeof(*pos), member); > > \ > > +!list_entry_is_head(pos, head, member);\ > > pos = list_next_entry(pos, member)) Actually can't you use 'pos' to temporarily hold the address of 'member'. Something like: for (pos = (void *)head; \ pos ? ((pos = (void *)pos - offsetof(member)), 1) : 0; \ pos = (void *)pos->next) So that 'pos' is NULL if the loop terminates. No pointers outside structures are generated. Probably need to kill list_entry_is_head() - or it just checks for NULL. David - Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK Registration No: 1397386 (Wales)
RE: [PATCH 2/6] treewide: remove using list iterator after loop body as a ptr
From: Linus Torvalds > Sent: 28 February 2022 19:56 > On Mon, Feb 28, 2022 at 4:19 AM Christian König > wrote: > > > > I don't think that using the extra variable makes the code in any way > > more reliable or easier to read. > > So I think the next step is to do the attached patch (which requires > that "-std=gnu11" that was discussed in the original thread). > > That will guarantee that the 'pos' parameter of list_for_each_entry() > is only updated INSIDE the for_each_list_entry() loop, and can never > point to the (wrongly typed) head entry. > > And I would actually hope that it should actually cause compiler > warnings about possibly uninitialized variables if people then use the > 'pos' pointer outside the loop. Except > > (a) that code in sgx/encl.c currently initializes 'tmp' to NULL for > inexplicable reasons - possibly because it already expected this > behavior > > (b) when I remove that NULL initializer, I still don't get a warning, > because we've disabled -Wno-maybe-uninitialized since it results in so > many false positives. > > Oh well. > > Anyway, give this patch a look, and at least if it's expanded to do > "(pos) = NULL" in the entry statement for the for-loop, it will avoid > the HEAD type confusion that Jakob is working on. And I think in a > cleaner way than the horrid games he plays. > > (But it won't avoid possible CPU speculation of such type confusion. > That, in my opinion, is a completely different issue) > > I do wish we could actually poison the 'pos' value after the loop > somehow - but clearly the "might be uninitialized" I was hoping for > isn't the way to do it. > > Anybody have any ideas? > > Linus diff --git a/include/linux/list.h b/include/linux/list.h index dd6c2041d09c..bab995596aaa 100644 --- a/include/linux/list.h +++ b/include/linux/list.h @@ -634,10 +634,10 @@ static inline void list_splice_tail_init(struct list_head *list, * @head: the head for your list. * @member:the name of the list_head within the struct. */ -#define list_for_each_entry(pos, head, member) \ - for (pos = list_first_entry(head, typeof(*pos), member);\ -!list_entry_is_head(pos, head, member);\ -pos = list_next_entry(pos, member)) +#define list_for_each_entry(pos, head, member) \ + for (typeof(pos) __iter = list_first_entry(head, typeof(*pos), member); \ +!list_entry_is_head(__iter, head, member) && (((pos)=__iter),1); \ +__iter = list_next_entry(__iter, member)) /** * list_for_each_entry_reverse - iterate backwards over list of given type. I think you actually want: !list_entry_is_head(__iter, head, member) ? (((pos)=__iter),1) : (((pos) = NULL),0); Which can be done in the original by: !list_entry_is_head(pos, head, member) ? 1 : (((pos) = NULL), 0); Although it would be safer to have (without looking up the actual name): for (item *__item = head; \ __item ? (((pos) = list_item(__item, member)), 1) : (((pos) = NULL), 0); __item = (pos)->member) The local does need 'fixing' to avoid shadowing for nested loops. David - Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK Registration No: 1397386 (Wales)
RE: [PATCH 2/6] treewide: remove using list iterator after loop body as a ptr
From: Linus Torvalds > Sent: 01 March 2022 19:07 > On Mon, Feb 28, 2022 at 2:29 PM James Bottomley > wrote: > > > > However, if the desire is really to poison the loop variable then we > > can do > > > > #define list_for_each_entry(pos, head, member) \ > > for (pos = list_first_entry(head, typeof(*pos), member);\ > > !list_entry_is_head(pos, head, member) && ((pos = NULL) == > > NULL; \ > > pos = list_next_entry(pos, member)) > > > > Which would at least set pos to NULL when the loop completes. > > That would actually have been excellent if we had done that > originally. It would not only avoid the stale and incorrectly typed > head entry left-over turd, it would also have made it very easy to > test for "did I find an entry in the loop". > > But I don't much like it in the situation we are now. > > Why? Mainly because it basically changes the semantics of the loop > _without_ any warnings about it. And we don't actually get the > advantage of the nicer semantics, because we can't actually make code > do > > list_for_each_entry(entry, ) { > .. > } > if (!entry) > return -ESRCH; > .. use the entry we found .. > > because that would be a disaster for back-porting, plus it would be a > flag-day issue (ie we'd have to change the semantics of the loop at > the same time we change every single user). > > So instead of that simple "if (!entry)", we'd effectively have to > continue to use something that still works with the old world order > (ie that "if (list_entry_is_head())" model). > > So we couldn't really take _advantage_ of the nicer semantics, and > we'd not even get a warning if somebody does it wrong - the code would > just silently do the wrong thing. > > IOW: I don't think you are wrong about that patch: it would solve the > problem that Jakob wants to solve, and it would have absolutely been > much better if we had done this from the beginning. But I think that > in our current situation, it's actually a really fragile solution to > the "don't do that then" problem we have. Can it be resolved by making: #define list_entry_is_head(pos, head, member) ((pos) == NULL) and double-checking that it isn't used anywhere else (except in the list macros themselves). The odd ones I just found are fs/locks.c mm/page_reporting.c security/apparmor/apparmorfs.c (3 times) net/xfrm/xfrm_ipcomp.c#L244 is buggy. (There is a WARN_ON() then it just carries on regardless!) There are only about 25 uses of list_entry_is_head(). There are a lot more places where these lists seem to be scanned by hand. I bet a few of those aren't actually right either. (Oh at 3am this morning I thought it was a different list type that could have much the same problem!) Another plausible solution is a variant of list_foreach_entry() that does set the 'entry' to NULL at the end. Then code can be moved over in stages. I'd reorder the arguments as well as changing the name! David - Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK Registration No: 1397386 (Wales)
RE: [PATCH 2/6] treewide: remove using list iterator after loop body as a ptr
From: Linus Torvalds > Sent: 01 March 2022 23:03 > > On Tue, Mar 1, 2022 at 2:58 PM David Laight wrote: > > > > Can it be resolved by making: > > #define list_entry_is_head(pos, head, member) ((pos) == NULL) > > and double-checking that it isn't used anywhere else (except in > > the list macros themselves). > > Well, yes, except for the fact that then the name is entirely misleading... > > And somebody possibly uses it together with list_first_entry() etc, so > it really is completely broken to mix that change with the list > traversal change. Probably true :-( Actually adding list_entry_not_found() as a synonym for list_entry_is_head() and changing the 25ish places that use it after a loop might work. Once that is done the loop can be changed at the same time as list_entry_not_found(). That won't affect the in-tree callers. (and my out of tree modules don't use those lists - so I don't care about that!) Having said that there are so few users of list_entry_is_head() it is reasonable to generate two new names. One for use after list_for_each_entry() and one for list_next_entry(). Then the change all the call sites. After that list_entry_is_head() can be deleted - breaking out of tree compiles. Finally list_for_each_entry() can be rewritten to set NULL at the end of the list. David - Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK Registration No: 1397386 (Wales)
RE: [PATCH 2/6] treewide: remove using list iterator after loop body as a ptr
From: Xiaomeng Tong > Sent: 02 March 2022 09:31 > > On Mon, 28 Feb 2022 16:41:04 -0800, Linus Torvalds > wrote: > > > > But basically to _me_, the important part is that the end result is > > maintainable longer-term. > > I couldn't agree more. And because of that, I stick with the following > approach because it's maintainable longer-term than "type(pos) pos" one: > Implements a new macro for each list_for_each_entry* with _inside suffix. > #define list_for_each_entry_inside(pos, type, head, member) I think that it would be better to make any alternate loop macro just set the variable to NULL on the loop exit. That is easier to code for and the compiler might be persuaded to not redo the test. It also doesn't need an extra variable defined in the for() statement so can be back-ported to older kernels without required declaration in the middle of blocks. OTOH there may be alternative definitions that can be used to get the compiler (or other compiler-like tools) to detect broken code. Even if the definition can't possibly generate a working kerrnel. David - Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK Registration No: 1397386 (Wales)
RE: [PATCH 2/6] treewide: remove using list iterator after loop body as a ptr
From: Xiaomeng Tong > Sent: 03 March 2022 02:27 > > On Wed, 2 Mar 2022 14:04:06 +0000, David Laight > wrote: > > I think that it would be better to make any alternate loop macro > > just set the variable to NULL on the loop exit. > > That is easier to code for and the compiler might be persuaded to > > not redo the test. > > No, that would lead to a NULL dereference. Why, it would make it b ethe same as the 'easy to use': for (item = head; item; item = item->next) { ... if (...) break; ... } if (!item) return; > The problem is the mis-use of iterator outside the loop on exit, and > the iterator will be the HEAD's container_of pointer which pointers > to a type-confused struct. Sidenote: The *mis-use* here refers to > mistakely access to other members of the struct, instead of the > list_head member which acutally is the valid HEAD. The problem is that the HEAD's container_of pointer should never be calculated at all. This is what is fundamentally broken about the current definition. > IOW, you would dereference a (NULL + offset_of_member) address here. Where? > Please remind me if i missed something, thanks. > > Can you share your "alternative definitions" details? thanks! The loop should probably use as extra variable that points to the 'list node' in the next structure. Something like: for (xxx *iter = head->next; iter == &head ? ((item = NULL),0) : ((item = list_item(iter),1)); iter = item->member->next) { ... With a bit of casting you can use 'item' to hold 'iter'. > > > OTOH there may be alternative definitions that can be used to get > > the compiler (or other compiler-like tools) to detect broken code. > > Even if the definition can't possibly generate a working kerrnel. > > The "list_for_each_entry_inside(pos, type, head, member)" way makes > the iterator invisiable outside the loop, and would be catched by > compiler if use-after-loop things happened. It is also a compete PITA for anything doing a search. David - Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK Registration No: 1397386 (Wales)
RE: [PATCH 2/6] treewide: remove using list iterator after loop body as a ptr
From: Xiaomeng Tong > Sent: 03 March 2022 07:27 > > On Thu, 3 Mar 2022 04:58:23 +0000, David Laight wrote: > > on 3 Mar 2022 10:27:29 +0800, Xiaomeng Tong wrote: > > > The problem is the mis-use of iterator outside the loop on exit, and > > > the iterator will be the HEAD's container_of pointer which pointers > > > to a type-confused struct. Sidenote: The *mis-use* here refers to > > > mistakely access to other members of the struct, instead of the > > > list_head member which acutally is the valid HEAD. > > > > The problem is that the HEAD's container_of pointer should never > > be calculated at all. > > This is what is fundamentally broken about the current definition. > > Yes, the rule is "the HEAD's container_of pointer should never be > calculated at all outside the loop", but how do you make sure everyone > follows this rule? > Everyone makes mistakes, but we can eliminate them all from the beginning > with the help of compiler which can catch such use-after-loop things. > > > > IOW, you would dereference a (NULL + offset_of_member) address here. > > > >Where? > > In the case where a developer do not follows the above rule, and mistakely > access a non-list-head member of the HEAD's container_of pointer outside > the loop. For example: > struct req{ > int a; > struct list_head h; > } > struct req *r; > list_for_each_entry(r, HEAD, h) { > if (r->a == 0x10) > break; > } > // the developer made a mistake: he didn't take this situation into > // account where all entries in the list are *r->a != 0x10*, and now > // the r is the HEAD's container_of pointer. > r->a = 0x20; > Thus the "r->a = 0x20" would dereference a (NULL + offset_of_member) > address here. That is just a bug. No different to failing to check anything else might 'return' a NULL pointer. Because it is a NULL dereference you find out pretty quickly. The existing loop leaves you with a valid pointer to something that isn't a list item. > > > Please remind me if i missed something, thanks. > > > > > > Can you share your "alternative definitions" details? thanks! > > > > The loop should probably use as extra variable that points > > to the 'list node' in the next structure. > > Something like: > > for (xxx *iter = head->next; > > iter == &head ? ((item = NULL),0) : ((item = > > list_item(iter),1)); > > iter = item->member->next) { > >... > > With a bit of casting you can use 'item' to hold 'iter'. > > you still can not make sure everyone follows this rule: > "do not use iterator outside the loop" without the help of compiler, > because item is declared outside the loop. That one has 'iter' defined in the loop. > BTW, to avoid ambiguity,the "alternative definitions" here i asked is > something from you in this context: > "OTOH there may be alternative definitions that can be used to get > the compiler (or other compiler-like tools) to detect broken code. > Even if the definition can't possibly generate a working kerrnel." I was thinking of something like: if ((pos = list_first)), 1) pos = NULL else so that unchecked dereferences after the loop will be detectable as NULL pointer offsets - but that in itself isn't enough to avoid other warnings. > > > The "list_for_each_entry_inside(pos, type, head, member)" way makes > > > the iterator invisiable outside the loop, and would be catched by > > > compiler if use-after-loop things happened. > > > It is also a compete PITA for anything doing a search. > > You mean it would be a burden on search? can you show me some examples? The whole business of having to save the pointer to the located item before breaking the loop, remembering to have set it to NULL earlier etc. It is so much better if you can just do: if (found) break; David - Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK Registration No: 1397386 (Wales)
RE: [PATCH 0/6] Remove usage of list iterator past the loop body
From: Dan Carpenter > Sent: 07 March 2022 15:01 > > Updating this API is risky because some places rely on the old behavior > and not all of them have been updated. Here are some additional places > you might want to change. I really can't help thinking that trying to merge this patch is actually impossible. It affects far too many different parts of the tree. Since (I believe) this is a doubly linked list with forwards and backwards pointers that point to a 'node' (not that there is a nice comment to that effect in the header - and there are lots of ways to do linked lists) the 'head' pretty much has to be a 'node'. I'd write the following new defines (but I might be using the old names here): list_first(head, field) First item, NULL if empty. list_last(head, field) Last item NULL if empty. list_next(head, item, field) Item after 'item', NULL if last. list_prev(head, item. field) Item before 'item', NULL if first. You get (something like): #define list_first(head, field) \ head->next == &head ? NULL : list_item(head->next, field) (probably needs typeof(item) from somewhere). The iterator loop is then just: #define loop_iterate(item, head, field) \ for (item = list_first(head, field); item; \ item = list_next(head, item, field) I'm not sure, but making the 'head' be a structure that contains a single member that is a 'node' might help type checking. Then all the code that uses the current defines can slowly be moved over (probably a couple of releases) before the existing defines are deleted. That should simplify all the open-coded search loops that are just as likely to be buggy (possibly more so). David - Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK Registration No: 1397386 (Wales)
RE: mainline build failure for x86_64 allmodconfig with clang
... > * NOTE: > * This file is gcc-parsable HW gospel, coming straight from HW engineers. I never trust hardware engineers to write code :-) (Although at the moment they trust me to write VHDL...) David - Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK Registration No: 1397386 (Wales)
RE: mainline build failure for x86_64 allmodconfig with clang
From: Arnd Bergmann > Sent: 05 August 2022 20:32 ... > One thing to try would be to condense a function call like > > dml32_CalculateWatermarksMALLUseAndDRAMSpeedChangeSupport( > ... > /* more arguments */); > > into calling conventions that take a pointer to 'mode_lib->vba' and another > one to 'v', so these are no longer passed on the stack individually. Or, if it is only called once (I can't find the source) force it to be inlined. Or just shoot the software engineer who thinks 100 arguments is sane. :-) David - Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK Registration No: 1397386 (Wales)
RE: [PATCH] drm/amd/display: fix i386 frame size warning
From: Hamza Mahfooz > Sent: 18 August 2022 17:49 > > Addresses the following warning: > drivers/gpu/drm/amd/amdgpu/../display/dc/dml/dcn30/display_mode_vba_30.c:3596:6: > error: stack frame > size (2092) exceeds limit (2048) in > 'dml30_ModeSupportAndSystemConfigurationFull' [-Werror,-Wframe- > larger-than] > void dml30_ModeSupportAndSystemConfigurationFull(struct display_mode_lib > *mode_lib) > ^ > > UseMinimumDCFCLK() is eating away at > dml30_ModeSupportAndSystemConfigurationFull()'s stack space, so use a > pointer to struct vba_vars_st instead of passing lots of large arrays > as parameters by value. > > Signed-off-by: Hamza Mahfooz > --- > .../dc/dml/dcn30/display_mode_vba_30.c| 295 -- > 1 file changed, 63 insertions(+), 232 deletions(-) > > diff --git a/drivers/gpu/drm/amd/display/dc/dml/dcn30/display_mode_vba_30.c > b/drivers/gpu/drm/amd/display/dc/dml/dcn30/display_mode_vba_30.c > index 876b321b30ca..b7fa003ffe06 100644 > --- a/drivers/gpu/drm/amd/display/dc/dml/dcn30/display_mode_vba_30.c > +++ b/drivers/gpu/drm/amd/display/dc/dml/dcn30/display_mode_vba_30.c > @@ -396,64 +396,10 @@ static void CalculateUrgentBurstFactor( > > static void UseMinimumDCFCLK( > struct display_mode_lib *mode_lib, > - int MaxInterDCNTileRepeaters, > + struct vba_vars_st *v, You should probably add 'const' in there. Thinks will likely break if v->xxx gets changed. David - Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK Registration No: 1397386 (Wales)
RE: [RFC PATCH 1/9] apple-gmux: use cpu_to_be32 instead of manual reorder
From: Hans de Goede > Sent: 10 February 2023 19:33 > > Hi, > > On 2/10/23 20:09, Hans de Goede wrote: > > Hi, > > > > On 2/10/23 05:48, Orlando Chamberlain wrote: > >> Currently it manually flips the byte order, but we can instead use > >> cpu_to_be32(val) for this. > >> > >> Signed-off-by: Orlando Chamberlain > >> --- > >> drivers/platform/x86/apple-gmux.c | 18 ++ > >> 1 file changed, 2 insertions(+), 16 deletions(-) > >> > >> diff --git a/drivers/platform/x86/apple-gmux.c > >> b/drivers/platform/x86/apple-gmux.c > >> index 9333f82cfa8a..e8cb084cb81f 100644 > >> --- a/drivers/platform/x86/apple-gmux.c > >> +++ b/drivers/platform/x86/apple-gmux.c > >> @@ -94,13 +94,7 @@ static u32 gmux_pio_read32(struct apple_gmux_data > >> *gmux_data, int port) > >> static void gmux_pio_write32(struct apple_gmux_data *gmux_data, int port, > >> u32 val) > >> { > >> - int i; > >> - u8 tmpval; > >> - > >> - for (i = 0; i < 4; i++) { > >> - tmpval = (val >> (i * 8)) & 0xff; > >> - outb(tmpval, gmux_data->iostart + port + i); > >> - } > >> + outl(cpu_to_be32(val), gmux_data->iostart + port); > >> } > >> > >> static int gmux_index_wait_ready(struct apple_gmux_data *gmux_data) > > > > The ioport / indexed-ioport accessed apple_gmux-es likely are (part of?) > > LPC bus devices . Looking at the bus level you are now changing 4 io > > accesses with a size of 1 byte, to 1 32 bit io-access. > > Correction to myself, re-reading the LPC specification, then > if I'm right and this is a LPC device then all IO in/out accesses > are always 1 byte accesses. Since the LPC bus only supports 16 / 32 > bit accesses for DMA cycles. > > So presumably the outl() would get split into 4 separate 8 bit > (port) IO accesses. I wonder if there is something obscure and the order of the 4 bytes writes matters? In any case writing as: iostart = gmux_data->iostart + port; outb(val, iostart); outb(val >> 8, iostart + 1); outb(val >> 16, iostart + 2); outb(val >> 24, ioctart + 3); almost certainly generates better code. David - Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK Registration No: 1397386 (Wales)
RE: Calculating array sizes in C - was: Re: Build regressions/improvements in v6.2-rc1
From: Segher Boessenkool > Sent: 20 January 2023 10:54 ... > > > I suggest to file a bug against gcc complaining about a "spurious > > > warning", and using "-Werror -Wno-error-sizeof-pointer-div" until gcc is > > > adapted to not emit the warning about the pointer division if the result > > > is not used. > > Yeah. If the first operand of a conditional operator is non-zero, the > second operand is not evaluated, and if the first is zero, the third > operand is not evaluated. It is better if we do not warn about > something we do not evaluate. In cases like here where it is clear at > compile time which branch is taken, that shouldn't be too hard. > > Can someone please file a GCC PR? With reduced testcase preferably. It's not a bug. All the operands of the conditional operator have to be valid. It might be that the optimiser can discard one, but that happens much later on. Even the operands of choose_expr() have to be valid - but can have different types. I'm not sure what the code is trying to do or why it is failing. Was it a fail in userspace - where the option to allow sizeof (void) isn't allowed. FWIW you can check for a compile-time NULL (or 0) with: #define is_null(x) (sizeof *(0 : (void *)(x) ? (int *)0) != 1) Although that is a compile-time error for non-NULL unless 'void *' arithmetic is allowed. David - Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK Registration No: 1397386 (Wales)
RE: [PATCH] Optimized division operation to shift operation
From: Christian König > Sent: 15 April 2020 08:57 > Am 15.04.20 um 09:41 schrieb Jani Nikula: > > On Tue, 14 Apr 2020, Alex Deucher wrote: > >> On Tue, Apr 14, 2020 at 9:05 AM Bernard Zhao wrote: > >>> On some processors, the / operate will call the compiler`s div lib, > >>> which is low efficient, We can replace the / operation with shift, > >>> so that we can replace the call of the division library with one > >>> shift assembly instruction. > > This was applied already, and it's not in a driver I look after... but > > to me this feels like something that really should be > > justified. Using >> instead of / for multiples of 2 division mattered 20 > > years ago, I'd be surprised if it still did on modern compilers. > > I have similar worries, especially since we replace the "/ (4 * 2)" with > ">> 3" it's making the code just a bit less readable. > > And that the code runs exactly once while loading the driver and pushing > the firmware into the hardware. So performance is completely irrelevant > here. Force the division to be unsigned and the compiler will use a shift. David - Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK Registration No: 1397386 (Wales) ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx
RE: AMDGPU and 16B stack alignment
From: Arnd Bergmann > Sent: 15 October 2019 08:19 > > On Tue, Oct 15, 2019 at 9:08 AM S, Shirish wrote: > > On 10/15/2019 3:52 AM, Nick Desaulniers wrote: > > > My gcc build fails with below errors: > > > > dcn_calcs.c:1:0: error: -mpreferred-stack-boundary=3 is not between 4 and 12 > > > > dcn_calc_math.c:1:0: error: -mpreferred-stack-boundary=3 is not between 4 > > and 12 > > > > While GPF observed on clang builds seem to be fixed. > > Ok, so it seems that gcc insists on having at least 2^4 bytes stack > alignment when > SSE is enabled on x86-64, but does not actually rely on that for > correct operation > unless it's using sse2. So -msse always has to be paired with > -mpreferred-stack-boundary=3. > > For clang, it sounds like the opposite is true: when passing 16 byte > stack alignment > and having sse/sse2 enabled, it requires the incoming stack to be 16 > byte aligned, > but passing 8 byte alignment makes it do the right thing. > > So, should we just always pass $(call cc-option, -mpreferred-stack-boundary=4) > to get the desired outcome on both? It probably won't solve the problem. You need to find all the asm blocks that call back into C and ensure they maintain the required stack alignment. This might be possible in the kernel, but is almost impossible in userspace. ISTR that gcc arbitrarily changed the stack alignment for i386 a few years ago. While it helped code generation it broke a lot of things. I can't remember the correct set of options to get the stack alignment code added only where it was needed (generates a double %bp frame). David - Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK Registration No: 1397386 (Wales) ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx