Re: RE: Loop vectorizer optimization questions

2024-01-09 Thread Tamar Christina via Gcc
Hi,

The 01/08/2024 22:46, 钟居哲 wrote:
> Oh. It's nice to see you have support min/max index reduction.
> 
> I knew your patch can handle this following:
> 
> 
> int idx = ii;
> int max = mm;
> for (int i = 0; i < n; ++i) {
>   int x = a[i];
>   if (max < x) {
> max = x;
> idx = i;
>   }
> }
> 
> But I wonder whether your patch can handle this:
> 
> int idx = ii;
> int max = mm;
> for (int i = 0; i < n; ++i) {
>   int x = a[i];
>   if (max <= x) {
> max = x;
> idx = i;
>   }
> }
> 

The last version of the patch we sent handled all conditionals:

https://inbox.sourceware.org/gcc-patches/db9pr08mb6603dccb35007d83c6736167f5...@db9pr08mb6603.eurprd08.prod.outlook.com/

There are some additional testcases in the patch for all these as well.

> Will you continue to work on min/max with index ?

I don't know if I'll have the free time to do so, that's the reason I haven't 
resent the new one.
The engineer who started it no longer works for Arm.

> Or you want me to continue this work base on your patch ?
> 
> I have an initial patch which roughly implemented LLVM's approach but turns 
> out Richi doesn't want me to apply LLVM's approach so your patch may be more 
> reasonable than LLVM's approach.
> 

When Richi reviewed it he wasn't against the approach in the patch 
https://inbox.sourceware.org/gcc-patches/nycvar.yfh.7.76.2105071320170.9...@zhemvz.fhfr.qr/
but he wanted the concept of a dependent reduction to be handle more 
generically, so we could extend it in the future.

I think, from looking at Richi's feedback is that he wants 
vect_recog_minmax_index_pattern to be more general. We've basically hardcoded 
the reduction type,
but it could just be a property on STMT_VINFO.

Unless I'm mistaken the patch already relies on first finding both reductions, 
but we immediately try to resolve the relationship using 
vect_recog_minmax_index_pattern.
Instead I think what Richi wanted was for us to keep track of reductions that 
operate on the same induction variable and after we finish analysing all 
reductions we
try to see if any reductions we kept track of can be combined.

Basically just separate out the discovery and tieing of the reductions.

Am I right here Richi?

I think the codegen part can mostly be used as is, though we might be able to 
do better for VLA.

So it should be fairly straight forward to go from that final patch to what 
Richi wants, but.. I just lack time.

If you want to tackle it that would be great :)

Thanks,
Tamar

> Thanks.
> 
> juzhe.zh...@rivai.ai
> 
> From: Tamar Christina
> Date: 2024-01-09 01:50
> To: 钟居哲; gcc
> CC: rdapp.gcc; 
> richard.guenther
> Subject: RE: Loop vectorizer optimization questions
> >
> > Also, another question is that I am working on min/max reduction with 
> > index, I
> > believe it should be in GCC-15, but I wonder
> > whether I can pre-post for review in stage 4, or I should post patch 
> > (min/max
> > reduction with index) when GCC-15 is open.
> >
> 
> FWIW, We tried to implement this 5 years ago 
> https://gcc.gnu.org/pipermail/gcc-patches/2019-November/534518.html
> and you'll likely get the same feedback if you aren't already doing so.
> 
> I think Richard would prefer to have a general framework these kinds of 
> operations.  We never got around to doing so
> and it's still on my list but if you're taking care of it 
> 
> Just though I'd point out the previous feedback.
> 
> Cheers,
> Tamar
> 
> > Thanks.
> >
> >
> > juzhe.zh...@rivai.ai

-- 


Re: Re: Loop vectorizer optimization questions

2024-01-09 Thread juzhe.zh...@rivai.ai
I see. Thanks Tamar.

I am willing to to investigate Arm's initial patch to see what else we need in 
that patch.

Since min/max reduction with index can improve SPEC performance, I will take a 
look at it in GCC-15.

Thanks a lot !



juzhe.zh...@rivai.ai
 
From: Tamar Christina
Date: 2024-01-09 16:59
To: 钟居哲
CC: richard.guenther; rdapp.gcc; gcc
Subject: Re: RE: Loop vectorizer optimization questions
Hi,
 
The 01/08/2024 22:46, 钟居哲 wrote:
> Oh. It's nice to see you have support min/max index reduction.
> 
> I knew your patch can handle this following:
> 
> 
> int idx = ii;
> int max = mm;
> for (int i = 0; i < n; ++i) {
>   int x = a[i];
>   if (max < x) {
> max = x;
> idx = i;
>   }
> }
> 
> But I wonder whether your patch can handle this:
> 
> int idx = ii;
> int max = mm;
> for (int i = 0; i < n; ++i) {
>   int x = a[i];
>   if (max <= x) {
> max = x;
> idx = i;
>   }
> }
> 
 
The last version of the patch we sent handled all conditionals:
 
https://inbox.sourceware.org/gcc-patches/db9pr08mb6603dccb35007d83c6736167f5...@db9pr08mb6603.eurprd08.prod.outlook.com/
 
There are some additional testcases in the patch for all these as well.
 
> Will you continue to work on min/max with index ?
 
I don't know if I'll have the free time to do so, that's the reason I haven't 
resent the new one.
The engineer who started it no longer works for Arm.
 
> Or you want me to continue this work base on your patch ?
> 
> I have an initial patch which roughly implemented LLVM's approach but turns 
> out Richi doesn't want me to apply LLVM's approach so your patch may be more 
> reasonable than LLVM's approach.
> 
 
When Richi reviewed it he wasn't against the approach in the patch 
https://inbox.sourceware.org/gcc-patches/nycvar.yfh.7.76.2105071320170.9...@zhemvz.fhfr.qr/
but he wanted the concept of a dependent reduction to be handle more 
generically, so we could extend it in the future.
 
I think, from looking at Richi's feedback is that he wants 
vect_recog_minmax_index_pattern to be more general. We've basically hardcoded 
the reduction type,
but it could just be a property on STMT_VINFO.
 
Unless I'm mistaken the patch already relies on first finding both reductions, 
but we immediately try to resolve the relationship using 
vect_recog_minmax_index_pattern.
Instead I think what Richi wanted was for us to keep track of reductions that 
operate on the same induction variable and after we finish analysing all 
reductions we
try to see if any reductions we kept track of can be combined.
 
Basically just separate out the discovery and tieing of the reductions.
 
Am I right here Richi?
 
I think the codegen part can mostly be used as is, though we might be able to 
do better for VLA.
 
So it should be fairly straight forward to go from that final patch to what 
Richi wants, but.. I just lack time.
 
If you want to tackle it that would be great :)
 
Thanks,
Tamar
 
> Thanks.
> 
> juzhe.zh...@rivai.ai
> 
> From: Tamar Christina
> Date: 2024-01-09 01:50
> To: 钟居哲; gcc
> CC: rdapp.gcc; 
> richard.guenther
> Subject: RE: Loop vectorizer optimization questions
> >
> > Also, another question is that I am working on min/max reduction with 
> > index, I
> > believe it should be in GCC-15, but I wonder
> > whether I can pre-post for review in stage 4, or I should post patch 
> > (min/max
> > reduction with index) when GCC-15 is open.
> >
> 
> FWIW, We tried to implement this 5 years ago 
> https://gcc.gnu.org/pipermail/gcc-patches/2019-November/534518.html
> and you'll likely get the same feedback if you aren't already doing so.
> 
> I think Richard would prefer to have a general framework these kinds of 
> operations.  We never got around to doing so
> and it's still on my list but if you're taking care of it 
> 
> Just though I'd point out the previous feedback.
> 
> Cheers,
> Tamar
> 
> > Thanks.
> >
> >
> > juzhe.zh...@rivai.ai
 
-- 
 


Re: Loop vectorizer optimization questions

2024-01-09 Thread Richard Biener via Gcc
On Mon, Jan 8, 2024 at 2:57 PM 钟居哲  wrote:
>
> Hi, Richard.
>
> I saw this following code:
>
>   if (LOOP_VINFO_CAN_USE_PARTIAL_VECTORS_P (loop_vinfo))
> {
>   if (direct_internal_fn_supported_p (IFN_VCOND_MASK_LEN, vectype,
>   OPTIMIZE_FOR_SPEED))
> return false;
>   else
> vect_record_loop_mask (loop_vinfo, masks, ncopies, vectype, NULL);
> }
>
> for early break, current early break is not sufficient to support target with 
> length partial vector so that we are not able to enable early break for RVV.
>
> I wonder if I want to support this in middle-end, is it allowed in GCC-14 ? 
> Or should I defer to GCC-15.

Defer to GCC 15.

> Also, another question is that I am working on min/max reduction with index, 
> I believe it should be in GCC-15, but I wonder
> whether I can pre-post for review in stage 4, or I should post patch (min/max 
> reduction with index) when GCC-15 is open.

You can always post patches for review, just don't expect timely ones ;)

> Thanks.
> 
> juzhe.zh...@rivai.ai


Re: [PATCH] panic: suppress gnu_printf warning

2024-01-09 Thread Manuel López-Ibáñez via Gcc

On 07/01/2024 19:21, Andrew Morton wrote:

On Sun,  7 Jan 2024 17:16:41 +0800 Baoquan He  wrote:


with GCC 13.2.1 and W=1, there's compiling warning like this:

kernel/panic.c: In function ?__warn?:
kernel/panic.c:676:17: warning: function ?__warn? might be a candidate for 
?gnu_printf? format attribute [-Wsuggest-attribute=format]
   676 | vprintk(args->fmt, args->args);
   | ^~~

The normal __printf(x,y) adding can't fix it. So add workaround which
disables -Wsuggest-attribute=format to mute it.

...

--- a/kernel/panic.c
+++ b/kernel/panic.c
@@ -666,8 +666,13 @@ void __warn(const char *file, int line, void *caller, 
unsigned taint,
pr_warn("WARNING: CPU: %d PID: %d at %pS\n",
raw_smp_processor_id(), current->pid, caller);
  
+#pragma GCC diagnostic push

+#ifndef __clang__
+#pragma GCC diagnostic ignored "-Wsuggest-attribute=format"
+#endif
if (args)
vprintk(args->fmt, args->args);
+#pragma GCC diagnostic pop
  
  	print_modules();

__warn() clearly isn't such a candidate.  I'm suspecting that gcc's
implementation of this warning is pretty crude.  Is it a new thing in
gcc-13.2?


I suspect the warning is about vprintk(), which does seem a printf-like 
function but something (early inlining?) may be messing up the context and GCC 
warns about __warn(). This may be bug: 
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=28492


If vprintk() already has the format attribute, then the messed up function name 
may be confusing GCC into warning again about it.


Best wishes,

Manuel.



Re: [PATCH 6/4] libbacktrace: Add loaded dlls after initialize

2024-01-09 Thread Björn Schäpers

Am 07.01.2024 um 18:03 schrieb Eli Zaretskii:

Date: Sun, 7 Jan 2024 17:07:06 +0100
Cc: i...@google.com, gcc-patc...@gcc.gnu.org, gcc@gcc.gnu.org
From: Björn Schäpers 


That was about GetModuleHandle, not about GetModuleHandleEx.  For the
latter, all Windows versions that support it also support "wide" APIs.
So my suggestion is to use GetModuleHandleExW here.  However, you will
need to make sure that notification_data->dll_base is declared as
'wchar_t *', not 'char *'.  If dll_base is declared as 'char *', then
only GetModuleHandleExA will work, and you will lose the ability to
support file names with non-ASCII characters outside of the current
system codepage.


The dll_base is a PVOID. With the GET_MODULE_HANDLE_EX_FLAG_FROM_ADDRESS flag
GetModuleHandleEx does not look for a name, but uses an adress in the module to
get the HMODULE, so you cast it to char* or wchar_t* depending on which function
you call. Actually one could just cast the dll_base to HMODULE, at least in
win32 on x86 the HMODULE of a dll is always its base adress. But to make it
safer and future proof I went the way through GetModuleHandeEx.


In that case, you an call either GetModuleHandeExA or
GetModuleHandeExW, the difference is minor.


Here an updated version without relying on TEXT or TCHAR, directly calling 
GetModuleHandleExW.


Kind regards,
Björn.From a8e1e64ccb56158ec8a7e5de0d5228f3f6f7e5c4 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Bj=C3=B6rn=20Sch=C3=A4pers?= 
Date: Sat, 6 Jan 2024 22:53:54 +0100
Subject: [PATCH 3/3] libbacktrace: Add loaded dlls after initialize

libbacktrace/Changelog:

* pecoff.c [HAVE_WINDOWS_H]:
  (dll_notification_data): Added
  (dll_notification_context): Added
  (dll_notification): Added
  (backtrace_initialize): Use LdrRegisterDllNotification to load
  debug information of later loaded
  dlls.
---
 libbacktrace/pecoff.c | 104 +-
 1 file changed, 103 insertions(+), 1 deletion(-)

diff --git a/libbacktrace/pecoff.c b/libbacktrace/pecoff.c
index 647baa39640..d973a26f05a 100644
--- a/libbacktrace/pecoff.c
+++ b/libbacktrace/pecoff.c
@@ -62,6 +62,34 @@ POSSIBILITY OF SUCH DAMAGE.  */
 #undef Module32Next
 #endif
 #endif
+
+#if defined(_ARM_)
+#define NTAPI
+#else
+#define NTAPI __stdcall
+#endif
+
+/* This is a simplified (but binary compatible) version of what Microsoft
+   defines in their documentation. */
+struct dll_notifcation_data
+{
+  ULONG reserved;
+  /* The name as UNICODE_STRING struct. */
+  PVOID full_dll_name;
+  PVOID base_dll_name;
+  PVOID dll_base;
+  ULONG size_of_image;
+};
+
+#define LDR_DLL_NOTIFICATION_REASON_LOADED 1
+
+typedef LONG NTSTATUS;
+typedef VOID CALLBACK (*LDR_DLL_NOTIFICATION)(ULONG,
+ struct dll_notifcation_data*,
+ PVOID);
+typedef NTSTATUS NTAPI (*LDR_REGISTER_FUNCTION)(ULONG,
+   LDR_DLL_NOTIFICATION, PVOID,
+   PVOID*);
 #endif
 
 /* Coff file header.  */
@@ -912,7 +940,8 @@ coff_add (struct backtrace_state *state, int descriptor,
   return 0;
 }
 
-#if defined(HAVE_WINDOWS_H) && !defined(HAVE_TLHELP32_H)
+#ifdef HAVE_WINDOWS_H
+#ifndef HAVE_TLHELP32_H
 static void
 free_modules (struct backtrace_state *state,
  backtrace_error_callback error_callback, void *data,
@@ -958,6 +987,51 @@ get_all_modules (struct backtrace_state *state,
 }
 }
 #endif
+struct dll_notification_context
+{
+  struct backtrace_state *state;
+  backtrace_error_callback error_callback;
+  void *data;
+};
+
+VOID CALLBACK
+dll_notification (ULONG reason,
+struct dll_notifcation_data *notification_data,
+PVOID context)
+{
+  char module_name[MAX_PATH];
+  int descriptor;
+  struct dll_notification_context* dll_context =
+(struct dll_notification_context*) context;
+  struct backtrace_state *state = dll_context->state;
+  void *data = dll_context->data;
+  backtrace_error_callback error_callback = dll_context->data;
+  fileline fileline;
+  int found_sym;
+  int found_dwarf;
+  HMODULE module_handle;
+
+  if (reason != LDR_DLL_NOTIFICATION_REASON_LOADED)
+return;
+
+  if (!GetModuleHandleExW (GET_MODULE_HANDLE_EX_FLAG_FROM_ADDRESS
+  | GET_MODULE_HANDLE_EX_FLAG_UNCHANGED_REFCOUNT,
+  (wchar_t*) notification_data->dll_base,
+  &module_handle))
+return;
+
+  if (!GetModuleFileNameA ((HMODULE) module_handle, module_name, MAX_PATH - 1))
+return;
+
+  descriptor = backtrace_open (module_name, error_callback, data, NULL);
+
+  if (descriptor < 0)
+return;
+
+  coff_add (state, descriptor, error_callback, data, &fileline, &found_sym,
+   &found_dwarf, (uintptr_t) module_handle);
+}
+#endif
 
 /* Initialize the backtrace data we need from an ELF executable.  At
the ELF