Re: GSoC 2025: In-Memory Filesystem for GPU Offloading Tests

2025-03-11 Thread Andrew Stubbs

On 10/03/2025 22:56, Arijit Kumar Das wrote:

Hello Andrew,

Thank you for the detailed response! This gives me a much clearer 
picture of how things work.


Regarding the two possible approaches:

  * I personally find *Option A (self-contained in-memory FS)* more
interesting, and I'd like to work on it first.

  * However, if *Option B (RPC-based host FS access)* is the preferred
approach for GSoC, I’d be happy to work on that as well.


I'll defer to Thomas who proposed the project and volunteered to act as 
GSoC mentor. :)


Have fun!

Andrew


Re: What branch to use for the Algol 68 front-end

2025-03-11 Thread Jose E. Marchesi via Gcc


>> Hi Jose,
>>
>> On Sat, Mar 08, 2025 at 02:17:52PM +0100, Jose E. Marchesi wrote:
>>> > Since you already have a fork on the (experimental) forge we could
>>> > also move your fork under https://forge.sourceware.org/gcc that way
>>> > you can experiment with merge requests if you like. Even if all your
>>> > patches still go to the algol68 list first.
>>> 
>>> I am all for continuing using the forge if it is useful for the
>>> experiment.
>>
>> I think it would. Also I think it would be fun to have one of the
>> oldest language frontends use the most modern development environments
>> :)
>
> :)
>
>>> Let's see.  Right now the resources we are using are:
>>> 
>>>   Website:  https://gcc.gnu.org/wiki/Algol68FrontEnd
>>>   Mailing list: algo...@gcc.gnu.org
>>>   Bugzilla: https://gcc.gnu.org/bugzilla
>>>   Git repo: https://forge.sourceware.org/jemarch/a68-gcc (branch a68)
>>> 
>>> If I understand you suggestion right, we would simply move the git repo
>>> to:
>>> 
>>>   https://forge.sourceware.org/gcc/algol68 (branch a68)
>>
>> Yes. To do that you should become a member/owner of the gcc
>> organization. https://forge.sourceware.org/org/gcc An existing
>> member/owner should be able to accept you.
>> https://forge.sourceware.org/org/gcc/members
>
> Ok.  I have asked in IRC.  Hopefully someone can add my user to that
> list.

This is done.

>> Then you can move your repository/fork to the gcc organization by
>> going to the existing repository Settings at
>> https://forge.sourceware.org/jemarch/algol68/settings Scroll down to
>> "Danger zone" and click on "Transfer ownership" The new owner would be
>> gcc. It does need to be accepted by one of the gcc Owners.
>
> Note that the repo is currently named jemarch/a68-gcc, but I see all the
> repos under the gcc organization start with "gcc":
>
>gcc-mirror
>gcc-wwwdocs-mirror
>gcc-TEST
>gcc-wwwdocs-TEST
>
> Probably it is a good idea to be consistent, so I will move the
> jemach/a68-gcc repo to gcc/a68-gcc and then rename it to gcc/gcc-a68.

This is also done.  The transfer and rename worked perfectly well.  The
forge is even able to redirect from jemarch/a68-gcc to gcc/gcc-a68 by
itself.

>> Now you can setup a Team inside the gcc organization of people who
>> help maintain this repository. You can do this by either creating a
>> list of colaborators or pick an existing gcc team at
>> https://forge.sourceware.org/gcc/algol68/settings/collaboration
>
> Which one is the default group of people who can admin the repo?  The
> members of the gcc organization?

jwakely created a Team in the gcc organization to administer the
gcc/gcc-a68 repo.

>> (BTW. This would also be a good time to set the Default branch to
>> a68.)
>
> Good idea.  I just did that in jemarch/a68-gcc
>
>>> Then people could fork it, send patch series based on their forks to be
>>> reviewed in the mailing list, get feedback, do corrections, rinse and
>>> repeat.  Then when the series has been OKed a PR can be made and the
>>> maintainer just merges it.  Is this the idea?
>>
>> Yes, that would be the traditional email workflow extended to the
>> forge. Note that (very unforge like) you (the algol68 team members)
>> can also directly push to the repository through ssh sidestepping
>> creating a merge request if that is how you want to do things.
>
> Sounds good!
>
>>> Is it possible to get emails sent to the mailing list when merges
>>> happen, merge requests are issued, etc?
>>
>> People can get emails by clicking the "Watch" button on the
>> repository.
>
> I think thats probably enough for now.  If somebody sends a merge
> request via the web, some admin can contact the person asking her to
> send a patch series to the mailing list for review.
>
>> We could try creating a "fake" account that has the mailinglist as
>> email address. But I am a little hesitant experimenting with "fake"
>> users. It might be good if the owner of a repo could add an email
>> address to the Watch list. Maybe this should be a new feature request
>> to Forgejo.


Re: GCC does not optimize well enough with vectors on bitshift

2025-03-11 Thread Qwert Nerdish via Gcc
Correct link is https://godbolt.org/z/GfeTobMvs

On Mon, Mar 10, 2025 at 4:45 PM Qwert Nerdish  wrote:

> On this godbolt link at https://godbolt.org/z/GfeTobMvs, the two C source
> codes behave identical.
> Yet the second source code does not use vectors and is 30% slower when I
> tested it.
>


GCC does not optimize well enough with vectors on bitshift

2025-03-11 Thread Qwert Nerdish via Gcc
On this godbolt link at https://godbolt.org/z/GfeTobMvs, the two C source
codes behave identical.
Yet the second source code does not use vectors and is 30% slower when I
tested it.


Re: GCC does not optimize well enough with vectors on bitshift

2025-03-11 Thread Jason Merrill via Gcc
Thanks for the report!  Please file it at
https://gcc.gnu.org/bugzilla/enter_bug.cgi?product=gcc

On Mon, Mar 10, 2025 at 10:47 AM Qwert Nerdish via Gcc 
> wrote:
>
> > Correct link is https://godbolt.org/z/GfeTobMvs
> >
> > On Mon, Mar 10, 2025 at 4:45 PM Qwert Nerdish 
> > wrote:
> >
> > > On this godbolt link at https://godbolt.org/z/GfeTobMvs, the two C
> > source
> > > codes behave identical.
> > > Yet the second source code does not use vectors and is 30% slower when
> I
> > > tested it.
> > >
> >
>
>
> --
> Matt
> (he/him)
>
>


GSOC interest in Extend the static analysis pass, [OpenACC]

2025-03-11 Thread Kaaden Ruman via Gcc
Hello, my name is Kaaden and I am a student at the University of Alberta in 
Canada. I am interested in pursuing the "Extend the static analysis pass" idea 
as a medium size project. 

I have cloned and built gcc and ran the testsuite and would like a nudge in the 
direction of what to look at next. I searched in bugzilla for terms like 
"OpenACC" and "static analysis". Bug 118627 looks like it might be a good 
candidate for a first patch. Any guidance on the former or suggestions for 
files that would be most helpful for me to read and try to understand would be 
greatly appreciated. I am open to any input.

Also, I noticed that this link in the project idea description is dead: 
https://mid.mail-archive.com/875yp9pyyx.fsf@euler.schwinge.homeip.net

More about me, I have previous experience in compilers (and working in large 
codebases) from interning at IBM where I worked on their COBOL compiler and 
binary optimiser (mostly backend). I have also taken a compiler course with a 
project of an LLVM based compiler (mostly frontend). 

I also have taken and enjoyed a GPU programming course so extending checks for 
OpenACC caught my eye. In that course we mainly worked with CUDA but discussed 
other tools such as OpenACC as well, so I have some familiarity.

Thanks,
Kaaden

GSOC: Guidance on LTO, and Static Analysis Projects

2025-03-11 Thread Yatindra Indoria via Gcc
Hello,

I am an engineering student. I’ve worked on high frequency trading systems,
and (research) on the Linux kernel I/O and memory subsystems. I am looking
to start contributing to GCC, for quite some time now, as my work utilises
it extensively :)

But GCC is complex, I believe the mentorship from GSOC will enable me to
start. I have of course gone through the “Before you Apply” section (also
built GCC, and executed test suite). I’d also like to mention that David’s
newbie guide and Prof Uday Khedkar’s content have played a huge role in
motivating me to apply.

I wish to know what contributions I can make towards Link-Time
Optimizations? It would be amazing if someone can point out who I should
discuss this with.

Additionally, I went through the linked issues in “Extend the static
analysis pass” mentored by David Malcom. Refactoring the format-string
logic will familiarise me with the GCC codebase. And my experience with
Linux kernel and CPython APIs may come of use in integrating the checker
there.

I’m hence looking into both for now, and intend to decide as per the
information I get for LTO hereon.

Thanks and regards,
Yatindra Indoria


Supporting “crippled” MIPS implementations as cpu option

2025-03-11 Thread mzpqnxow via Gcc
Hello,

Direct questions listed at the end for the impatient :)

Hopefully my mail client wraps the text properly, if not, I apologize in
advance. I haven’t used this client for mailing list posts before…

I’m looking for information on GCC patch submission, hoping someone can
provide some guidance. Specifically, the formal or informal guidelines used
when determining if a specific submission is, in principle, something that
is welcomed, assuming it meets technical guidelines. I do not want to send
a patch if it won’t be considered as I don’t want to waste developers’ time

What I’m talking about: I would like to understand how open the project is
to accepting small patches that add support for a CPU (-mcpu targets) when
they’re very slight variations within a supported architecture

Why: I have the occasional need to have gcc emit MIPS1 instructions that do
not contain a subset of loads and stores (lwl/swl/lwr/swl instructions)

Context/History: There are several patches floating around that do this,
primarily from developers working on embedded networking development on a
few Realtek SoCs in the rtl819x family. These CPUs do not support these
instructions due to patent issues, and will SIGILL when encountering them.
I’m not aware of any work to contribute these upstream after a few quick
searches. There is extensive information available on OpenWRT [1]

Some specific questions:

1. Are any gcc mips experts aware of a way to accomplish what I described
above (no lwl/swl/lwr/swr instructions) *without* patching GCC? I have not
yet tested -mno-struct-align/-mstrict-align though they sound promising. I
plan to give them a try, it may render this all moot. I may be
misunderstanding precisely what they do
2. Are these sorts of patches (mcpu target, unpopular architecture)
generally considered currently?
3. If “it depends” - what are the primary considerations? Does the
popularity of the target factor into it? I admit that this CPU is not
popular and though it can still be found in embedded network devices, it is
only getting less common
4. If such a patch produces code that is inherently incompatible with glibc
(or some other core dependency of a common toolchain) is that considered a
blocker? What I’m referring to here is a (theoretical) example where glibc
uses the “bad” instructions in an asm block as part of some low-level
component (iirc, sigsetjmp, some of the loader functionality, etc. uses
hand-written asm)

I understand this is a pretty niche thing, only benefitting a subset of GCC
users, so I’m not expecting a lot of willingness to accept such a patch,
despite its relatively simplicity. I’m appreciative of any advice, guidance
or commentary the community has to offer, even if the answer is effectively
“don’t bother” :)

Thanks!

1. https://openwrt.org/docs/techref/hardware/soc/soc.realtek


Re: [RFC] RISC-V: Go PLT for CALL/JUMP/RVC_JUMP if `h->plt.offset' isn't -1

2025-03-11 Thread Nelson Chu
On Mon, Mar 10, 2025 at 11:33 AM Fangrui Song  wrote:

> >> (lld has a quite simple model where undefined non-weak and undefined
> >> weak symbols are handled in a unified way.
> >> A symbol is preemptible if:
> >>
> >> * -shared or at least one input file is DSO, and
> >> * the symbol is undefined or exported (to .dynsym due to
> >> --export-dynamic/--dynamic-list/referenced by DSO/etc), and
> >> * other conditions that the symbol is preemptible
> >>
> >> Then, a preemptible symbol might need a PLT and associated JUMP_SLOT
> >> relocation.)
>
> Thanks for the example! I see that the branch target is now more
> meaningful after the patch.
>

Thanks for the detailed description of lld's behavior for reference.  The
behavior described above from you is the main purpose of this patch, so
that when "at least one input file is DSO but --no-pie", undefined weak and
non-weak symbols for call will also be preemptible and go to plt.  Palmer
and I were having a hard time figuring out how to describe this problem in
short...  Thanks for helping us out :-)

Nelson


enquiry about GSoC

2025-03-11 Thread Manish Mathe via Gcc
Hi, I am Manish, a 2nd year B.tech student from India and I have been using
C for almost a year now, mostly to solve DSA problems on leetcode. I
started learning C++ a month ago so I have a bit of an idea on it too.

I was curious about the project "Simple file system for use during Nvidia
and AMD GPU code generation testing" I do own a laptop with a Nvidia GPU
and have learnt file I/O in my 1st year i wanted to know the expected
skills to work on this project so that I can brush up those topics before
the start and learn if required.

I also wanted to know if there are any other ways of communicating with the
group or directly to the mentor to have a clear idea of the project.

Thank you.


Re: GCC does not optimize well enough with vectors on bitshift

2025-03-11 Thread Matt Godbolt
While this doesn't affect your example in this particular case: please
don't use `-march-native` on Compiler Explorer for these examples - this
will pick whatever architecture your individual query is served from which
may be any of the available AMD or Intel CPUs we run on. There ought to be
a pop-up warning you against doing this (I'm not sure why it didn't show
up).

Please use a specific architecture e.g. `-march=skylake-avx512` -
https://godbolt.org/z/GvTcqasqK

Thanks, Matt :)

On Mon, Mar 10, 2025 at 10:47 AM Qwert Nerdish via Gcc 
wrote:

> Correct link is https://godbolt.org/z/GfeTobMvs
>
> On Mon, Mar 10, 2025 at 4:45 PM Qwert Nerdish 
> wrote:
>
> > On this godbolt link at https://godbolt.org/z/GfeTobMvs, the two C
> source
> > codes behave identical.
> > Yet the second source code does not use vectors and is 30% slower when I
> > tested it.
> >
>


-- 
Matt
(he/him)


Re: GSoC 2025: In-Memory Filesystem for GPU Offloading Tests

2025-03-11 Thread Andrew Stubbs

On 10/03/2025 15:37, Arijit Kumar Das via Gcc wrote:

Hello GCC Community!

I am Arijit Kumar Das, a second-year engineering undergraduate from NIAMT
Ranchi, India. While my major isn’t Computer Science, my passion for system
programming, embedded systems, and operating systems has driven me toward
low-level development. Programming has always fascinated me—it’s like
painting with logic, where each block of code works in perfect
synchronization.

The project mentioned in the subject immediately caught my attention, as I
have been exploring the idea of a simple hobby OS for my Raspberry Pi Zero.
Implementing an in-memory filesystem would be an exciting learning
opportunity, closely aligning with my interests.

I have carefully read the project description and understand that the goal
is to modify *newlib* and the *run tools* to redirect system calls for file
I/O operations to a virtual, volatile filesystem in host memory, as the GPU
lacks its own filesystem. Please correct me if I’ve misunderstood any
aspect.


That was the first of two options suggested.  The other option is to 
implement a pass-through RPC mechanism so that the runtime actually can 
access the real host file-system.


Option A is more self-contained, but requires inventing a filesystem and 
ultimately will not help all the tests pass.


Option B has more communication code, but doesn't require storing 
anything manually, and eventually should give full test coverage.


A simple RPC mechanism already exists for the use of printf (actually 
"write") on GCN, but was not necessary on NVPTX (a "printf" text output 
API is provided by the driver).  The idea is to use a shared memory ring 
buffer that the host "run" tool polls while the GPU kernel is running.



I have set up the GCC source tree and am currently browsing relevant files
in the *gcc/testsuite* directory. However, I am unsure *where the run tools
source files are located and how they interact with newlib system calls.*
Any guidance on this would be greatly appreciated so I can get started as
soon as possible!


You'll want to install the toolchain following the instructions at 
https://gcc.gnu.org/wiki/Offloading and try running some simple OpenMP 
target kernels first.  Newlib isn't part of the GCC repo, so if you 
can't find the files then that's probably why!


The "run" tools are installed as part of the offload toolchain, albeit 
hidden under the "libexec" directory because they're really only used 
for testing. You can find the sources with the config/nvptx or 
config/gcn backend files.


User code is usually written using OpenMP or OpenACC, in which case the 
libgomp target plugins serve the same function as the "run" tools. These 
too could use the file-system access, but it's not clear that there's a 
common use-case for that.  The case should at least fail gracefully 
though (as they do now).


Currently, system calls such as "open" simply return EACCESS 
("permission denied") so the stub implementations are fairly easy to 
understand (e.g. newlib/libc/sys/amdgcn/open.c).  The task would be to 
insert new code there that actually does something.  You do not need to 
modify the compiler itself.


Hope that helps

Andrew



Best regards,
Arijit Kumar Das.

*GitHub:* https://github.com/ArijitKD
*LinkedIn:* https://linkedin.com/in/arijitkd