Re: GSoC 2025: In-Memory Filesystem for GPU Offloading Tests
On 10/03/2025 22:56, Arijit Kumar Das wrote: Hello Andrew, Thank you for the detailed response! This gives me a much clearer picture of how things work. Regarding the two possible approaches: * I personally find *Option A (self-contained in-memory FS)* more interesting, and I'd like to work on it first. * However, if *Option B (RPC-based host FS access)* is the preferred approach for GSoC, I’d be happy to work on that as well. I'll defer to Thomas who proposed the project and volunteered to act as GSoC mentor. :) Have fun! Andrew
Re: What branch to use for the Algol 68 front-end
>> Hi Jose, >> >> On Sat, Mar 08, 2025 at 02:17:52PM +0100, Jose E. Marchesi wrote: >>> > Since you already have a fork on the (experimental) forge we could >>> > also move your fork under https://forge.sourceware.org/gcc that way >>> > you can experiment with merge requests if you like. Even if all your >>> > patches still go to the algol68 list first. >>> >>> I am all for continuing using the forge if it is useful for the >>> experiment. >> >> I think it would. Also I think it would be fun to have one of the >> oldest language frontends use the most modern development environments >> :) > > :) > >>> Let's see. Right now the resources we are using are: >>> >>> Website: https://gcc.gnu.org/wiki/Algol68FrontEnd >>> Mailing list: algo...@gcc.gnu.org >>> Bugzilla: https://gcc.gnu.org/bugzilla >>> Git repo: https://forge.sourceware.org/jemarch/a68-gcc (branch a68) >>> >>> If I understand you suggestion right, we would simply move the git repo >>> to: >>> >>> https://forge.sourceware.org/gcc/algol68 (branch a68) >> >> Yes. To do that you should become a member/owner of the gcc >> organization. https://forge.sourceware.org/org/gcc An existing >> member/owner should be able to accept you. >> https://forge.sourceware.org/org/gcc/members > > Ok. I have asked in IRC. Hopefully someone can add my user to that > list. This is done. >> Then you can move your repository/fork to the gcc organization by >> going to the existing repository Settings at >> https://forge.sourceware.org/jemarch/algol68/settings Scroll down to >> "Danger zone" and click on "Transfer ownership" The new owner would be >> gcc. It does need to be accepted by one of the gcc Owners. > > Note that the repo is currently named jemarch/a68-gcc, but I see all the > repos under the gcc organization start with "gcc": > >gcc-mirror >gcc-wwwdocs-mirror >gcc-TEST >gcc-wwwdocs-TEST > > Probably it is a good idea to be consistent, so I will move the > jemach/a68-gcc repo to gcc/a68-gcc and then rename it to gcc/gcc-a68. This is also done. The transfer and rename worked perfectly well. The forge is even able to redirect from jemarch/a68-gcc to gcc/gcc-a68 by itself. >> Now you can setup a Team inside the gcc organization of people who >> help maintain this repository. You can do this by either creating a >> list of colaborators or pick an existing gcc team at >> https://forge.sourceware.org/gcc/algol68/settings/collaboration > > Which one is the default group of people who can admin the repo? The > members of the gcc organization? jwakely created a Team in the gcc organization to administer the gcc/gcc-a68 repo. >> (BTW. This would also be a good time to set the Default branch to >> a68.) > > Good idea. I just did that in jemarch/a68-gcc > >>> Then people could fork it, send patch series based on their forks to be >>> reviewed in the mailing list, get feedback, do corrections, rinse and >>> repeat. Then when the series has been OKed a PR can be made and the >>> maintainer just merges it. Is this the idea? >> >> Yes, that would be the traditional email workflow extended to the >> forge. Note that (very unforge like) you (the algol68 team members) >> can also directly push to the repository through ssh sidestepping >> creating a merge request if that is how you want to do things. > > Sounds good! > >>> Is it possible to get emails sent to the mailing list when merges >>> happen, merge requests are issued, etc? >> >> People can get emails by clicking the "Watch" button on the >> repository. > > I think thats probably enough for now. If somebody sends a merge > request via the web, some admin can contact the person asking her to > send a patch series to the mailing list for review. > >> We could try creating a "fake" account that has the mailinglist as >> email address. But I am a little hesitant experimenting with "fake" >> users. It might be good if the owner of a repo could add an email >> address to the Watch list. Maybe this should be a new feature request >> to Forgejo.
Re: GCC does not optimize well enough with vectors on bitshift
Correct link is https://godbolt.org/z/GfeTobMvs On Mon, Mar 10, 2025 at 4:45 PM Qwert Nerdish wrote: > On this godbolt link at https://godbolt.org/z/GfeTobMvs, the two C source > codes behave identical. > Yet the second source code does not use vectors and is 30% slower when I > tested it. >
GCC does not optimize well enough with vectors on bitshift
On this godbolt link at https://godbolt.org/z/GfeTobMvs, the two C source codes behave identical. Yet the second source code does not use vectors and is 30% slower when I tested it.
Re: GCC does not optimize well enough with vectors on bitshift
Thanks for the report! Please file it at https://gcc.gnu.org/bugzilla/enter_bug.cgi?product=gcc On Mon, Mar 10, 2025 at 10:47 AM Qwert Nerdish via Gcc > wrote: > > > Correct link is https://godbolt.org/z/GfeTobMvs > > > > On Mon, Mar 10, 2025 at 4:45 PM Qwert Nerdish > > wrote: > > > > > On this godbolt link at https://godbolt.org/z/GfeTobMvs, the two C > > source > > > codes behave identical. > > > Yet the second source code does not use vectors and is 30% slower when > I > > > tested it. > > > > > > > > -- > Matt > (he/him) > >
GSOC interest in Extend the static analysis pass, [OpenACC]
Hello, my name is Kaaden and I am a student at the University of Alberta in Canada. I am interested in pursuing the "Extend the static analysis pass" idea as a medium size project. I have cloned and built gcc and ran the testsuite and would like a nudge in the direction of what to look at next. I searched in bugzilla for terms like "OpenACC" and "static analysis". Bug 118627 looks like it might be a good candidate for a first patch. Any guidance on the former or suggestions for files that would be most helpful for me to read and try to understand would be greatly appreciated. I am open to any input. Also, I noticed that this link in the project idea description is dead: https://mid.mail-archive.com/875yp9pyyx.fsf@euler.schwinge.homeip.net More about me, I have previous experience in compilers (and working in large codebases) from interning at IBM where I worked on their COBOL compiler and binary optimiser (mostly backend). I have also taken a compiler course with a project of an LLVM based compiler (mostly frontend). I also have taken and enjoyed a GPU programming course so extending checks for OpenACC caught my eye. In that course we mainly worked with CUDA but discussed other tools such as OpenACC as well, so I have some familiarity. Thanks, Kaaden
GSOC: Guidance on LTO, and Static Analysis Projects
Hello, I am an engineering student. I’ve worked on high frequency trading systems, and (research) on the Linux kernel I/O and memory subsystems. I am looking to start contributing to GCC, for quite some time now, as my work utilises it extensively :) But GCC is complex, I believe the mentorship from GSOC will enable me to start. I have of course gone through the “Before you Apply” section (also built GCC, and executed test suite). I’d also like to mention that David’s newbie guide and Prof Uday Khedkar’s content have played a huge role in motivating me to apply. I wish to know what contributions I can make towards Link-Time Optimizations? It would be amazing if someone can point out who I should discuss this with. Additionally, I went through the linked issues in “Extend the static analysis pass” mentored by David Malcom. Refactoring the format-string logic will familiarise me with the GCC codebase. And my experience with Linux kernel and CPython APIs may come of use in integrating the checker there. I’m hence looking into both for now, and intend to decide as per the information I get for LTO hereon. Thanks and regards, Yatindra Indoria
Supporting “crippled” MIPS implementations as cpu option
Hello, Direct questions listed at the end for the impatient :) Hopefully my mail client wraps the text properly, if not, I apologize in advance. I haven’t used this client for mailing list posts before… I’m looking for information on GCC patch submission, hoping someone can provide some guidance. Specifically, the formal or informal guidelines used when determining if a specific submission is, in principle, something that is welcomed, assuming it meets technical guidelines. I do not want to send a patch if it won’t be considered as I don’t want to waste developers’ time What I’m talking about: I would like to understand how open the project is to accepting small patches that add support for a CPU (-mcpu targets) when they’re very slight variations within a supported architecture Why: I have the occasional need to have gcc emit MIPS1 instructions that do not contain a subset of loads and stores (lwl/swl/lwr/swl instructions) Context/History: There are several patches floating around that do this, primarily from developers working on embedded networking development on a few Realtek SoCs in the rtl819x family. These CPUs do not support these instructions due to patent issues, and will SIGILL when encountering them. I’m not aware of any work to contribute these upstream after a few quick searches. There is extensive information available on OpenWRT [1] Some specific questions: 1. Are any gcc mips experts aware of a way to accomplish what I described above (no lwl/swl/lwr/swr instructions) *without* patching GCC? I have not yet tested -mno-struct-align/-mstrict-align though they sound promising. I plan to give them a try, it may render this all moot. I may be misunderstanding precisely what they do 2. Are these sorts of patches (mcpu target, unpopular architecture) generally considered currently? 3. If “it depends” - what are the primary considerations? Does the popularity of the target factor into it? I admit that this CPU is not popular and though it can still be found in embedded network devices, it is only getting less common 4. If such a patch produces code that is inherently incompatible with glibc (or some other core dependency of a common toolchain) is that considered a blocker? What I’m referring to here is a (theoretical) example where glibc uses the “bad” instructions in an asm block as part of some low-level component (iirc, sigsetjmp, some of the loader functionality, etc. uses hand-written asm) I understand this is a pretty niche thing, only benefitting a subset of GCC users, so I’m not expecting a lot of willingness to accept such a patch, despite its relatively simplicity. I’m appreciative of any advice, guidance or commentary the community has to offer, even if the answer is effectively “don’t bother” :) Thanks! 1. https://openwrt.org/docs/techref/hardware/soc/soc.realtek
Re: [RFC] RISC-V: Go PLT for CALL/JUMP/RVC_JUMP if `h->plt.offset' isn't -1
On Mon, Mar 10, 2025 at 11:33 AM Fangrui Song wrote: > >> (lld has a quite simple model where undefined non-weak and undefined > >> weak symbols are handled in a unified way. > >> A symbol is preemptible if: > >> > >> * -shared or at least one input file is DSO, and > >> * the symbol is undefined or exported (to .dynsym due to > >> --export-dynamic/--dynamic-list/referenced by DSO/etc), and > >> * other conditions that the symbol is preemptible > >> > >> Then, a preemptible symbol might need a PLT and associated JUMP_SLOT > >> relocation.) > > Thanks for the example! I see that the branch target is now more > meaningful after the patch. > Thanks for the detailed description of lld's behavior for reference. The behavior described above from you is the main purpose of this patch, so that when "at least one input file is DSO but --no-pie", undefined weak and non-weak symbols for call will also be preemptible and go to plt. Palmer and I were having a hard time figuring out how to describe this problem in short... Thanks for helping us out :-) Nelson
enquiry about GSoC
Hi, I am Manish, a 2nd year B.tech student from India and I have been using C for almost a year now, mostly to solve DSA problems on leetcode. I started learning C++ a month ago so I have a bit of an idea on it too. I was curious about the project "Simple file system for use during Nvidia and AMD GPU code generation testing" I do own a laptop with a Nvidia GPU and have learnt file I/O in my 1st year i wanted to know the expected skills to work on this project so that I can brush up those topics before the start and learn if required. I also wanted to know if there are any other ways of communicating with the group or directly to the mentor to have a clear idea of the project. Thank you.
Re: GCC does not optimize well enough with vectors on bitshift
While this doesn't affect your example in this particular case: please don't use `-march-native` on Compiler Explorer for these examples - this will pick whatever architecture your individual query is served from which may be any of the available AMD or Intel CPUs we run on. There ought to be a pop-up warning you against doing this (I'm not sure why it didn't show up). Please use a specific architecture e.g. `-march=skylake-avx512` - https://godbolt.org/z/GvTcqasqK Thanks, Matt :) On Mon, Mar 10, 2025 at 10:47 AM Qwert Nerdish via Gcc wrote: > Correct link is https://godbolt.org/z/GfeTobMvs > > On Mon, Mar 10, 2025 at 4:45 PM Qwert Nerdish > wrote: > > > On this godbolt link at https://godbolt.org/z/GfeTobMvs, the two C > source > > codes behave identical. > > Yet the second source code does not use vectors and is 30% slower when I > > tested it. > > > -- Matt (he/him)
Re: GSoC 2025: In-Memory Filesystem for GPU Offloading Tests
On 10/03/2025 15:37, Arijit Kumar Das via Gcc wrote: Hello GCC Community! I am Arijit Kumar Das, a second-year engineering undergraduate from NIAMT Ranchi, India. While my major isn’t Computer Science, my passion for system programming, embedded systems, and operating systems has driven me toward low-level development. Programming has always fascinated me—it’s like painting with logic, where each block of code works in perfect synchronization. The project mentioned in the subject immediately caught my attention, as I have been exploring the idea of a simple hobby OS for my Raspberry Pi Zero. Implementing an in-memory filesystem would be an exciting learning opportunity, closely aligning with my interests. I have carefully read the project description and understand that the goal is to modify *newlib* and the *run tools* to redirect system calls for file I/O operations to a virtual, volatile filesystem in host memory, as the GPU lacks its own filesystem. Please correct me if I’ve misunderstood any aspect. That was the first of two options suggested. The other option is to implement a pass-through RPC mechanism so that the runtime actually can access the real host file-system. Option A is more self-contained, but requires inventing a filesystem and ultimately will not help all the tests pass. Option B has more communication code, but doesn't require storing anything manually, and eventually should give full test coverage. A simple RPC mechanism already exists for the use of printf (actually "write") on GCN, but was not necessary on NVPTX (a "printf" text output API is provided by the driver). The idea is to use a shared memory ring buffer that the host "run" tool polls while the GPU kernel is running. I have set up the GCC source tree and am currently browsing relevant files in the *gcc/testsuite* directory. However, I am unsure *where the run tools source files are located and how they interact with newlib system calls.* Any guidance on this would be greatly appreciated so I can get started as soon as possible! You'll want to install the toolchain following the instructions at https://gcc.gnu.org/wiki/Offloading and try running some simple OpenMP target kernels first. Newlib isn't part of the GCC repo, so if you can't find the files then that's probably why! The "run" tools are installed as part of the offload toolchain, albeit hidden under the "libexec" directory because they're really only used for testing. You can find the sources with the config/nvptx or config/gcn backend files. User code is usually written using OpenMP or OpenACC, in which case the libgomp target plugins serve the same function as the "run" tools. These too could use the file-system access, but it's not clear that there's a common use-case for that. The case should at least fail gracefully though (as they do now). Currently, system calls such as "open" simply return EACCESS ("permission denied") so the stub implementations are fairly easy to understand (e.g. newlib/libc/sys/amdgcn/open.c). The task would be to insert new code there that actually does something. You do not need to modify the compiler itself. Hope that helps Andrew Best regards, Arijit Kumar Das. *GitHub:* https://github.com/ArijitKD *LinkedIn:* https://linkedin.com/in/arijitkd