Re: Welcome GCC GSoC 2020 participants

Martin Jambor Thu, 21 May 2020 13:00:54 -0700

Hello Tony,

On Wed, May 20 2020, y2s1982 . wrote:
> Hello Martin,
> On Mon, May 18, 2020 at 11:32 AM Martin Jambor <mjam...@suse.cz> wrote:
>
>> Hello Tony,
>>
>> sorry for not getting back to you last week.  Time seems to fly even
>> faster now that I'm forced to work from home :-/  Furthermore, both me
>> and Jakub have been preparing for a big OpenMP meeting that takes place
>> this week.
>
> Wow, is that meeting something I may attend? It sounds like an amazing
> learning opportunity.

Unfortunately no, it is a meeting of representatives of "members of the
OpenMP Architecture Review Board" and is not opened to public.

>> On Tue, May 12 2020, y2s1982 . wrote:
>> > This is Tony Sim. I am excited to report that I have been getting some
>> > documentation out of the way in the past few weeks.
>> > - I have submitted a signed application to ass...@gnu.org. I had a corr-
>> > -espondence with Craig and have submitted a signed document last
>> Thursday.
[...]
>
> I got a reply from Craig yesterday with the form that has both signatures.
> It seems that part is now complete.
>

Great, It is good to know this has been sorted out.

[...]

> Oh, that makes sense: private keys shouldn't be shared. I do have a public
> key, so I will share it when given the opportunity. I will reach out to the
> Compile Farm later this week to inquire more on the progress of the
> application.

Good.  Compile farm will only get handy once it gets to testing stuff
anyway (assuming you have a reasonable computer to work on).

[...]

>>
>> About "how do I contribute" question you asked on IRC: I think it's
>> going to take a little while before we get to that but once you have the
>> FSF copyright assignment and something to contribute, we'll get you an
>> account allowing you to commit after approval directly to the upstream
>> repo - after approval means once your patch has been officially approved
>> by the respective maintainer on the gcc-patches mailing list.
>>
>> At this point, I'd suggest that you simply clone our git repo and start
>> experimenting.  Patch-via-email is likely to be the most used way of
>> discussing code.  At some point we'll probably need a branch, that can
>> initially sit either on gcc git server or anywhere else, really.  There
>> is a mirror both at gitlab (https://gitlab.com/x86-gcc/gcc) or github
>> (https://github.com/gcc-mirror/gcc) and many other git hosting services,
>> for example.  Whatever fits your philosophical or practical preferences.
>
> If it is all the same, and since I am familiar with working on github, may
> I work on github?  I took the liberty of creating the fork of gcc-mirror to
> my account. I would like to create a major develop branch within the fork,
> and create minor develop branches from that branch. I would also like to
> plan out my tasks using their Issue tracking system. The minor develop
> branch code would be reviewed via PR by any interested parties,
> particularly Jakub, after which it would be squash-merged to the major
> develop branch of the fork.  We can discuss further on the interval for the
> patch-via-email process to merge the code to upstream, which I assume would
> happen when the code reaches certain maturity, or at least at the end of
> this project.

If that is how you like to work, I guess we can try it out.  Just please
keep in mind that:

1) We are used to reviewing patches in our email clients and prefer it
   to reviews in web-based tools.  I have quite a lot of customizations
   in place that I am used to and so prefer it to
   one-method-fits-everyone web tools.

2) Do not spend too much time thinking about how to organize the
   project.  The time is better spent actually thinking about the
   project itself, particularly because I expect this one to entail a
   lot of experimenting with an occasional dead end.

> I also would like to know how often I should pull from upstream to
> keep the fork up to date.

I personally pull every Monday, sometimes every other Monday if I feel I
am particularly short of time.  As long as it does not cause any painful
merging issues, I'd suggest similar cadence.  If there are issues,
falling behind a few weeks is OK if it helps you focus, I think.

[...]

>>
>> I know that this delayed response might suggest otherwise, but email
>> will be the main communication method for the project.  I believe Jakub
>> strongly prefers email too, perhaps even more than I do.
>
> I am comfortable with email. The community has been very generous on IRC
> front, too, so I guess I will use both methods.
>
>> More often than not it will be a good idea to CC the gcc mailing list on
>> any email regarding the project.  It is not just the two of us who can
>> help you with issues.  We also generally prefer working in the open.
>>
> Okay. I have CC'd the gcc mailing list. I hope I chose the correct one.

Yes, you indeed have.

>> Having said that, if you'd like to do a hangouts video call to say hello
>> to each other and perhaps to discuss some issues with setting up your
>> work, I personally am definitely happy to do that too.  As a regular
>> communication tool, I did not find videoconferencing to be very useful
>> in the past (but I guess I can be persuaded to try again).
>
> Hmm. In my last coop that ended during pandemic, we used the video
> conferencing tool to do daily stand-ups so the team can keep tabs on how
> different parts of the project is going and give suggestions as needed. A
> little off-topic, but how often would you like to discuss my progress of
> the project?

So... ideally the stream of emails discussing the overall approach,
followed by a stream of patches and reviews would make it completely
unnecessary to ask you for some kind of regular status reports.
Nevertheless, if some task takes you more than a 4-5 work-days in which
you don't get back to us, please send us a quick summary of what you
have been working on.  This arrangement of course means that you need to
reach out to us if you believe you are stuck, so please do.

But let me reiterate that I am willing to try a videoconference or two
if you think it would be useful at any point.

>> > As the next step, I planned on following the instructions on
>> > https://gcc.gnu.org/wiki/SummerOfCode#Before_you_apply. I also remember
>> > suggestions that I should learn to read the dump files, which is going to
>> > be the second things I will try out.
>>
>> OK, all very important, how is it going?
>>
>> In particular, look at (selected few) libgomp testcases in
>> libgomp/testsuite/libgomp.c/ (let's focus only on C for now).  So after
>> you've built gcc (remember to disable bootstrap and stuff), enter the
>> "x86_64-pc-linux-gnu/libgomp/" subdirectory within your build directory
>> (assuming you are on an x86_64) and run:
>>
>>   make -j -k check RUNTESTFLAGS="c.exp"
>>
>> and wait for the tests to be run which can take quite a while.  After
>> they are, inspect the generated file in testsuite/libgomp.log which
>> contains (among other things) the exact command lines used to compile
>> the testcases.  Compile them yourself, add dumping options (I think the
>> most interesting for you are -fdump-tree-original -fdump-tree-gimple
>> -fdump-tree-omplower -fdump-tree-ompexp -fdump-tree-ssa and especially
>> -fdump-tree-optimized), inspect the dumps and start asking questions.
>
>
> I had built gcc but without disabling bootstrap. I read up on it, and was
> wondering what was the need for disabling it. Is it to save time on
> compilation by not compiling the entire gcc?

Yeah, the reason is compile time, but the point is not to avoid
compiling the "entire" gcc but rather not to compile it multiple times.
With bootstrap enabled, GCC is first built using the system compiler,
then again using itself to be optimized with its latest and greatest
optimizations passes and then again using the optimized version to
verify that the result is the same, bit-for-bit, to catch any errors in
the latest and greatest optimizations.

But you can also disable building stuff you don't need, in your case it
is for example enough to build only C and C++ front-ends using the
switch --enable-languages=c,c++ at configure time.  Later on you may
want to add fortran.

> Also, I tried compiling one of my old assignment from OpenMP with the flags
> mentioned in the instruction.  The compiled code was using work-sharing to
> perform reduction. It generated 255 files, and I observed .gimple,
> .omplower, .ompexp, and .ompexpssa2.  I am still learning what just
> happened to my code, of course :)  I wrote some of my findings in my blog:
> http://shavedyak.blogspot.com/2020/05/debugging-with-gcc-gimple.html
> I haven't looked at the optimized output, so I will look at that, too.  My
> understanding is that each of the 255 files represents a transformation
> from the .gimple file, each either translating the code to gimple or
> optimizing before they are used to create the binary.

Generally speaking, yes.  But it seems like you also used -fdump-rtl-all
option, which dumps the RTL intermediate representation of each function
after each pass.  RTL is a lower-level representation than gimple and is
very different from gimple.  I believe you can safely skip those now -
those tell you what lowest-level optimizations were performed, what
register allocation did etc.

Ditto for dumps generated by inter-procedural (IPA) passes and
associated (symbol table, call graph) infrastructure, produced with
-fdump-ipa-* options.  Those are whole different topic.

I suggested you looked at (selected) gimple dumps because I think those
are the easiest to read, not unlike a primitive C, and will show you
what GCC produces for various OpenMP constructs in a form that is easier
to grok than assembly.  You can have many gimple (tree) dumps generated
because there's one after each optimization pass.  But since your focus
is not on optimization, the last gimple dump called ".optimized" is
probably the only really interesting one for you.  That shows the final
gimple before it is translated to RTL (reduce optimization level to -O1
if optimizations like inlining confuse you).

And by the way, even though gimple representation of the same code is
very similar on different architectures, there is a myriad of details
(e.g. all the type sizes) which make it very much
architecture-dependent.

>>
>> Dumps will show you what the compiler produces but most of the work in
>> this project will probably be done in the run-time library libgomp.  So
>> look at its source, the generated dump files should show you what are
>> the entry points and when they are called.  Please make sure you
>> understand how the library works for simple OpenMP (example) programs.
>> Ask more questions.
>>
> I will try compiling the test cases you mentioned and try to understand the
> gimple more in depth. I will also try to see which part of the libgomp is
> making the translation. Is it correct for me to assume that libgomp is all
> about reading C code and manipulate GIMPLE?
>

No, GCC, the compiler, reads C and then goes through various stages of
intermediate representations of the C code, one of which is gimple,
optimizes it and produces an assembly.

If that C file contains OpenMP directives (and you compile with
-fopenmp) many of those are converted in one way or another into calls
into the "GNU offloading and multi-processing (run-time) library:"
libgomp.  It used to be just GNU OpenMP library but now it is also the
run-time library for OpenACC.

For example, #pragma omp parallel is compiled in a way that the body of
the construct is outlined into a special artificial function and the
construct itself is compiled into a call to a function GOMP_parallel,
with a reference to the function with the body passed in one of the
parameters.  In gimple optimized dump, the function is called
__builtin_GOMP_parallel which I admit is slightly confusing, but it is
the same thing - and the concept should be well visible in the dump.

GOMP_parallel is a function in libgomp.  Grep-ing for it in the libgomp
subdirectory finds it in parallel.c.  From the dump you should have good
idea what it receives in its parameters.  Reading a large chunk of
libgomp source code starting there - and perhaps at other such entry
points - is probably a good idea.

>>
>> > I also feel that I should be reading up on OMPD documentation more
>> > thoroughly, as well. Please feel free to give any suggestions on this
>> path.
>> >
>>
>> That's exactly right.  I currently can only point to the specification
>> itself at openmp.org and to the only public implementation in (LLVM)
>> libomp.  I personally have only had a superficial look at the former and
>> none at all at the latter, so cannot provide much advice at this point.
>> But this is the point where the actual project work starts ...so feel
>> free to ask yet more questions.
>>
> I skimmed through the documentation to familiarize with the interface. I
> would have to read more on it as I go through the development.
> I also looked at the clang project. I could see how some of the document
> was used to create headers and constants. What I didn't get is their
> references to gdb: does that mean something different in clang or is that
> referencing GCC's gdb? An entire folder is dedicated to gdb-wrapper, for
> example, and a commit history also references gdb.

I can only guess they indeed refer to GNU gdb.  

>>
>> > Furthermore, since this is supposed to be about community bonding, I was
>> > wondering if there's any suggestions you might have for me in this
>> regard.
>>
>> Please be persistent if people fail to respond to you for a long time :-)
>> Ping me or Jakub if we have not replied for a few days.  Ping your
>> patch on the mailing list if it has not been reviewed for two weeks.  It
>> unfortunately can happen that we keep postponing a reply for a few days
>> too many.
>>
> Okay. I will keep that time frame in mind.  Also, are there any suggestions
> to the work schedule defined in my proposal? Should I change anything from
> it?

IMHO the first 4-6 weeks look good, then we'll re-adjust.

>>
>> >
>> > I look forward to working with you and the community.
>>
>> This is going to be a very difficult project but I am very happy to see
>> it getting started!  I wish you best of luck!
>>
> Thank you very much :)

You're very welcome!

Martin

Re: Welcome GCC GSoC 2020 participants

Reply via email to