Re: Moving to git

2015-08-24 Thread Jakub Jelinek
On Thu, Aug 20, 2015 at 04:09:39PM -0400, Jason Merrill wrote:
> On 08/20/2015 02:23 PM, Jeff Law wrote:
> >I suspect Jakub will strongly want to see some kind commit hook to
> >associate something similar to an SVN id to each git commit to support
> >his workflow where the SVN ids are  associated with the compiler
> >binaries he keeps around for very fast bisection.  I think when we
> >talked about it last year, he just needs an increasing # for each
> >commit, presumably starting with whatever the last SVN ID is when we
> >make the change.
> 
> Jakub: How about using git bisect instead, and identify the compiler
> binaries with the git commit sha1?

That is really not useful.  While you speed it bisection somewhat by avoiding
network traffic and communication with a server, there is still significant
time spent on actually building the compiler.  So, if you use bisection only
occassionally, git bisect may be useful, but if you use it often, it is
still too slow.  The way I use bisection is that either I have for every 50-200
commits a cc1/cc1plus/f951 compiler already built (that is on my ws) or for
every non-library commit to the branch that could affect the compiler (no
testsuite changes etc.).  And for those really identifying them by sha1
hashes is significantly worse than using monotonically increasing small
number, sha1 hashes are impossible to remember, and you don't know what is
earlier and what is later from just looking at it.

The revision ids are also useful for bugzilla, r123456
links in text pointing to http://gcc.gnu.org/r123456 is significantly
shorter and again gives the idea what is earlier and what is later, over
referencing the sha1 hashes in the text.

Looking at man git-notes, can we e.g. in a some git commit hook
do notes.rewrite. and assign to each trunk or release branch
commit the revision ids starting from the last svn rev id we'll get before
the conversion (and for the converted commits from svn too), remember it
both Notes: of the commit and perhaps some on
the side file (or files, say for every 1000 revision ids), which would
translate the revision ids to the sha1 hashes?
Then http://gcc.gnu.org/r123456 could keep working, we could mention the
revision numbers in gcc-cvs mails too (here I'd prefer not to send diffs
to that mailing list, but only lists of changed files like now, plus URL to
the commit, the revision id and sha1 hash), it could be mentioned in gcc
--version too (again, holds more information than much longer sha1 sum).

Jakub


Re: Moving to git

2015-08-24 Thread Andreas Schwab
Jakub Jelinek  writes:

> And for those really identifying them by sha1 hashes is significantly
> worse than using monotonically increasing small number, sha1 hashes
> are impossible to remember, and you don't know what is earlier and
> what is later from just looking at it.

git describe gives you such a number (relative to a tag).

Andreas.

-- 
Andreas Schwab, SUSE Labs, sch...@suse.de
GPG Key fingerprint = 0196 BAD8 1CE9 1970 F4BE  1748 E4D4 88E3 0EEA B9D7
"And now for something completely different."


Re: Moving to git

2015-08-24 Thread Jonathan Wakely
On 24 August 2015 at 09:17, Jakub Jelinek wrote:
> The revision ids are also useful for bugzilla, r123456
> links in text pointing to http://gcc.gnu.org/r123456 is significantly
> shorter

The first six characters of the sha1 is usually enough to
unambiguously identify a commit, so we could easily have
https://gcc.gnu.org/git/f00baa or something similar, if we don't use
git-notes to add a revision to the commits.


Re: Moving to git - bibisect ...

2015-08-24 Thread Michael Meeks
Hi Jakub,

On Mon, 2015-08-24 at 10:17 +0200, Jakub Jelinek wrote:
> > Jakub: How about using git bisect instead, and identify the compiler
> > binaries with the git commit sha1?
> 
> That is really not useful.  While you speed it bisection somewhat by avoiding
> network traffic and communication with a server, there is still significant
> time spent on actually building the compiler.

In LibreOffice land (thanks to Bjoern Michaelsen) we use and publish
binary bisection repositories (bibsect). It takes of the order of an
hour+ on some cutting edge hardware to build each of our binaries - for
most people longer - so we archive our live, runnable commit as you do -
but we check those images into a new git repository.

Each of those is checked in with a commit message that points to the
source hash.

> The way I use bisection is that either I have for every 50-200
> commits a cc1/cc1plus/f951 compiler already built (that is on my ws) or for
> every non-library commit to the branch that could affect the compiler (no
> testsuite changes etc.).

So in our model, those would all go in git and get packed with an
aggressive git gc. We publish these repositories too[1] - with thousands
of binaries built inside them so non-technical QA guys can download and
locate the right developer to blame for their pet regression long after
the date). Interestingly mostly non-technical QA guys have done this for
several hundred regressions in the last few years.

>   And for those really identifying them by sha1
> hashes is significantly worse than using monotonically increasing small
> number, sha1 hashes are impossible to remember, and you don't know what is
> earlier and what is later from just looking at it.

That's of course true; the hashes are a pain - but bisecting in the
binary repository is easy enough I think - and there is IIRC some degree
of built-in tooling for running scripts/tests on each version to
automate that (I'm sure you have something like that already).

https://wiki.documentfoundation.org/QA/Bibisect#Introduction

has some fluff on our approach.

Of course, aside from that git takes quite some learning to love ;-)
but as/when you're there you wonder how you lived through RCS, CVS, SVN,
etc.

HTH,

Michael.

[1] - this of course involves some horrors of different Linux and ABI
issues and so on that (I hope) gcc would be less prone to problems with.
-- 
 michael.me...@collabora.com  <><, Pseudo Engineer, itinerant idiot



Re: Moving to git

2015-08-24 Thread Eric S. Raymond
Jonathan Wakely :
> On 24 August 2015 at 09:17, Jakub Jelinek wrote:
> > The revision ids are also useful for bugzilla, r123456
> > links in text pointing to http://gcc.gnu.org/r123456 is significantly
> > shorter
> 
> The first six characters of the sha1 is usually enough to
> unambiguously identify a commit, so we could easily have
> https://gcc.gnu.org/git/f00baa or something similar, if we don't use
> git-notes to add a revision to the commits.

I recommend *against* using hashes to identify commits.  Here's what I said
about it in the NTPsec developers guidelines.

=== How to refer to previous commits ===

The best (most human-friendly) way to reference a commit is by quoting its
summary line.  If you need to disambiguate, give its date and author.

The worst way is to quote its git hash, because humans are not good at
keeping random strings of hex digits in working memory.  Besides, hashes
will break if the history is ever moved to another VCS or the repository
has to be surgically altered.
-- 
http://www.catb.org/~esr/";>Eric S. Raymond


Re: Moving to git

2015-08-24 Thread Jonathan Wakely
On 24 August 2015 at 11:42, Eric S. Raymond wrote:
> Jonathan Wakely :
>> On 24 August 2015 at 09:17, Jakub Jelinek wrote:
>> > The revision ids are also useful for bugzilla, r123456
>> > links in text pointing to http://gcc.gnu.org/r123456 is significantly
>> > shorter
>>
>> The first six characters of the sha1 is usually enough to
>> unambiguously identify a commit, so we could easily have
>> https://gcc.gnu.org/git/f00baa or something similar, if we don't use
>> git-notes to add a revision to the commits.
>
> I recommend *against* using hashes to identify commits.  Here's what I said
> about it in the NTPsec developers guidelines.
>
> === How to refer to previous commits ===
>
> The best (most human-friendly) way to reference a commit is by quoting its
> summary line.  If you need to disambiguate, give its date and author.

That doesn't really work if we want Bugzilla to automatically turn
something that looks like a reference to a commit into a hyperlink.
Currently I can say "caused by r227043" in a bugzilla comment and it
links to the relevant commit. I don't really want to have to say
"caused by libstdc++/67294 Don't run timed mutex tests on Darwin" or
"caused by
Author: Jonathan Wakely 
Date:   Thu Aug 20 20:36:19 2015 +
"

It's pretty simple for Bugzilla to look for "r\d+" in comments and
create a hyperlink to https://gcc.gnu.org/\1 without accessing the
repository at all. It would not be practical (for every bugzilla
comment) to search the repo for "libstdc++/67294 Don't run timed mutex
tests on Darwin" to identify a specific commit and create a link to
it.

> The worst way is to quote its git hash, because humans are not good at
> keeping random strings of hex digits in working memory.  Besides, hashes
> will break if the history is ever moved to another VCS or the repository
> has to be surgically altered.

We have that situation now with the subversion commit IDs we refer to
in Bugzilla, that doesn't mean it isn't useful.


Re: Offer of help with move to git

2015-08-24 Thread Tom Tromey
Eric> In the mean time, I'm enclosing a contributor map that will need to be
Eric> filled in whoever does the conversion.  The right sides should become
Eric> full names and preferred email addresses.

It's probably worth starting with the map I used when converting gdb.
There is a lot of overlap between the sets of contributors.

See the file "Total-merged-user-map" here:

https://github.com/tromey/gdb-git-migration

Tom


Re: Moving to git

2015-08-24 Thread Richard Earnshaw
On 24/08/15 12:43, Jonathan Wakely wrote:
> On 24 August 2015 at 11:42, Eric S. Raymond wrote:
>> Jonathan Wakely :
>>> On 24 August 2015 at 09:17, Jakub Jelinek wrote:
 The revision ids are also useful for bugzilla, r123456
 links in text pointing to http://gcc.gnu.org/r123456 is significantly
 shorter
>>>
>>> The first six characters of the sha1 is usually enough to
>>> unambiguously identify a commit, so we could easily have
>>> https://gcc.gnu.org/git/f00baa or something similar, if we don't use
>>> git-notes to add a revision to the commits.
>>
>> I recommend *against* using hashes to identify commits.  Here's what I said
>> about it in the NTPsec developers guidelines.
>>
>> === How to refer to previous commits ===
>>
>> The best (most human-friendly) way to reference a commit is by quoting its
>> summary line.  If you need to disambiguate, give its date and author.
> 
> That doesn't really work if we want Bugzilla to automatically turn
> something that looks like a reference to a commit into a hyperlink.
> Currently I can say "caused by r227043" in a bugzilla comment and it
> links to the relevant commit. I don't really want to have to say
> "caused by libstdc++/67294 Don't run timed mutex tests on Darwin" or
> "caused by
> Author: Jonathan Wakely 
> Date:   Thu Aug 20 20:36:19 2015 +
> "
> 
> It's pretty simple for Bugzilla to look for "r\d+" in comments and
> create a hyperlink to https://gcc.gnu.org/\1 without accessing the
> repository at all. It would not be practical (for every bugzilla
> comment) to search the repo for "libstdc++/67294 Don't run timed mutex
> tests on Darwin" to identify a specific commit and create a link to
> it.
> 

Something like 'git:' ought to be easy enough to write into
BZ (you normally cut-and-paste in this sort of case, anyway) and should
be trivial to write a scanner for.  It's not like these numbers really
have to be ascending in this case.


>> The worst way is to quote its git hash, because humans are not good at
>> keeping random strings of hex digits in working memory.  Besides, hashes
>> will break if the history is ever moved to another VCS or the repository
>> has to be surgically altered.
> 
> We have that situation now with the subversion commit IDs we refer to
> in Bugzilla, that doesn't mean it isn't useful.
> 



Re: Offer of help with move to git

2015-08-24 Thread Joseph Myers
On Sun, 23 Aug 2015, Eric S. Raymond wrote:

> Whoever you designate to lead the conversion *should read this

Jason Merrill  is leading the conversion.

> Hwew's a particular warning: though you may be using git-svn for live
> gatewaying, it is *not* the way to go for full-history conversions, as
> it is likely to screw up the history in ways that are not apparent
> from merely looking at the head revision.  The dump analyzer in my
> reposurgeon tool does a substantially better job.

Hence my suggestion in  
of reconverting and then combining with the existing git-svn history via 
renaming all the refs in the existing git repository, so as to preserve 
the validity of commit references and git-only branches there while having 
the main copy of the history properly converted.

There are definitely oddities from git-svn not knowing exactly which 
subdirectories of /branches are themselves branches, and which are 
containers for multiple branches (something that will need configuring 
correctly for the proper conversion - I think it should be possible to 
tell in a fairly automated way, e.g. if the directory contains ChangeLog 
it's almost certainly a branch and if it doesn't it's probably a container 
for branches but should be checked manually to make sure).

I don't know what either git-svn or reposurgeon make of the times when 
trunk was accidentally deleted and then recreated as an SVN copy of a 
pre-deletion revision (what we want to avoid for the proper conversion is 
those looking like deletion and recreation of all files in trunk - commits 
that don't change the tree at all, or complete omission of the deletion 
and subsequent recreation, would be fine).

> pretty bulletproof. Problems are most likely if the SVN repo was
> previously converted from CVS, a process which in the past tended

It was converted from CVS.  More precisely, from two CVS repositories: the 
gcc2 repository (1988-1999, starting as a collection of RCS files and with 
not many files version controlled before 1992 and documentation not 
version controlled for years after then), and what started as the EGCS 
repository (1997-2005).  The two repositories were combined by a custom 
version of CVS (work done by Ian Taylor) to produce the input to cvs2svn.  
gcc2 changes between the start of EGCS in 1997 and 1999 when development 
in the gcc2 project ended were moved to /branches/premerge-fsf-branch as 
part of the combination process (pre-EGCS gcc2 changes are on trunk).

A few branches in the repository that started as the EGCS repository, the 
history of which branches was particularly messed up by rebasing (branch 
tags having been moved from one revision to another, leaving behind 
unnamed branches), were deliberately omitted from the conversion to SVN to 
avoid it generating large amounts of very messy and not particularly 
useful history in the resulting repository.

-- 
Joseph S. Myers
jos...@codesourcery.com


Re: Offer of help with move to git

2015-08-24 Thread Joseph Myers
On Sun, 23 Aug 2015, Andreas Schwab wrote:

> Florian Weimer  writes:
> 
> > Okay, it's not a big deal for me if my older contributions are
> > attributed to Red Hat.  I was just wondering.
> 
> Since subversion doesn't store author names, only committer names, the
> real attributions are in the ChangeLog anyway.

That's true, but there are also enough cases of people e.g. having typos 
in their email addresses in the ChangeLog entries that I'd be wary of 
using them to determine per-commit attributions (as opposed to providing a 
first indication of corresponding author name / email for a username) 
without careful checking.

In most cases (for post-1997 commits, not for history from the gcc2 
repository but that probably had far fewer committers) author names (if 
not emails) can probably be found in /etc/passwd (we rarely delete user 
accounts on sourceware).  And of course starting with the map used for 
binutils-gdb should reduce the number of usernames we need to find 
conversions for.

-- 
Joseph S. Myers
jos...@codesourcery.com


Re: Offer of help with move to git

2015-08-24 Thread Eric S. Raymond
Tom Tromey :
> Eric> In the mean time, I'm enclosing a contributor map that will need to be
> Eric> filled in whoever does the conversion.  The right sides should become
> Eric> full names and preferred email addresses.
> 
> It's probably worth starting with the map I used when converting gdb.
> There is a lot of overlap between the sets of contributors.
> 
> See the file "Total-merged-user-map" here:
> 
> https://github.com/tromey/gdb-git-migration
> 
> Tom

Thanks!
-- 
http://www.catb.org/~esr/";>Eric S. Raymond


Re: Moving to git

2015-08-24 Thread Jeff Law

On 08/24/2015 02:17 AM, Jakub Jelinek wrote:

On Thu, Aug 20, 2015 at 04:09:39PM -0400, Jason Merrill wrote:

On 08/20/2015 02:23 PM, Jeff Law wrote:

I suspect Jakub will strongly want to see some kind commit hook to
associate something similar to an SVN id to each git commit to support
his workflow where the SVN ids are  associated with the compiler
binaries he keeps around for very fast bisection.  I think when we
talked about it last year, he just needs an increasing # for each
commit, presumably starting with whatever the last SVN ID is when we
make the change.


Jakub: How about using git bisect instead, and identify the compiler
binaries with the git commit sha1?


That is really not useful.  While you speed it bisection somewhat by avoiding
network traffic and communication with a server, there is still significant
time spent on actually building the compiler.
I thought the suggestion was to use the git hash to identify the builds 
you save.


So you'd use git bisect merely to get the hash id.  Once you've got the 
git hash, you can then use that to find the right cc1/cc1plus/f95 that 
you'd previously built.


It's not perfect (since you can't just look git hashes and know which 
one is newer).


Jeff


Re: Moving to git

2015-08-24 Thread Jakub Jelinek
On Mon, Aug 24, 2015 at 09:34:41AM -0600, Jeff Law wrote:
> On 08/24/2015 02:17 AM, Jakub Jelinek wrote:
> >On Thu, Aug 20, 2015 at 04:09:39PM -0400, Jason Merrill wrote:
> >>On 08/20/2015 02:23 PM, Jeff Law wrote:
> >>>I suspect Jakub will strongly want to see some kind commit hook to
> >>>associate something similar to an SVN id to each git commit to support
> >>>his workflow where the SVN ids are  associated with the compiler
> >>>binaries he keeps around for very fast bisection.  I think when we
> >>>talked about it last year, he just needs an increasing # for each
> >>>commit, presumably starting with whatever the last SVN ID is when we
> >>>make the change.
> >>
> >>Jakub: How about using git bisect instead, and identify the compiler
> >>binaries with the git commit sha1?
> >
> >That is really not useful.  While you speed it bisection somewhat by avoiding
> >network traffic and communication with a server, there is still significant
> >time spent on actually building the compiler.
> I thought the suggestion was to use the git hash to identify the builds you
> save.
> 
> So you'd use git bisect merely to get the hash id.  Once you've got the git
> hash, you can then use that to find the right cc1/cc1plus/f95 that you'd
> previously built.
> 
> It's not perfect (since you can't just look git hashes and know which one is
> newer).

But then you are forced to use git bisect all the time, because the hashes
don't tell you anything.
Most often even before writing a script I try a couple of compiler versions
by hand if I have some extra info (this used to work a 3 years ago, broke in
the last couple of days, etc.).
Perhaps I could touch the cc1.sha1hash files with timestamps corresponding to
the date/time of the commit, and keep them sorted in some file manager by
timestamps, still it would be worse usability wise.
Not to mention we should keep the existing r123456 comments in bugzilla
working, and I'm not convinced keeping a SVN version of the repository
(frozen) for that purpose is the best idea.

Jakub


Re: Moving to git

2015-08-24 Thread Jeff Law

On 08/24/2015 09:43 AM, Jakub Jelinek wrote:

On Mon, Aug 24, 2015 at 09:34:41AM -0600, Jeff Law wrote:

On 08/24/2015 02:17 AM, Jakub Jelinek wrote:

On Thu, Aug 20, 2015 at 04:09:39PM -0400, Jason Merrill wrote:

On 08/20/2015 02:23 PM, Jeff Law wrote:

I suspect Jakub will strongly want to see some kind commit hook to
associate something similar to an SVN id to each git commit to support
his workflow where the SVN ids are  associated with the compiler
binaries he keeps around for very fast bisection.  I think when we
talked about it last year, he just needs an increasing # for each
commit, presumably starting with whatever the last SVN ID is when we
make the change.


Jakub: How about using git bisect instead, and identify the compiler
binaries with the git commit sha1?


That is really not useful.  While you speed it bisection somewhat by avoiding
network traffic and communication with a server, there is still significant
time spent on actually building the compiler.

I thought the suggestion was to use the git hash to identify the builds you
save.

So you'd use git bisect merely to get the hash id.  Once you've got the git
hash, you can then use that to find the right cc1/cc1plus/f95 that you'd
previously built.

It's not perfect (since you can't just look git hashes and know which one is
newer).


But then you are forced to use git bisect all the time, because the hashes
don't tell you anything.

True.


Most often even before writing a script I try a couple of compiler versions
by hand if I have some extra info (this used to work a 3 years ago, broke in
the last couple of days, etc.).
A map of key hashes would probably be helpful with this kind of thing. 
Major releases, key branch->trunk merge points and the like.


It'd still be somewhat worse usability wise for you, but it ought to be 
manageable.


ANd like I said before, I'd support a git-hook which bumped some kind of 
index at each commit for your workflow.




Perhaps I could touch the cc1.sha1hash files with timestamps corresponding to
the date/time of the commit, and keep them sorted in some file manager by
timestamps, still it would be worse usability wise.
Not to mention we should keep the existing r123456 comments in bugzilla
working, and I'm not convinced keeping a SVN version of the repository
(frozen) for that purpose is the best idea.
I'd like to keep the old ones working, but new references should 
probably be using the hash id and commit name.


As for how to best keep the old r123456 links working, I don't know. 
Presumably those could be mapped behind the scenes to a git id.


Jeff



Re: Moving to git

2015-08-24 Thread Richard Earnshaw
On 24/08/15 16:43, Jakub Jelinek wrote:
> On Mon, Aug 24, 2015 at 09:34:41AM -0600, Jeff Law wrote:
>> On 08/24/2015 02:17 AM, Jakub Jelinek wrote:
>>> On Thu, Aug 20, 2015 at 04:09:39PM -0400, Jason Merrill wrote:
 On 08/20/2015 02:23 PM, Jeff Law wrote:
> I suspect Jakub will strongly want to see some kind commit hook to
> associate something similar to an SVN id to each git commit to support
> his workflow where the SVN ids are  associated with the compiler
> binaries he keeps around for very fast bisection.  I think when we
> talked about it last year, he just needs an increasing # for each
> commit, presumably starting with whatever the last SVN ID is when we
> make the change.

 Jakub: How about using git bisect instead, and identify the compiler
 binaries with the git commit sha1?
>>>
>>> That is really not useful.  While you speed it bisection somewhat by 
>>> avoiding
>>> network traffic and communication with a server, there is still significant
>>> time spent on actually building the compiler.
>> I thought the suggestion was to use the git hash to identify the builds you
>> save.
>>
>> So you'd use git bisect merely to get the hash id.  Once you've got the git
>> hash, you can then use that to find the right cc1/cc1plus/f95 that you'd
>> previously built.
>>
>> It's not perfect (since you can't just look git hashes and know which one is
>> newer).
> 
> But then you are forced to use git bisect all the time, because the hashes
> don't tell you anything.
> Most often even before writing a script I try a couple of compiler versions
> by hand if I have some extra info (this used to work a 3 years ago, broke in
> the last couple of days, etc.).
> Perhaps I could touch the cc1.sha1hash files with timestamps corresponding to
> the date/time of the commit, and keep them sorted in some file manager by
> timestamps, still it would be worse usability wise.
> Not to mention we should keep the existing r123456 comments in bugzilla
> working, and I'm not convinced keeping a SVN version of the repository
> (frozen) for that purpose is the best idea.
> 
>   Jakub
> 

Why not use the output of 'git show -s --format=%ct-%h'?

$ git show -s --format=%ct-%h master
1440153969-f57da59

That gives you a unix timestamp for the commit, followed by the hash.
Now you've got a fully ordered way of referring to the commit, but still
have access to the hash code.




Power ELFv2 ABI now openly published

2015-08-24 Thread Bill Schmidt
At Cauldron this year, several people complained to me that our latest
ABI document was behind a registration wall.  I'm happy to say that
we've finally gotten past the issues that were holding it there, and it
is now openly available at:

https://members.openpowerfoundation.org/document/dl/576  

Thanks,
Bill



Re: Offer of help with move to git

2015-08-24 Thread Eric S. Raymond
Joseph Myers :
> Hence my suggestion in  
> of reconverting and then combining with the existing git-svn history via 
> renaming all the refs in the existing git repository, so as to preserve 
> the validity of commit references and git-only branches there while having 
> the main copy of the history properly converted.

Sorry, but I can't even imagine how to recombine in that way with the tools
I have.  If you still think it's worth trying after seeing the reposurgeon
conversion I deliver, we can investigate that I suppose.

> I don't know what either git-svn or reposurgeon make of the times when 
> trunk was accidentally deleted and then recreated as an SVN copy of a 
> pre-deletion revision (what we want to avoid for the proper conversion is 
> those looking like deletion and recreation of all files in trunk - commits 
> that don't change the tree at all, or complete omission of the deletion 
> and subsequent recreation, would be fine).

git-svn often fluffs that general kind of delete-recreate case
pretty badly; reposurgeon's analyzer takes them in stride.  I have
a whole bunch of regression tests from pathological repos that I keep
around to verify this.

Another similar case is when a branch was created by a non-SVN copy
followed by a commit, losing ancestry information - this is a
relatively common operator error that reposurgeon had to learn to cope
with early on.  Most other translation tools (including git-svn) lose
their cookies here.

Hairballs like these are why reposurgeon has its own internal parser for
the SVN dumpfile format, the only one that exists outside the SVN
suite itself and the exception to the general rule that reposurgeon
consumes the fast-import-stream output of exporters in order to
read repositories.  I couldn't achieve robustness in the presence
of common metadata malformations in any less drastic way.

> It was converted from CVS.  More precisely, from two CVS repositories: the 
> gcc2 repository (1988-1999, starting as a collection of RCS files and with 
> not many files version controlled before 1992 and documentation not 
> version controlled for years after then), and what started as the EGCS 
> repository (1997-2005).  The two repositories were combined by a custom 
> version of CVS (work done by Ian Taylor) to produce the input to cvs2svn.  
> gcc2 changes between the start of EGCS in 1997 and 1999 when development 
> in the gcc2 project ended were moved to /branches/premerge-fsf-branch as 
> part of the combination process (pre-EGCS gcc2 changes are on trunk).

Uh oh.  This sounds like it could be a recipe for serious grief.

While Ian is certainly smart and persistent enough to have made
something coherent out of that kind of mess, older versions of cvs2svn
were defect amplifiers that would turn even minor metadata glitches in
CVS into large tracts of scar tissue in the translated SVN, which in
turn tend not to get noticed until you try to up-convert from the
SVN. Cleaning up this kind of artifact was one of the major original
motivations for reposurgeon.

The fact that you had to *combine* CVS repositories hints that I
may be about to encounter an entirely new class of malformations.
Oh joy, oh rapture... :-(

> A few branches in the repository that started as the EGCS repository, the 
> history of which branches was particularly messed up by rebasing (branch 
> tags having been moved from one revision to another, leaving behind 
> unnamed branches), were deliberately omitted from the conversion to SVN to 
> avoid it generating large amounts of very messy and not particularly 
> useful history in the resulting repository.

I'll be glad not to have those problems...

We'll know soon enough how bad things are.  It's taken me the better
part of three days to mirror the SVN, in part because your hosting site
is randomly dropping connection once per several hours, but I'm now up
to 208213 which is 91% close to the end.

Once I have a complete mirror and can do a trial conversion, I'll be
able to run a 'lint' command that is pretty good at finding cvs2svn
conversion artifacts.

I'll have to regenerate the empty contributor map, too.  When I made the
first one I didn't know that mirroring had been interrupted by a host timeout;
I only had commits up to mid-2005.

The GCC repo is pretty huge, but I've been hunting mastodons like it
for years now - there's a row of trophy heads in the reposurgeon
documentation.  I ended up building a machine with a processor and
cache specifically designed to handle non-parallelizable graph-theory
computations multiple gigabytes wide - SMP is no help here and you
want extra-large primary memory caches. On this hardware, conversion
runs will merely be painfully slow rather than die-of-old-age
interminable.
-- 
http://www.catb.org/~esr/";>Eric S. Raymond


Re: Moving to git

2015-08-24 Thread Joseph Myers
On Mon, 24 Aug 2015, Jakub Jelinek wrote:

> Not to mention we should keep the existing r123456 comments in bugzilla
> working, and I'm not convinced keeping a SVN version of the repository
> (frozen) for that purpose is the best idea.

Of course you keep the SVN repository there indefinitely, read-only, just 
as you can still check out the old CVS repository.

-- 
Joseph S. Myers
jos...@codesourcery.com


Re: Offer of help with move to git

2015-08-24 Thread Joseph Myers
On Mon, 24 Aug 2015, Eric S. Raymond wrote:

> Joseph Myers :
> > Hence my suggestion in  
> > of reconverting and then combining with the existing git-svn history via 
> > renaming all the refs in the existing git repository, so as to preserve 
> > the validity of commit references and git-only branches there while having 
> > the main copy of the history properly converted.
> 
> Sorry, but I can't even imagine how to recombine in that way with the tools
> I have.  If you still think it's worth trying after seeing the reposurgeon
> conversion I deliver, we can investigate that I suppose.

I'm pretty sure it should be doable with pure git.  Do something with git 
for-each-ref on (a copy of) the old repository to script renaming of all 
refs so the two repositories don't have any conflicting refs.  Then use 
git fast-import to import all the content of one repository into the other 
(or add one repository as a remote to the other, fetch and then rename all 
the refs from remotes/).  Then git gc --aggressive to repack it all.  
Optionally, add a "git merge -s ours" commit to new master to show it as 
merged from the git-svn master, if it's considered beneficial to converge 
the history like that.

People with existing git-only branches might need to take extra care the 
first time they merge after this is done (maybe merge up to the last 
revision of the git-svn master then do their own -s ours merge from the 
new master, if git doesn't get it right automatically given such a merge 
commit on master), but I don't see any reason this approach shouldn't work 
to keep existing references to git-svn commit hashes meaningful (without 
needing to have a renamed git-svn repository sit on the side for that 
purpose) and to keep existing git-only branches (of which there are lots) 
usable.  And with blobs and hopefully most tree objects shared between the 
two histories, I hope this won't make the repository too much larger.

> The GCC repo is pretty huge, but I've been hunting mastodons like it
> for years now - there's a row of trophy heads in the reposurgeon
> documentation.  I ended up building a machine with a processor and
> cache specifically designed to handle non-parallelizable graph-theory
> computations multiple gigabytes wide - SMP is no help here and you
> want extra-large primary memory caches. On this hardware, conversion
> runs will merely be painfully slow rather than die-of-old-age
> interminable.

FWIW, Jason's own trial conversion with reposurgeon got up to at least 
45GB memory consumption on a 32GB repository.

-- 
Joseph S. Myers
jos...@codesourcery.com


Re: Moving to git

2015-08-24 Thread Jakub Jelinek
On Mon, Aug 24, 2015 at 10:22:22AM +0200, Andreas Schwab wrote:
> Jakub Jelinek  writes:
> 
> > And for those really identifying them by sha1 hashes is significantly
> > worse than using monotonically increasing small number, sha1 hashes
> > are impossible to remember, and you don't know what is earlier and
> > what is later from just looking at it.
> 
> git describe gives you such a number (relative to a tag).

But it is not unique across different branches, and furthermore, what is the
fast way to go from the number to a commit?  If you have to git describe
all the commits to find the one you are looking for, then it is not what I
want...

Jakub


porting to lra

2015-08-24 Thread shmeel gutl

are there any guidelines as to what needs to be done in the backend to
enable lra for 5.2? when I turn it on I get two types of errors. 1) insn
not recognized because fp hasn't been converted yet, and 2) max number
of generated reload insns.

any pointers will be appreciated

shmeel




Re: Offer of help with move to git

2015-08-24 Thread Eric S. Raymond
Joseph Myers :
> FWIW, Jason's own trial conversion with reposurgeon got up to at least 
> 45GB memory consumption on a 32GB repository.

I have no trouble believing that at *all*.
-- 
http://www.catb.org/~esr/";>Eric S. Raymond


Re: porting to lra

2015-08-24 Thread Vladimir Makarov

On 08/24/2015 02:43 PM, shmeel gutl wrote:

are there any guidelines as to what needs to be done in the backend to
enable lra for 5.2?
Unfortunately, switching from reload to LRA can be a difficult task.  
Reload pass is driven by many machine target hooks.  As LRA uses 
different algorithms these hooks might be misleading for it.

when I turn it on I get two types of errors. 1) insn
not recognized because fp hasn't been converted yet, and 2) max number
of generated reload insns.

any pointers will be appreciated
I did several LRA ports and they had different problems and changes were 
made in different code of machine-dependent files.  I can say only that 
porting mostly needs to rework (sometimes to switch off) hooks used by 
reload.  Sometimes very small changes in hooks are necessary, sometimes 
it needs a lot of changes (powerpc required the biggest efforts as the 
port uses a lot of tricks).


Could I ask you what target you are trying to port to LRA.  I can look 
at it and evaluate how many efforts will be needed to do the port.





Re: Moving to git

2015-08-24 Thread Andreas Schwab
Jakub Jelinek  writes:

> But it is not unique across different branches, and furthermore, what is the
> fast way to go from the number to a commit?

The git describe output is a valid ref by itself.

Andreas.

-- 
Andreas Schwab, SUSE Labs, sch...@suse.de
GPG Key fingerprint = 0196 BAD8 1CE9 1970 F4BE  1748 E4D4 88E3 0EEA B9D7
"And now for something completely different."


Re: Offer of help with move to git

2015-08-24 Thread Frank Ch. Eigler
Joseph Myers  writes:

> [...]
> FWIW, Jason's own trial conversion with reposurgeon got up to at least 
> 45GB memory consumption on a 32GB repository.

(The host sourceware.org box has 72GB.)

- FChE


Re: Offer of help with move to git

2015-08-24 Thread Jeff Law

On 08/24/2015 01:46 PM, Frank Ch. Eigler wrote:

Joseph Myers  writes:


[...]
FWIW, Jason's own trial conversion with reposurgeon got up to at least
45GB memory consumption on a 32GB repository.


(The host sourceware.org box has 72GB.)
And if Jason really needs it, we've got considerably larger systems in 
our test farm that he could provision for this task.


Jeff


Re: CFI directives and dynamic stack alignment

2015-08-24 Thread Steve Ellcey
On Tue, 2015-08-18 at 09:23 +0930, Alan Modra wrote:
> On Mon, Aug 17, 2015 at 10:38:22AM -0700, Steve Ellcey wrote:

> OK, then you need to emit a .cfi directive to say the frame top is
> given by the temp hard reg sometime after that assignment and before
> sp is aligned in the prologue, and another .cfi directive when copying
> to the pseudo.  It's a while since I looked at the CFI code in gcc,
> but arranging this might be as simple as setting RTX_FRAME_RELATED_P
> on the insns involved.
> 
> If -fasynchronous-unwind-tables, then you'll also need to track the
> frame in the epilogue.
> 
> > This function (fn2) ends with a call to abort, which is noreturn, so the
> > optimizer sees that the epilogue is dead code and GCC determines that
> > there is no need to save the old stack pointer since it will never get
> > restored.   I guess I need to tell GCC to save the stack pointer in
> > expand_prologue even if it never sees a use for it.  I guess I need to
> > make the temporary register where I save $sp volatile or do something
> > else so that the assignment (and its associated .cfi) is not deleted by
> > the optimizer.
> 
> Ah, I see.  Yes, the temp and pseudo are not really dead if they are
> needed for unwinding.

Yes, I was originally thinking I just had to make the temp and pseudo
regs volatile so that the assignments would not get removed but it
appears that I need the epilogue code too (even if I never get there
because of a call to abort which GCC knows is non-returning) so that I
have the needed .cfi directives there.  I am thinking I should add an
edge from the entry_block to the exit_block so that the exit block is
never removed by the optimizer.  I assume this edge would need to be
abnormal and/or fake but I am not sure which (if either) of these edges
would be appropriate for this.

Steve Ellcey
sell...@imgtec.com