Re: [lldb-dev] [cfe-dev] [llvm-dev] GitHub anyone?

2016-05-31 Thread Tom Honermann via lldb-dev
On 5/31/2016 4:46 PM, Mehdi Amini via cfe-dev wrote:
> Apparently I wasn't very clear: llvm and clang (and the others projects) 
> would be simple decoupled, individual git repositories. You would be able to 
> check them out however you want and commit to them individually.
> There would be an extra "integration repository" on top that would only 
> provide the service that tells "r12345 is llvm:36c941c clang:eaf492b 
> compiler-rt:6d77ea5". This repository should be managed transparently by some 
> server-side integration.
> The provided scripting I was referring to would just be a convenience that is 
> using this extra layer of metadata ("integration repository") to be able 
> checkout the other individual repositories together at the right "rev-lock" 
> revision.
> This is not on your way if you don't want to use it, but it provides this 
> "single increase monotonic revision number across multiple repository" that 
> is convenient for some people.
>
> Makes sense?

Yes, makes sense; we have been doing exactly this for the last few 
months.  We created our own integration repo (to host our own build 
integration scripts) and cloned the llvm and clang repos from (I assume) 
https://github.com/llvm-mirror as sub-modules within it.  We're just 
using the native git command line to manage things and, so far, so good.

We're still working on getting a full continuous integration process in 
place (right now we manually pull periodically), but expect to have that 
soon.  The CI process is just to inform us of conflicts and allow us to 
resolve them proactively; we don't release product based on trunk.

Tom.

___
lldb-dev mailing list
lldb-dev@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-dev


Re: [lldb-dev] [cfe-dev] Git Move: GitHub+modules proposal

2016-06-29 Thread Tom Honermann via lldb-dev
On 6/29/2016 10:03 AM, Renato Golin via cfe-dev wrote:
> Since the umbrella project cannot see the sub-modules' commits without
> some form of update, there are two ways to do this:
>
> P. Per push: Every push (not commit) on all other repositories will
> trigger a hook that will hit a URL on our server, telling it to
> generate an incremental ID, update some umbrella's SeqID property (or
> even a commit SHA) and update the sub-modules.

How would you coordinate dependent updates to the sub-modules?  For 
example, in the case where someone makes a change to the LLVM sub-module 
that requires changes to the Clang sub-module?  Would there be some way 
for a developer to push both sets of updates as an atomic update to the 
umbrella project?  It probably doesn't matter often, as long as the 
updates to both sub-modules are pushed close together in time.

Tom.

___
lldb-dev mailing list
lldb-dev@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-dev


Re: [lldb-dev] [llvm-dev] Sequential ID Git hook

2016-06-30 Thread Tom Honermann via lldb-dev
On 6/30/2016 7:43 AM, Renato Golin via llvm-dev wrote:
> Given the nature of our project's repository structure, triggers in
> each repository can't just update their own sequential ID (like
> Gerrit) because we want a sequence in order for the whole project, not
> just each component. But it's clear to me that we have to do something
> similar to Gerrit, as this has been proven to work on a larger
> infrastructure.

I'm assuming that pushes to submodules will result in a (nearly) 
immediate commit/push to the umbrella repo to update it with the new 
submodule head.  Otherwise, checking out the umbrella repo won't get you 
the latest submodule updates.

Since updates to the umbrella project are needed to synchronize it for 
updates to sub-modules, it seems to me that if you want an ID that 
applies to all projects, that it would have to be coordinated relative 
to the umbrella project.

>   Design decisions
>
> This could be a pre/post-commit trigger on each repository that
> receives an ID from somewhere (TBD) and updates the commit message.
> When the umbrella project synchronises, it'll already have the
> sequential number in. In this case, the umbrella project is not
> necessary for anything other than bisect, buildbots and releases.

I recommend using git tag rather than updating the commit message 
itself.  Tags are more versatile.

> I personally believe that having the trigger in the umbrella project
> will be harder to implement and more error prone.

Relative to a SQL database and a server, I think managing the ID from 
the umbrella repository would be much simpler and more reliable.

Managing IDs from a repo using git meta data is pretty simple.  Here's 
an example script that creates a repo and allocates a push tag in 
conjunction with a sequence of commits (here I'm simulating pushes of 
individual commits rather than using git hooks for simplicity).  I'm not 
a git expert, so there may be better ways of doing this, but I don't 
know of any problems with this approach.

#!/bin/sh

rm -rf repo

# Create a repo
mkdir repo
cd repo
git init

# Create a well known object.
PUSH_OBJ=$(echo "push ID" | git hash-object -w --stdin)
echo "PUSH_OBJ: $PUSH_OBJ"

# Initialize the push ID to 0.
git notes add -m 0 $PUSH_OBJ

# Simulate some commits and pushes.
for i in 1 2 3; do
   echo $i > file$i
   git add file$i
   git commit -m "Added file$i" file$i
   PUSH_TAG=$(git notes show $PUSH_OBJ)
   PUSH_TAG=$((PUSH_TAG+1))
   git notes add -f -m $PUSH_TAG $PUSH_OBJ
   git tag -m "push-$PUSH_TAG" push-$PUSH_TAG
done

# list commits with push tags
git log --decorate=full


Running the above shows a git log with the tags:

commit a4ca4a0b54d5fb61a2dacbab5732d00cf8216029 (HEAD, tag: 
refs/tags/push-3, refs/heads/master)
...
 Added file3

commit e98e2669569d5cfb15bf4cd1f268507873bcd63f (tag: refs/tags/push-2)
...
 Added file2

commit 5c7f29107838b4af91fe6fa5c2fc5e3769b87bef (tag: refs/tags/push-1)
...
 Added file1


The above script is not transaction safe because it runs commands 
individually.  In a real deployment, git hooks would be used and would 
rely on push locks to synchronize updates.  Those hooks could also 
distribute ID updates to the submodules to keep them synchronized.

Tom.
___
lldb-dev mailing list
lldb-dev@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-dev


Re: [lldb-dev] [cfe-dev] [llvm-dev] Sequential ID Git hook

2016-07-01 Thread Tom Honermann via lldb-dev
On 6/30/2016 5:20 PM, Robinson, Paul via cfe-dev wrote:
> We were using tags for a while in our own SVN->git conversion internally.
> (git branch is pushed to SVN and the SVN r-number used to create a tag.)
> They are convenient for some things, but each tag adds a new (if small)
> file to .git/tags and I don't know that it really scales well when you
> are talking about (long term) hundreds of thousands of them.  That was
> not what tags were designed for.

We're using tags in this manner for our internal repos and LLVM/Clang 
mirrors and haven't experienced any problems.  We're at ~50k tags for 
our most used repo, so not quite at hundreds of thousands yet.

When I look in .git/refs/tags of one of my repos, I do *not* see 50k 
files; I see ~400.  I'm not sure what causes some to appear here and 
others not.

I don't see how this use of tags is not representative of what tags were 
designed for.  They are designed to label a commit.  That seems to match 
well what is desired here.

> We've since stopped creating the tags, and gotten used to not having
> them.  We do the 'rev-list --count' trick which mainly gets recorded as
> one component of the version number, and it has been working for us.

As I understand it, 'git rev-list --count HEAD' requires walking the 
entire commit history.  Perhaps the performance is actually ok in 
practice, but I would be concerned about scaling with this approach as well:

$ time git rev-list --count HEAD
115968

real0m1.170s
user0m1.100s
sys 0m0.064s

> I think having the number in the commit log (even if it's just for the
> superproject) would be preferable.  You can use 'git log --grep' to
> find a particular rev if you need to.

Grepping every commit doesn't seem like the most scalable option either. 
  I did a quick test on a large repo.  First a grep for an identifier:

$ time git log --grep 
...
real0m1.450s
user0m1.340s
sys 0m0.092s

Then I did the same for the associated push tag:

$ time git log -n 1 
...
real0m0.048s
user0m0.024s
sys 0m0.016s

Tom.
___
lldb-dev mailing list
lldb-dev@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-dev


Re: [lldb-dev] [cfe-dev] [llvm-dev] Sequential ID Git hook

2016-07-01 Thread Tom Honermann via lldb-dev
On 6/30/2016 6:18 PM, Jim Rowan via cfe-dev wrote:
>
> On Jun 30, 2016, at 2:25 PM, Robinson, Paul via llvm-dev
> mailto:llvm-...@lists.llvm.org>> wrote:
>
> (talking about lots of tags)
>
>> I don't know that it really scales well when you
>> are talking about (long term) hundreds of thousands of them.
>
> I can say from experience that it does not scale well.After some
> time, everyone would start feeling the pain.

Can you elaborate on this?  As I mentioned in another email, we're at 
~50k tags in one repo and not having any problems.  I can't see why git 
would fundamentally have scaling or performance issues in conjunction 
with lots of tags.  Perhaps some UI interfaces were failing to scale well?

Tom.

___
lldb-dev mailing list
lldb-dev@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-dev