I don't think we should do any of that. It's too complicated -- and I don't see the reason to even do it.
There's a need for the "llvm-project" repository -- that's been discussed plenty -- but where does the need for a separate "id" that must be pushed into all of the sub-projects come from? This is the first I've heard of that as a thing that needs to be done. There was a previous discussion about putting an sequential ID in the "llvm-project" repo commit messages (although, even that I'd say is unnecessary), but not anywhere else. On Thu, Jun 30, 2016 at 7:42 AM, Renato Golin via llvm-dev < llvm-...@lists.llvm.org> wrote: > Now that we seem to be converging to an acceptable Git model, there > was only one remaining doubt, and that's how the trigger to update a > sequential ID will work. I've been in contact with GitHub folks, and > this is in line with their suggestions... > > Given the nature of our project's repository structure, triggers in > each repository can't just update their own sequential ID (like > Gerrit) because we want a sequence in order for the whole project, not > just each component. But it's clear to me that we have to do something > similar to Gerrit, as this has been proven to work on a larger > infrastructure. > > Adding an incremental "Change-ID" to the commit message should > suffice, in the same way we have for SVN revisions now, if we can > guarantee that: > > 1. The ID will be unique across *all* projects > 2. Earlier pushes will get lower IDs than later ones > > Other things are not important: > > 3. We don't need the ID space to be complete (ie, we can jump from > 123 to 125 if some error happens) > 4. We don't need an ID for every "commit", but for every push. A > multi-commit push is a single feature, and doing so will help > buildbots build the whole set as one change. Reverts should also be > done in one go. > > What's left for the near future: > > 5. We don't yet handle multi-repository patch-sets. A way to > implement this is via manual Change-ID manipulation (explained below). > Not hard, but not a priority. > > > Design decisions > > This could be a pre/post-commit trigger on each repository that > receives an ID from somewhere (TBD) and updates the commit message. > When the umbrella project synchronises, it'll already have the > sequential number in. In this case, the umbrella project is not > necessary for anything other than bisect, buildbots and releases. > > I personally believe that having the trigger in the umbrella project > will be harder to implement and more error prone. > > The server has to have some kind of locking mechanism. Web services > normally spawn dozens of "listeners", meaning multiple pushes won't > fail to get a response, since the lock will be further down, after the > web server. > > Therefore, the lock for the unique increment ID has to be elsewhere. > The easiest thing I can think of is a SQL database with auto-increment > ID. Example: > > Initially: > sql> create table LLVM_ID ( id int not null primary key > auto_increment, repository varchar not null, hash varchar nut null ); > sql> alter table LLVM_ID auto_increment = 300000; > > On every request: > sql> insert into LLVM_ID values ("$repo_name", "$hash"); > sql> select_last_inset_id(); -> return > > and then print the "last insert id" back to the user in the body of > the page, so the hook can update the Change-id on the commit message. > The repo/hash info is more for logging, debugging and conflict > resolution purposes. > > We also must limit the web server to only accept connections from > GitHub's servers, to avoid abuse. Other repos in GitHub could still > abuse, and we can go further if it becomes a problem, but given point > (3) above, we may fix that only if it does happen. > > This solution doesn't scale to multiple servers, nor helps BPC > planning. Given the size of our needs, it not relevant. > > > Problems > > If the server goes down, given point (3), we may not be able to > reproduce locally the same sequence as the server would. Meaning > SVN-based bisects and releases would not be possible during down > times. But Git bisect and everything else would. > > Furthermore, even if a local script can't reproduce exactly what the > server would do, it still can make it linear for bisect purposes, > fixing the local problem. I can't see a situation in which we need the > sequence for any other purpose. > > Upstream and downstream releases can easily wait a day or two in the > unlucky situation that the server goes down in the exact time the > release will be branched. > > Migrations and backups also work well, and if we use some cloud > server, we can easily take snapshots every week or so, migrate images > across the world, etc. We don't need duplication, read-only scaling, > multi-master, etc., since only the web service will be writing/reading > from it. > > All in all, a "robust enough" solution for our needs. > > > Bundle commits > > Just FYI, here's a proposal that appeared in the "commit message > format" round of emails a few months ago, and that can work well for > bundling commits together, but will need more complicated SQL > handling. > > The current proposal is to have one ID per push. This is easy by using > auto_increment. But if we want to have one ID per multiple pushes, on > different repositories, we'll need to have the same ID on two or more > "repo/hash" pairs. > > On the commit level, the developer adds a temporary hash, possibly > generated by a local script in 'utils'. Example: > > Commit-ID: 68bd83f69b0609942a0c7dc409fd3428 > > This ID will have to be the same on both (say) LLVM and Clang commits. > > The script will then take that hash, generate an ID, and then if it > receives two or more pushes with such hashes, it'll return the *same* > ID, say 123456, in which case the Git hooks on all projects will > update the commit message by replacing the original Commit-ID to: > > Commit-ID: 123456 > > To avoid hash clashes in the future, the server script can refuse > existing hashes that are a few hours old and return error, in which > case the developer generates a new hash, update all commit messages > and re-push. > > If there is no Commit-ID, or if it's empty, we just insert a new empty > line, get the auto increment ID and return. Meaning, empty Commit-IDs > won't "match" any other. > > To solve this on the server side, a few ways are possible: > > A. We stop using primary_key auto_increment, handle the increment in > the script and use SQL transactions. > > This would be feasible, but more complex and error prone. I suggest we > go down that route only if keeping the repo/hash information is really > important. > > B. We ditch keeping record of repo/hash and just re-use the ID, but > record the original string, so we can match later. > > This keeps it simple and will work for our purposes, but we'll lose > the ability to debug problems if they happen in the future. > > C. We improve the SQL design to have two tables: > > LLVM_ID: > * ID: int PK auto > * Key: varchar null > > LLVM_PUSH: > * LLVM_ID: int FK (LLVM_ID:ID) > * Repo: varchar not null > * Push: varchar not null > > Every new push updates both tables, returns the ID. Pushes with the > same Key re-use the ID and update only LLVM_PUSH, returns the same ID. > > This is slightly more complicated, will need to code scripts to gather > information (for logging, debug), but give us both benefits > (debug+auto_increment) in one package. As a start, I'd recommend we > take this route even before the script supports it. But it could be > simple enough that we add support for it right from the beginning. > > I vote for option C. > > > Deployment > > I recommend we code this, setup a server, let it running for a while > on our current mirrors *before* we do the move. A simple plan is to: > > * Develop the server, hooks and set it running without updating the > commit message. > * We follow the logs, make sure everything is sane > * Change the hook to start updating the commit message > * We follow the commit messages, move some buildbots to track GitHub > (SVN still master) > * When all bots are live tracking GitHub and all developers have moved, we > flip. > > Sounds good? > > cheers, > --renato > _______________________________________________ > LLVM Developers mailing list > llvm-...@lists.llvm.org > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >
_______________________________________________ lldb-dev mailing list lldb-dev@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-dev