In mid April, Ingo announced availability of his conversion from
CVS to a flat patchset format:

    From: Ingo Molnar <[EMAIL PROTECTED]>
    Subject: full kernel history, in patchset format
    Message-ID: <[EMAIL PROTECTED]>

    the history data starts at 2.4.0 and ends at 2.6.12-rc2. I've included a 
    script that will apply all the patches in order and will create a 
    pristine 2.6.12-rc2 tree.
    ...
    note: i kept the patches the cvsps utility generated as-is, to have a 
    verifiable base to work on. There were a very small amount of deltas 
    missed (about a dozen), probably resulting from CVS related errors, 
    these are included in the diff-CVS-to-real patch. Also, the patch format 

I was futzing with the script in Ingo's tarball, which
originally used "patch".  After converting it to use
'git-apply', I had some troubles with applying patches, which
eventually led me to find out and fix a corner case bug ---
git-apply did not handle files with an incomplete line correctly
in some cases.

After I fixed that problem, the script still found some more
errors in the patchset, but after manual inspection it looked to
me that they are problems not on the patch application side, but
on the patch generation side.  I only checked 2.4.0, 2.6.9,
2.6.11 and 2.6.12-rc2, but the trees built were byte-to-byte
equivalent, except that the file executable bits, which are
preserved in the patch series.

The patch attached to this message is not for inclusion in the
git source tree.  It is the script I used for conversion.  You
will need the following patches to apply.c for it to work, which
will be sent separately:

  [PATCH 1/2] apply.c: handle incomplete lines correctly.
  [PATCH 2/2] apply.c: --exclude=fnmatch-pattern option.

I did this not because I was particularly interested in the
ancient kernel history, but because I wanted to see how well
packs perform.  Here are some numbers that may be of interest.

    26M pack-000002.pack    18M pack-015360.pack
    48M pack-001024.pack    21M pack-016384.pack
    22M pack-002048.pack    18M pack-017408.pack
    20M pack-003072.pack    19M pack-018432.pack
    21M pack-004096.pack    17M pack-019456.pack
    24M pack-005120.pack    22M pack-020480.pack
    20M pack-006144.pack    20M pack-021504.pack
    20M pack-007168.pack    17M pack-022528.pack
    24M pack-008192.pack    23M pack-023552.pack
    19M pack-009216.pack    16M pack-024576.pack
    24M pack-010240.pack    21M pack-025600.pack
    20M pack-011264.pack    19M pack-026624.pack
    23M pack-012288.pack    18M pack-027648.pack
    21M pack-013312.pack    17M pack-028237.pack
    18M pack-014336.pack

The script makes a full pack after importing 2.4.0 (which is the
patchset #2), and then makes an incremental every 1024 commits,
so the baseline pack is 26MB and the first incremental up to the
patchset #1024 is 48MB.  It averages at around 20MB per 1024
commits.  The repository with the full history, repacked into a
single pack, is 203MB (370291 objects).


------------
A script to slurp full 2.4.0->2.6.12-rc2 history.

Create an empty directory, put the "build-git-tree" script in it, 
and extract Ingo's CVSPS conversion result, available at:

  http://kernel.org/pub/linux/kernel/people/mingo/Linux-2.6-patchset/

in it.  Make sure the definition of variable PS matches the name
of the directory you extracted the tarball, and run the script.
Some hours later, you will have linux/ directory whose .git
subdirectory has a GITified full 2.4.0->2.6.12-rc2 history.

Now if we had a mechanism to graft a later history which starts
at 2.6.12-rc2 on top of this earlier history leading up to
it,...  ;-)

Signed-off-by: Junio C Hamano <[EMAIL PROTECTED]>
---

 build-git-tree |  177 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++
 1 files changed, 177 insertions(+), 0 deletions(-)

diff --git a/build-git-tree b/build-git-tree
new file mode 100755
--- /dev/null
+++ b/build-git-tree
@@ -0,0 +1,177 @@
+#!/bin/sh
+
+PS=linux-2.4.0-to-2.6.12-rc2-patchset
+cat build-git-tree >build-git-tree-next
+cat >sayVersion <<\EOF
+default:
+       @echo "v$(VERSION).$(PATCHLEVEL).$(SUBLEVEL)$(EXTRAVERSION)"
+EOF
+
+: >duplicate-tags
+
+: ${END=28237}
+: ${CPP=1024}
+
+sedScript='
+    /^Date: /{
+       s/^Date: \(.*\)/\1/
+        s/'\''/'\''\\'\'''\''/g
+       s/^.*/DATE='\''&'\''/p
+    }
+    /^Author: /{
+       s/^Author: \(.*\)/\1/
+        s/'\''/'\''\\'\'''\''/g
+       s/^.*/AUTHOR='\''&'\''/p
+    }
+    /^Log:$/{
+       n
+       b LOOP
+    }
+    b
+   : LOOP
+    /^BKrev: /{
+       g
+       s/^\n//
+       s/\n$//
+       s/'\''/'\''\\'\'''\''/g
+       s/^.*/LOG='\''&'\''/p
+       q
+    }
+    H
+    n
+    b LOOP
+'
+
+rm -fr errs linux pack && mkdir -p linux/Documentation errs pack || exit
+cp $PS/logo.gif linux/Documentation
+
+cd linux
+git-init-db
+git-ls-files --cached -z |
+xargs -0 -r git-update-cache --add --remove --
+find ?* -type f -print0 | xargs -0 -r git-update-cache --add --
+
+N=1 P=
+while expr $N \<= $END >/dev/null
+do
+  NN=$(printf "%06d" $N)
+  FILE=../$PS/patches/$N.patch
+
+  e=`sed -ne "$sedScript" $FILE` &&
+  eval "$e" &&
+
+  GIT_AUTHOR_NAME="$AUTHOR" &&
+  GIT_AUTHOR_EMAIL="$AUTHOR" &&
+  GIT_COMMITTER_NAME="$AUTHOR" &&
+  GIT_COMMITTER_EMAIL="$AUTHOR" &&
+  GIT_AUTHOR_DATE="+0000 $DATE" &&
+  GIT_COMMITTER_DATE="+0000 $DATE" &&
+
+  export GIT_AUTHOR_NAME GIT_AUTHOR_EMAIL GIT_COMMITTER_NAME \
+         GIT_COMMITTER_EMAIL GIT_AUTHOR_DATE GIT_COMMITTER_DATE &&
+
+  echo "* $NN - $AUTHOR - $DATE" &&
+
+  git-apply --exclude='BitKeeper/*' --index --summary --apply \
+       <$FILE >../errs/$NN.out 2>&1 || {
+    # Special case.
+    patch -E -p1 <$FILE >../errs/$NN.spc 2>&1
+    sed -ne 's|^File \(.*\) is not empty after patch, as expected$|\1|p' \
+        <../errs/$NN.spc |
+    while read path
+    do
+        echo "* expected to be empty: $path"
+        ls -l "$path"
+       rm -f "$path"
+    done >.tmp
+    cat .tmp >>../errs/$NN.spc
+    rm -f .tmp
+
+    # Some patches (like 678) do not have Index line for everything,
+    # so looking for Index: line is not good enough.
+    sed -n -e 's|^--- linux/\([^       ]*\).*|\1|p' \
+          -e 's|^+++ linux/\([^        ]*\).*|\1|p' $FILE |
+    sort -u |
+    xargs -r git-update-cache --add --remove --
+    if test -d BitKeeper
+    then
+      find BitKeeper -type f -print |
+      while read path
+      do
+       echo removing "$path"
+      done | tee -a ../errs/$NN.spc
+      find BitKeeper -type f -print0 |
+      xargs -0 -r git-update-cache --force-remove --
+      rm -fr BitKeeper
+    fi
+  }
+
+  T=`git-write-tree` &&
+  C=$(echo "$LOG" | git-commit-tree $T $P) &&
+  echo $C >.git/HEAD &&
+
+  P="-p $C" || exit
+
+  # Look at the Makefile change and make a tag.
+  git-diff-tree -p $C Makefile |
+  sed -ne '
+    /^[-+]VERSION =/{
+       p
+       q
+    }
+    /^[-+]PATCHLEVEL =/{
+       p
+       q
+    }
+    /^[-+]SUBLEVEL =/{
+       p
+       q
+    }
+    /^[-+]EXTRAVERSION =/{
+       p
+       q
+    }
+  ' >.tmp
+  if test -s .tmp
+  then
+      v=$((sed -ne '/^VERSION/p;/^PATCHLEVEL/p;/^SUBLEVEL/p;/^EXTRAVERSION/p' \
+          Makefile; cat ../sayVersion) | make -f -)
+      if test -f ".git/refs/tags/$v"
+      then
+          echo "* $v (duplicate)"
+         echo "$C      $v" >>../duplicate-tags
+      else
+         echo "$C" >".git/refs/tags/$v"
+         echo "* $v"
+      fi
+  fi
+  rm -f .tmp
+
+  # 70a29f4bd97bbb78fac1cc7f87c13fb08d1a12cd == v2.4.0.6
+
+  # Pack
+  if expr \( $N = 2 \) \| \( $N % $CPP = 0 \) >/dev/null
+  then
+    {
+       echo "* packing"
+       du -sh .git/objects &&
+       case "$N" in
+       2)
+           pack=$(git-rev-list --objects $C | \
+                  git-pack-objects ../pack/pack-$NN)
+           ;;
+       *)
+           pack=$(git-rev-list --unpacked --objects $C | \
+                  git-pack-objects --incremental ../pack/pack-$NN)
+           ;;
+       esac &&
+       ln ../pack/pack-$NN-$pack.idx ../pack/pack-$NN-$pack.pack \
+               .git/objects/pack/. &&
+       git-prune-packed &&
+       du -sh .git/objects
+    } 2>&1 | tee ../errs/$NN.packlog
+  fi
+  test -f ../errs/$NN.spc || rm -f ../errs/$NN.out
+
+  N=`expr $N + 1`
+done


-
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Reply via email to