> They [hg/mecurial] are infinitely faster, more reliable, and more useful. And 
> in
> some ways they are even conceptually simpler (I never quite understood
> some of the most subtle points of replica, like why it keeps saying it
> needs to update files that were already updated if I happen to have
> some local changes elsewhere, even when I have had them explained to
> me repeatedly, of course that is due to my own intellectual
> limitations, but...)

charles hit the main point — it's hard to fix things without
identifying the problem.

i do not use replica to update my machine, but this is a reflection
of the fact that legitimate changes to files i've never changed can
remove functionality that i use, like il.  i suspect that i'm a special
case in this regard and no amount of revision control fanciness
can save me.

i have used replica to move history between fs with different
on-disk layouts.  http://www.quanstro.net/plan9/history.pdf
has all the gory details.  i've been able to successfully
replicate about 3000 filesystems without a single problem.
but i had pretty controlled circumstances and i didn't have
any fs trouble.

in the case of sources, i think something is going wrong while
updatedb is running.  (hg or mecurial would likely suffer the same
fate, btw.)  from the log on sources:

; grep 386/init plan9.log
1196890780 540 a 386/init - 775 sys sys 1179372116 101487
1209616216 98 c 386/init - 775 sys sys 1209614871 101498
1219773605 695 a 386/init - 775 sys sys 1209614871 101498
1232008203 15347 d 386/init - 775 sys sys 1209614871 0
1232038858 697 a 386/init - 775 sys sys 1209614871 101498

the 3d column is the action 'a' for add 'c' for change and 'd'
for delete.  notice that other than the first and second line,
all the actions are suprising.

since several people have reported venti errors when
connecting to sources recently, it seems to me that the
most logical explination is that when the update process
ran on 15 jan, venti was mia.  then replica interpreted
the error to mean that 386/init had been deleted.

supposing this is correct, it would be logical to add this
patch which i added to replica for the history paper.
i think that adding a strcmp for "venti error" 
to this patch would do the trick.

/n/sources/plan9//sys/src/cmd/replica/updatedb.c:77,87 - updatedb.c:77,91
                        change = 1;
                }
        }else{
-               if((d.mode&DMDIR)==0 && (od.mtime!=d.mtime || 
od.length!=d.length)){
+               if((od.mode&DMDIR) != (d.mode&DMDIR)){
+                       xlog('d', new, &d);
+                       xlog('a', new, &d);
+                       change = 1;
+               }else if((d.mode&DMDIR)==0 && (od.mtime!=d.mtime || 
od.length!=d.length)){
                        xlog('c', new, &d);
                        change = 1;
                }
-               if((!uid&&strcmp(od.uid,d.uid)!=0)
+               else if((!uid&&strcmp(od.uid,d.uid)!=0)
                || strcmp(od.gid,d.gid)!=0 
                || od.mode!=d.mode){
                        xlog('m', new, &d);
/n/sources/plan9//sys/src/cmd/replica/updatedb.c:97,135 - updatedb.c:101,106
  }
  
  void
- warn(char *msg, void*)
- {
-       char *p;
- 
-       fprint(2, "warning: %s\n", msg);
- 
-       /* find the %r in "can't open foo: %r" */
-       p = strstr(msg, ": ");
-       if(p)
-               p += 2;
- 
-       /*
-        * if the error is about a remote server failing,
-        * then there's no point in continuing to look
-        * for changes -- we'll think everything got deleted!
-        *
-        * actual errors i see are:
-        *      "i/o on hungup channel" for a local hangup
-        *      "i/o on hungup channel" for a timeout (yank the network wire)
-        *      "'/n/sources/plan9' Hangup" for a remote hangup
-        * the rest is paranoia.
-        */
-       if(p){
-               if(cistrstr(p, "hungup") || cistrstr(p, "Hangup")
-               || cistrstr(p, "rpc error")
-               || cistrstr(p, "shut down")
-               || cistrstr(p, "i/o")
-               || cistrstr(p, "connection"))
-                       sysfatal("suspected network or i/o error - bailing 
out");
-       }
- }
- 
- void
  usage(void)
  {
        fprint(2, "usage: replica/updatedb [-c] [-p proto] [-r root] [-t now n] 
[-u uid] [-x path]... db [paths]\n");
/n/sources/plan9//sys/src/cmd/replica/updatedb.c:184,190 - updatedb.c:155,161
        nmatch = argc-1;
  
        db = opendb(argv[0]);
-       if(rdproto(proto, root, walk, warn, nil) < 0)
+       if(rdproto(proto, root, walk, nil, nil) < 0)
                sysfatal("rdproto: %r");
  
        if(!changesonly){


- erik

Reply via email to