I will test it with a table derived mask and see what happens.   I would
like the table to be large enough to encompass 19x19 GO too, but I'll
test it with a small table first.

- Don

On Fri, 2006-12-08 at 00:39 +0100, Antoine de Maricourt wrote:
> Last time I checked MT was under 20cc per call (I assume it's
> inlined). On a P4, the shift operator is supposed to have a 4 cc
> latency. The dependency chain to get the mask yields to 25 cc latency
> before the mask can be used (4 per shift + 1 for OR) * 5. So it's
> quite possible that this sequence dominates the call to rand(). Just
> give a try... unless you use another processor with a better latency
> for shift operations. But even with those processors it might be
> benefic to replace 15 instructions by only one. Of course if you make
> a specific implementation for small n (below 256), you can remove the
> 2 last shift as well.
> 
> Antoine
> > I'm pretty sure the time of this function is dominated by the call to
> > rand(), but it never occurred to do a table lookup for the mask,
> > interesting idea.
> > 
> > 
> > - Don
> > 
> > 
> > On Thu, 2006-12-07 at 22:36 +0100, Antoine de Maricourt wrote:
> >   
> > > If this randint routine is critical, you can save some calls to rand()
> > > when you know that n is always below some value (see my previous post
> > > about bitmap go).
> > > For instance if n < 128 (probably true for 9x9 go), you could try:
> > > 
> > > while (true) {
> > >   r = rand();
> > > 
> > >   if ((r & v) < n) return (r & v);
> > >   r >>= 7;
> > >   if ((r & v) < n) return (r & v);
> > >   r >>= 7;
> > >   ... // 2 more times
> > > }
> > > 
> > > Also it could be better to read the mask (v) from a table (provided
> > > the upper limit of n is small enough). At least on a P4, it will avoid
> > > a long data dependency chain and might give a (probably small)
> > > improvment. It might even be benefical on other processors.
> > > 
> > > Antoine
> > > 
> > >     
> > > > On Thu, 2006-12-07 at 16:05 +0100, Łukasz Lew wrote:
> > > >   
> > > >       
> > > > >     ii = pm::rand () % empty_v_cnt;         // TODO improve speed "%"
> > > > >     
> > > > >         
> > > > Try this,  I think it could be faster, not sure,  but has the advantage
> > > > that it's slightly more correct.
> > > > 
> > > > // returns an integer between 0 and n-1 inclusive
> > > > // 
> > > > unsigned long randint(unsigned long n) 
> > > > {
> > > >   unsigned long   v =  n;
> > > >   unsigned long   r;
> > > > 
> > > >   v--;
> > > >   v |= v >> 1;
> > > >   v |= v >> 2;
> > > >   v |= v >> 4;
> > > >   v |= v >> 8;
> > > >   v |= v >> 16;
> > > > 
> > > >   do  { r = rand(); } while ( (r & v) >= n );
> > > > 
> > > >   return( r & v );
> > > > }
> > > > 
> > > > 
> > > > 
> > > > - Don
> > > > 
> > > > 
> > > > _______________________________________________
> > > > computer-go mailing list
> > > > computer-go@computer-go.org
> > > > http://www.computer-go.org/mailman/listinfo/computer-go/
> > > > 
> > > > 
> > > >   
> > > >       
> > 
> > 
> > 
> >   

_______________________________________________
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/

Reply via email to