date:20061215

Re: [computer-go] Fast Board implementation

2006-12-15 Thread Urban Hafner


-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1


On Dec 11, 2006, at 19:22 , Łukasz Lew wrote:


Soon I will publish the code on my web page.
But I don't have a web page yet. :)

The second issue is a licence.
Can I use my Go Board implementation in commercial program if I
publish it on Gnu?
If no, then can I change the licence when I want to?


Just to make it clear. If you want to just provide your implementation
to everybody that's OK. But if you want to do "joint development" (as
you stated in your first mail) you might not want to use the GPL as you
can only relicense your own code and not the code written by everyone
else. So in this case something like a BSD style license would be better
(though that might allow others to use your code in a commercial
program, too).

Anyway, I you are interested I could provide a website and/or Subversion
repository for your code.

Urban
- --
http://bettong.net - Urban's Blog



-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.6 (Darwin)

iD8DBQFFgm2mggNuVCIrEyURAqDyAJ0djicZ3BSv43YJ4yuEFj1f3WV6agCdECww
ahYws7UhqQ8ep/pGFnfFRbc=
=iXaQ
-END PGP SIGNATURE-
___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/

Re: [computer-go] Fast Board implementation

2006-12-15 Thread Zach Wegner


I could pack all of the information into a single array.  Here is one
possible scheme which fits in 32 bits and accommodates all board sizes:

  3 bits to identify white/black/empty/off-board
  9 bits for the hash key (the reference stone)
  9 bits for pseudo liberty count
  9 bits for stone count


That's quite close to the data structure I've been working on. So far I have:
2 bits to identify black/white/empty (no off-board, I'm not really a
"mailboxer")
2 bit to identify liberties for both sides
1 bit for parity, used with floodfills to track which squares have been updated
9 bit for string reference: I was thinking of using this as a sort of
linked list. Now I'm not so sure, it might be better to have each
element link to a root element. It is possible however to keep a
doubly linked list and a base reference...

For the rest, I'm not so sure what to put in; this is my first Go
program. The pseudo liberty count might be useful, or maybe the real
liberty count. I was thinking of having a linked list of liberties for
a string. However a liberty can belong to more than one string/color.

Also a topic that I've thought about a lot is bitboards (I come from
the chess world). It seems the most natural representation for Go as
there is only one piece type and floodfills are used all the time.
However the use of them seems to be quite esoteric, the only
implementation I have seen is 1 word = 1 row. I would rather use more
data in each word and have each word represent a region more
rectangular in shape, but this makes things much more complicated,
especially if you support multiple board sizes.

Zach
___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/

[computer-go] Re:Re:Are there researches about human annotation to game records?

2006-12-15 Thread 荒木伸夫

Thank you for responding to my question, everyone.

>Nijhuis
Great! Thank you for your advice.
___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/

Re: [computer-go] Fast Board implementation

2006-12-15 Thread Łukasz Lew

On 12/15/06, Don Dailey <[EMAIL PROTECTED]> wrote:

> At the time of Your post I've had it already implemented and regarded
> it like "my sweet secret" :)

I don't know how sweet this secret is, but it does help.  I just
implemented it in Lazarus and I get about a 20% speedup.  That's not
bad, but nothing to write home about.  To a chess programmer this is
rather enormous!  I am guessing that the win would be much greater on
bigger boards.

In principle it would probably be FAR faster, but it adds extra
complexity,
operations and overhead to executing a move which subtracts some from
the
savings.

The question is: "what is the best way to implement it?"  The problem
is that you now must track pseudo liberty counts for every chain on
the board.  This is more complicated obviously.

Every time a stone is placed or removed, 4 additional entries must be
accessed in order to update the pseudo liberty counts.

It's still a good trade-off because the testing for whether a move
is a capture or suicide is cheap.

why this is cheap?
Isn't it more expensive than with normal liberties?

I'm reviewing my design now and I think I see some ways to improve.

Here is the current  data structure (assuming the 9x9 board):

  1.  A 10x11 board (actually 10x11 plus 1 for diagonal checking.)
  0 = empty
  1 = white
  2 = black
  4 = off-board points.

  2.  Another 10x11 board-like array - if there is a occupied point on
  the board this array contains the index (hash key) of a "string"
  data type.

I have here union set.

  3.  A list of string data types (a struct in C):
1.  stone count
2.  pseudo liberties for the whole string
3.  a reference stone.

To save time, the list of strings is really more like a hash table,
not a proper list.  It's a perfect hash table because the keys will
never collide.  I use the reference stone as the key for each string
which is generally the first stone placed in the string.  So the hash
key computation is just a table lookup and is the only thing the
second table is used for.

So to summarize, I really have 3 10x11 arrays, one of them is an array
of structs in C (actually in D) and is viewed as a hash table.

I could pack all of the information into a single array.  Here is one
possible scheme which fits in 32 bits and accommodates all board sizes:

  3 bits to identify white/black/empty/off-board
  9 bits for the hash key (the reference stone)
  9 bits for pseudo liberty count
  9 bits for stone count

Usually packing is bad idea, but sometimes is good when you need to
fit in cache or
if You have operations that do not need unpacking.

(I have packed 3 local counters packed in one int)

It's kind of interesting when you think about it.   A single array does
triple duty as a:

1. Go board.
2. Hash table.
3. lookup table - to get the "hash key" (table index) of the data
for
   the string that occupies this point.

The hash table bits in the table are usually irrelevant, it is only
valid for the single point that represent the hash table entry for a
given string.

So this scheme does no list manipulations and has a fixed size in
memory.  It's compact and I think fast.  Is there anything obvious I'm
missing that simplifies things?

By the way, stone count field is helpful but not necessary.  But it's
trivial to keep updated and it's convenient.

Using the compact representation would slow down some of the
operations which are in fast inner loops now, but would probably be a
win overall, perhaps a big win because I'm not manipulating several
different arrays.

So I will also try this compact implementation unless someone has a
superior scheme to suggest.

- Don

Best Regards,
Lukasz

PS
I've spent a lot of time on cleaning up the board code to prepare it to release.
It will arrive before end of the week :)

On Mon, 2006-12-11 at 20:00 +0100, Łukasz Lew wrote:
> At the time of Your post I've had it already implemented and regarded
> it like "my sweet secret" :) , so I was expecting that when You
> published it everybody will appreciate, and start using. But I guess
> it wasn't easy to see benefits.
> I wonder how many really good ideas came through this list unnoticed.
>
> Best Regards,
> Lukasz
>
>
> On 12/11/06, House, Jason J. <[EMAIL PROTECTED]> wrote:
> > Wow.  Did some of my early posts on liberties/chains actually get used
> > by someone?  Or did the idea of pseudo-liberties and union sets exist
> > before my post(s) on the topic?  I remember a lot of negative feedback
> > on pseudo-liberties at the time of my post.
> >
> > -Original Message-
> > From: [EMAIL PROTECTED]
> > [mailto:[EMAIL PROTECTED] On Behalf Of Lukasz Lew
> > Sent: Monday, December 11, 2006 1:31 PM
> > To: [EMAIL PROTECTED]
> > Cc: computer-go
> > Subject: Re: [computer-go] Fast Board implementation
> >
> > Pseudo liberties of a group is a sum of liberties of each stone,
> > example:
> >
> > O
> > OXXXO
>

Re: [computer-go] Fast Board implementation

2006-12-15 Thread Łukasz Lew

On 12/15/06, Luke Gustafson <[EMAIL PROTECTED]> wrote:

Are you using the union-find algorithm with path compression for joining
strings?  It's very simple and very fast.
http://en.wikipedia.org/wiki/Disjoint-set_data_structure#Disjoint-set_forests

The other main thing to consider is reducing branch mispredictions, which
(I'm guessing) could very well use more clock cycles on a P4 than performing
the actual calculations themselves.  I think this is why the next_stone list
is faster than using e.g. a recursive algorithm, which involves a lot of
if's that cannot be predicted.

I confirm branches are most costly.
Removing 1 not needed "if" gave me speedup of 5%.

Compact data I think is unnecessary, although perhaps eliminating the String
struct would speed things up a tad.  In fact, on a P4, shift operations are
slower than L1 cache memory retrievals, so fancy packing of data is not
worthwhile.  As long as all data fits in the L1 cache it should be max
speed.

:)

I am working on a program myself, and I will be porting it to assembly at
some point; right now I am just experimenting with the algorithm design in
C.  (Although this happens pretty slowly since I am very busy with work.)  I
think that this sort of algorithm is perfect for assembly language, where
hand-optimization can be much better than compiler optimization.  I will let
you know how it turns out compared to Lukasz :)

I took a lot of willpower to stop myself from rewriting in assembler :)
I guess I would obtain then about 50-60 pps  :) but any more complex
than random moves external algorithm is the bottle neck.

Anyway I would like to advertise HLA.

>> At the time of Your post I've had it already implemented and regarded
>> it like "my sweet secret" :)
>
> I don't know how sweet this secret is, but it does help.  I just
> implemented it in Lazarus and I get about a 20% speedup.  That's not
> bad, but nothing to write home about.  To a chess programmer this is
> rather enormous!  I am guessing that the win would be much greater on
> bigger boards.
>
> In principle it would probably be FAR faster, but it adds extra
> complexity,
> operations and overhead to executing a move which subtracts some from
> the
> savings.
>
> The question is: "what is the best way to implement it?"  The problem
> is that you now must track pseudo liberty counts for every chain on
> the board.  This is more complicated obviously.
>
> Every time a stone is placed or removed, 4 additional entries must be
> accessed in order to update the pseudo liberty counts.
>
> It's still a good trade-off because the testing for whether a move
> is a capture or suicide is cheap.
>
> I'm reviewing my design now and I think I see some ways to improve.
>
> Here is the current  data structure (assuming the 9x9 board):
>
>  1.  A 10x11 board (actually 10x11 plus 1 for diagonal checking.)
>  0 = empty
>  1 = white
>  2 = black
>  4 = off-board points.
>
>  2.  Another 10x11 board-like array - if there is a occupied point on
>  the board this array contains the index (hash key) of a "string"
>  data type.
>
>  3.  A list of string data types (a struct in C):
>1.  stone count
>2.  pseudo liberties for the whole string
>3.  a reference stone.
>
> To save time, the list of strings is really more like a hash table,
> not a proper list.  It's a perfect hash table because the keys will
> never collide.  I use the reference stone as the key for each string
> which is generally the first stone placed in the string.  So the hash
> key computation is just a table lookup and is the only thing the
> second table is used for.
>
> So to summarize, I really have 3 10x11 arrays, one of them is an array
> of structs in C (actually in D) and is viewed as a hash table.
>
> I could pack all of the information into a single array.  Here is one
> possible scheme which fits in 32 bits and accommodates all board sizes:
>
>  3 bits to identify white/black/empty/off-board
>  9 bits for the hash key (the reference stone)
>  9 bits for pseudo liberty count
>  9 bits for stone count
>
> It's kind of interesting when you think about it.   A single array does
> triple duty as a:
>
>1. Go board.
>2. Hash table.
>3. lookup table - to get the "hash key" (table index) of the data
> for
>   the string that occupies this point.
>
> The hash table bits in the table are usually irrelevant, it is only
> valid for the single point that represent the hash table entry for a
> given string.
>
> So this scheme does no list manipulations and has a fixed size in
> memory.  It's compact and I think fast.  Is there anything obvious I'm
> missing that simplifies things?
>
> By the way, stone count field is helpful but not necessary.  But it's
> trivial to keep updated and it's convenient.
>
> Using the compact representation would slow down some of the
> operations which are in fast inner loops now, but would probably be a
> win overall, perhaps

Re: [computer-go] Fast Board implementation

2006-12-15 Thread Łukasz Lew

Thanks :)
I will stick to GPL and use Bazar model (The Cathedral and the Bazaar).
I will provide tarball and darcs repository :)

Best regards,
Lukasz Lew

On 12/15/06, Urban Hafner <[EMAIL PROTECTED]> wrote:

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On Dec 11, 2006, at 19:22 , Łukasz Lew wrote:

> Soon I will publish the code on my web page.
> But I don't have a web page yet. :)
>
> The second issue is a licence.
> Can I use my Go Board implementation in commercial program if I
> publish it on Gnu?
> If no, then can I change the licence when I want to?

Just to make it clear. If you want to just provide your implementation
to everybody that's OK. But if you want to do "joint development" (as
you stated in your first mail) you might not want to use the GPL as you
can only relicense your own code and not the code written by everyone
else. So in this case something like a BSD style license would be better
(though that might allow others to use your code in a commercial
program, too).

Anyway, I you are interested I could provide a website and/or Subversion
repository for your code.

Urban
- --
http://bettong.net - Urban's Blog

-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.6 (Darwin)

iD8DBQFFgm2mggNuVCIrEyURAqDyAJ0djicZ3BSv43YJ4yuEFj1f3WV6agCdECww
ahYws7UhqQ8ep/pGFnfFRbc=
=iXaQ
-END PGP SIGNATURE-
___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/

___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/

Re: [computer-go] Fast Board implementation

2006-12-15 Thread steve uurtamo

> I confirm branches are most costly.
> Removing 1 not needed "if" gave me speedup of 5%.

do you mean that the 'if' was never evaluated,
or that it always evaluated the same way, or
that it was handled elsewhere?  i'm stunned
that a single 'if' was 5% of the execution time
of your code.  it might make more sense if your code
was 20 'if' evaluations, all of which had empty
bodies.

s.

__
Do You Yahoo!?
Tired of spam?  Yahoo! Mail has the best spam protection around 
http://mail.yahoo.com 
___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/

Re: [computer-go] Fast Board implementation

2006-12-15 Thread Łukasz Lew

if (is_player (color_at[v]))
 chain_at[v].find_root ()->inc_lib_cnt ();

I removed the if

chain_at[v].find_root ()->inc_lib_cnt ();

And added some code elsewhere to make it correct.

this code is executed for every neighbour intersection (v) of
intersection that just turned into empty (was captured)

Removing the "if" was more than 4% speedup.

Lukasz

On 12/15/06, steve uurtamo <[EMAIL PROTECTED]> wrote:

> I confirm branches are most costly.
> Removing 1 not needed "if" gave me speedup of 5%.

do you mean that the 'if' was never evaluated,
or that it always evaluated the same way, or
that it was handled elsewhere?  i'm stunned
that a single 'if' was 5% of the execution time
of your code.  it might make more sense if your code
was 20 'if' evaluations, all of which had empty
bodies.

s.

__
Do You Yahoo!?
Tired of spam?  Yahoo! Mail has the best spam protection around
http://mail.yahoo.com
___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/

___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/

Re: [computer-go] Slow KGS computer Go Tournament

2006-12-15 Thread David Doshay


We are working hard to be ready on short notice, and are shooting
in the dark about how to use the extra time. So far we are not able
to bring up a SlugGo that would take advantage of that much time.

But we will keep trying and, as you said:


You can regard it as a test slow tournament.



Cheers,
David



On 14, Dec 2006, at 7:43 PM, Don Dailey wrote:


Hi Nick,

I'm not going to be ready for such a tournament - but I really want to
be.

I hope you have another one at a later date.

- Don


On Thu, 2006-12-14 at 12:49 +, Nick Wedd wrote:
The 2006 Slow KGS computer Go tournament will be next week,  
starting at

00:00 on Monday 18th UCT (GMT).


___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/

Re: [computer-go] Fast Board implementation

2006-12-15 Thread steve uurtamo

> if (is_player (color_at[v]))
>   chain_at[v].find_root ()->inc_lib_cnt ();
> 
> I removed the if
> 
> chain_at[v].find_root ()->inc_lib_cnt ();
> 
> And added some code elsewhere to make it correct.

so it wasn't the "is_player"?

s.

__
Do You Yahoo!?
Tired of spam?  Yahoo! Mail has the best spam protection around 
http://mail.yahoo.com 
___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/

Re: [computer-go] Fast Board implementation

2006-12-15 Thread Łukasz Lew

On 12/15/06, steve uurtamo <[EMAIL PROTECTED]> wrote:

> if (is_player (color_at[v]))
>   chain_at[v].find_root ()->inc_lib_cnt ();
>
> I removed the if
>
> chain_at[v].find_root ()->inc_lib_cnt ();
>
> And added some code elsewhere to make it correct.

so it wasn't the "is_player"?

Oh, sorry I forgot.

static bool is_player (t color) { return (color & (~1)) == 0; }

This is 1 CC so I don't think so. :)

Lukasz

s.

__
Do You Yahoo!?
Tired of spam?  Yahoo! Mail has the best spam protection around
http://mail.yahoo.com
___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/

___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/

Re: [computer-go] Fast Board implementation

2006-12-15 Thread steve uurtamo

> Oh, sorry I forgot.
> 
> static bool is_player (t color) { return (color &
> (~1)) == 0; }
> 
> This is 1 CC so I don't think so. :)

:)

i guess that since the entries of color_at
had to be looked up for each v (presumably),
the pred_br_eq (or whatever the x86 version
is called) instruction couldn't do prediction?

that's a pretty dang impressive minimal example.

s.

__
Do You Yahoo!?
Tired of spam?  Yahoo! Mail has the best spam protection around 
http://mail.yahoo.com 
___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/

Re: [computer-go] Fast Board implementation

2006-12-15 Thread Don Dailey

Another report on D programming.   It appears that D programs are about
1.5X slower in general - at least for what I'm doing.

At one point I reported about 7% slower, but that was a mistake on my
part.

The difference in programming effort and just plain clean code is quite
large - D was well designed and after doing a lot of programming in D,
it's a real drag going back to C!

But it appears that you can have your cake and eat it too by just making
a low level library in C,  and doing most of the other stuff in D.D
and C are link compatible and there is no effort in making this work -
it just works.  

I have indeed been able to experiment more quickly and easily doing it
in D, so I have come up with what seems like a nice formula:

  1.  Start by doing everything in D.
  2.  Optimize and experiment.
  3.  When you are happy,  redo the low level stuff in C.   It will
link 
  right in.
  4.  You can still continue to experiment.

Porting to C from D is quite easy because D is pretty much a cleaned up
version of C.   It's easy to make this port when you know exactly what
you want.

I've been able to do a substantial amount of experimentation using D and
the D program is now just slightly faster than the old C program due to
the improved
data structures.I should be able to get close to 25,000 games per
second if
I get the 1.5 X improvement going to C.   This is only about 3K of code
that will
have to be written in C,  where much of the code is identical to C
anyway.

- Don

On Thu, 2006-12-14 at 22:51 -0500, Luke Gustafson wrote:
> Are you using the union-find algorithm with path compression for joining 
> strings?  It's very simple and very fast.
> http://en.wikipedia.org/wiki/Disjoint-set_data_structure#Disjoint-set_forests
> 
> The other main thing to consider is reducing branch mispredictions, which 
> (I'm guessing) could very well use more clock cycles on a P4 than performing 
> the actual calculations themselves.  I think this is why the next_stone list 
> is faster than using e.g. a recursive algorithm, which involves a lot of 
> if's that cannot be predicted.
> 
> Compact data I think is unnecessary, although perhaps eliminating the String 
> struct would speed things up a tad.  In fact, on a P4, shift operations are 
> slower than L1 cache memory retrievals, so fancy packing of data is not 
> worthwhile.  As long as all data fits in the L1 cache it should be max 
> speed.
> 
> I am working on a program myself, and I will be porting it to assembly at 
> some point; right now I am just experimenting with the algorithm design in 
> C.  (Although this happens pretty slowly since I am very busy with work.)  I 
> think that this sort of algorithm is perfect for assembly language, where 
> hand-optimization can be much better than compiler optimization.  I will let 
> you know how it turns out compared to Lukasz :)
> 
> >> At the time of Your post I've had it already implemented and regarded
> >> it like "my sweet secret" :)
> >
> > I don't know how sweet this secret is, but it does help.  I just
> > implemented it in Lazarus and I get about a 20% speedup.  That's not
> > bad, but nothing to write home about.  To a chess programmer this is
> > rather enormous!  I am guessing that the win would be much greater on
> > bigger boards.
> >
> > In principle it would probably be FAR faster, but it adds extra
> > complexity,
> > operations and overhead to executing a move which subtracts some from
> > the
> > savings.
> >
> > The question is: "what is the best way to implement it?"  The problem
> > is that you now must track pseudo liberty counts for every chain on
> > the board.  This is more complicated obviously.
> >
> > Every time a stone is placed or removed, 4 additional entries must be
> > accessed in order to update the pseudo liberty counts.
> >
> > It's still a good trade-off because the testing for whether a move
> > is a capture or suicide is cheap.
> >
> > I'm reviewing my design now and I think I see some ways to improve.
> >
> > Here is the current  data structure (assuming the 9x9 board):
> >
> >  1.  A 10x11 board (actually 10x11 plus 1 for diagonal checking.)
> >  0 = empty
> >  1 = white
> >  2 = black
> >  4 = off-board points.
> >
> >  2.  Another 10x11 board-like array - if there is a occupied point on
> >  the board this array contains the index (hash key) of a "string"
> >  data type.
> >
> >  3.  A list of string data types (a struct in C):
> >1.  stone count
> >2.  pseudo liberties for the whole string
> >3.  a reference stone.
> >
> > To save time, the list of strings is really more like a hash table,
> > not a proper list.  It's a perfect hash table because the keys will
> > never collide.  I use the reference stone as the key for each string
> > which is generally the first stone placed in the string.  So the hash
> > key computation is just a table lookup and is the only thing the
> > second table i

Re: [computer-go] Fast Board implementation

2006-12-15 Thread steve uurtamo

> I should be able to get close to
> 25,000 games per
> second if
> I get the 1.5 X improvement going to C.   This is
> only about 3K of code
> that will
> have to be written in C,  where much of the code is
> identical to C
> anyway.

is this that much easier than just starting and
finishing in C?

s.

__
Do You Yahoo!?
Tired of spam?  Yahoo! Mail has the best spam protection around 
http://mail.yahoo.com 
___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/

Re: [computer-go] Fast Board implementation

2006-12-15 Thread Łukasz Lew

On 12/16/06, Don Dailey <[EMAIL PROTECTED]> wrote:

Another report on D programming.   It appears that D programs are about
1.5X slower in general - at least for what I'm doing.

At one point I reported about 7% slower, but that was a mistake on my
part.

The difference in programming effort and just plain clean code is quite
large - D was well designed and after doing a lot of programming in D,
it's a real drag going back to C!

Is it?
I mean what features of D You've used that make it inconvenient to go back to C?

But it appears that you can have your cake and eat it too by just making
a low level library in C,  and doing most of the other stuff in D.D
and C are link compatible and there is no effort in making this work -
it just works.

I have indeed been able to experiment more quickly and easily doing it
in D, so I have come up with what seems like a nice formula:

  1.  Start by doing everything in D.
  2.  Optimize and experiment.
  3.  When you are happy,  redo the low level stuff in C.   It will
link
  right in.
  4.  You can still continue to experiment.

Porting to C from D is quite easy because D is pretty much a cleaned up
version of C.   It's easy to make this port when you know exactly what
you want.

I've been able to do a substantial amount of experimentation using D and
the D program is now just slightly faster than the old C program due to
the improved
data structures.I should be able to get close to 25,000 games per
second if
I get the 1.5 X improvement going to C.   This is only about 3K of code
that will
have to be written in C,  where much of the code is identical to C
anyway.

- Don

On Thu, 2006-12-14 at 22:51 -0500, Luke Gustafson wrote:
> Are you using the union-find algorithm with path compression for joining
> strings?  It's very simple and very fast.
> http://en.wikipedia.org/wiki/Disjoint-set_data_structure#Disjoint-set_forests
>
> The other main thing to consider is reducing branch mispredictions, which
> (I'm guessing) could very well use more clock cycles on a P4 than performing
> the actual calculations themselves.  I think this is why the next_stone list
> is faster than using e.g. a recursive algorithm, which involves a lot of
> if's that cannot be predicted.
>
> Compact data I think is unnecessary, although perhaps eliminating the String
> struct would speed things up a tad.  In fact, on a P4, shift operations are
> slower than L1 cache memory retrievals, so fancy packing of data is not
> worthwhile.  As long as all data fits in the L1 cache it should be max
> speed.
>
> I am working on a program myself, and I will be porting it to assembly at
> some point; right now I am just experimenting with the algorithm design in
> C.  (Although this happens pretty slowly since I am very busy with work.)  I
> think that this sort of algorithm is perfect for assembly language, where
> hand-optimization can be much better than compiler optimization.  I will let
> you know how it turns out compared to Lukasz :)
>
> >> At the time of Your post I've had it already implemented and regarded
> >> it like "my sweet secret" :)
> >
> > I don't know how sweet this secret is, but it does help.  I just
> > implemented it in Lazarus and I get about a 20% speedup.  That's not
> > bad, but nothing to write home about.  To a chess programmer this is
> > rather enormous!  I am guessing that the win would be much greater on
> > bigger boards.
> >
> > In principle it would probably be FAR faster, but it adds extra
> > complexity,
> > operations and overhead to executing a move which subtracts some from
> > the
> > savings.
> >
> > The question is: "what is the best way to implement it?"  The problem
> > is that you now must track pseudo liberty counts for every chain on
> > the board.  This is more complicated obviously.
> >
> > Every time a stone is placed or removed, 4 additional entries must be
> > accessed in order to update the pseudo liberty counts.
> >
> > It's still a good trade-off because the testing for whether a move
> > is a capture or suicide is cheap.
> >
> > I'm reviewing my design now and I think I see some ways to improve.
> >
> > Here is the current  data structure (assuming the 9x9 board):
> >
> >  1.  A 10x11 board (actually 10x11 plus 1 for diagonal checking.)
> >  0 = empty
> >  1 = white
> >  2 = black
> >  4 = off-board points.
> >
> >  2.  Another 10x11 board-like array - if there is a occupied point on
> >  the board this array contains the index (hash key) of a "string"
> >  data type.
> >
> >  3.  A list of string data types (a struct in C):
> >1.  stone count
> >2.  pseudo liberties for the whole string
> >3.  a reference stone.
> >
> > To save time, the list of strings is really more like a hash table,
> > not a proper list.  It's a perfect hash table because the keys will
> > never collide.  I use the reference stone as the key for each string
> > which is generally the first stone pl

Re: [computer-go] Fast Board implementation

2006-12-15 Thread Don Dailey

On Sat, 2006-12-16 at 01:47 +0100, Łukasz Lew wrote:
> On 12/16/06, Don Dailey <[EMAIL PROTECTED]> wrote:
> > Another report on D programming.   It appears that D programs are about
> > 1.5X slower in general - at least for what I'm doing.
> >
> > At one point I reported about 7% slower, but that was a mistake on my
> > part.
> >
> > The difference in programming effort and just plain clean code is quite
> > large - D was well designed and after doing a lot of programming in D,
> > it's a real drag going back to C!
> 
> Is it?
> I mean what features of D You've used that make it inconvenient to go back to 
> C?

A good part of it is just the extra syntax in C.   D isn't expressive
like Ruby or Python, but it is expressive enough that you are willing to
try something that you might not want to in C without getting a cup of
coffee first.   

Even just the foreach loop, which is a petty thing I admit, but it makes
life easier and the code looks cleaner.   

You can get rid of most of the pointer ugliness.  I really like that.
My C 
program is full of functions where you pass pointers to something:

  MakeMove( position *p, mv_t mv )

   p->bd[mv] = 1 + (p->ctm & 1);

  etc...

In D you declare arguments as "in", "out" or "inout" and forget about
the
pointer syntax such as  &x, *x  blah blah blah. When you have a lot
of
complex expressions this gets rather convoluted in C.

arrays are real nice - you can slice and dice them.  You can sort them
without
all the setup code of c.   You can sort arrays of structs by putting a
comparison
function right inside the structure.

Most of these things can be done in C with extra setup code.   You don't
need foreach loops, you can add a couple of lines of code and get the
same functionality in C.But the author of D payed attention to all
the little annoyances of C and fixed many of them.   And he fixed a lot
of stuff that 
generates bugs.   You can write code a little faster in D, but you can
write finished bug-free code a LOT faster.

It's a whole like of little things like that.  Array concatenation,
associative arrays,  dynamic arrays, etc.   Associative arrays  is a
killer feature even though every language has libraries - it's not the
same as having it built into
the language.   

I probably wouldn't use associate arrays in parts of the program that I
expect to port to C,  but it's hard to live without associative arrays
and I've always wanted them in C.

Here is a good page to give a quick comparison with C:

  http://www.digitalmars.com/d/comparison.html

- Don

> >
> > But it appears that you can have your cake and eat it too by just making
> > a low level library in C,  and doing most of the other stuff in D.D
> > and C are link compatible and there is no effort in making this work -
> > it just works.
> >
> > I have indeed been able to experiment more quickly and easily doing it
> > in D, so I have come up with what seems like a nice formula:
> >
> >   1.  Start by doing everything in D.
> >   2.  Optimize and experiment.
> >   3.  When you are happy,  redo the low level stuff in C.   It will
> > link
> >   right in.
> >   4.  You can still continue to experiment.
> >
> > Porting to C from D is quite easy because D is pretty much a cleaned up
> > version of C.   It's easy to make this port when you know exactly what
> > you want.
> >
> > I've been able to do a substantial amount of experimentation using D and
> > the D program is now just slightly faster than the old C program due to
> > the improved
> > data structures.I should be able to get close to 25,000 games per
> > second if
> > I get the 1.5 X improvement going to C.   This is only about 3K of code
> > that will
> > have to be written in C,  where much of the code is identical to C
> > anyway.
> >
> > - Don
> >
> >
> > On Thu, 2006-12-14 at 22:51 -0500, Luke Gustafson wrote:
> > > Are you using the union-find algorithm with path compression for joining
> > > strings?  It's very simple and very fast.
> > > http://en.wikipedia.org/wiki/Disjoint-set_data_structure#Disjoint-set_forests
> > >
> > > The other main thing to consider is reducing branch mispredictions, which
> > > (I'm guessing) could very well use more clock cycles on a P4 than 
> > > performing
> > > the actual calculations themselves.  I think this is why the next_stone 
> > > list
> > > is faster than using e.g. a recursive algorithm, which involves a lot of
> > > if's that cannot be predicted.
> > >
> > > Compact data I think is unnecessary, although perhaps eliminating the 
> > > String
> > > struct would speed things up a tad.  In fact, on a P4, shift operations 
> > > are
> > > slower than L1 cache memory retrievals, so fancy packing of data is not
> > > worthwhile.  As long as all data fits in the L1 cache it should be max
> > > speed.
> > >
> > > I am working on a program myself, and I will be porting it to assembly at
> > > some point; right now I am just experimenting with the algorithm design in
> > > C.

Re: [computer-go] Fast Board implementation

2006-12-15 Thread Don Dailey

On Fri, 2006-12-15 at 16:28 -0800, steve uurtamo wrote:
> > I should be able to get close to
> > 25,000 games per
> > second if
> > I get the 1.5 X improvement going to C.   This is
> > only about 3K of code
> > that will
> > have to be written in C,  where much of the code is
> > identical to C
> > anyway.
> 
> is this that much easier than just starting and
> finishing in C?

This is a tried and true paradigm,  you program in one language and 
use wrapped up libraries written in C.The only difference is that
I'm writing in a convenient C dialect and I don't have to wrap the
library.This is a very natural thing.

Porting a program from D to C is a small one-time project.  99 percent 
of my time is spent experimenting with the code, rewriting sections of
it, trying new things.

Anything that makes that 99% better and faster is worth it.   I don't
mind
spending 2 or 3 hours porting to C after spending weeks experimenting
with a new algorithm.  

You could do all of this with some other high level language too, such
as Java, but with a great deal more effort.  

I am making a judgment that far less effort will be spent for a given
quality of product doing most of the dev work in a better language.   

- Don

> s.
> 
> __
> Do You Yahoo!?
> Tired of spam?  Yahoo! Mail has the best spam protection around 
> http://mail.yahoo.com 

___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/

Re: [computer-go] Fast Board implementation

Re: [computer-go] Fast Board implementation

[computer-go] Re:Re:Are there researches about human annotation to game records?

Re: [computer-go] Fast Board implementation

Re: [computer-go] Fast Board implementation

Re: [computer-go] Fast Board implementation

Re: [computer-go] Fast Board implementation

Re: [computer-go] Fast Board implementation

Re: [computer-go] Slow KGS computer Go Tournament

Re: [computer-go] Fast Board implementation

Re: [computer-go] Fast Board implementation

Re: [computer-go] Fast Board implementation

Re: [computer-go] Fast Board implementation

Re: [computer-go] Fast Board implementation

Re: [computer-go] Fast Board implementation

Re: [computer-go] Fast Board implementation

Re: [computer-go] Fast Board implementation

17 matches

Site Navigation

Mail list logo

Footer information