Mark Boon wrote:
> Thanks for the pointer Don, that might be worth looking into at some
> point.
>
> When you say 'amazing accuracy despite this speed and simplicity' I
> still have to ask: "how fast?". I think 10usec is actually pretty
> fast. But when talking about using it during MC playouts for example,
> it's still rather slow. This is due to the pile-up of numbers. A
> (relatively) big area to look for, and a large number of occasions to
> do the lookup.
>
> The patterns don't necessarily need to be hand-crafted. But throwing a
> million generated patterns at it also has its price in terms of
> performance.
Pattern lookup has nothing to do with the number of patterns it was
trained on of course.   Think of this as a kind of neural net.    The
data produced is compact, and  is a function of the number of
attributes,  and the number of classes (this is a classifier) and not
the amount of patterns used for training.   

Therefore 10 million training patterns would take the same storage as 1
million or even just one.    I generate 16 versions of each pattern for
each possible color and orientation and the training time, I'm
guessing,  is dominated by the time it takes to read the training file
(a text file) into memory, parse the fields, and rotate the patterns.   
Of course I don't really care about the training time itself since this
is a one time operation but it's fast nevertheless,  not like training a
neural net or running a genetic algorithm.   All it is, is parsing data
and collecting it into a data structure to be able to calculate
probabilities from.  

If you are just interested in patterns generated the old fashion way,
hand crafted without learning or generalization (other than wild
cards),   then this has nothing to offer for you.    The techniques for
doing this have been around for decades and you are an expert in them
and there is nothing new in this.     David Fotland, and many of the top
GO people have very fast pattern matchers that work well and are
super-optimized to do what they do very quickly.   

I definitely agree with you, when it comes to pure state of the art
pattern matching (the old fashioned way) that variable pattern shapes
with wild-card matching is very important.   This is all fine for a
programmer who is also an expert GO player,   but for someone like
myself there is no other choice (other than taking a few years to become
a GO expert)  but to explore this in a machine learning context.  

Naive Bayes is called "naive" for a reason.   It makes assumptions about
the independence of observations that are not true,  however it has been
shown both empirically and formally that it can very often perform
surprisingly well despite these limitations even in domains where
attributes are highly correlated.    In GO,  patterns attributes of
course are highly correlated probably to an absurd degree,  but there
are many simple enhancements that correct the shortcomings while
maintaining most of the simplicity.  

But it is a whole different thing from what you are doing. 

- Don


>
> Mark
>
>
> On 26-mrt-08, at 16:17, Don Dailey wrote:
>
>> Mark,
>>
>> I am doing some experimentation with something similar to patterns,  but
>> based  on Naive Bayes classifiers.   The idea is to use  Naive Bayes
>> classifiers to generalize patterns.     The classifiers would still be
>> focused on some constrained area of the board, such as the 5x5 matrix or
>> something,  but they would be extremely compact compared to patterns and
>> very good at generalizing.  I'm convinced they would have to be enhanced
>> with additional attributes concerning the points being considered,  but
>> one of their strengths is that you can pile on huge numbers of
>> attributes without killing the speed or memory consumption very
>> significantly.
>>
>> Recently there has been a huge surge of interest in Naive Bayes
>> Classifiers due to their simplicity and speed, and yet amazing accuracy
>> despite this speed and simplicity.    Nothing has been found that
>> improves all that much on Naive Bayes for many applications,  and some
>> simple improvements to Naive Bayes has put it in the same league as
>> other far more complex and computationally expensive methods such as
>> neural networks and decision trees.
>>
>> I have no idea whether I'm barking up the wrong tree - but I think it's
>> definitely worth taking a look at.     I don't think GO programs can be
>> improved much more by simply adding more hand crafted patterns - we need
>> to find ways to generalize knowledge in powerful ways.
>>
>> Naive Bayes is trained by example and it's trivial, basically a single
>> pass statistics gathering.   So you must basically show it a gazillion
>> sample patterns with "known" classifications.    You could build these
>> from games of strong players for instance.
>>
>> - Don
>>
>>
>>
>>
>> Mark Boon wrote:
>>> Lately I have been putting some effort into pattern-matching. Although
>>> I have made progress, the result was not as good as what I had hoped
>>> for by about an order of magnitude. This makes me wonder what is
>>> currently actually the state of the art of pattern matching in Go.
>>>
>>> Of course it's a bit of a vague question, as there are many possible
>>> forms of pattern-matching. Fixed-size patterns are the easiest since a
>>> hash-code can be used. Nothing is going to beat that in terms of
>>> speed, but its use is limited to some special occasions. Joseki is one
>>> and apparently 3x3 patterns are used in Monte-Carlo programs.
>>>
>>> But I think the most generally useful form is one that can do
>>> pattern-matching for variable shapes. (Or which can have don't-care
>>> points if you like.) I thought I had a solution that would be pretty
>>> close to the theoretical best performance. Formally proving that would
>>> probably be a dissertation in itself, most important for me is in
>>> itself it works and with modest memory requirements. That is the good
>>> part. The bad part is, if I'm right this is the best it can get I'm a
>>> bit disappointed it isn't faster. I'd rather be proven wrong here.
>>> It's written in Java so making it in C would possibly give a two-fold
>>> speedup, but that's all I can think of.
>>>
>>> What I have now takes 10-15 microseconds on a 2Ghz Intel computer to
>>> find all the patterns on a board (on average for 9x9, for 19x19 it's
>>> more like 15-20 usec) and it also gives me the 'new' patterns i.e.
>>> patterns that match now but didn't match the previous move (I believe
>>> that's a useful feature, but it's a detail). The set of patterns is
>>> under a thousand patterns. Somehow smaller sets don't go much faster,
>>> but larger sets do slow down, every ten-fold increase in number of
>>> patterns seems to double the time.
>>>
>>> So I was wondering if others cared to share the performance of their
>>> pattern-matcher. I just want to find out if I'm chasing a unicorn or
>>> not by trying to make something faster.
>>>
>>> Mark
>>>
>>>
>>> _______________________________________________
>>> computer-go mailing list
>>> computer-go@computer-go.org
>>> http://www.computer-go.org/mailman/listinfo/computer-go/
>>>
>> _______________________________________________
>> computer-go mailing list
>> computer-go@computer-go.org
>> http://www.computer-go.org/mailman/listinfo/computer-go/
>
> _______________________________________________
> computer-go mailing list
> computer-go@computer-go.org
> http://www.computer-go.org/mailman/listinfo/computer-go/
>
_______________________________________________
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/

Reply via email to