All,
let me chip in with some additional thoughts about massively parallel
hardware.
I recently implemented Monte Carlo playouts on CUDA, to run them on the
GPU. It was more or less a "naive" implementation (read: a more or less
straight port with optimised memory access patterns). I am hopefully
going to have a more detailed post about this in the future, but in the
meantime, let me assure you that on such a platform (i.e. SIMD/SIMT) it
is not just the size of the tree that is the obstacle, but the fact that
there is a tree at all - linked data structures and branching just don't
work well. Other data structures may be required.
Of course, on the PC, it looks like people will continue to go down the
track of "coarse-grain" parallelism, with each thread running a full
simulating instance. We will see whether CPU manufacturers will oblige
to support you in that direction. Scaling control units, caches, etc.
across all cores becomes more and more expensive the more cores you
have. If it goes the GPU route, your architectures will have to change
to get more performance.
Christian
Don Dailey wrote:
This is a great post, and some good observations. I agree with your
conclusions that CPU power is increasing faster than memory and memory
bandwidth. Let me give you my take on this.
In a nutshell, I believe memory will increasingly become the limiting
factor no matter what direction we go. The only thing I would add to
what you said is that we should recognize that there are no
workarounds - heavy playouts won't solve the problem but may be the
pathway we have to go down to mimimze the problem.
This is because it's the tree that has been the source of our recent
breakthroughs. When the tree stops scaling due to memory
limitations, so will our programs. When we are forced to focus on
heavier playouts because we cannot build bigger trees, then we are
back to the problem of trying to solve the game statically. At this
point we have regressed to our old non-scalable methods that have not
served us very well over the decades.
That is not to say that we won't continue to make progress, taking
advantage of ever increasing hardware. The memory will continue to
get larger, just not at the same rate as processing power. We will
be forced to find ways to trade off CPU power for memory, and so
on. We will no doubt be pretty clever about this. We may be
forced to consider other kinds of tree search that do not require
maintaining the entire tree in memory. However, there is no tree
search that does not benefit from memory and memory bandwidth, so we
will always have the memory bottleneck to deal with, however this ends
up going.
- Don
2009/5/12 Brian Sheppard <sheppar...@aol.com <mailto:sheppar...@aol.com>>
Summary: The trend in computer systems has been for CPU power to
grow much
faster than memory size. The implication of this trend for MCTS
computer go
implementations is that "heavy" playouts will have a significant
cost advantage
in the future.
I bought a Pentium D 3GHz system a few years back. It was equipped
with 2 GB
of memory. This was a good CPU at that time (second fastest
available) with a
maximum memory configuration. A comparable system nowadays would be an
i7-940, with 6GB of memory.
Now, comparison of those systems shows the trend. The Pentium D has a
Passmark rating (an industry standard measure of comuter power) of
770, and
the i7-940 has a rating of 6095. The CPU is 7.9 times as fast, but
the memory
configuration is only 3 times as large.
This is only one memory configuration. You can buy larger memory
configurations
at higher cost, but that's the point: the CPU is accelerating
faster than memory
is growing.
The trend seems likely to intensify. Intel has already announced
6-core and 8-core
systems for release later this year. It is not known what memory
configuration will
ship in such a system, but we cannot expect it to be double what
ships in a
quad-core.
In 2010 we have the Larrabee architecture, which puts a much
larger number of
cores on the chip. Intel isn't saying how many, but you can count
on 16 cores
at a minimum, with rapid doublings thereafter. Memory density
seems unlikely
to keep pace.
I will be watching this trend very closely, as one who has to pay
for his own
computer equipment.
The implication for MCTS and computer go: it pays to trade off CPU
power to keep
memory usage low.
The most direct way to keep memory low is to represent the tree
using as few bytes
as possible. The literature refers to techniques for "promoting"
nodes after some
number of trials, and I recall that perhaps Fuego gives an
implementation. But I
doubt that we know the limits. Here is a thought experiment: if
you run the bzip
compressor on your tree, would you get 10% compression? My guess
is more
like 95% compression.
The nodes in my own program are horribly fat. But I blame Mogo;
they haven't
published how they represent the tree! Joking aside, there is a
lot of potential here.
Another approach is to play well using a smaller tree size, which
means using more
CPU power on each trial. This is the "heavy playouts" conclusion.
I speculate that
adding knowledge to the tree search and trials will pay off, and
can be beneficial even
if the program is not stronger, because smaller memory usage means
that the program
has a higher limit at longer time controls.
Note that benchmarking programs in "lightning" games on CGOS can
bias our
decisions in the wrong direction. If we use memory to make our
programs faster then
it looks like a big win. But at tournament time controls that
choice merely saturates
RAM more quickly, at a smaller tree size.
So that's all I have to say. Just something to think about, and if
you think of a good
way to represent the tree, I hope you will return the favor.
Best,
Brian
_______________________________________________
computer-go mailing list
computer-go@computer-go.org <mailto:computer-go@computer-go.org>
http://www.computer-go.org/mailman/listinfo/computer-go/
------------------------------------------------------------------------
_______________________________________________
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/
--
Christian Nentwich
Director, Model Two Zero Ltd.
+44-(0)7747-061302
http://www.modeltwozero.com
_______________________________________________
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/