Good morning,
GPUs are SIMD processors. The hundreds of cores are great for highly
parallel calculation.
In GLSL/HLSL I can write a program which is calculated for a very small
set of pixels (usually 2x2 or 1x1). So if you have a resolution of 10x10
the program is basically run 5x5=25 or 10x10=100 times in parallel. Boost
the resolution to more real values like 1080p you see how the many cores
benefit the whole calculation.
This high parallelization can only really happen because most stuff is
independent. For example, when raytracing each ray is (almost)
independent of other rays.
Still, the 2x2 matrix is often calculated dependently because for
texturing (and mip mapping) you need the "distance" between two pixels in
a fragment shader. This is why (for texturing) you may end up having
slower programs and some waiting time between some threads, because
sometimes you need the value of the neighbor thread and have to wait
until it's calculated.
Well, these are very language-specific details that are important for
graphics, but apply similarly to other use cases. I can imagine that for
neural networks you can just write the code for one node nad execute it
500 times for 500 nodes in parallel. Imagine having this beast on the CPU
with just 4 cores...
I hope this helps you understand how GPU cores ("shaders") work.
Vulkan would indeed be interesting. Since we are only interested in the
compute part it might even make our programs really small, the "hello
world" part of drawing triangles would be the "client" side (writing a
rasterizer, raymarcher, tracer, whatever). It could still be a lot lines
of code, but maybe we still benefit from the 10% speedup.
I still have to understand how all this "shader compilation" stuff works.
In webgl it's like, "here's my code, make a shader from it, then I tell
you it's a fragment shader". Shader compilation happens automatically. In
UE shader compilation takes a long time, and I believe also in blender
shaders are stored in a precompiled binaries.
sirjofri
23.08.2021 06:13:53 Bakul Shah <ba...@iitbombay.org>:
Don't high end GPUs have thousands of "cores"? Even high end CPUs don't
have more than a few dozen cores to 128 or so. While each kind's cores
are very different, seems to me GPU/CPU paths have diverged for good.
Or we need some massive shift in programming languages + compilers. I
lack imagination how. Still, the thought of the CPUs gaining the
complexity of the graphics engine scares me!
-- Bakul
On Aug 22, 2021, at 12:09 PM, Paul Lalonde <paul.a.lalo...@gmail.com>
wrote:
I'm pretty sure we're still re-inventing, though it's the CPU's turn to
gain some of the complexity of the graphics engine.
Paul
On Sun, Aug 22, 2021, 12:05 PM Bakul Shah <ba...@iitbombay.org> wrote:
Thanks. Looks like Sutherland's "Wheel of
Reincarnation[https://www2.cs.arizona.edu/~cscheid/reading/myer-sutherland-design-of-display-processors.pdf]"
has not only stopped but exploded :-) Or stopped being applicable.
-- Bakul
On Aug 22, 2021, at 9:23 AM, Paul Lalonde <paul.a.lalo...@gmail.com>
wrote:
It got complicated because there's no stable interface or ISA. The
hardware evolved from fixed-function to programmable in a commercial
environment where the only meaningful measure was raw performance per
dollar at many price points. Every year the hardware spins and
becomes more performant, usually faster than Moore's law. With 3D
APIs hiding the hardware details there is no pressure to make the
hardware interface uniform, pretty, or neat. And with the need for
performance there are dozens of fixed function units that effectively
need their own sub-drivers while coordinating at high performance with
the other units.
The system diagrams for GPUs look complex, but they are radical
simplifications of what's really on the inside.
Intel really pioneered the open driver stacks, but performance
generally wasn't there. That might be changing now, but I don't know
if their recently announced discrete product line will be
driver-compatible.
Paul
On Sun, Aug 22, 2021 at 8:48 AM Bakul Shah <ba...@iitbombay.org>
wrote:
The FreeBSD amdgpu.ko is over 3Mbytes of compiled code. Not counting
the "firmware" that gets loaded on the GPU board. drm/amd/amdgpu has
200K+ lines of source code. drm/amd over 2M lines of code. Intel's
i915 seems to be about 1/10th the amd size. AIUI, this is linux GPU
driver code, more or less unchanged (FreeBSD has shim code to use
it). How did the interface to an SIMD processor get so complicated?
…
-- Bakul
*9fans[https://9fans.topicbox.com/latest]* / 9fans / see
discussions[https://9fans.topicbox.com/groups/9fans] +
participants[https://9fans.topicbox.com/groups/9fans/members] +
delivery options[https://9fans.topicbox.com/groups/9fans/subscription]
Permalink[https://9fans.topicbox.com/groups/9fans/Tad29bfc223dc4fbe-Me78513510ae4df2da186c73a]
------------------------------------------
9fans: 9fans
Permalink:
https://9fans.topicbox.com/groups/9fans/Tad29bfc223dc4fbe-M40ea45711a1551fd53807b84
Delivery options: https://9fans.topicbox.com/groups/9fans/subscription