Good morning,

GPUs are SIMD processors. The hundreds of cores are great for highly parallel calculation.

In GLSL/HLSL I can write a program which is calculated for a very small set of pixels (usually 2x2 or 1x1). So if you have  a resolution of 10x10 the program is basically run 5x5=25 or 10x10=100 times in parallel. Boost the resolution to more real values like 1080p you see how the many cores benefit the whole calculation.

This high parallelization can only really happen because most stuff is independent. For example, when raytracing each ray is (almost) independent of other rays.

Still, the 2x2 matrix is often calculated dependently because for texturing (and mip mapping) you need the "distance" between two pixels in a fragment shader. This is why (for texturing) you may end up having slower programs and some waiting time between some threads, because sometimes you need the value of the neighbor thread and have to wait until it's calculated.

Well, these are very language-specific details that are important for graphics, but apply similarly to other use cases. I can imagine that for neural networks you can just write the code for one node nad execute it 500 times for 500 nodes in parallel. Imagine having this beast on the CPU with just 4 cores...

I hope this helps you understand how GPU cores ("shaders") work.

Vulkan would indeed be interesting. Since we are only interested in the compute part it might even make our programs really small, the "hello world" part of drawing triangles would be the "client" side (writing a rasterizer, raymarcher, tracer, whatever). It could still be a lot lines of code, but maybe we still benefit from the 10% speedup.

I still have to understand how all this "shader compilation" stuff works. In webgl it's like, "here's my code, make a shader from it, then I tell you it's a fragment shader". Shader compilation happens automatically. In UE shader compilation takes a long time, and I believe also in blender shaders are stored in a precompiled binaries.

sirjofri

23.08.2021 06:13:53 Bakul Shah <ba...@iitbombay.org>:

Don't high end GPUs have thousands of "cores"? Even high end CPUs don't have more than a few dozen cores to 128 or so. While each kind's cores are very different, seems to me GPU/CPU paths have diverged for good. Or we need some massive shift in programming languages + compilers. I lack imagination how. Still, the thought of the CPUs gaining the complexity of the graphics engine scares me!

-- Bakul

On Aug 22, 2021, at 12:09 PM, Paul Lalonde <paul.a.lalo...@gmail.com> wrote:

I'm pretty sure we're still re-inventing, though it's the CPU's turn to gain some of the complexity of the graphics engine.

Paul

On Sun, Aug 22, 2021, 12:05 PM Bakul Shah <ba...@iitbombay.org> wrote:
Thanks. Looks like Sutherland's "Wheel of Reincarnation[https://www2.cs.arizona.edu/~cscheid/reading/myer-sutherland-design-of-display-processors.pdf]"; has not only stopped but exploded :-) Or stopped being applicable.

-- Bakul

On Aug 22, 2021, at 9:23 AM, Paul Lalonde <paul.a.lalo...@gmail.com> wrote:

It got complicated because there's no stable interface or ISA.  The hardware evolved from fixed-function to programmable in a commercial environment where the only meaningful measure was raw performance per dollar at many price points.  Every year the hardware spins and becomes more performant, usually faster than Moore's law.  With 3D APIs hiding the hardware details there is no pressure to make the hardware interface uniform, pretty, or neat.  And with the need for performance there are dozens of fixed function units that effectively need their own sub-drivers while coordinating at high performance with the other units.  The system diagrams for GPUs look complex, but they are radical simplifications of what's really on the inside.

Intel really pioneered the open driver stacks, but performance generally wasn't there.  That might be changing now, but I don't know if their recently announced discrete product line will be driver-compatible.

Paul


On Sun, Aug 22, 2021 at 8:48 AM Bakul Shah <ba...@iitbombay.org> wrote:
The FreeBSD amdgpu.ko is over 3Mbytes of compiled code. Not counting the "firmware" that gets loaded on the GPU board. drm/amd/amdgpu has 200K+ lines of source code. drm/amd over 2M lines of code. Intel's i915 seems to be about 1/10th the amd size. AIUI, this is linux GPU driver code, more or less unchanged (FreeBSD has shim code to use it). How did the interface to an SIMD processor get so complicated?

…



-- Bakul



*9fans[https://9fans.topicbox.com/latest]* / 9fans / see discussions[https://9fans.topicbox.com/groups/9fans] + participants[https://9fans.topicbox.com/groups/9fans/members] + delivery options[https://9fans.topicbox.com/groups/9fans/subscription] Permalink[https://9fans.topicbox.com/groups/9fans/Tad29bfc223dc4fbe-Me78513510ae4df2da186c73a]

------------------------------------------
9fans: 9fans
Permalink: 
https://9fans.topicbox.com/groups/9fans/Tad29bfc223dc4fbe-M40ea45711a1551fd53807b84
Delivery options: https://9fans.topicbox.com/groups/9fans/subscription

Reply via email to