On Fri, Dec 27, 2024 at 08:56:32AM -0800, Bakul Shah via 9fans wrote:
> This may be of some use to non-experts:
> 
> https://enccs.github.io/gpu-programming/
> 

I will hence start by this before diving in the references Paul
Lalonde has given. Thanks!

> > On Dec 27, 2024, at 8:32?AM, Paul Lalonde <paul.a.lalo...@gmail.com> wrote:
> > 
> > GPUs have been my bread and butter for 20+ years.
> > 
> > The best introductory source continues to be Kayvon Fatahalian and Mike 
> > Houston's 2008 CACM paper: https://dl.acm.org/doi/10.1145/1400181.1400197
> > 
> > It says little about the software interface to the GPU, but does a very 
> > good job of motivating and describing the architecture.
> > 
> > The in-depth resource for modern GPU architecture is the Nvidia A100 tensor 
> > architecture paper: 
> > https://images.nvidia.com/aem-dam/en-zz/Solutions/data-center/nvidia-ampere-architecture-whitepaper.pdf.
> >   It's a slog, but clearly shows how compute has changed.  Particularly, 
> > much of the success is in turning branchy workloads with scattered memory 
> > accesses into much more bulk-oriented data streams that match well to the 
> > "natural" structure of the tensor cores.  The performance gains can be 
> > astronomical. I've personally made > 1000x - yes, that's *times* not 
> > percentages - speedups with some workloads.  There is very little compute 
> > that's "cpu-limited" at multi-second scales that can't benefit from these 
> > approaches, hence the death of non-GPU supercomputing.
> > 
> > Paul
> > 
> > 
> > 
> > On Fri, Dec 27, 2024 at 7:13?AM <tlaro...@kergis.com 
> > <mailto:tlaro...@kergis.com>> wrote:
> >> On Thu, Dec 26, 2024 at 10:24:23PM -0800, Ron Minnich wrote:
> >> [very interesting stuff]
> >> > 
> >> > Finally, why did something like this not ever happen? Because GPUs
> >> > came along a few years later and that's where all the parallelism in
> >> > HPC is nowadays. NIX was a nice idea, but it did not survive in the
> >> > GPU era.
> >> > 
> >> 
> >> GPUs are actually wreaking havoc other kernels, with, in the Unix
> >> world, X11 being in a bad shape for several reasons, one being that
> >> GPU are not limited to graphical display---this tends to be
> >> anecdoctical in some sense.
> >> 
> >> Can you elaborate on the GPUs paradigm break? I tend to think that
> >> there is a main difference between "equals" sharing a same address
> >> space via MMU, and auxiliary processors that are using another address
> >> space. A GPU, as far as I know (this is not much), is an auxiliary
> >> processor when the GPU is discrete, and is a specialized processor
> >> sharing the same address space when integrated (but I guess that a
> >> HPC have discrete GPUs with perhaps a specialized connection).
> >> 
> >> Do you know good references about:
> >> 
> >> - organizing processors depending on memory connection---I found
> >> mainly M. J. Flynn's paper(s) about this, but nothing more
> >> recent---and the impact on an OS design;
> >> 
> >> - IPC vs threads---from your description, it seems that your solution
> >> was multiplying processes so IPC instead of multiplying threads---but
> >> nonetheless the sharing of differing memories remains, and is more
> >> easy to solve with IPC than with threads;
> >> 
> >> - Present GPU's architecture (supposing it is documented; it seems not
> >> totally from "General-Purpose Graphics Processor Archictectures",
> >> Aamodt, Lun Fung, Rogers, SpringerVerlag) and the RISC-V approach,
> >> composing hardware by connecting dedicated elements, and vectors vs
> >> SIMT.
> >> 
> >> Thanks for sharing (what can be shared)!
> >> --
> >> Thierry Laronde <tlaronde +AT+ kergis +dot+ com>
> >>              http://www.kergis.com/
> >>             http://kertex.kergis.com/
> >> Key fingerprint = 0FF7 E906 FBAF FE95 FD89  250D 52B1 AE95 6006 F40C
> > 
> > 9fans <https://9fans.topicbox.com/latest> / 9fans / see discussions 
> > <https://9fans.topicbox.com/groups/9fans> + participants 
> > <https://9fans.topicbox.com/groups/9fans/members> + delivery options 
> > <https://9fans.topicbox.com/groups/9fans/subscription>Permalink 
> > <https://9fans.topicbox.com/groups/9fans/T7692a612f26c8ec5-M515bf7357d2a3e968c25260f>

-- 
        Thierry Laronde <tlaronde +AT+ kergis +dot+ com>
                     http://www.kergis.com/
                    http://kertex.kergis.com/
Key fingerprint = 0FF7 E906 FBAF FE95 FD89  250D 52B1 AE95 6006 F40C

------------------------------------------
9fans: 9fans
Permalink: 
https://9fans.topicbox.com/groups/9fans/T7692a612f26c8ec5-M719687d43692a16531bb4306
Delivery options: https://9fans.topicbox.com/groups/9fans/subscription

Reply via email to