I think there is a conflict between two different types of usage that
GPU architectures need to support. One is full-speed performance,
where the resource is fully owned and utilized for a single purpose
for a large chunk of time, and the other is where the GPU is a rare
resource that needs to be shared in short intervals (e.g. ML/AI
inference mode)

If I understand correctly, NIX's approach addresses the first case (at
least for cores).  Nvidia's Multiple-Instance GPU (MIG) architecture
seems to help to handle the second type of usage. The first case
requires maximizing data transfer rate, and the second needs clever
scheduling to minimize switching (large/huge) model parameters in/out.


On Fri, Dec 27, 2024 at 8:33 AM Paul Lalonde <paul.a.lalo...@gmail.com> wrote:
>
> GPUs have been my bread and butter for 20+ years.
>
> The best introductory source continues to be Kayvon Fatahalian and Mike 
> Houston's 2008 CACM paper: https://dl.acm.org/doi/10.1145/1400181.1400197
>
> It says little about the software interface to the GPU, but does a very good 
> job of motivating and describing the architecture.
>
> The in-depth resource for modern GPU architecture is the Nvidia A100 tensor 
> architecture paper: 
> https://images.nvidia.com/aem-dam/en-zz/Solutions/data-center/nvidia-ampere-architecture-whitepaper.pdf.
>   It's a slog, but clearly shows how compute has changed.  Particularly, much 
> of the success is in turning branchy workloads with scattered memory accesses 
> into much more bulk-oriented data streams that match well to the "natural" 
> structure of the tensor cores.  The performance gains can be astronomical. 
> I've personally made > 1000x - yes, that's *times* not percentages - speedups 
> with some workloads.  There is very little compute that's "cpu-limited" at 
> multi-second scales that can't benefit from these approaches, hence the death 
> of non-GPU supercomputing.
>
> Paul
>
>
>
> On Fri, Dec 27, 2024 at 7:13 AM <tlaro...@kergis.com> wrote:
>>
>> On Thu, Dec 26, 2024 at 10:24:23PM -0800, Ron Minnich wrote:
>> [very interesting stuff]
>> >
>> > Finally, why did something like this not ever happen? Because GPUs
>> > came along a few years later and that's where all the parallelism in
>> > HPC is nowadays. NIX was a nice idea, but it did not survive in the
>> > GPU era.
>> >
>> 
>> GPUs are actually wreaking havoc other kernels, with, in the Unix
>> world, X11 being in a bad shape for several reasons, one being that
>> GPU are not limited to graphical display---this tends to be
>> anecdoctical in some sense.
>> 
>> Can you elaborate on the GPUs paradigm break? I tend to think that
>> there is a main difference between "equals" sharing a same address
>> space via MMU, and auxiliary processors that are using another address
>> space. A GPU, as far as I know (this is not much), is an auxiliary
>> processor when the GPU is discrete, and is a specialized processor
>> sharing the same address space when integrated (but I guess that a
>> HPC have discrete GPUs with perhaps a specialized connection).
>> 
>> Do you know good references about:
>> 
>> - organizing processors depending on memory connection---I found
>> mainly M. J. Flynn's paper(s) about this, but nothing more
>> recent---and the impact on an OS design;
>> 
>> - IPC vs threads---from your description, it seems that your solution
>> was multiplying processes so IPC instead of multiplying threads---but
>> nonetheless the sharing of differing memories remains, and is more
>> easy to solve with IPC than with threads;
>> 
>> - Present GPU's architecture (supposing it is documented; it seems not
>> totally from "General-Purpose Graphics Processor Archictectures",
>> Aamodt, Lun Fung, Rogers, SpringerVerlag) and the RISC-V approach,
>> composing hardware by connecting dedicated elements, and vectors vs
>> SIMT.
>> 
>> Thanks for sharing (what can be shared)!
>> --
>> Thierry Laronde <tlaronde +AT+ kergis +dot+ com>
>>              http://www.kergis.com/
>>             http://kertex.kergis.com/
>> Key fingerprint = 0FF7 E906 FBAF FE95 FD89  250D 52B1 AE95 6006 F40C
>
> 9fans / 9fans / see discussions + participants + delivery options Permalink

------------------------------------------
9fans: 9fans
Permalink: 
https://9fans.topicbox.com/groups/9fans/T7692a612f26c8ec5-M4831fee22f0b3926b832d46f
Delivery options: https://9fans.topicbox.com/groups/9fans/subscription

Reply via email to