Re: [ccp4bb] MX data processing with GPUs??

Winter, Graeme (DLSLtd,RAL,LSCI) Wed, 19 Feb 2020 01:22:07 -0800

Folks,

To clarify following a couple of questions re: sizes of Eiger data sets


Sparsely measured Eiger data compress very well indeed - I have seen cases 
where the compressed data are around 100 kB / frame for 18 megapixels

The issue I was addressing below is that the pixels stored in RAM are typically 
worked with as 32 bit integers, even if the underlying acquisition was using 16 
bit. Thus the bytes which have to be worked on by the CPU are massively larger 
than the data on disk… even if most of the bits are 0’s :-)

Best wishes Graeme



On 19 Feb 2020, at 08:08, Winter, Graeme (DLSLtd,RAL,LSCI) 
<graeme.win...@diamond.ac.uk<mailto:graeme.win...@diamond.ac.uk>> wrote:

Dear Ana,

To follow up on the contributions from others, there are some particular 
annoyances with MX processing which differentiate it from other “big data” or 
imaging problems.

In tomographic reconstruction you have a big block of data which needs to (as a 
simplistic approximation) be transformed by a bunch of trigonometric functions 
to another big block of data. The shape of the calculation is the same 
independent of the data itself, and overall this represents a massively 
parallel computationally expensive problem, which makes it worth the cost of 
getting the data in and out of the GPU (this is not cheap) - even in this case, 
the parallelism of modern CPUs means that this is not a given. These folks are 
usually the ones who are making a lot of noise about how awesome GPU boards 
are, and for their use case this is absolutely true.

In MX we have a particularly annoying problem, as about half of the 
calculations are nicely parallel (spot finding, peak integration) and are 
memory bandwidth / CPU breadth limited and the other half (indexing, 
refinement, scaling) are not very parallel CPU speed bound, so finding the best 
CPU architecture is hard to start with. In terms of GPU, the data need to 
typically pass through main memory three times - for spot finding you need to 
look at every pixel, and integration typically needs to load full frames to 
extract the profiles and then fit them (the shoebox regions can be cached 
between these, but they still need to pass in and out of the CPU). Since moving 
data in and out of memory is expensive and GPU memory is expensive this is a 
problem. For reference, a typical Eiger 16M data set uncompressed needs about 
half a terabyte of RAM (7,200 * 18 megapixels * 4 bytes) so in memory 
processing presents real challenges. The image analysis calculations themselves 
are typically rather light weight floating point work (e.g. summed area table 
calculations) without a lot of trigonometry.

All this, combined with the annoying habit of using words like “if” and “for” 
in the code (which kills GPU calculations dead) mean that even for spot finding 
it’s not worth the effort of moving the data into a GPU - we DIALS folks looked 
into this a couple of years back with a specialist from NVIDIA.

For what it’s worth we have spent some time looking at this here at Diamond, 
where we have a certain interest in speedy processing of MX data and the 
current (2020/02) best bang for buck appears to be AMD Rome.

We as a community have a challenge with keeping up with high data rate 
beamlines at both synchrotrons and FELs - I feel it is important to keep an eye 
on emerging technology and make best use of it (and share experiences of using 
it!) but we should also keep in mind that the processing done in MX is actually 
rather well defined and mathematical at its heart. It is very unlikely that 
deep learning will help with the mathematical challenges we face [1] as we know 
exactly the calculations we need to do (which are very well documented in the 
literature, thank you to everyone who has written these up over the years) and 
instead a clear focus on making the maths fast is needed.

Up to the point where someone comes up with a completely new way of looking at 
the data, of course. I’m sure someone out there is looking at this :-)

On the topic of raspberry pi machines ;-) these are fun but I would hate to 
look at the interconnect necessary to get enough boards to work together to 
keep up with a single AMD Rome box…

best wishes Graeme

[1] with the possible exception of classifying individual found spots and other 
niche areas


On 19 Feb 2020, at 07:04, Leonarski Filip Karol (PSI) 
<filip.leonar...@psi.ch<mailto:filip.leonar...@psi.ch>> wrote:

Dear Ana,

To benefit from GPU architecture, over CPU, the algorithm needs to do quite 
significant number crunching – i.e. do at least certain number of floating 
point operations (FLOP) per one byte of data. It also needs to be highly 
parallel, preferably without conditional (if/else) statements. Finally, there 
is a variety of GPU architectures on the market and it is not exactly obvious 
that code written for one GPU will be optimal on another one. So if the code is 
based on a general purpose library, it will be easier to make sure that it runs 
efficiently on all GPU hardware.

I believe combination of these factors makes a big difference between imaging 
and MX.

Imaging processing is limited by FFT performance, which needs floating point 
performance. Libraries for FFT on GPUs are standard and provided by hardware 
vendors, so it is easy to implement.

On the other hand MX algorithms for image processing, at least the one I know 
of, do only handful of FLOP per pixel and they will probably not benefit from 
GPU processing significantly, even if ported to such architecture – which would 
be also a non-negligible effort. So while it is not impossible to imagine 
GPU-accelerated MX software and hopefully people are working on this, it is not 
a low hanging fruit, like in case of GPU acceleration for imaging or cryo-EM.

On a side note if one could find a way to use machine learning for data 
processing and implement data processing pipeline in Tensorflow, then GPUs 
would pay off quickly.

Regarding Tim’s Raspberry Pi argument – it should be compared with Nvidia 
Jetson price, which is more or less RPi with GPU, and it won’t be actually that 
significant difference.

Best,
Filip


From: CCP4 bulletin board <CCP4BB@JISCMAIL.AC.UK<mailto:CCP4BB@JISCMAIL.AC.UK>> 
on behalf of Ana Carolina de Mattos Zeri 
<ana.z...@lnls.br<mailto:ana.z...@lnls.br>>
Reply to: Ana Carolina de Mattos Zeri 
<ana.z...@lnls.br<mailto:ana.z...@lnls.br>>
Date: Tuesday, 18 February 2020 at 20:58
To: "CCP4BB@JISCMAIL.AC.UK<mailto:CCP4BB@JISCMAIL.AC.UK>" 
<CCP4BB@JISCMAIL.AC.UK<mailto:CCP4BB@JISCMAIL.AC.UK>>
Subject: [ccp4bb] MX data processing with GPUs??

Dear all
we have asked this of a few people, but the question remains:
does any of you have experienced/tried using GPU based software to treat MX 
data? for reducing or subsequent image analysis?
is it a lost battle?
how do you deal with the crescent amount of data we are facing, at Synchrotrons 
and XFELs?
Here at the Manaca beamline at Sirius we will continue to support CPU based 
software, but due to developments in the imaging beam lines, GPU machines are 
looking very attractive.
many thanks in advance for your thoughts,
all the best
Ana


Ana Carolina Zeri, PhD
Manaca Beamline Coordinator (Macromolecular Micro and Nano Crystallography)
Brazilian Synchrotron Light Laboratory (LNLS)
Brazilian Center for Research in Energy and Materials (CNPEM)
Zip Code 13083-970, Campinas, Sao Paulo, Brazil.
(19) 3518-2498
www.lnls.br<http://www.lnls.br/>
ana.z...@lnls.br<mailto:ana.z...@lnls.br>






Aviso Legal: Esta mensagem e seus anexos podem conter informações confidenciais 
e/ou de uso restrito. Observe atentamente seu conteúdo e considere eventual 
consulta ao remetente antes de copiá-la, divulgá-la ou distribuí-la. Se você 
recebeu esta mensagem por engano, por favor avise o remetente e apague-a 
imediatamente.

Disclaimer: This email and its attachments may contain confidential and/or 
privileged information. Observe its content carefully and consider possible 
querying to the sender before copying, disclosing or distributing it. If you 
have received this email by mistake, please notify the sender and delete it 
immediately.


________________________________

To unsubscribe from the CCP4BB list, click the following link:
https://www.jiscmail.ac.uk/cgi-bin/webadmin?SUBED1=CCP4BB&A=1

________________________________

To unsubscribe from the CCP4BB list, click the following link:
https://www.jiscmail.ac.uk/cgi-bin/webadmin?SUBED1=CCP4BB&A=1



--

This e-mail and any attachments may contain confidential, copyright and or 
privileged material, and are for the use of the intended addressee only. If you 
are not the intended addressee or an authorised recipient of the addressee 
please notify us of receipt by returning the e-mail and do not use, copy, 
retain, distribute or disclose the information in or attached to the e-mail.
Any opinions expressed within this e-mail are those of the individual and not 
necessarily of Diamond Light Source Ltd.
Diamond Light Source Ltd. cannot guarantee that this e-mail or any attachments 
are free from viruses and we cannot accept liability for any damage which you 
may sustain as a result of software viruses which may be transmitted in or with 
the message.
Diamond Light Source Limited (company no. 4375679). Registered in England and 
Wales with its registered office at Diamond House, Harwell Science and 
Innovation Campus, Didcot, Oxfordshire, OX11 0DE, United Kingdom


________________________________

To unsubscribe from the CCP4BB list, click the following link:
https://www.jiscmail.ac.uk/cgi-bin/webadmin?SUBED1=CCP4BB&A=1


########################################################################

To unsubscribe from the CCP4BB list, click the following link:
https://www.jiscmail.ac.uk/cgi-bin/webadmin?SUBED1=CCP4BB&A=1

Re: [ccp4bb] MX data processing with GPUs??

Reply via email to