Re: [ccp4bb] am I doing this right?

Gergely Katona Fri, 22 Oct 2021 00:25:24 -0700

Hi,

I have more estimates to the same problem using a multinomial data 
distribution. I should have realized that for prediction, I do not have to deal 
with infinite likelihood of 0 trials when observing only 0s on an image. 
Whenever 0 photons generated by the latent process, the image is automatically 
empty. With this simplification, I still have to hide behind mathematical 
convenience and use Gamma prior for the latent Poisson process, but observing 0 
counts just increments the beta parameter by 1 compared to the prior belief. 
With equal photon capture probabilities, the mean counts are about 0.01 and the 
std is about 0.1 with rate≈Gamma(alpha=1, beta=0.1) prior . With a symmetric 
Dirichlet prior to the capture probabilities, the means appear unchanged, but 
the predicted stds starts high at very low concentration parameter and level 
off at high concentration parameter. This becomes more apparent at high photon 
counts (high alpha of Gamma distribution). The answer is different if we look 
at the std across the detector plane or across time of a single pixel.
Details of the calculation below:


https://colab.research.google.com/drive/1NK43_3r1rH5lBTDS2rzIFDFNWqFfekrZ?usp=sharing

Best wishes,

Gergely

Gergely Katona, Professor, Chairman of the Chemistry Program Council
Department of Chemistry and Molecular Biology, University of Gothenburg
Box 462, 40530 Göteborg, Sweden
Tel: +46-31-786-3959 / M: +46-70-912-3309 / Fax: +46-31-786-3910
Web: http://katonalab.eu, Email: gergely.kat...@gu.se

From: CCP4 bulletin board <CCP4BB@JISCMAIL.AC.UK> On Behalf Of Nave, Colin 
(DLSLtd,RAL,LSCI)
Sent: 21 October, 2021 19:21
To: CCP4BB@JISCMAIL.AC.UK
Subject: Re: [ccp4bb] am I doing this right?

Congratulations to James for starting this interesting discussion.

For those who are like me, nowhere near a black belt in statistics, the thread 
has included a number of distributions.  I have had to look up where these 
apply and investigate their properties.
As an example,
“The Poisson distribution is used to model the # of events in the future, 
Exponential distribution is used to predict the wait time until the very first 
event, and Gamma distribution is used to predict the wait time until the k-th 
event.”
A useful calculator for distributions can be found at
https://keisan.casio.com/menu/system/000000000540
a specific example is at
https://keisan.casio.com/exec/system/1180573179
where cumulative probabilities for a Poisson distribution can be found given 
values for x and lambda.

The most appropriate prior is another issue which has come up e.g. is a flat 
prior appropriate? I can see that a different prior would be appropriate for 
different areas of the detector (e.g. 1 pixel instead of 100 pixels) but the 
most appropriate prior seems a bit arbitrary to me. One of James’ examples was 
10^5 background photons distributed among  10^6 pixels – what is the most 
appropriate prior for this case? I presume it is OK to update the prior after 
each observation but I understand that it can create difficulties if not done 
properly.

Being able to select the prior is sometimes seen as a strength of Bayesian 
methods. However, as a strong advocate of Bayesian methods once put it, this is 
a bit like Achilles boasting about his heel!

I hope for some agreement among the black belts. It would be good to end up 
with some clarity about the most appropriate probability distributions and 
priors. Also, have we got clarity about the question being asked?

Thanks to all for the interesting points.

Colin
From: CCP4 bulletin board <CCP4BB@JISCMAIL.AC.UK<mailto:CCP4BB@JISCMAIL.AC.UK>> 
On Behalf Of Randy John Read
Sent: 21 October 2021 13:23
To: CCP4BB@JISCMAIL.AC.UK<mailto:CCP4BB@JISCMAIL.AC.UK>
Subject: Re: [ccp4bb] am I doing this right?

Hi Kay,

No, I still think the answer should come out the same if you have good reason 
to believe that all the 100 pixels are equally likely to receive a photon (for 
instance because your knowledge of the geometry of the source and the detector 
says the difference in their positions is insignificant, i.e. part of your 
prior expectation). Unless the exact position of the spot where you detect the 
photon is relevant, detecting 1 photon on a big pixel and detecting the same 
photon on 1 of 100 smaller pixels covering the same area are equivalent events. 
What should be different in the analysis, if you're thinking about individual 
pixels, is that the expected value for a photon landing on any of the pixels 
will be 100 times lower for each of the smaller pixels than the single big 
pixel, so that the expected value of their sum is the same. You won't get to 
that conclusion without having a different prior probability for the two cases 
that reflects the 100-fold lower flux through the smaller area, regardless of 
the total power of the source.

Best wishes,
Randy

On 21 Oct 2021, at 13:03, Kay Diederichs 
<kay.diederi...@uni-konstanz.de<mailto:kay.diederi...@uni-konstanz.de>> wrote:

Randy,

I must admit that I am not certain about my answer, but I lean toward thinking 
that the result (of the two thought experiments that you describe) is not the 
same. I do agree that it makes sense that the expectation value is the same, 
and the math that I sketched in 
https://www.jiscmail.ac.uk/cgi-bin/wa-jisc.exe?A2=CCP4BB;bdd31b04.2110 actually 
shows this. But the variance? To me, a 100-pixel patch with all zeroes is no 
different from sequentially observing 100 pixels, one after the other. For the 
first of these pixels, I have no idea what the count is, until I observe it. 
For the second, I am less surprised that it is 0 because I observed 0 for the 
first. And so on, until the 100th. For the last one, my belief that I will 
observe a zero before I read out the pixel is much higher than for the first 
pixel. The variance is just the inverse of the amount of error (squared) that 
we assign to our belief in the expectation value. And that amount of belief is 
very different. I find it satisfactory that the sigma goes down with the sqrt() 
of the number of pixels.

Also, I don't find an error in the math of my posting of Mon, 18 Oct 2021 
15:00:42 +0100 . I do think that a uniform prior is not realistic, but this 
does not seem to make much difference for the 100-pixel thought experiment.

We could change the thought experiment in the following way - you observe 99 
pixels with zero counts, and 1 with 1 count. Would you still say that both the 
big-pixel-single-observation and the 100-pixel experiment should give 
expectation value of 2 and variance of 2? I wouldn't.

Best wishes,
Kay

On Thu, 21 Oct 2021 09:00:23 +0000, Randy John Read 
<rj...@cam.ac.uk<mailto:rj...@cam.ac.uk>> wrote:

Just to be a bit clearer, I mean that the calculation of the expected value and 
its variance should give the same answer if you're comparing one pixel for a 
particular length of exposure with the sum obtained from either a larger number 
of smaller pixels covering the same area for the same length of exposure, or 
the sum from the same pixel measured for smaller time slices adding up to the 
same total exposure.

On 21 Oct 2021, at 09:54, Randy John Read 
<rj...@cam.ac.uk<mailto:rj...@cam.ac.uk><mailto:rj...@cam.ac.uk>> wrote:

I would think that if this problem is being approached correctly, with the 
right prior, it shouldn't matter whether you collect the same signal 
distributed over 100 smaller pixels or the same pixel measured for the same 
length of exposure but with 100 time slices; you should get the same answer. So 
I would want to formulate the problem in a way where this invariance is 
satisfied. I thought it was, from some of the earlier descriptions of the 
problem, but this sounds worrying.

I think you're trying to say the same thing here, Kay. Is that right?

Best wishes,

Randy

On 21 Oct 2021, at 08:51, Kay Diederichs 
<kay.diederi...@uni-konstanz.de<mailto:kay.diederi...@uni-konstanz.de><mailto:kay.diederi...@uni-konstanz.de>>
 wrote:

Hi Ian,

it is Iobs=0.01 and sigIobs=0.01 for one pixel, but adding 100 pixels each with 
variance=sigIobs^2=0.0001 gives  0.01 , yielding a 100-pixel-sigIobs of 0.1 - 
different from the 1 you get. As if repeatedly observing the same count of 0 
lowers the estimated error by sqrt(n), where n is the number of observations 
(100 in this case).

best wishes,
Kay

On Wed, 20 Oct 2021 13:08:33 +0100, Ian Tickle 
<ianj...@gmail.com<mailto:ianj...@gmail.com><mailto:ianj...@gmail.com>> wrote:

Hi Kay

Can I just confirm that your result Iobs=0.01 sigIobs=0.01 is the estimate
of the true average intensity *per pixel* for a patch of 100 pixels?  So
then the total count for all 100 pixels is 1 with variance also 1, or in
general for k observed counts in the patch, expectation = variance = k+1
for the total count, irrespective of the number of pixels?  If so then that
agrees with my own conclusion.  It makes sense because Iobs=0.01
sigIobs=0.01 cannot come from a Poisson process (which obviously requires
expectation = variance = an integer), whereas the total count does come
from a Poisson process.

The difference from my approach is that you seem to have come at it via the
individual pixel counts whereas I came straight from the Agostini result
applied to the whole patch.  The number of pixels seems to me to be
irrelevant for the whole patch since the design of the detector, assuming
it's an ideal detector with DQE = 1 surely cannot change the photon flux
coming from the source: all ideal detectors whatever their pixel layout
must give the same result.  The number of pixels is then only relevant if
one needs to know the average intensity per pixel, i.e. the total and s.d.
divided by the number of pixels.  Note the pixels here need not even
correspond to the hardware pixels, they can be any arbitrary subdivision of
the detector surface.

Best wishes

-- Ian


On Tue, 19 Oct 2021 at 12:39, Kay Diederichs 
<kay.diederi...@uni-konstanz.de<mailto:kay.diederi...@uni-konstanz.de><mailto:kay.diederi...@uni-konstanz.de>>
wrote:

James,

I am saying that my answer to "what is the expectation and variance if I
observe a 10x10 patch of pixels with zero
counts?" is Iobs=0.01 sigIobs=0.01 (and Iobs=sigIobs=1 if there is only
one pixel) IF the uniform prior applies. I agree with Gergely and others
that this prior (with its high expectation value and variance) appears
unrealistic.

In your posting of Sat, 16 Oct 2021 12:00:30 -0700 you make a calculation
of Ppix that appears like a more suitable expectation value of a prior to
me. A suitable prior might then be 1/Ppix * e^(-l/Ppix) (Agostini §7.7.1).
The Bayesian argument is IIUC that the prior plays a minor role if you do
repeated measurements of the same value, because you use the posterior of
the first measurement as the prior for the second, and so on. What this
means is that your Ppix must play the role of a scale factor if you
consider the 100-pixel experiment.
However, for the 1-pixel experiment, having a more suitable prior should
be more important.

best,
Kay




On Mon, 18 Oct 2021 12:40:45 -0700, James Holton 
<jmhol...@lbl.gov<mailto:jmhol...@lbl.gov><mailto:jmhol...@lbl.gov>> wrote:

Thank you very much for this Kay!

So, to summarize, you are saying the answer to my question "what is the
expectation and variance if I observe a 10x10 patch of pixels with zero
counts?" is:
Iobs = 0.01
sigIobs = 0.01     (defining sigIobs = sqrt(variance(Iobs)))

And for the one-pixel case:
Iobs = 1
sigIobs = 1

but in both cases the distribution is NOT Gaussian, but rather
exponential. And that means adding variances may not be the way to
propagate error.

Is that right?

-James Holton
MAD Scientist



On 10/18/2021 7:00 AM, Kay Diederichs wrote:
Hi James,

I'm a bit behind ...

My answer about the basic question ("a patch of 100 pixels each with
zero counts - what is the variance?") you ask is the following:

1) we all know the Poisson PDF (Probability Distribution Function)
P(k|l) = l^k*e^(-l)/k!  (where k stands for for an integer >=0 and l is
lambda) which tells us the probability of observing k counts if we know l.
The PDF is normalized: SUM_over_k (P(k|l)) is 1 when k=0...infinity is 1.
2) you don't know before the experiment what l is, and you assume it is
some number x with 0<=x<=xmax (the xmax limit can be calculated by looking
at the physics of the experiment; it is finite and less than the overload
value of the pixel, otherwise you should do a different experiment). Since
you don't know that number, all the x values are equally likely - you use a
uniform prior.
3) what is the PDF P(l|k) of l if we observe k counts?  That can be
found with Bayes theorem, and it turns out that (due to the uniform prior)
the right hand side of the formula looks the same as in 1) : P(l|k) =
l^k*e^(-l)/k! (again, the ! stands for the factorial, it is not a semantic
exclamation mark). This is eqs. 7.42 and 7.43 in Agostini "Bayesian
Reasoning in Data Analysis".
3a) side note: if we calculate the expectation value for l, by
multiplying with l and integrating over l from 0 to infinity, we obtain
E(P(l|k))=k+1, and similarly for the variance (Agostini eqs 7.45 and 7.46)
4) for k=0 (zero counts observed in a single pixel), this reduces to
P(l|0)=e^(-l) for a single observation (pixel). (this is basic math; see
also §7.4.1 of Agostini.
5) since we have 100 independent pixels, we must multiply the
individual PDFs to get the overall PDF f, and also normalize to make the
integral over that PDF to be 1: the result is f(l|all 100 pixels are
0)=n*e^(-n*l). (basic math). A more Bayesian procedure would be to realize
that the posterior PDF P(l|0)=e^(-l) of the first pixel should be used as
the prior for the second pixel, and so forth until the 100th pixel. This
has the same result f(l|all 100 pixels are 0)=n*e^(-n*l) (Agostini § 7.7.2)!
6) the expectation value INTEGRAL_0_to_infinity over l*n*e^(-n*l) dl is
1/n .  This is 1 if n=1 as we know from 3a), and 1/100 for 100 pixels with
0 counts.
7) the variance is then INTEGRAL_0_to_infinity over
(l-1/n)^2*n*e^(-n*l) dl . This is 1/n^2

I find these results quite satisfactory. Please note that they deviate
from the MLE result: expectation value=0, variance=0 . The problem appears
to be that a Maximum Likelihood Estimator may give wrong results for small
n; something that I've read a couple of times but which appears not to be
universally known/taught. Clearly, the result in 6) and 7) for large n
converges towards 0, as it should be.
What this also means is that one should really work out the PDF instead
of just adding expectation values and variances (and arriving at 100 if all
100 pixels have zero counts) because it is contradictory to use a uniform
prior for all the pixels if OTOH these agree perfectly in being 0!

What this means for zero-dose extrapolation I have not thought about.
At least it prevents infinite weights!

Best,
Kay





########################################################################

To unsubscribe from the CCP4BB list, click the following link:
https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB&A=1

This message was issued to members of 
www.jiscmail.ac.uk/CCP4BB<http://www.jiscmail.ac.uk/CCP4BB<http://www.jiscmail.ac.uk/CCP4BB%3chttp:/www.jiscmail.ac.uk/CCP4BB>>,
 a
mailing list hosted by 
www.jiscmail.ac.uk<http://www.jiscmail.ac.uk/<http://www.jiscmail.ac.uk%3chttp:/www.jiscmail.ac.uk/>>,
 terms & conditions are
available at https://www.jiscmail.ac.uk/policyandsecurity/

########################################################################

To unsubscribe from the CCP4BB list, click the following link:
https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB&A=1

This message was issued to members of 
www.jiscmail.ac.uk/CCP4BB<http://www.jiscmail.ac.uk/CCP4BB<http://www.jiscmail.ac.uk/CCP4BB%3chttp:/www.jiscmail.ac.uk/CCP4BB>>,
 a
mailing list hosted by 
www.jiscmail.ac.uk<http://www.jiscmail.ac.uk<http://www.jiscmail.ac.uk%3chttp:/www.jiscmail.ac.uk>>,
 terms & conditions are
available at https://www.jiscmail.ac.uk/policyandsecurity/


########################################################################

To unsubscribe from the CCP4BB list, click the following link:
https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB&A=1

This message was issued to members of 
www.jiscmail.ac.uk/CCP4BB<http://www.jiscmail.ac.uk/CCP4BB<http://www.jiscmail.ac.uk/CCP4BB%3chttp:/www.jiscmail.ac.uk/CCP4BB>>,
 a mailing list hosted by 
www.jiscmail.ac.uk<http://www.jiscmail.ac.uk<http://www.jiscmail.ac.uk%3chttp:/www.jiscmail.ac.uk>>,
 terms & conditions are available at 
https://www.jiscmail.ac.uk/policyandsecurity/


########################################################################

To unsubscribe from the CCP4BB list, click the following link:
https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB&A=1

This message was issued to members of 
www.jiscmail.ac.uk/CCP4BB<http://www.jiscmail.ac.uk/CCP4BB<http://www.jiscmail.ac.uk/CCP4BB%3chttp:/www.jiscmail.ac.uk/CCP4BB>>,
 a mailing list hosted by 
www.jiscmail.ac.uk<http://www.jiscmail.ac.uk<http://www.jiscmail.ac.uk%3chttp:/www.jiscmail.ac.uk>>,
 terms & conditions are available at 
https://www.jiscmail.ac.uk/policyandsecurity/

------
Randy J. Read
Department of Haematology, University of Cambridge
Cambridge Institute for Medical Research     Tel: + 44 1223 336500
The Keith Peters Building                               Fax: + 44 1223 336827
Hills Road                                                       E-mail: 
rj...@cam.ac.uk<mailto:rj...@cam.ac.uk<mailto:rj...@cam.ac.uk%3cmailto:rj...@cam.ac.uk>>
Cambridge CB2 0XY, U.K.                             
www-structmed.cimr.cam.ac.uk<http://www-structmed.cimr.cam.ac.uk/>


________________________________

To unsubscribe from the CCP4BB list, click the following link:
https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB&A=1

------
Randy J. Read
Department of Haematology, University of Cambridge
Cambridge Institute for Medical Research     Tel: + 44 1223 336500
The Keith Peters Building                               Fax: + 44 1223 336827
Hills Road                                                       E-mail: 
rj...@cam.ac.uk<mailto:rj...@cam.ac.uk<mailto:rj...@cam.ac.uk%3cmailto:rj...@cam.ac.uk>>
Cambridge CB2 0XY, U.K.                             
www-structmed.cimr.cam.ac.uk<http://www-structmed.cimr.cam.ac.uk>


########################################################################

To unsubscribe from the CCP4BB list, click the following link:
https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB&A=1

This message was issued to members of 
www.jiscmail.ac.uk/CCP4BB<http://www.jiscmail.ac.uk/CCP4BB>, a mailing list 
hosted by www.jiscmail.ac.uk<http://www.jiscmail.ac.uk>, terms & conditions are 
available at https://www.jiscmail.ac.uk/policyandsecurity/


------
Randy J. Read
Department of Haematology, University of Cambridge
Cambridge Institute for Medical Research     Tel: + 44 1223 336500
The Keith Peters Building                               Fax: + 44 1223 336827
Hills Road                                                       E-mail: 
rj...@cam.ac.uk<mailto:rj...@cam.ac.uk>
Cambridge CB2 0XY, U.K.                             
www-structmed.cimr.cam.ac.uk<http://www-structmed.cimr.cam.ac.uk>


________________________________

To unsubscribe from the CCP4BB list, click the following link:
https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB&A=1



--

This e-mail and any attachments may contain confidential, copyright and or 
privileged material, and are for the use of the intended addressee only. If you 
are not the intended addressee or an authorised recipient of the addressee 
please notify us of receipt by returning the e-mail and do not use, copy, 
retain, distribute or disclose the information in or attached to the e-mail.
Any opinions expressed within this e-mail are those of the individual and not 
necessarily of Diamond Light Source Ltd.
Diamond Light Source Ltd. cannot guarantee that this e-mail or any attachments 
are free from viruses and we cannot accept liability for any damage which you 
may sustain as a result of software viruses which may be transmitted in or with 
the message.
Diamond Light Source Limited (company no. 4375679). Registered in England and 
Wales with its registered office at Diamond House, Harwell Science and 
Innovation Campus, Didcot, Oxfordshire, OX11 0DE, United Kingdom


________________________________

To unsubscribe from the CCP4BB list, click the following link:
https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB&A=1

########################################################################

To unsubscribe from the CCP4BB list, click the following link:
https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB&A=1

This message was issued to members of www.jiscmail.ac.uk/CCP4BB, a mailing list 
hosted by www.jiscmail.ac.uk, terms & conditions are available at 
https://www.jiscmail.ac.uk/policyandsecurity/

Re: [ccp4bb] am I doing this right?

Reply via email to