Hi, I have more estimates to the same problem using a multinomial data distribution. I should have realized that for prediction, I do not have to deal with infinite likelihood of 0 trials when observing only 0s on an image. Whenever 0 photons generated by the latent process, the image is automatically empty. With this simplification, I still have to hide behind mathematical convenience and use Gamma prior for the latent Poisson process, but observing 0 counts just increments the beta parameter by 1 compared to the prior belief. With equal photon capture probabilities, the mean counts are about 0.01 and the std is about 0.1 with rate≈Gamma(alpha=1, beta=0.1) prior . With a symmetric Dirichlet prior to the capture probabilities, the means appear unchanged, but the predicted stds starts high at very low concentration parameter and level off at high concentration parameter. This becomes more apparent at high photon counts (high alpha of Gamma distribution). The answer is different if we look at the std across the detector plane or across time of a single pixel. Details of the calculation below:
https://colab.research.google.com/drive/1NK43_3r1rH5lBTDS2rzIFDFNWqFfekrZ?usp=sharing Best wishes, Gergely Gergely Katona, Professor, Chairman of the Chemistry Program Council Department of Chemistry and Molecular Biology, University of Gothenburg Box 462, 40530 Göteborg, Sweden Tel: +46-31-786-3959 / M: +46-70-912-3309 / Fax: +46-31-786-3910 Web: http://katonalab.eu, Email: gergely.kat...@gu.se From: CCP4 bulletin board <CCP4BB@JISCMAIL.AC.UK> On Behalf Of Nave, Colin (DLSLtd,RAL,LSCI) Sent: 21 October, 2021 19:21 To: CCP4BB@JISCMAIL.AC.UK Subject: Re: [ccp4bb] am I doing this right? Congratulations to James for starting this interesting discussion. For those who are like me, nowhere near a black belt in statistics, the thread has included a number of distributions. I have had to look up where these apply and investigate their properties. As an example, “The Poisson distribution is used to model the # of events in the future, Exponential distribution is used to predict the wait time until the very first event, and Gamma distribution is used to predict the wait time until the k-th event.” A useful calculator for distributions can be found at https://keisan.casio.com/menu/system/000000000540 a specific example is at https://keisan.casio.com/exec/system/1180573179 where cumulative probabilities for a Poisson distribution can be found given values for x and lambda. The most appropriate prior is another issue which has come up e.g. is a flat prior appropriate? I can see that a different prior would be appropriate for different areas of the detector (e.g. 1 pixel instead of 100 pixels) but the most appropriate prior seems a bit arbitrary to me. One of James’ examples was 10^5 background photons distributed among 10^6 pixels – what is the most appropriate prior for this case? I presume it is OK to update the prior after each observation but I understand that it can create difficulties if not done properly. Being able to select the prior is sometimes seen as a strength of Bayesian methods. However, as a strong advocate of Bayesian methods once put it, this is a bit like Achilles boasting about his heel! I hope for some agreement among the black belts. It would be good to end up with some clarity about the most appropriate probability distributions and priors. Also, have we got clarity about the question being asked? Thanks to all for the interesting points. Colin From: CCP4 bulletin board <CCP4BB@JISCMAIL.AC.UK<mailto:CCP4BB@JISCMAIL.AC.UK>> On Behalf Of Randy John Read Sent: 21 October 2021 13:23 To: CCP4BB@JISCMAIL.AC.UK<mailto:CCP4BB@JISCMAIL.AC.UK> Subject: Re: [ccp4bb] am I doing this right? Hi Kay, No, I still think the answer should come out the same if you have good reason to believe that all the 100 pixels are equally likely to receive a photon (for instance because your knowledge of the geometry of the source and the detector says the difference in their positions is insignificant, i.e. part of your prior expectation). Unless the exact position of the spot where you detect the photon is relevant, detecting 1 photon on a big pixel and detecting the same photon on 1 of 100 smaller pixels covering the same area are equivalent events. What should be different in the analysis, if you're thinking about individual pixels, is that the expected value for a photon landing on any of the pixels will be 100 times lower for each of the smaller pixels than the single big pixel, so that the expected value of their sum is the same. You won't get to that conclusion without having a different prior probability for the two cases that reflects the 100-fold lower flux through the smaller area, regardless of the total power of the source. Best wishes, Randy On 21 Oct 2021, at 13:03, Kay Diederichs <kay.diederi...@uni-konstanz.de<mailto:kay.diederi...@uni-konstanz.de>> wrote: Randy, I must admit that I am not certain about my answer, but I lean toward thinking that the result (of the two thought experiments that you describe) is not the same. I do agree that it makes sense that the expectation value is the same, and the math that I sketched in https://www.jiscmail.ac.uk/cgi-bin/wa-jisc.exe?A2=CCP4BB;bdd31b04.2110 actually shows this. But the variance? To me, a 100-pixel patch with all zeroes is no different from sequentially observing 100 pixels, one after the other. For the first of these pixels, I have no idea what the count is, until I observe it. For the second, I am less surprised that it is 0 because I observed 0 for the first. And so on, until the 100th. For the last one, my belief that I will observe a zero before I read out the pixel is much higher than for the first pixel. The variance is just the inverse of the amount of error (squared) that we assign to our belief in the expectation value. And that amount of belief is very different. I find it satisfactory that the sigma goes down with the sqrt() of the number of pixels. Also, I don't find an error in the math of my posting of Mon, 18 Oct 2021 15:00:42 +0100 . I do think that a uniform prior is not realistic, but this does not seem to make much difference for the 100-pixel thought experiment. We could change the thought experiment in the following way - you observe 99 pixels with zero counts, and 1 with 1 count. Would you still say that both the big-pixel-single-observation and the 100-pixel experiment should give expectation value of 2 and variance of 2? I wouldn't. Best wishes, Kay On Thu, 21 Oct 2021 09:00:23 +0000, Randy John Read <rj...@cam.ac.uk<mailto:rj...@cam.ac.uk>> wrote: Just to be a bit clearer, I mean that the calculation of the expected value and its variance should give the same answer if you're comparing one pixel for a particular length of exposure with the sum obtained from either a larger number of smaller pixels covering the same area for the same length of exposure, or the sum from the same pixel measured for smaller time slices adding up to the same total exposure. On 21 Oct 2021, at 09:54, Randy John Read <rj...@cam.ac.uk<mailto:rj...@cam.ac.uk><mailto:rj...@cam.ac.uk>> wrote: I would think that if this problem is being approached correctly, with the right prior, it shouldn't matter whether you collect the same signal distributed over 100 smaller pixels or the same pixel measured for the same length of exposure but with 100 time slices; you should get the same answer. So I would want to formulate the problem in a way where this invariance is satisfied. I thought it was, from some of the earlier descriptions of the problem, but this sounds worrying. I think you're trying to say the same thing here, Kay. Is that right? Best wishes, Randy On 21 Oct 2021, at 08:51, Kay Diederichs <kay.diederi...@uni-konstanz.de<mailto:kay.diederi...@uni-konstanz.de><mailto:kay.diederi...@uni-konstanz.de>> wrote: Hi Ian, it is Iobs=0.01 and sigIobs=0.01 for one pixel, but adding 100 pixels each with variance=sigIobs^2=0.0001 gives 0.01 , yielding a 100-pixel-sigIobs of 0.1 - different from the 1 you get. As if repeatedly observing the same count of 0 lowers the estimated error by sqrt(n), where n is the number of observations (100 in this case). best wishes, Kay On Wed, 20 Oct 2021 13:08:33 +0100, Ian Tickle <ianj...@gmail.com<mailto:ianj...@gmail.com><mailto:ianj...@gmail.com>> wrote: Hi Kay Can I just confirm that your result Iobs=0.01 sigIobs=0.01 is the estimate of the true average intensity *per pixel* for a patch of 100 pixels? So then the total count for all 100 pixels is 1 with variance also 1, or in general for k observed counts in the patch, expectation = variance = k+1 for the total count, irrespective of the number of pixels? If so then that agrees with my own conclusion. It makes sense because Iobs=0.01 sigIobs=0.01 cannot come from a Poisson process (which obviously requires expectation = variance = an integer), whereas the total count does come from a Poisson process. The difference from my approach is that you seem to have come at it via the individual pixel counts whereas I came straight from the Agostini result applied to the whole patch. The number of pixels seems to me to be irrelevant for the whole patch since the design of the detector, assuming it's an ideal detector with DQE = 1 surely cannot change the photon flux coming from the source: all ideal detectors whatever their pixel layout must give the same result. The number of pixels is then only relevant if one needs to know the average intensity per pixel, i.e. the total and s.d. divided by the number of pixels. Note the pixels here need not even correspond to the hardware pixels, they can be any arbitrary subdivision of the detector surface. Best wishes -- Ian On Tue, 19 Oct 2021 at 12:39, Kay Diederichs <kay.diederi...@uni-konstanz.de<mailto:kay.diederi...@uni-konstanz.de><mailto:kay.diederi...@uni-konstanz.de>> wrote: James, I am saying that my answer to "what is the expectation and variance if I observe a 10x10 patch of pixels with zero counts?" is Iobs=0.01 sigIobs=0.01 (and Iobs=sigIobs=1 if there is only one pixel) IF the uniform prior applies. I agree with Gergely and others that this prior (with its high expectation value and variance) appears unrealistic. In your posting of Sat, 16 Oct 2021 12:00:30 -0700 you make a calculation of Ppix that appears like a more suitable expectation value of a prior to me. A suitable prior might then be 1/Ppix * e^(-l/Ppix) (Agostini §7.7.1). The Bayesian argument is IIUC that the prior plays a minor role if you do repeated measurements of the same value, because you use the posterior of the first measurement as the prior for the second, and so on. What this means is that your Ppix must play the role of a scale factor if you consider the 100-pixel experiment. However, for the 1-pixel experiment, having a more suitable prior should be more important. best, Kay On Mon, 18 Oct 2021 12:40:45 -0700, James Holton <jmhol...@lbl.gov<mailto:jmhol...@lbl.gov><mailto:jmhol...@lbl.gov>> wrote: Thank you very much for this Kay! So, to summarize, you are saying the answer to my question "what is the expectation and variance if I observe a 10x10 patch of pixels with zero counts?" is: Iobs = 0.01 sigIobs = 0.01 (defining sigIobs = sqrt(variance(Iobs))) And for the one-pixel case: Iobs = 1 sigIobs = 1 but in both cases the distribution is NOT Gaussian, but rather exponential. And that means adding variances may not be the way to propagate error. Is that right? -James Holton MAD Scientist On 10/18/2021 7:00 AM, Kay Diederichs wrote: Hi James, I'm a bit behind ... My answer about the basic question ("a patch of 100 pixels each with zero counts - what is the variance?") you ask is the following: 1) we all know the Poisson PDF (Probability Distribution Function) P(k|l) = l^k*e^(-l)/k! (where k stands for for an integer >=0 and l is lambda) which tells us the probability of observing k counts if we know l. The PDF is normalized: SUM_over_k (P(k|l)) is 1 when k=0...infinity is 1. 2) you don't know before the experiment what l is, and you assume it is some number x with 0<=x<=xmax (the xmax limit can be calculated by looking at the physics of the experiment; it is finite and less than the overload value of the pixel, otherwise you should do a different experiment). Since you don't know that number, all the x values are equally likely - you use a uniform prior. 3) what is the PDF P(l|k) of l if we observe k counts? That can be found with Bayes theorem, and it turns out that (due to the uniform prior) the right hand side of the formula looks the same as in 1) : P(l|k) = l^k*e^(-l)/k! (again, the ! stands for the factorial, it is not a semantic exclamation mark). This is eqs. 7.42 and 7.43 in Agostini "Bayesian Reasoning in Data Analysis". 3a) side note: if we calculate the expectation value for l, by multiplying with l and integrating over l from 0 to infinity, we obtain E(P(l|k))=k+1, and similarly for the variance (Agostini eqs 7.45 and 7.46) 4) for k=0 (zero counts observed in a single pixel), this reduces to P(l|0)=e^(-l) for a single observation (pixel). (this is basic math; see also §7.4.1 of Agostini. 5) since we have 100 independent pixels, we must multiply the individual PDFs to get the overall PDF f, and also normalize to make the integral over that PDF to be 1: the result is f(l|all 100 pixels are 0)=n*e^(-n*l). (basic math). A more Bayesian procedure would be to realize that the posterior PDF P(l|0)=e^(-l) of the first pixel should be used as the prior for the second pixel, and so forth until the 100th pixel. This has the same result f(l|all 100 pixels are 0)=n*e^(-n*l) (Agostini § 7.7.2)! 6) the expectation value INTEGRAL_0_to_infinity over l*n*e^(-n*l) dl is 1/n . This is 1 if n=1 as we know from 3a), and 1/100 for 100 pixels with 0 counts. 7) the variance is then INTEGRAL_0_to_infinity over (l-1/n)^2*n*e^(-n*l) dl . This is 1/n^2 I find these results quite satisfactory. Please note that they deviate from the MLE result: expectation value=0, variance=0 . The problem appears to be that a Maximum Likelihood Estimator may give wrong results for small n; something that I've read a couple of times but which appears not to be universally known/taught. Clearly, the result in 6) and 7) for large n converges towards 0, as it should be. What this also means is that one should really work out the PDF instead of just adding expectation values and variances (and arriving at 100 if all 100 pixels have zero counts) because it is contradictory to use a uniform prior for all the pixels if OTOH these agree perfectly in being 0! What this means for zero-dose extrapolation I have not thought about. At least it prevents infinite weights! Best, Kay ######################################################################## To unsubscribe from the CCP4BB list, click the following link: https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB&A=1 This message was issued to members of www.jiscmail.ac.uk/CCP4BB<http://www.jiscmail.ac.uk/CCP4BB<http://www.jiscmail.ac.uk/CCP4BB%3chttp:/www.jiscmail.ac.uk/CCP4BB>>, a mailing list hosted by www.jiscmail.ac.uk<http://www.jiscmail.ac.uk/<http://www.jiscmail.ac.uk%3chttp:/www.jiscmail.ac.uk/>>, terms & conditions are available at https://www.jiscmail.ac.uk/policyandsecurity/ ######################################################################## To unsubscribe from the CCP4BB list, click the following link: https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB&A=1 This message was issued to members of www.jiscmail.ac.uk/CCP4BB<http://www.jiscmail.ac.uk/CCP4BB<http://www.jiscmail.ac.uk/CCP4BB%3chttp:/www.jiscmail.ac.uk/CCP4BB>>, a mailing list hosted by www.jiscmail.ac.uk<http://www.jiscmail.ac.uk<http://www.jiscmail.ac.uk%3chttp:/www.jiscmail.ac.uk>>, terms & conditions are available at https://www.jiscmail.ac.uk/policyandsecurity/ ######################################################################## To unsubscribe from the CCP4BB list, click the following link: https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB&A=1 This message was issued to members of www.jiscmail.ac.uk/CCP4BB<http://www.jiscmail.ac.uk/CCP4BB<http://www.jiscmail.ac.uk/CCP4BB%3chttp:/www.jiscmail.ac.uk/CCP4BB>>, a mailing list hosted by www.jiscmail.ac.uk<http://www.jiscmail.ac.uk<http://www.jiscmail.ac.uk%3chttp:/www.jiscmail.ac.uk>>, terms & conditions are available at https://www.jiscmail.ac.uk/policyandsecurity/ ######################################################################## To unsubscribe from the CCP4BB list, click the following link: https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB&A=1 This message was issued to members of www.jiscmail.ac.uk/CCP4BB<http://www.jiscmail.ac.uk/CCP4BB<http://www.jiscmail.ac.uk/CCP4BB%3chttp:/www.jiscmail.ac.uk/CCP4BB>>, a mailing list hosted by www.jiscmail.ac.uk<http://www.jiscmail.ac.uk<http://www.jiscmail.ac.uk%3chttp:/www.jiscmail.ac.uk>>, terms & conditions are available at https://www.jiscmail.ac.uk/policyandsecurity/ ------ Randy J. Read Department of Haematology, University of Cambridge Cambridge Institute for Medical Research Tel: + 44 1223 336500 The Keith Peters Building Fax: + 44 1223 336827 Hills Road E-mail: rj...@cam.ac.uk<mailto:rj...@cam.ac.uk<mailto:rj...@cam.ac.uk%3cmailto:rj...@cam.ac.uk>> Cambridge CB2 0XY, U.K. www-structmed.cimr.cam.ac.uk<http://www-structmed.cimr.cam.ac.uk/> ________________________________ To unsubscribe from the CCP4BB list, click the following link: https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB&A=1 ------ Randy J. Read Department of Haematology, University of Cambridge Cambridge Institute for Medical Research Tel: + 44 1223 336500 The Keith Peters Building Fax: + 44 1223 336827 Hills Road E-mail: rj...@cam.ac.uk<mailto:rj...@cam.ac.uk<mailto:rj...@cam.ac.uk%3cmailto:rj...@cam.ac.uk>> Cambridge CB2 0XY, U.K. www-structmed.cimr.cam.ac.uk<http://www-structmed.cimr.cam.ac.uk> ######################################################################## To unsubscribe from the CCP4BB list, click the following link: https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB&A=1 This message was issued to members of www.jiscmail.ac.uk/CCP4BB<http://www.jiscmail.ac.uk/CCP4BB>, a mailing list hosted by www.jiscmail.ac.uk<http://www.jiscmail.ac.uk>, terms & conditions are available at https://www.jiscmail.ac.uk/policyandsecurity/ ------ Randy J. Read Department of Haematology, University of Cambridge Cambridge Institute for Medical Research Tel: + 44 1223 336500 The Keith Peters Building Fax: + 44 1223 336827 Hills Road E-mail: rj...@cam.ac.uk<mailto:rj...@cam.ac.uk> Cambridge CB2 0XY, U.K. www-structmed.cimr.cam.ac.uk<http://www-structmed.cimr.cam.ac.uk> ________________________________ To unsubscribe from the CCP4BB list, click the following link: https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB&A=1 -- This e-mail and any attachments may contain confidential, copyright and or privileged material, and are for the use of the intended addressee only. If you are not the intended addressee or an authorised recipient of the addressee please notify us of receipt by returning the e-mail and do not use, copy, retain, distribute or disclose the information in or attached to the e-mail. Any opinions expressed within this e-mail are those of the individual and not necessarily of Diamond Light Source Ltd. Diamond Light Source Ltd. cannot guarantee that this e-mail or any attachments are free from viruses and we cannot accept liability for any damage which you may sustain as a result of software viruses which may be transmitted in or with the message. Diamond Light Source Limited (company no. 4375679). Registered in England and Wales with its registered office at Diamond House, Harwell Science and Innovation Campus, Didcot, Oxfordshire, OX11 0DE, United Kingdom ________________________________ To unsubscribe from the CCP4BB list, click the following link: https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB&A=1 ######################################################################## To unsubscribe from the CCP4BB list, click the following link: https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB&A=1 This message was issued to members of www.jiscmail.ac.uk/CCP4BB, a mailing list hosted by www.jiscmail.ac.uk, terms & conditions are available at https://www.jiscmail.ac.uk/policyandsecurity/