Re: [ccp4bb] am I doing this right?

James Holton Sun, 17 Oct 2021 10:12:50 -0700

Well Frank, I think it comes down to something I believe you were thefirst to call "dose slicing".

Like fine phi slicing, collecting a larger number of weaker imagesrecords the same photons, but with more information about the samplebefore it dies. In fine phi slicing the extra information allows you todo better background rejection, and in "dose slicing" the extrainformation is about radiation damage. We lose that information when weuse longer exposures per image, and if you burn up the entire usefullife of your crystal in one shot, then all information about how thespots decayed during the exposure is lost. Your data are also ratherincomplete.

How much information is lost? Well, how much more disk space would betaken up, even after compression, if you collected only 1 photon perimage? And kept collecting all the way out to 30 MGy in dose? That'sabout 1 million photons (images) per cubic micron of crystal. So, I'dsay the amount of information lost is "quite a bit".

But what makes matters worse is that if you did collect this data setand preserved all information available from your crystal you'd have noway to process it. This is not because its impossible, its just that wedon't have the software. Your only choice would be to go find imageswith the same "phi" value and add them together until you have enoughphotons/pixel to index it. Once you've got an indexing solution you canmap every photon hit to a position in reciprocal space as well as giveit a time/dose stamp. What do you do with that? You can do zero-doseextrapolation, of course! Damage-free data! Wouldn't that be nice. Orcan you? The data you will have in hand for each reciprocal-space pixelmight look something like:tic tic .. tic . tic ... tic tic........tic ...........tic............................tic.

So. Eight photons. With time-of-arrival information. How do you fit astraight line to that? You could "bin" the data or do some kind ofsmoothing thing, but then you are losing information again. Perhaps alsomaking ill-founded assumptions. You need error bars of some kind, and,better yet, the shape of the distribution implied by those error bars.

And all this makes me think somebody must have already done this. I'mwilling to bet probably some time in the late 1700s to early 1800s. Allwe're really talking about here is augmenting maximum-likelihoodestimation of an average value to maximum-likelihood estimation of astraight line. That is, slope and intercept, with sigmas on both. Isuspect the proper approach is to first bring everything down to theexact information content of a single photon (or lack of a photon), andbuild up from there. If you are lucky enough to have a large number ofphotons then linear regression will work, and you are back to Diederichs(2003). But when you're photon-starved the statistics of single photonsbecome more and more important. This led me to: is it k? or k+1 ? Whenk=0 getting this wrong could introduce a factor of infinity.

So, perhaps the big "consequence of getting it wrong" is embarrassingmyself by re-making a 200-year old mistake I am not currently aware of.I am confident a solution exists, but only recently started working onthis. So, I figured ... ask the world?


-James Holton
MAD Scientist


On 10/17/2021 1:51 AM, Frank Von Delft wrote:

James, I've been watching the thread with fascination, but also theconfusion of wild ignorance. I've finally realised why.
What I've missed is: what exactly makes the question so important? I've understood what brought it up, if course, but not the consequenceof getting it wrong.
Frank

Sent from tiny silly touch screen
------------------------------------------------------------------------
*From:* James Holton <jmhol...@lbl.gov>
*Sent:* Saturday, 16 October 2021 20:01
*To:* CCP4BB@JISCMAIL.AC.UK
*Subject:* Re: [ccp4bb] am I doing this right?

Thank you everyone for your thoughtful and thought-provoking responses!

But, I am starting to think I was not as clear as I could have been
about my question.  I am actually concerning myself with background, not
necessarily Bragg peaks.  With Bragg photons you want the sum, but for
background you want the average.

What I'm getting at is: how does one properly weight a zero-photon
observation when it comes time to combine it with others? Hopefully
they are not all zero.  If they are, check your shutter.

So, ignoring Bragg photons for the moment (let us suppose it is a
systematic absence) what I am asking is: what is the variance, or,
better yet,what is the WEIGHT one should assign to the observation of
zero photons in a patch of 10x10 pixels?

In the absence of any prior knowledge this is a difficult question, but
a question we kind of need to answer if we want to properly measure data
from weak images.  So, what do we do?

Well, with the "I have no idea" uniform prior, it would seem that
expectation (Epix) and variance (Vpix) would be k+1 = 1 for each pixel,
and therefore the sum of Epix and Vpix over the 100 independent pixels is:

Epatch=Vpatch=100 photons

I know that seems weird to assume 100 photons should have hit when we
actually saw none, but consider what that zero-photon count, all by
itself, is really telling you:
a) Epix > 20 ? No way. That is "right out". Given we know its Poisson
distributed, and that background is flat, it is VERY unlikely you have E
that big when you saw zero. Cross all those E values off your list.
b) Epix=0 ? Well, that CAN be true, but other things are possible and
all of them are E>0. So, most likely E is not 0, but at least a little
bit higher.
c) Epix=1e-6 ?  Yeah, sure, why not?
d) Epix= -1e-6 ?  No. Don't be silly.
e) If I had to guess? Meh. 1 photon per pixel?  That would be k+1

I suppose my objection to E=V=0 is because V=0 implies infinite
confidence in the value of E, and that we don't have. Yes, it is true
that we are quite confident in the fact that we did not see any photons
this time, but the remember that E and V are the mean and variance that
you would see if you did a million experiments under the same
conditions. We are trying to guess those from what we've got. Just
because you've seen zero a hundred times doesn't mean the 101st
experiment won't give you a count.  If it does, then maybe Epatch=0.01
and Epix=0.0001?  But what do you do before you see your first photon?
All you can really do is bracket it.

But what if you come up with a better prior than "I have no idea" ?
Well, we do have other pixels on the detector, and presuming the
background is flat, or at least smooth, maybe the average counts/pixel
is a better prior?

So, let us consider an ideal detector with 1e6 independent pixels. Let
us further say that 1e5 background photons have hit that detector.  I
want to still ignore Bragg photons because those have a very different
prior distribution to the background.  Let us say we have masked off all
the Bragg areas.

The average overall background is then 0.1 photons/pixel. Let us assign
that to the prior probability Ppix = 0.1.  Now let us look again at that
patch of 10x10 pixels with zero counts on it.  We expected to see 10,
but got 0.  What are the odds of that?  Pretty remote.  Less than 1 in a
million.

I suspect in this situation where such an unlikely event has occurred it
should perhaps be given a variance larger than 100. Perhaps quite a bit
larger?  Subsequent "sigma-weighted" summation would then squash its
contribution down to effectively 0. So, relative to any other
observation with even a shred of merit it would have no impact. Giving
it V=0, however? That can't be right.

But what if Ppix=0.01 ?  Then we expect to see zero counts on our
100-pixel patch about 1/3 of the time. Same for 1-photon observations.
Giving these two kinds of observations the same weight seems more
sensible, given the prior.

Another prior might be to take the flux and sample thickness into
account.  Given the cross section of light elements the expected
photons/pixel on most any detector would be:

Ppix = 1.2e-5*flux*exposure*thickness*omega/Npixels
where:
Ppix = expected photons/pixel
Npixels = number of pixels on the detector
omega  = fraction of scattered photons that hit it (about 0.5)
thickness = thickness of sample and loop in microns
exposure = exposure time in seconds
flux = incident beam flux in photons/s
1.2e-5 = 1e-4 cm/um * 1.2 g/cm^3 * 0.2 cm^2/g (cross section of oxygen)

If you don't know anything else about the sample, you can at least know
that.

Or am I missing something?

-James Holton
MAD Scientist


On 10/16/2021 12:47 AM, Kay Diederichs wrote:
> Dear Gergely,
>
> with " 10 x 10 patch of pixels ", I believe James means that heobserves 100 neighbouring pixels each with 0 counts. Thus thefrequentist view can be taken, and results in 0 as the variance, right?
>
> best,
> Kay
>
>
> On Fri, 15 Oct 2021 21:07:26 +0000, Gergely Katona<gergely.kat...@gu.se> wrote:
>
>> Dear James,
>>
>> Uniform distribution sounds like “I have no idea”, but a uniformdistribution does not go from -inf to +inf. If I believe that everycount from 0 to 65535 has the same probability, then I also expectcounts with an average of 32768 on the image. It is not an objectivebelief in the end and probably not a very good idea for an X-rayexperiment if the number of observations are small. Concerning whichvariance is the right one, the frequentist view requires frequenciesto be observed. In the absence of frequencies, there is no errorestimate. Bayesians at least can determine a single distribution as ananswer without observations and that will be their prior belief of thevariance. Again, I would avoid a uniform a priori distribution for thevariance. For a Poisson distribution the convenient conjugate prior isthe gamma distribution. It can control the magnitude of k and strengthof belief with its location and scale parameter, respectively.
>>
>> Best wishes,
>>
>> Gergely
>>
>> Gergely Katona, Professor, Chairman of the Chemistry Program Council
>> Department of Chemistry and Molecular Biology, University of Gothenburg
>> Box 462, 40530 Göteborg, Sweden
>> Tel: +46-31-786-3959 / M: +46-70-912-3309 / Fax: +46-31-786-3910
>> Web: http://katonalab.eu <http://katonalab.eu>, Email:gergely.kat...@gu.se
>>
>> From: CCP4 bulletin board <CCP4BB@JISCMAIL.AC.UK> On Behalf OfJames Holton
>> Sent: 15 October, 2021 18:06
>> To: CCP4BB@JISCMAIL.AC.UK
>> Subject: Re: [ccp4bb] am I doing this right?
>>
>> Well I'll be...
>>
>> Kay Diederichs pointed out to me off-list that the k+1 expectationand variance from observing k photons is in "Bayesian Reasoning inData Analysis: A Critical Introduction" by Giulio D. Agostini. Granted, that is with a uniform prior, which I take as the Bayeseanequivalent of "I have no idea".
>>
>> So, if I'm looking to integrate a 10 x 10 patch of pixels on a weakdetector image, and I find that area has zero counts, what varianceshall I put on that observation? Is it:
>>
>> a) zero
>> b) 1.0
>> c) 100
>>
>> Wish I could say there are no wrong answers, but I think at leasttwo of those are incorrect,
>>
>> -James Holton
>> MAD Scientist
>> On 10/13/2021 2:34 PM, Filipe Maia wrote:
>> I forgot to add probably the most important. James is correct, theexpected value of u, the true mean, given a single observation k isindeed k+1 and k+1 is also the mean square error of using k+1 as theestimator of the true mean.
>>
>> Cheers,
>> Filipe
>>
>> On Wed, 13 Oct 2021 at 23:17, Filipe Maia<fil...@xray.bmc.uu.se<mailto:fil...@xray.bmc.uu.se>> wrote:
>> Hi,
>>
>> The maximum likelihood estimator for a Poisson distributed variableis equal to the mean of the observations. In the case of a singleobservation, it will be equal to that observation. As Graemesuggested, you can calculate the probability mass function for a givenobservation with different Poisson parameters (i.e. true means) andsee that function peaks when the parameter matches the observation.
>>
>> The root mean squared error of the estimation of the true mean froma single observation k seems to be sqrt(k+2). Or to put it in anotherway, mean squared error, that is the expected value of (k-u)**2, foran observation k and a true mean u, is equal to k+2.
>>
>> You can see some example calculations athttps://colab.research.google.com/drive/1eoaNrDqaPnP-4FTGiNZxMllP7SFHkQuS?usp=sharing<https://colab.research.google.com/drive/1eoaNrDqaPnP-4FTGiNZxMllP7SFHkQuS?usp=sharing>
>>
>> Cheers,
>> Filipe
>>
>> On Wed, 13 Oct 2021 at 17:14, Winter, Graeme (DLSLtd,RAL,LSCI)<00006a19cead4548-dmarc-requ...@jiscmail.ac.uk<mailto:00006a19cead4548-dmarc-requ...@jiscmail.ac.uk>>wrote:>> This rang a bell to me last night, and I think you can derive thisfrom first principles
>>
>> If you assume an observation of N counts, you can calculate theprobability of such an observation for a given Poisson rate constantX. If you then integrate over all possible value of X to work out thecentral value of the rate constant which is most likely to result inan observation of N I think you get X = N+1
>>
>> I think it is the kind of calculation you can perform on a napkin,if memory serves
>>
>> All the best Graeme
>>
>>
>> On 13 Oct 2021, at 16:10, Andrew Leslie - MRC LMB<and...@mrc-lmb.cam.ac.uk<mailto:and...@mrc-lmb.cam.ac.uk>> wrote:
>>
>> Hi Ian, James,
>>
>> I have a strong feeling that I have seen thisresult before, and it was due to Andy Hammersley at ESRF. I’ve done aliterature search and there is a paper relating to errors in analysisof counting statistics (se below), but I had a quick look at this andcould not find the (N+1) correction, so it must have been somewhereelse. I Have cc’d Andy on this Email (hoping that this Email addressfrom 2016 still works) and maybe he can throw more light on this. WhatI remember at the time I saw this was the simplicity of the correction.
>>
>> Cheers,
>>
>> Andrew
>>
>> Reducing bias in the analysis of counting statistics data
>> Hammersley,AP<https://www.webofscience.com/wos/author/record/2665675<https://www.webofscience.com/wos/author/record/2665675>> (Hammersley,AP) Antoniadis,A<https://www.webofscience.com/wos/author/record/13070551<https://www.webofscience.com/wos/author/record/13070551>> (Antoniadis, A)>> NUCLEAR INSTRUMENTS & METHODS IN PHYSICS RESEARCH SECTIONA-ACCELERATORS SPECTROMETERS DETECTORS AND ASSOCIATED EQUIPMENT
>> Volume
>> 394
>> Issue
>> 1-2
>> Page
>> 219-224
>> DOI
>> 10.1016/S0168-9002(97)00668-2
>> Published
>> JUL 11 1997
>>
>>
>> On 12 Oct 2021, at 18:55, Ian Tickle<ianj...@gmail.com<mailto:ianj...@gmail.com>> wrote:
>>
>>
>> Hi James
>>
>> What the Poisson distribution tells you is that if the true countis N then the expectation and variance are also N. That's not thesame thing as saying that for an observed count N the expectation andvariance are N. Consider all those cases where the observed count isexactly zero. That can arise from any number of true counts, thoughas you noted larger values become increasingly unlikely. However thosetrue counts are all >= 0 which means that the mean and variance ofthose true counts must be positive and non-zero. From your resultsthey are both 1 though I haven't been through the algebra to prove it.
>>
>> So what you are saying seems correct: for N observed counts weshould be taking the best estimate of the true value and variance asN+1. For reasonably large N the difference is small but if you areconcerned with weak images it might start to become significant.
>>
>> Cheers
>>
>> -- Ian
>>
>>
>> On Tue, 12 Oct 2021 at 17:56, James Holton<jmhol...@lbl.gov<mailto:jmhol...@lbl.gov>> wrote:
>> All my life I have believed that if you're counting photons then the
>> error of observing N counts is sqrt(N).  However, a calculation I just
>> performed suggests its actually sqrt(N+1).
>>
>> My purpose here is to understand the weak-image limit of data
>> processing. Question is: for a given pixel, if one photon is all you
>> got, what do you "know"?
>>
>> I simulated millions of 1-second experiments. For each I used a "true"
>> beam intensity (Itrue) between 0.001 and 20 photons/s. That is, for
>> Itrue= 0.001 the average over a very long exposure would be 1 photon
>> every 1000 seconds or so. For a 1-second exposure the observedcount (N)
>> is almost always zero. About 1 in 1000 of them will see one photon, and
>> roughly 1 in a million will get N=2. I do 10,000 such experiments and
>> put the results into a pile.  I then repeat with Itrue=0.002,
>> Itrue=0.003, etc. All the way up to Itrue = 20. At Itrue > 20 I never
>> see N=1, not even in 1e7 experiments. With Itrue=0, I also see no N=1
>> events.
>> Now I go through my pile of results and extract those with N=1, and
>> count up the number of times a given Itrue produced such an event. The
>> histogram of Itrue values in this subset is itself Poisson, but with
>> mean = 2 ! If I similarly count up events where 2 and only 2 photons
>> were seen, the mean Itrue is 3. And if I look at only zero-count events
>> the mean and standard deviation is unity.
>>
>> Does that mean the error of observing N counts is really sqrt(N+1) ?
>>
>> I admit that this little exercise assumes that the distribution ofItrue
>> is uniform between 0.001 and 20, but given that one photon has been
>> observed Itrue values outside this range are highly unlikely. The
>> Itrue=0.001 and N=1 events are only a tiny fraction of the whole. So, I
>> wold say that even if the prior distribution is not uniform, it is
>> certainly bracketed. Now, Itrue=0 is possible if the shutter didn't
>> open, but if the rest of the detector pixels have N=~1, doesn't this
>> affect the prior distribution of Itrue on our pixel of interest?
>>
>> Of course, two or more photons are better than one, but these days with
>> small crystals and big detectors N=1 is no longer a trivial situation.
>> I look forward to hearing your take on this. And no, this is not atrick.
>>
>> -James Holton
>> MAD Scientist
>>
>>########################################################################
>>
>> To unsubscribe from the CCP4BB list, click the following link:
>> https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB&A=1<https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB&A=1>
>>
>> This message was issued to members ofwww.jiscmail.ac.uk/CCP4BB<http://www.jiscmail.ac.uk/CCP4BB><http://www.jiscmail.ac.uk/CCP4BB<http://www.jiscmail.ac.uk/CCP4BB>>,a mailing list hosted by www.jiscmail.ac.uk<http://www.jiscmail.ac.uk/<http://www.jiscmail.ac.uk/>>, terms & conditions are available athttps://www.jiscmail.ac.uk/policyandsecurity/<https://www.jiscmail.ac.uk/policyandsecurity/>
>>
>> ________________________________
>>
>> To unsubscribe from the CCP4BB list, click the following link:
>> https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB&A=1<https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB&A=1>
>>
>>
>> ________________________________
>>
>> To unsubscribe from the CCP4BB list, click the following link:
>> https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB&A=1<https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB&A=1>
>>
>>
>>
>>
>> --
>>
>> This e-mail and any attachments may contain confidential, copyrightand or privileged material, and are for the use of the intendedaddressee only. If you are not the intended addressee or an authorisedrecipient of the addressee please notify us of receipt by returningthe e-mail and do not use, copy, retain, distribute or disclose theinformation in or attached to the e-mail.>> Any opinions expressed within this e-mail are those of theindividual and not necessarily of Diamond Light Source Ltd.>> Diamond Light Source Ltd. cannot guarantee that this e-mail or anyattachments are free from viruses and we cannot accept liability forany damage which you may sustain as a result of software viruses whichmay be transmitted in or with the message.>> Diamond Light Source Limited (company no. 4375679). Registered inEngland and Wales with its registered office at Diamond House, HarwellScience and Innovation Campus, Didcot, Oxfordshire, OX11 0DE, UnitedKingdom
>>
>>
>> ________________________________
>>
>> To unsubscribe from the CCP4BB list, click the following link:
>> https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB&A=1<https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB&A=1>
>>
>> ________________________________
>>
>> To unsubscribe from the CCP4BB list, click the following link:
>> https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB&A=1<https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB&A=1>
>>
>>
>> ________________________________
>>
>> To unsubscribe from the CCP4BB list, click the following link:
>> https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB&A=1<https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB&A=1>
>>
>>########################################################################
>>
>> To unsubscribe from the CCP4BB list, click the following link:
>> https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB&A=1<https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB&A=1>
>>
>> This message was issued to members of www.jiscmail.ac.uk/CCP4BB<http://www.jiscmail.ac.uk/CCP4BB>, a mailing list hosted bywww.jiscmail.ac.uk <http://www.jiscmail.ac.uk>, terms & conditions areavailable at https://www.jiscmail.ac.uk/policyandsecurity/<https://www.jiscmail.ac.uk/policyandsecurity/>
> ########################################################################
>
> To unsubscribe from the CCP4BB list, click the following link:
> https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB&A=1<https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB&A=1>
>
> This message was issued to members of www.jiscmail.ac.uk/CCP4BB<http://www.jiscmail.ac.uk/CCP4BB>, a mailing list hosted bywww.jiscmail.ac.uk <http://www.jiscmail.ac.uk>, terms & conditions areavailable at https://www.jiscmail.ac.uk/policyandsecurity/<https://www.jiscmail.ac.uk/policyandsecurity/>
########################################################################

To unsubscribe from the CCP4BB list, click the following link:
https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB&A=1<https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB&A=1>
This message was issued to members of www.jiscmail.ac.uk/CCP4BB<http://www.jiscmail.ac.uk/CCP4BB>, a mailing list hosted bywww.jiscmail.ac.uk <http://www.jiscmail.ac.uk>, terms & conditions areavailable at https://www.jiscmail.ac.uk/policyandsecurity/<https://www.jiscmail.ac.uk/policyandsecurity/>



########################################################################

To unsubscribe from the CCP4BB list, click the following link:
https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB&A=1

This message was issued to members of www.jiscmail.ac.uk/CCP4BB, a mailing list 
hosted by www.jiscmail.ac.uk, terms & conditions are available at 
https://www.jiscmail.ac.uk/policyandsecurity/

Re: [ccp4bb] am I doing this right?

Reply via email to