On 2022-06-07 21:51, David V Glasgow via use-livecode wrote:
Quite a lot of stats and maths packages offer a feature whereby the N,
the Mean and the SD are variables specified by the user, and N random
numbers are then generated with the required mean and SD.  I remember
the venerable and excellent Hypercard  HyperStat
<https://link.springer.com/content/pdf/10.3758/BF03204668.pdf> (1993)
by David M Lane doing exactly that.

Or is there an elegant formula?  I have Googled about and can’t see
one, but maybe I don’t know the magic words.  And if someone wanted to
script this in LC what would be the best approach? (just general
guidance here, wouldn’t want anyone to invest their valuable time in
what is at present just vague musings)

Any hints from the stats gurus?

I'm not a stats guru but...

I think all you need to do here is to use some of the intrinsic 'properties' of the Mean and SD.

Lets say you have a collection X of numbers then the following things are always true:

  P1: Mean(c * X) = c * Mean(X)
  P2: Mean(X + k) = k + Mean(X)
  P3: SD(c * X) = abs(c) * SD(X)
  P4: SD(X + k) = SD(X)

In English, scaling a set of numbers scales their mean by the same amount, and offsetting a set of numbers offsets their mean by the same amount, Similarly, scaling a set of numbers scales their SD by the same amount, and offsetting a set of numbers makes no difference to the SD (as the SD is a relative quantity - it cares about distance from the mean, not magnitude).

Now, hopefully we can agree that if you generate a set of a random numbers, then scaling and offsetting them still uniformly does not reduce the randomness (randomness means the numbers form a uniform distribution over the range of generation, if you scale and offset then all you are doing is changing the range - not the distribution).

So with this in mind, let TMean and TSD be the target mean and target SD. Then:

  1. Generate N random numbers in the range [0, 1] - S0, ..., SN

  2. Compute SMean := Mean(S0, ..., SN)

  3. Compute SSD := SD(S0, ..., SN)

Now we take a small diversion from a sequence of enumerated steps to ask "what offset and scale do we need to apply to the set of numbers so that we get TMean and TSD, rather than SMean and SSD?".

The amount we need to scale by is mandated by the SD, specifically:

     c := TSD/SSD

If we scale our source numbers by c and apply SD then we see:

     SD(c * S0, ..., c * SN) = c * SD(S0, ..., SN) [P3 above]
                             = c * SSD
                             = TSD / SSD * SSD
                             = TSD

i.e. Our scaled input numbers give us the desired SD!

So now we just need to play the same 'game' with the Mean. We have:

     Mean(c * S0, ..., c * SN) = c * Mean(S0, ..., SN)
                               = c * SMean

However we really want a mean of TMean so define:

     k := TMean - c * SMean

Then if we translate our (scaled!) source numbers by k and apply Mean then we see:

Mean(c * S0 + k, ..., c * SN + k) = c * Mean(S0, ..., SN) + k [P1 and P2 above]
                                      = c * SMean + k
                                      = c * SMean + TMean - c * SMean
                                      = TMean

i.e. Our scaled and offset input numbers give us the desired Mean!

Note that SD is invariant under offsetting (P4) so SD(c * S0 + k, ..., c * SN + k) = SD(c * S0, ... c * SN) = TSD!

We can now return to our sequence of steps:

  4. Compute c := TSD/SSD

  5. Compute k := TMean - c * SMean

  6. Compute the target random numbers, Tn := c * Sn + k

So, assuming my maths is correct above T0, ..., TN, will be still be 'random' (for some suitable definition of random), but have Mean of TMean and SD of TSD as desired.

In LiveCode Script, the above is something like:

   function randomNumbers pN, pTMean, pTSD
      local tSource
      repeat pN times
         put random(2^31) & comma after tSource
      end repeat

      local tSMean, tSSD
      put average(tSource) into tSMean
      put stdDev(tSource) into tSSD

      local tC, tK
      put pTSD / pSSD into tC
      put pTMean - tC * tSMean into tK

      local tTarget
      repeat for each item tS in tSource
        put tC * tS + tK & comma after tTarget
      end repeat

      return tTarget
   end randomNumbers

Hope this helps!

Mark.

--
Mark Waddingham ~ m...@livecode.com ~ http://www.livecode.com/
LiveCode: Everyone can create apps

_______________________________________________
use-livecode mailing list
use-livecode@lists.runrev.com
Please visit this url to subscribe, unsubscribe and manage your subscription 
preferences:
http://lists.runrev.com/mailman/listinfo/use-livecode

Reply via email to