Re: [R] Best practices for handling very small numbers?

Seref Arikan Wed, 19 Oct 2011 09:05:09 -0700

Thanks Duncan,
Indeed I have found a problem in my likelihood, and I'll work on the log
approach again. Regarding your comment; do you think it would make sense to
tackle a data set of few millions for parameter estimation using MCMC? In my
case we are talking about Griddy Gibbs sampling.


One relevant idea is to break the observations into smaller chunks and use
sequential updating by taking the posterior of one step as the prior of the
other, however, I don't feel confident that this would work, since it is not
the same situation with conjugate priors, where you have analytic form of
the posterior.

Any pointers (papers, books) towards large scale data processing with MCMC,
and Gibbs sampling in particular would be much appreciated.

Kind regards
Seref


On Tue, Oct 18, 2011 at 11:12 AM, Duncan Murdoch
<murdoch.dun...@gmail.com>wrote:

> On 11-10-18 4:30 AM, Seref Arikan wrote:
>
>> Hi Dan,
>> I've tried the log likelihood, but it reaches zero again, if I work with
>> say
>> 1000 samples.
>> I need an approach that would scale to quite large sample sizes. Surely I
>> can't be the first one to encounter this problem, and I'm sure I'm missing
>> an option that is embarrassingly obvious.
>>
>
> I think you are calculating the log likelihood incorrectly.  Don't
> calculate the likelihood and take the log; work out the formula for the log
> of the likelihood, and calculate that.  (If the likelihood contains a sum of
> terms, as in a mixture model, this takes some thinking, but it is still
> worthwhile.)
>
> With most models, it is just about impossible to cause the log likelihood
> to underflow if it is calculated carefully.
>
> Duncan Murdoch
>
>
>> Regards
>> Seref
>>
>> On Mon, Oct 17, 2011 at 6:15 PM, Nordlund, Dan (DSHS/RDA)<
>> nord...@dshs.wa.gov>  wrote:
>>
>>  -----Original Message-----
>>>> From: r-help-boun...@r-project.org [mailto:r-help-bounces@r-
>>>> project.org] On Behalf Of Seref Arikan
>>>> Sent: Monday, October 17, 2011 9:11 AM
>>>> To: r-help@r-project.org
>>>> Subject: [R] Best practices for handling very small numbers?
>>>>
>>>> Greetings
>>>> I have been experimenting with sampling from posterior distributions
>>>> using
>>>> R. Assume that I have the following observations from a normal
>>>> distribution,
>>>> with an unscaled joint likelihood function:
>>>>
>>>> normsamples = rnorm(1000,8,3)
>>>>
>>>> joint_likelihood = function(observations, mean, sigma){
>>>>     return((sigma ^ (-1 * length(observations))) * exp(-0.5 * sum(
>>>> ((observations - mean ) ^ 2)) / (sigma ^ 2) ));
>>>> }
>>>>
>>>> the joint likelihood omits the constant (1/(2Pi)^n), which is what I
>>>> want,
>>>> because I've been experimenting with some crude sampling methods. The
>>>> problem is, with the samples above, the joint likelihood becomes 0 very
>>>> quickly.
>>>> I wanted to experiment with tens of thousands of observations, but
>>>> without
>>>> an implementation of a transformation that can handle very small
>>>> values, my
>>>> sampling algorithms would not work.
>>>>
>>>> This is an attempt to use some sampling algorithms for Bayesian
>>>> parameter
>>>> estimation. I do not want to resort to conjugacy, since I am developing
>>>> this
>>>> to handle non conjugate scenarios, I just wanted to test it on a
>>>> conjugate
>>>> scenario, but I've quickly realized that I'm in trouble due to
>>>> likelihood
>>>> reaching 0 quickly.
>>>>
>>>> Your feedback would be appreciated. It makes me wonder how JAGS/BUGS
>>>> handles
>>>> this problem
>>>>
>>>> Best regards
>>>> Seref
>>>>
>>>>
>>> Maybe you should work with the log-likelihood?
>>>
>>>
>>> Hope this is helpful,
>>>
>>> Dan
>>>
>>> Daniel J. Nordlund
>>> Washington State Department of Social and Health Services
>>> Planning, Performance, and Accountability
>>> Research and Data Analysis Division
>>> Olympia, WA 98504-5204
>>>
>>> ______________________________**________________
>>> R-help@r-project.org mailing list
>>> https://stat.ethz.ch/mailman/**listinfo/r-help<https://stat.ethz.ch/mailman/listinfo/r-help>
>>> PLEASE do read the posting guide
>>> http://www.R-project.org/**posting-guide.html<http://www.R-project.org/posting-guide.html>
>>> and provide commented, minimal, self-contained, reproducible code.
>>>
>>>
>>        [[alternative HTML version deleted]]
>>
>>
>> ______________________________**________________
>> R-help@r-project.org mailing list
>> https://stat.ethz.ch/mailman/**listinfo/r-help<https://stat.ethz.ch/mailman/listinfo/r-help>
>> PLEASE do read the posting guide http://www.R-project.org/**
>> posting-guide.html <http://www.R-project.org/posting-guide.html>
>> and provide commented, minimal, self-contained, reproducible code.
>>
>
>

        [[alternative HTML version deleted]]

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Best practices for handling very small numbers?

Reply via email to