On Aug 12, 12:26 pm, Brandon <[EMAIL PROTECTED]> wrote: > > You are very correct about the Laplace adjustment. However, a more > precise statement of my overall problem would involve training and > testing which utilizes bigram probabilities derived in part from the > Laplace adjustment; as I understand the workflow that I should follow, > I can't allow myself to be constrained only to bigrams that actually > exist in training or my overall probability when I run through testing > will be thrown off to 0 as soon as a test bigram that doesn't exist in > training is encountered. Hence my desire to find all possible bigrams > in train (having taken steps to ensure proper set relations between > train and test). > The best way I can currently see to do this is with > my current two-dictionary "caper", and by iterating over foo, not > bar :)
I can't grok large chunks of the above, especially these troublesome test bigrams that don't exist in training but which you desire to find in train(ing?). However let's look at the mechanics: Are you now saying that your original assertion "I am certain that all keys in bar belong to foo as well" was not quite "precise"? If not, please explain why you think you need to iterate (slowly) over foo in order to accomplish your stated task. -- http://mail.python.org/mailman/listinfo/python-list