Yes, the "bigram" in that demo only has two characters, which could
separate different character sets. -Xiangrui
On Wed, Oct 1, 2014 at 2:54 PM, Liquan Pei wrote:
> The program computes hashing bi-gram frequency normalized by total number of
> bigrams then filter out zero values. hashing is a eff
The program computes hashing bi-gram frequency normalized by total number
of bigrams then filter out zero values. hashing is a effective trick of
vectorizing features. Take a look at
http://en.wikipedia.org/wiki/Feature_hashing
Liquan
On Wed, Oct 1, 2014 at 2:18 PM, Soumya Simanta
wrote:
> I'm