thanks! So I know there is a Java implementation of LDA (MALLET Pkg), which I believe uses collapsed Gibbs sampling, and also there are probably multiple C++ implementations as well, unfortunately I don't know Java or C++ so I'm unable personally to benchmark against those. However there are also Matlab and R implementations which are two languages that I do probably know well enough that I could run some benchmarks against them, so I may do that in the near future.
On Saturday, July 2, 2016 at 6:30:34 AM UTC-7, Cedric St-Jean wrote: > > Impressive work, especially with the documentation! Have you benchmarked > it against other implementations? > > On Saturday, July 2, 2016 at 12:32:13 AM UTC-4, esproff wrote: >> >> Hi all! >> >> So I have just released a new variational Bayes topic modeling package >> for Julia, which can be found here: >> >> https://github.com/esproff/TopicModelsVB.jl >> >> The models included are: >> >> 1. >> >> Latent Dirichlet Allocation (LDA) >> 2. >> >> Filtered Latent Dirichlet Allocation (fLDA) >> 3. >> >> Correlated Topic Model (CTM) >> 4. >> >> Filtered Correlated Topic Model (fCTM) >> 5. >> >> Dynamic Topic Model (DTM) >> 6. >> >> Collaborative Topic Poisson Factorization (CTPF) >> >> This is, as far as I can tell, the best open-source topic modeling >> package to date. It's still a bit rough around the edges and there are a >> few edge-case bugs I think still deep in the belly of 1 or 2 of the >> algorithms. But overall it's polished enough that I think it needs to be >> tried out by other people besides myself. >> >> I'm open to collaborators, and I'm especially interested in adding some >> GPGPU support, however, formally speaking, I'm trained as a mathematician, >> not a computer scientist or software engineer, and thus if you're an expert >> in GPGPU I'd be very interested in talking to you about adding this >> functionality as Bayesian learning can be *EXTREMELY *computationally >> intensive. (you can contact me on here or at [email protected]) >> >> On the other hand, if you're more into the applied math / machine >> learning side, there are still a number of models to implement, mostly >> non-parametric versions of the ones I've implemented, however I should warn >> you that Bayesian nonparametrics is not for the faint of heart. >> >> Julia is a great language, and I hope you all like it as much as I do, of >> course the speed is the big seller, however I think maybe its best feature >> is the ease with which one can dig down into the internals of the language, >> and considering how high-level the language is, this is truly a >> masterstroke by the creators. >> >
