Hi Kevin,

Yes I saw that.  Slycoder's package uses collapsed Gibbs sampling, which 
may run faster than a standard VB implementation of LDA.  I hope to soon 
implement a collapsed VB implementation for LDA which should outperform 
both collapsed Gibbs sampling and standard VB.

On Saturday, July 2, 2016 at 5:22:13 PM UTC-7, Kevin Squire wrote:
>
> TopicModels.jl (https://github.com/slycoder/TopicModels.jl 
> <https://www.google.com/url?q=https%3A%2F%2Fgithub.com%2Fslycoder%2FTopicModels.jl%2Fblob%2Fmaster%2FREADME.md&sa=D&sntz=1&usg=AFQjCNEvhQ7gWCMHuW6DuVS0toN9VKKymw>)
>  
> has an implementation of LDA. 
>
> Cheers,
>    Kevin
>
> On Saturday, July 2, 2016, esproff <[email protected] <javascript:>> wrote:
>
>> thanks!
>>
>> So I know there is a Java implementation of LDA (MALLET Pkg), which I 
>> believe uses collapsed Gibbs sampling, and also there are probably multiple 
>> C++ implementations as well, unfortunately I don't know Java or C++ so I'm 
>> unable personally to benchmark against those.  However there are also 
>> Matlab and R implementations which are two languages that I do probably 
>> know well enough that I could run some benchmarks against them, so I may do 
>> that in the near future.
>>
>> On Saturday, July 2, 2016 at 6:30:34 AM UTC-7, Cedric St-Jean wrote:
>>>
>>> Impressive work, especially with the documentation! Have you benchmarked 
>>> it against other implementations?
>>>
>>> On Saturday, July 2, 2016 at 12:32:13 AM UTC-4, esproff wrote:
>>>>
>>>> Hi all!
>>>>
>>>> So I have just released a new variational Bayes topic modeling package 
>>>> for Julia, which can be found here:
>>>>
>>>> https://github.com/esproff/TopicModelsVB.jl
>>>>
>>>> The models included are:
>>>>
>>>>    1. 
>>>>    
>>>>    Latent Dirichlet Allocation (LDA)
>>>>    2. 
>>>>    
>>>>    Filtered Latent Dirichlet Allocation (fLDA)
>>>>    3. 
>>>>    
>>>>    Correlated Topic Model (CTM)
>>>>    4. 
>>>>    
>>>>    Filtered Correlated Topic Model (fCTM)
>>>>    5. 
>>>>    
>>>>    Dynamic Topic Model (DTM)
>>>>    6. 
>>>>    
>>>>    Collaborative Topic Poisson Factorization (CTPF)
>>>>    
>>>> This is, as far as I can tell, the best open-source topic modeling 
>>>> package to date. It's still a bit rough around the edges and there are a 
>>>> few edge-case bugs I think still deep in the belly of 1 or 2 of the 
>>>> algorithms. But overall it's polished enough that I think it needs to be 
>>>> tried out by other people besides myself.
>>>>
>>>> I'm open to collaborators, and I'm especially interested in adding some 
>>>> GPGPU support, however, formally speaking, I'm trained as a mathematician, 
>>>> not a computer scientist or software engineer, and thus if you're an 
>>>> expert 
>>>> in GPGPU I'd be very interested in talking to you about adding this 
>>>> functionality as Bayesian learning can be *EXTREMELY *computationally 
>>>> intensive. (you can contact me on here or at [email protected])
>>>>
>>>> On the other hand, if you're more into the applied math / machine 
>>>> learning side, there are still a number of models to implement, mostly 
>>>> non-parametric versions of the ones I've implemented, however I should 
>>>> warn 
>>>> you that Bayesian nonparametrics is not for the faint of heart.
>>>>
>>>> Julia is a great language, and I hope you all like it as much as I do, 
>>>> of course the speed is the big seller, however I think maybe its best 
>>>> feature is the ease with which one can dig down into the internals of the 
>>>> language, and considering how high-level the language is, this is truly a 
>>>> masterstroke by the creators.
>>>>
>>>

Reply via email to