It seems that the Vowpal Wabbit version is most similar to what is in
https://github.com/intel-analytics/TopicModeling/blob/master/src/main/scala/org/apache/spark/mllib/topicModeling/OnlineHDP.scala
Although the Intel seems to implement the Hierarchical Dirichlet Process
(topics and subtopics) as
What machine learning algorithms are you interested in exploring or using?
Start from there or better yet the problem you are trying to solve, and
then the selection may be evident.
On Wednesday, August 5, 2015, praveen S wrote:
> I was wondering when one should go for MLib or SparkR. What is t
Is velox NOT open source?
On Saturday, June 20, 2015, Debasish Das wrote:
> Hi,
>
> The demo of end-to-end ML pipeline including the model server component at
> Spark Summit was really cool.
>
> I was wondering if the Model Server component is based upon Velox or it
> uses a completely different
Would the IndexedRDD feature provide what the Lookup RDD does?
I'Ve been using a broadcast variable map for a similar kind of thing -- It
probably is within 1GB but interested to know if the lookup (or indexed)
might be better.
C
On Friday, June 5, 2015, Dmitry Goldenberg wrote:
> Thanks everyon
Would tachyon be appropriate here?
On Friday, June 5, 2015, Evo Eftimov wrote:
> Oops, @Yiannis, sorry to be a party pooper but the Job Server is for Spark
> Batch Jobs (besides anyone can put something like that in 5 min), while I
> am under the impression that Dmytiy is working on Spark Stream
Dani,
Folding in I believe refers to setting up your Gibbs sampler (or other
model) with the learning word and document topic proportions as computed by
spark.
You might look at
https://lists.cs.princeton.edu/pipermail/topic-models/2014-May/002763.html
Where Jones suggests summing across columns
Heszak,
I have only glanced at it but you should be able to incorporate tokens
approximating n-gram yourself, say by using the lucene
ShingleAnalyzerWrapper API
http://lucene.apache.org/core/4_9_0/analyzers-common/org/apache/lucene/analysis/shingle/ShingleAnalyzerWrapper.html
You might also take a
Yes,
The case is convincing for PMML with Oryx. I will also investigate
parameter server.
Cheers,
Charles
On Tuesday, November 18, 2014, Sean Owen wrote:
> I'm just using PMML. I haven't hit any limitation of its
> expressiveness, for the model types is supports. I don't think there
> is a point
Manish and others,
A follow up question on my mind is whether there are protobuf (or other
binary format) frameworks in the vein of PMML. Perhaps scientific data
storage frameworks like netcdf, root are possible also.
I like the comprehensiveness of PMML but as you mention the complexity of
managem
Looking for something like scikit's grid search module.
C
While I can't definitively speak to MLLib online learning,
I'm sure you're evaluating Vowpal Wabbit, for which there's been some storm
integrations contributed.
Also you might look at factorie, http://factorie.cs.understanding.edu,
which at least provides an online lda.
C
On Thursday, June 19, 20
11 matches
Mail list logo