I'm building an LDA Pipeline, currently with 4 steps, Tokenizer,
StopWordsRemover, CountVectorizer, and LDA. I would like to add more steps,
for example, stemming and lemmatization, and also 1-gram and 2-grams (which
I believe is not supported by the default NGram class). Is there a way to
add these steps? In sklearn, you can create classes with fit() and
transform() methods, and that should be enough. Is that true in Spark ML as
well (or something similar)? 



--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/How-to-add-custom-steps-to-Pipeline-models-tp27522.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

---------------------------------------------------------------------
To unsubscribe e-mail: user-unsubscr...@spark.apache.org

Reply via email to