PR
> for it.
> If anyone can have a look and suggest any changes it would be really
> appreciated.
>
> Thank you.
>
>
> 2017-11-15 1:11 GMT+00:00 Bago Amirbekian :
>
>> There is a known issue with VectorAssembler which causes it to fail in
>> streaming if any of
Since we are on spark 2.2, I backported/fixed it. Here is the diff file
comparing against
https://github.com/apache/spark/blob/73fe1d8087cfc2d59ac5b9af48b4cf5f5b86f920/mllib/src/main/scala/org/apache/spark/ml/feature/VectorSizeHint.scala
24c24
< import org.apache.spark.ml.param.{Param, ParamMap, P
Since we are on spark 2.2, I backported/fixed it. Here is the diff file
comparing against
https://github.com/apache/spark/blob/73fe1d8087cfc2d59ac5b9af48b4cf5f5b86f920/mllib/src/main/scala/org/apache/spark/ml/feature/VectorSizeHint.scala
24c24
< import org.apache.spark.ml.param.{Param, ParamMap, P
Bago,
Finally I am able to create one which fails consistently. I think the issue
is caused by the VectorAssembler in the model. In the new code, I have 2
features(1 text and 1 number) and I have to run through a VectorAssembler
before giving to LogisticRegression. Code and test data below
import
Bago,
The code I wrote is not generating the issue. In our case, we build a ML
pipeline from a UI and is done in a particular fashion so that a user can
create a pipeline behind the scene using drag and drop. I am yet to dig
deeper to recreate the same as a standalone code. Meanwhile I am sharing
Sure. I will get one over the weekend
--
Sent from: http://apache-spark-developers-list.1001551.n3.nabble.com/
-
To unsubscribe e-mail: dev-unsubscr...@spark.apache.org
I have built a ML pipeline model on a static twitter data for sentiment
analysis. When I use the model on a structured stream, it always throws
"Queries with streaming sources must be executed with writeStream.start()".
This particular model doesn't contain any documented "unsupported"
operations.