date:20171023

Zero Coefficient in logistic regression

2017-10-23 Thread Alexis Peña

Hi Guys, We are fitting a Logistic model using the following code. val Chisqselector = new ChiSqSelector().setNumTopFeatures(10).setFeaturesCol("VECTOR_1").setLabelCol("TARGET").setOutputCol("selectedFeatures") val assembler = new VectorAssembler().setInputCols(Array("FEATURES", "sele

Orc predicate pushdown with Spark Sql

2017-10-23 Thread Siva Gudavalli

Hello, I am working with Spark SQL to query Hive Managed Table (in Orc Format) I have my data organized by partitions and asked to set indexes for each 50,000 Rows by setting ('orc.row.index.stride'='5') lets say -> after evaluating partition there are around 50 files in which data is

Re: Spark ML - LogisticRegression interpreting prediction

2017-10-23 Thread pun

Thanks a lot! You are right! -- Sent from: http://apache-spark-user-list.1001560.n3.nabble.com/ - To unsubscribe e-mail: user-unsubscr...@spark.apache.org

Re: parition by multiple columns/keys

2017-10-23 Thread Imran Rajjad

strangely this is working only for very small dataset of rows.. for very large datasets apparently the partitioning is not working. is there a limit to the number of columns or rows when repartitioning according to multiple columns? regards, Imran On Wed, Oct 18, 2017 at 11:00 AM, Imran Rajjad w