I am in CentOS 7 and I use Spark 2.3.0. Below I have posted my code. Logistic regression took 85 minutes and linear regression 127 seconds…
My dataset as I said is 128 MB and contains: 1000 features and ~100 classes. #SparkSession ss = SparkSession.builder.getOrCreate() start = time.time() #Read data trainData = ss.read.format("csv").option("inferSchema","true").load(file) #Calculate Features assembler = VectorAssembler(inputCols=trainData.columns[1:], outputCol="features") trainData = assembler.transform(trainData) #Drop columns dropColumns = trainData.columns dropColumns = [e for e in dropColumns if e not in ('_c0', 'features')] trainData = trainData.drop(*dropColumns) #Rename column from _c0 to label trainData = trainData.withColumnRenamed("_c0", "label") #Logistic regression lr = LogisticRegression(maxIter=500, regParam=0.3, elasticNetParam=0.8) lrModel = lr.fit(trainData) #Output Coefficients print("Coefficients: " + str(lrModel.coefficientMatrix)) - Thodoris > On 27 Apr 2018, at 22:50, Irving Duran <irving.du...@gmail.com> wrote: > > Are you reformatting the data correctly for logistic (meaning 0 & 1's) before > modeling? What are OS and spark version you using? > > Thank You, > > Irving Duran > > > On Fri, Apr 27, 2018 at 2:34 PM Thodoris Zois <z...@ics.forth.gr > <mailto:z...@ics.forth.gr>> wrote: > Hello, > > I am running an experiment to test logistic and linear regression on spark > using MLlib. > > My dataset is only 128MB and something weird happens. Linear regression takes > about 127 seconds either with 1 or 500 iterations. On the other hand, > logistic regression most of the times does not manage to finish either with 1 > iteration. I usually get memory heap error. > > In both cases I use the default cores and memory for driver and I spawn 1 > executor with 1 core and 2GBs of memory. > > Except that, I get a warning about NativeBLAS. I searched in the Internet and > I found that I have to install libgfortran. Even if I did it the warning > remains. > > Any ideas for the above? > > Thank you, > - Thodoris > > --------------------------------------------------------------------- > To unsubscribe e-mail: user-unsubscr...@spark.apache.org > <mailto:user-unsubscr...@spark.apache.org> >