rithms with ability to caching more data.
>
> Sincerely,
>
> DB Tsai
> ---
> Blog: https://www.dbtsai.com
>
>
> On Tue, Mar 3, 2015 at 2:27 PM, Gustavo Enrique Salazar Torres
> wrote:
> > Yeah, I can call count be
occurring within LBFGS. With the
> given stack trace, I'm not sure what part of LBFGS it's happening in.
>
> On Tue, Mar 3, 2015 at 2:27 PM, Gustavo Enrique Salazar Torres <
> gsala...@ime.usp.br> wrote:
>
>> Yeah, I can call count before that and it works.
looks like it might be
> happening before the data even gets to LBFGS. (Perhaps the outer join
> you're trying to do is making the dataset size explode a bit.) Are you
> able to call count() (or any RDD action) on the data before you pass it to
> LBFGS?
>
> On Tue, Mar 3, 201
ry.
I will let you know.
Thanks
On Tue, Mar 3, 2015 at 3:25 AM, Akhil Das
wrote:
> Can you try increasing your driver memory, reducing the executors and
> increasing the executor memory?
>
> Thanks
> Best Regards
>
> On Tue, Mar 3, 2015 at 10:09 AM, Gustavo Enrique Salazar T
Hi Sam:
Shouldn't you define the table schema? I had the same problem in Scala and
then I solved it defining the schema. I did this:
sqlContext.applySchema(dataRDD, tableSchema).registerTempTable(tableName)
Hope it helps.
On Mon, Jan 5, 2015 at 7:01 PM, Sam Flint wrote:
> Below is the code th
Hi there:
I'm using LBFGS optimizer to train a logistic regression model. The code I
implemented follows the pattern showed in
https://spark.apache.org/docs/1.2.0/mllib-linear-methods.html but training
data is obtained from a Spark SQL RDD.
The problem I'm having is that LBFGS tries to count the e
Hi there:
I have this dataset (about 12G) which I need to sort by key.
I used the sortByKey method but when I try to save the file to disk (HDFS
in this case) it seems that some tasks run out of time because they have
too much data to save and it can't fit in memory.
I say this because before the