from:"buring"

Re: Execution error during ALS execution in spark

2016-03-31 Thread buring

I have some suggestions you may try 1) input RDD ,use the persist method ,this may much save running time 2) from the UI,you can see cluster spend much time in shuffle stage , this can adjust through some conf parameters ,such as" spark.shuffle.memoryFraction" "spark.memory.fraction" good luck

LDA code little error @Xiangrui Meng

2015-04-22 Thread buring

Hi: there is a little error in source code LDA.scala at line 180, as follows: def setBeta(beta: Double): this.type = setBeta(beta) which cause "java.lang.StackOverflowError". It's easy to see there is error -- View this message in context: http://apache-spark-user-list.1001560.n3.n

Re: MLLib /ALS : java.lang.OutOfMemoryError: Java heap space

2014-12-17 Thread buring

I am not sure this can help you. I have 57 million rating,about 4million user and 4k items. I used 7-14 total-executor-cores,executal-memory 13g,cluster have 4 nodes,each have 4cores,max memory 16g. I found set as follows may help avoid this problem: conf.set("spark.shuffle.memoryFraction","0.

Re: "toArray","first" get the different result from one element RDD

2014-12-16 Thread buring

I get the key point . The problem is in sc.sequenceFile,From API description "RDD will create many references to the same objecty" ,So I revise the code "sessions.getBytes" to "sessions.getBytes.clone", It seems to work. Thanks. -- View this message in context: http://apache-spark-user-list.1

"toArray","first" get the different result from one element RDD

2014-12-16 Thread buring

Hi Recently I have some problems about rdd behaviors.It's about "RDD.first","RDD.toArray" method when RDD only has one element. I get the different result in different method from one element RDD where i should have the same result. I will give more detail after the code.

Re: MLLib /ALS : java.lang.OutOfMemoryError: Java heap space

2014-12-16 Thread buring

you can try to decrease the rank value. -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/MLLib-ALS-java-lang-OutOfMemoryError-Java-heap-space-tp20584p20711.html Sent from the Apache Spark User List mailing list archive at Nabble.com.

RDD "toarray","first" behavior

2014-12-16 Thread buring

Hi Recently I have some problems about rdd behaviors.It's about "RDD.first","RDD.toArray" method when RDD only has one element. I can't get the correct element in RDD. I will give more detail after the code. My code was as follows: //get and rdd with just one row RDD[(Long,A

Re: Help with processing multiple RDDs

2014-11-11 Thread buring

i think you can try to set lower spark.storage.memoryFraction,for example 0.4 conf.set("spark.storage.memoryFraction","0.4") //default 0.6 -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Help-with-processing-multiple-RDDs-tp18628p18659.html Sent from the Ap

Re: To generate IndexedRowMatrix from an RowMatrix

2014-11-10 Thread buring

You should supply more information about your input data. For example ,I generate a IndexRowMatrix from ALS algorithm input data format,my code like this: val inputData = sc.textFile(fname).map{ line=> val parts = line.trim.split(' ') (parts(0).toLong,parts(1).toInt,parts(2).

index File create by mapFile can't read

2014-11-10 Thread buring

Hi Recently i want to save a big RDD[(k,v)] in form of index and data ,I deceide to use hadoop mapFile. I tried some examples like this :https://gist.github.com/airawat/6538748 I runs the code well and generate a index and data file. I can use command "hadoop fs -text /spark/out2

index File create by mapFile can't

2014-11-10 Thread buring

Hi Recently I want to save a big RDD[(k,v)] in form of index and data ,I deceide to use hadoop mapFile. I tried some examples like this :https://gist.github.com/airawat/6538748 I runs the code well and generate a index and data file. I can use command "hadoop fs -text /spark/out2

Re: How to avoid use snappy compression when saveAsSequenceFile?

2014-11-05 Thread buring

thanks after add the code: spark.io.compression.codec org.apache.spark.io.LZ4CompressionCodec in spark-defaults.conf,It runs well. -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/How-to-avoid-use-snappy-compression-when-saveAsSequenceFile-tp17350p18240.html

Re: How to avoid use snappy compression when saveAsSequenceFile?

2014-10-28 Thread buring

yes ,I use standalone mode,I have set the "spark.io.compression.codec" with code : conf.set("spark.io.compression.codec","org.apache.spark.io.LZ4CompressionCodec") It seems have no influence on function "saveAsSequenceFile" which still used snappy compression internal. Thanks. -- Vi

Re: How to avoid use snappy compression when saveAsSequenceFile?

2014-10-27 Thread buring

Here is error log,I abstract as follows: INFO [binaryTest---main]: before first WARN [org.apache.spark.scheduler.TaskSetManager---Result resolver thread-0]: Lost task 0.0 in stage 0.0 (TID 0, spark-dev136): org.xerial.snappy.SnappyError: [FAILED_TO_LOAD_NATIVE_LIBRARY] null org.xeri

How to avoid use snappy compression when saveAsSequenceFile?

2014-10-27 Thread buring

Hi: After update spark to version1.1.0, I experienced a snappy error which was posted here http://apache-spark-user-list.1001560.n3.nabble.com/Update-gcc-version-Still-snappy-error-tt15137.html . I avoid this problem with code:conf.set("spark.io.compression.codec","org.apache.spark.io.LZ4C

The confusion order of rows in SVD matrix ?

2014-09-29 Thread buring

Hi: I want to use SVD in my work. I tried some examples and have some confusions. The input the 4*3 matrix as follows: 2 0 0 0 3 2 0 3 1 2 0 3 My input file text as follows which is corresponding to the matrix 0 0 2 1 1 3 1 2 2

Update gcc version ,Still snappy error.

2014-09-25 Thread buring

I update the spark version form 1.02 to 1.10 , experienced an snappy version issue with the new Spark-1.1.0. After update the glibc version, occured a another issue. I abstract the log as follows: 14/09/25 11:29:18 WARN [org.apache.hadoop.util.NativeCodeLoader---main]: Unable to load native-hadoo

Re: Execution error during ALS execution in spark

LDA code little error @Xiangrui Meng

Re: MLLib /ALS : java.lang.OutOfMemoryError: Java heap space

Re: "toArray","first" get the different result from one element RDD

"toArray","first" get the different result from one element RDD

Re: MLLib /ALS : java.lang.OutOfMemoryError: Java heap space

RDD "toarray","first" behavior

Re: Help with processing multiple RDDs

Re: To generate IndexedRowMatrix from an RowMatrix

index File create by mapFile can't read

index File create by mapFile can't

Re: How to avoid use snappy compression when saveAsSequenceFile?

Re: How to avoid use snappy compression when saveAsSequenceFile?

Re: How to avoid use snappy compression when saveAsSequenceFile?

How to avoid use snappy compression when saveAsSequenceFile?

The confusion order of rows in SVD matrix ?

Update gcc version ,Still snappy error.

17 matches

Site Navigation

Mail list logo

Footer information