ing rdd.collect and
> rdd.foreach(println)
>
> Thanks
> Best Regards
>
> On Wed, Sep 17, 2014 at 12:26 PM, vasiliy <
> zadonskiyd@
> > wrote:
>
>> it also appears in streaming hdfs fileStream
>>
>>
>>
>> --
>> View this message in co
full code example:
def main(args: Array[String]) {
val conf = new
SparkConf().setAppName("ErrorExample").setMaster("local[8]")
.set("spark.serializer", classOf[KryoSerializer].getName)
val sc = new SparkContext(conf)
val rdd = sc.hadoopFile(
"hdfs://./user.avro"
it also appears in streaming hdfs fileStream
--
View this message in context:
http://apache-spark-user-list.1001560.n3.nabble.com/collect-on-hadoopFile-RDD-returns-wrong-results-tp14368p14425.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.
Hello. I have a hadoopFile RDD and i tried to collect items to driver
program, but it returns me an array of identical records (equals to last
record of my file). My code is like this:
val rdd = sc.hadoopFile(
"hdfs:///data.avro",
classOf[org.apache.avro.mapred.AvroInputFormat[MyAv
it works, thanks
--
View this message in context:
http://apache-spark-user-list.1001560.n3.nabble.com/Spark-SQL-Thrift-JDBC-server-deployment-for-production-tp13947p14345.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.
-
When you get a stream from sc.fileStream() spark will process only files with
file timestamp > then current timestamp so all data from HDFS should not be
processed again. You may have a another problem - spark will not process
files that moved to your HDFS folder between your application restarts.
Hi, i have a question about spark sql Thrift JDBC server.
Is there a best practice for spark SQL deployement ? If i understand right
script
./sbin/start-thriftserver.sh
starts Thrift JDBC server in local mode. Is there an script options for
running this server on yarn-cluster mode ?
--
Vie