Hello
i believe i followed instructions here to get Spark to work on Windows.
The article refers to Win7, but it will work for win10 as well
http://nishutayaltech.blogspot.co.uk/2015/04/how-to-run-apache-spark-on-windows7-in.html
Jagat posted a similar link on winutils...i believe it would
Hey Marco/Jagat,
As I earlier informed you, that I've already done those basic checks and
permission changes.
eg: D:\winutils\bin\winutils.exe chmod 777 D:\tmp\hive, but to no avail. It
still throws the same error. At the very first place, I do not understand,
without any manual change, how did t
Better use coalesce instead of repatition
On Fri, Oct 20, 2017 at 9:47 PM, Marco Mistroni wrote:
> Use counts.repartition(1).save..
> Hth
>
>
> On Oct 20, 2017 3:01 PM, "Uğur Sopaoğlu" wrote:
>
> Actually, when I run following code,
>
> val textFile = sc.textFile("Sample.txt")
> val co
Do you have winutils in your system relevant for your system.
This SO post has infomation related
https://stackoverflow.com/questions/34196302/the-root-scratch-dir-tmp-hive-on-hdfs-should-be-writable-current-permissions
On 21 October 2017 at 03:16, Marco Mistroni wrote:
> Did u build spark or
Right, that makes sense and I understood that.
The thing I'm wondering about (And i think the answer is 'no' at this
stage).
When the optimizer is running and pushing predicates down, does it take
into account indexing and other storage layer strategies in determining
which predicates are process
Use counts.repartition(1).save..
Hth
On Oct 20, 2017 3:01 PM, "Uğur Sopaoğlu" wrote:
Actually, when I run following code,
val textFile = sc.textFile("Sample.txt")
val counts = textFile.flatMap(line => line.split(" "))
.map(word => (word, 1))
.reduceByK
Did u build spark or download the zip?
I remember having similar issue...either you have to give write perm to
your /tmp directory or there's a spark config you need to override
This error is not 2.1 specific...let me get home and check my configs
I think I amended my /tmp permissions via xterm
Hi,
Any help please? What can be the issue?
Thanks,
Aakash.
-- Forwarded message --
From: Aakash Basu
Date: Fri, Oct 20, 2017 at 1:00 PM
Subject: PySpark 2.1 Not instantiating properly
To: user
Hi all,
I have Spark 2.1 installed in my laptop where I used to run all my
program
here below Gary
filtered_df = spark.hiveContext.sql("""
SELECT
*
FROM
df
WHERE
type = 'type'
AND action = 'action'
AND audited_changes LIKE '---\ncompany_id:\n- %'
""")
filtered_audits.registerTempTable("filtered_df")
you are using hql to read
Actually, when I run following code,
val textFile = sc.textFile("Sample.txt")
val counts = textFile.flatMap(line => line.split(" "))
.map(word => (word, 1))
.reduceByKey(_ + _)
It save the results into more than one partition like part-0,
part-1. I w
Hi
Could you just create an rdd/df out of what you want to save and store it
in hdfs?
Hth
On Oct 20, 2017 9:44 AM, "Uğur Sopaoğlu" wrote:
> Hi all,
>
> In word count example,
>
> val textFile = sc.textFile("Sample.txt")
> val counts = textFile.flatMap(line => line.split(" "))
>
I have seen a similar scenario where we load data from a RDBMS into a NoSQL
database… Spark made sense for velocity and parallel processing (and cost of
licenses :) ).
> On Oct 15, 2017, at 21:29, Saravanan Thirumalai
> wrote:
>
> We are an Investment firm and have a MDM platform in oracle a
SK,
Have you considered:
Dataset df = spark.read().json(dfWithStringRowsContainingJson);
jg
> On Oct 11, 2017, at 16:35, sk skk wrote:
>
> Can we create a dataframe from a Java pair rdd of String . I don’t have a
> schema as it will be a dynamic Json. I gave encoders.string class.
Trying to improve the old solution.
Do we have a better text classifier now in Spark Mllib?
Regards,
lmk
--
Sent from: http://apache-spark-user-list.1001560.n3.nabble.com/
-
To unsubscribe e-mail: user-unsubscr...@spark.apach
Hi all,
In word count example,
val textFile = sc.textFile("Sample.txt")
val counts = textFile.flatMap(line => line.split(" "))
.map(word => (word, 1))
.reduceByKey(_ + _)
counts.saveAsTextFile("hdfs://master:8020/user/abc")
I want to write collection of "*c
Hi all,
I have Spark 2.1 installed in my laptop where I used to run all my
programs. PySpark wasn't used for around 1 month, and after starting it
now, I'm getting this exception (I've tried the solutions I could find on
Google, but to no avail).
Specs: Spark 2.1.1, Python 3.6, HADOOP 2.7, Window
16 matches
Mail list logo