T project, and
> check that it makes it into the JAR using jar tf yourfile.jar.
>
> Matei
>
> > On Dec 30, 2014, at 4:21 PM, durga wrote:
> >
> > I am not sure , the way I can pass jets3t.properties file for
> spark-submit.
> > --file option seems not working
I am not sure , the way I can pass jets3t.properties file for spark-submit.
--file option seems not working.
can some one please help me. My production spark jobs get hung up when
reading s3 file sporadically.
Thanks,
-D
--
View this message in context:
http://apache-spark-user-list.1001560.n
Hi All ,
It seems problem is little more complicated.
If the job is hungup on reading s3 file.even if I kill the unix process that
started the job, it is not killing spark-job. It is still hung up there.
Now the questions are :
How do I find spark-job based on the name?
How do I kill the spark-
Please check your spark version and hadoop version in your mvn as well as
local spark setup. If hadoop versions not matching you might get this issue.
Thanks,
-D
--
View this message in context:
http://apache-spark-user-list.1001560.n3.nabble.com/Spark-in-Standalone-mode-tp20780p20815.html
Sen
open a few hundreds of files on s3 to read
> from one node. It just block itself without error until timeout later.
>
> On Monday, December 22, 2014, durga wrote:
>
>> Hi All,
>>
>> I am facing a strange issue sporadically. occasionally my spark job is
>> hung
Hi All,
I am facing a strange issue sporadically. occasionally my spark job is
hungup on reading s3 files. It is not throwing exception . or making some
progress, it is just hungs up there.
Is this a known issue , Please let me know how could I solve this issue.
Thanks,
-D
--
View this messag
One more question.
How would I submit additional jars to the spark-submit job. I used --jars
option, it seems it is not working as explained earlier.
Thanks for the help,
-D
--
View this message in context:
http://apache-spark-user-list.1001560.n3.nabble.com/java-sql-SQLException-No-suitable
Hi All,
I tried to make combined.jar in shell script . it is working when I am using
spark-shell. But for the spark-submit it is same issue.
Help is highly appreciated.
Thanks
-D
--
View this message in context:
http://apache-spark-user-list.1001560.n3.nabble.com/java-sql-SQLException-No-sui
Hi I am facing an issue with mysql jars with spark-submit.
I am not running in yarn mode.
spark-submit --jars $(echo mysql-connector-java-5.1.34-bin.jar | tr ' ' ',')
--class com.abc.bcd.GetDBSomething myjar.jar "abc" "bcd"
Any help is really appreciated.
Thanks,
-D
14/12/19 23:42:10 INFO Spar
ou try something like:
>
> //Get the last hour
> val d = (System.currentTimeMillis() - 3600 * 1000)
> val ex = "abc_" + d.toString().substring(0,7) + "*.json"
>
>
> [image: Inline image 1]
>
> Thanks
> Best Regards
>
> On Wed, Dec 17, 2014 at 5:05 A
Hi All,
I need help with regex in my sc.textFile()
I have lots of files with with epoch millisecond timestamp.
ex:abc_1418759383723.json
Now I need to consume last one hour files using the epoch time stamp as
mentioned above.
I tried couple of options , nothing seems working for me.
If any on
Hi
I am using using below program in spark-shell to load and filter data from
the data sets. I am getting exceptions if I run the programs for multiple
times, If I restart the shell it is working fine.
1) please let me know what I am doing wrong.
2) Also is there a way to make the program better
Thanks Mayur.
is there any documentation/readme with step by step process available for
adding or deleting nodes?
Thanks,
D.
--
View this message in context:
http://apache-spark-user-list.1001560.n3.nabble.com/persistent-HDFS-instance-for-cluster-restarts-destroys-tp10551p10565.html
Sent from
Hi All,
I have a question,
For my company , we are planning to use spark-ec2 scripts to create cluster
for us.
I understand that , persistent HDFS will make the hdfs available for cluster
restarts.
Question is:
1) What happens , If I destroy and re-create , do I loose the data.
a) If I loos
Hi
It seems I can only give --hadoop-major-version=2 . it is taking 2.0.0.
How could I say it should use 2.0.2
is there any --hadoop-minor-version variable I can use?
Thanks,
D.
--
View this message in context:
http://apache-spark-user-list.1001560.n3.nabble.com/How-could-I-start-new-spark-
Thanks Akhil
--
View this message in context:
http://apache-spark-user-list.1001560.n3.nabble.com/How-could-I-start-new-spark-cluster-with-hadoop2-0-2-tp10450p10514.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.
Hi,
I am trying to create spark cluster using spark-ec2 file under spark1.0.1
directory.
1) I noticed that It is always creating hadoop version 1.0.4.Is there a way
I can override that?I would like to have hadoop2.0.2
2) I also wants install Oozie along with. Is there any scrips available
along
Thanks Chen
--
View this message in context:
http://apache-spark-user-list.1001560.n3.nabble.com/Joining-by-timestamp-tp10367p10449.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.
Hi Chen,
Thank you very much for your reply. I think I do not understand how can I do
the join using spark api. If you have time , could you please write some
code .
Thanks again,
D.
--
View this message in context:
http://apache-spark-user-list.1001560.n3.nabble.com/Joining-by-timestamp-tp1
Hi Chen,
I am new to the Spark as well as SparkSQL , could you please explain how
would I create a table and run query on top of it.That would be super
helpful.
Thanks,
D.
--
View this message in context:
http://apache-spark-user-list.1001560.n3.nabble.com/Joining-by-timestamp-tp10367p10381.ht
Hi
I have peculiar problem,
I have two data sets (large ones) .
Data set1:
((timestamp),iterable[Any]) => {
(2014-07-10T00:02:45.045+,ArrayBuffer((2014-07-10T00:02:45.045+,98.4859,22)))
(2014-07-10T00:07:32.618+,ArrayBuffer((2014-07-10T00:07:32.618+,75.4737,22)))
}
DataSet2:
((
Thanks for the reply.
I am trying to save a huge file in my case it is 60GB. I think l.toSeq is
going to collect all the data into the driver , where I don't have that much
space . Is there any possibility using something like multipleoutput format
class etc for a large file.
Thanks,
m able to save it , But for larger files I am getting heap
space error . I am thinking it is due to "take" . Can some please help me
with this.
Thanks,
Durga
import org.apache.spark.SparkContext._
val conf = new SparkConf()
.setMaster(master)
.setAppName(appName)
23 matches
Mail list logo