3.nabble.com/access-hdfs-file-name-in-map-td6551.html
>
> --
> Emre Sevinç
>
>
> On Fri, Feb 6, 2015 at 2:16 AM, Subacini B wrote:
>
>> Hi All,
>>
>> We have filename with timestamp say ABC_1421893256000.txt and the
>> timestamp needs to be extrac
Hi All,
We have filename with timestamp say ABC_1421893256000.txt and the
timestamp needs to be extracted from file name for further processing.Is
there a way to get input file name picked up by spark streaming job?
Thanks in advance
Subacini
Hi All,
I have a cluster of 3 nodes [each 8 core/32 GB memory].
My program uses Spark Streaming with Spark SQL[Spark 1.1] and writes
incoming JSON to elasticsearch, Hbase. Below is my code and i receive
json files [input data varies from 30MB to 300 MB] every 10 seconds.
Irrespective of 3 nodes
Hi ,
Can someone help me , Any pointers would help.
Thanks
Subacini
On Fri, Dec 19, 2014 at 10:47 PM, Subacini B wrote:
> Hi All,
>
> Is there any API that can be used directly to write schemaRDD to HBase??
> If not, what is the best way to write schemaRDD to HBase.
>
> Thanks
> Subacini
>
Hi All,
Is there any API that can be used directly to write schemaRDD to HBase??
If not, what is the best way to write schemaRDD to HBase.
Thanks
Subacini
hi All,
How to run concurrently multiple requests on same cluster.
I have a program using *spark streaming context *which reads* streaming
data* and writes it to HBase. It works fine, the problem is when multiple
requests are submitted to cluster, only first request is processed as the
entire clu
Hi,
Can someone provide me pointers for this issue.
Thanks
Subacini
On Wed, Jul 2, 2014 at 3:34 PM, Subacini B wrote:
> Hi,
>
> Below code throws compilation error , "not found: *value Sum*" . Can
> someone help me on this. Do i need to add any jars or imports ? e
Hi,
http://mail-archives.apache.org/mod_mbox/spark-user/201403.mbox/%3cb75376b8-7a57-4161-b604-f919886cf...@gmail.com%3E
This talks about Shark backend will be replaced with Spark SQL engine in
future.
Does that mean Spark will continue to support Shark + Spark SQL for long
term? OR
After some p
Hi,
Below code throws compilation error , "not found: *value Sum*" . Can
someone help me on this. Do i need to add any jars or imports ? even for
Count , same error is thrown
val queryResult = sql("select * from Table)
queryResult.groupBy('colA)('colA,*Sum*('colB) as 'totB).aggregate(*Sum*
Hi All,
Running this join query
sql("SELECT * FROM A_TABLE A JOIN B_TABLE B WHERE
A.status=1").collect().foreach(println)
throws
Exception in thread "main" org.apache.spark.SparkException: Job aborted due
to stage failure: Task 1.0:3 failed 4 times, most recent failure: Exception
failure in T
se
> it will instead try to take all resource from a few nodes.
> On Jun 8, 2014 1:55 AM, "Subacini B" wrote:
>
>> Hi All,
>>
>> My cluster has 5 workers each having 4 cores (So total 20 cores).It is
>> in stand alone mode (not using Mesos or Yarn).I want two
HI,
I am stuck here, my cluster is not effficiently utilized . Appreciate any
input on this.
Thanks
Subacini
On Sat, Jun 7, 2014 at 10:54 PM, Subacini B wrote:
> Hi All,
>
> My cluster has 5 workers each having 4 cores (So total 20 cores).It is in
> stand alone mode (not using M
Hi All,
My cluster has 5 workers each having 4 cores (So total 20 cores).It is in
stand alone mode (not using Mesos or Yarn).I want two programs to run at
same time. So I have configured "spark.cores.max=3" , but when i run the
program it allocates three cores taking one core from each worker mak
13 matches
Mail list logo