nd not others?
>
> It sounds like an interesting problem…
>
> On Jun 23, 2016, at 5:21 AM, Prabhu Joseph
> wrote:
>
> Hi All,
>
>On submitting 20 parallel same SQL query to Spark Thrift Server, the
> query execution time for some queries are less than a second and some a
concurrency is affected
by Single Driver. How to improve the concurrency and what are the best
practices.
Thanks,
Prabhu Joseph
cate hot cached blocks right?
>
>
> On Tuesday, March 8, 2016, Prabhu Joseph
> wrote:
>
>> Hi All,
>>
>> When a Spark Job is running, and one of the Spark Executor on Node A
>> has some partitions cached. Later for some other stage, Scheduler tries to
shuffle files from an external service instead of
from each other which will offload the load on Spark Executors.
We want to check whether a similar thing of an External Service is
implemented for transferring the cached partition to other executors.
Thanks, Prabhu Joseph
= {
val pieces = line.split(' ')
val level = pieces(2).toString
val one = pieces(0).toString
val two = pieces(1).toString
(level,LogClass(one,two))
}
val output = logData.map(x => parse(x))
*val partitioned = output.partitionBy(new ExactPartitioner(5)).persist()val
groups = partitioned.groupByKey(new ExactPartitioner(5))*
groups.count()
output.partitions.size
partitioned.partitions.size
}
}
Thanks,
Prabhu Joseph
:
> Looking at
>
> https://logging.apache.org/log4j/1.2/apidocs/org/apache/log4j/PatternLayout.html
>
> *WARNING* Generating the caller class information is slow. Thus, use
> should be avoided unless execution speed is not an issue.
>
> On Sat, Feb 27, 2016 at 12:40 PM, Prabhu
4:40 ERROR org.apache.spark.Logging$class: Failed to create
any local dir.
16/02/27 15:34:40 INFO org.apache.spark.Logging$class: Shutdown hook called
16/02/27 15:34:40 INFO org.apache.spark.Logging$class: Deleting directory
/tmp/spark-5544c349-0393-4bd0-8aab-c20331a9a1cf
Thanks,
Prabhu Joseph
YARN-2026 has fixed the issue.
On Thu, Feb 25, 2016 at 4:17 AM, Prabhu Joseph
wrote:
> You are right, Hamel. It should get 10 TB /2. And In hadoop-2.7.0, it is
> working fine. But in hadoop-2.5.1, it gets only 10TB/230. The same
> configuration used in both versions.
> So i think
dea of what your queues/actual resource
> usage is like. Logs from each of your Spark applications would also be
> useful. Basically the more info the better.
>
> On Wed, Feb 24, 2016 at 2:52 PM Prabhu Joseph
> wrote:
>
>> Hi Hamel,
>>
>> Thanks for looki
ty
and reservation.
The question is how much preemption tries to preempt the queue A if it
holds the entire resource without releasing? Could not able to share the
actual configuration, but the answer to the question here will help us.
Thanks,
Prabhu Joseph
On Wed, Feb 24, 2016 at 10:03 PM, Ham
r new YARN application type with similar behavior. We want
YARN to control this behavior by killing the resources which is hold by
first job for longer period.
Thanks,
Prabhu Joseph
java old threading is used somewhere.
On Friday, February 19, 2016, Jörn Franke wrote:
> How did you configure YARN queues? What scheduler? Preemption ?
>
> > On 19 Feb 2016, at 06:51, Prabhu Joseph > wrote:
> >
> > Hi All,
> >
> >When running con
taking 2-3 times longer than A,
which shows concurrency does not improve with shared Spark Context. [Spark
Job Server]
Thanks,
Prabhu Joseph
ed your help to find scenario where "No AMRMToken" will happen, an user
added with a token but later that token is missing. Is token removed since
expired?
Thanks,
Prabhu Joseph
On Wed, Feb 10, 2016 at 12:59 AM, Hari Shreedharan <
hshreedha...@cloudera.com> wrote:
> The cred
hadoop-2.5.1 and hence
spark.yarn.dist.files does not work with hadoop-2.5.1,
spark.yarn.dist.files works fine on hadoop-2.7.0, as CWD/* is included in
container classpath through some bug fix. Searching for the JIRA.
Thanks,
Prabhu Joseph
On Wed, Feb 10, 2016 at 4:04 PM, Ted Yu wrote:
> H
of hbase
client jars, when i checked launch container.sh , Classpath does not have
$PWD/* and hence all the hbase client jars are ignored.
Is spark.yarn.dist.files not for adding jars into the executor classpath.
Thanks,
Prabhu Joseph
On Tue, Feb 9, 2016 at 1:42 PM, Prabhu Joseph
wrote:
>
+ Spark-Dev
On Tue, Feb 9, 2016 at 10:04 AM, Prabhu Joseph
wrote:
> Hi All,
>
> A long running Spark job on YARN throws below exception after running
> for few days.
>
> yarn.ApplicationMaster: Reporter thread fails 1 time(s) in a row.
> org.apache.hadoop.yarn.exceptio
> must be the process of putting ..."
> - Edsger Dijkstra
>
> "If you pay peanuts you get monkeys"
>
>
> 2016-02-04 11:33 GMT+01:00 Prabhu Joseph :
>
>> Okay, the reason for the task delay within executor when some RDD in
>> memory and some in Hadoop i.
up and launching it on a
less-local node.
So after making it 0, all tasks started parallel. But learned that it is
better not to reduce it to 0.
On Mon, Feb 1, 2016 at 2:02 PM, Prabhu Joseph
wrote:
> Hi All,
>
>
> Sample Spark application which reads a logfile from hadoop (1.2GB
, saveAsHadoopFile runs fine.
What could be the reason for ExecutorLostFailure failing when cores per
executor is high.
Error: ExecutorLostFailure (executor 3 lost)
16/02/02 04:22:40 WARN TaskSetManager: Lost task 1.3 in stage 15.0 (TID
1318, hdnprd-c01-r01-14):
Thanks,
Prabhu Joseph
ores,
2.0 GB RAM
16/02/01 06:54:28 INFO AppClient$ClientEndpoint: Executor updated:
app-20160201065319-0014/2848 is now LOADING
16/02/01 06:54:28 INFO AppClient$ClientEndpoint: Executor updated:
app-20160201065319-0014/2848 is now RUNNING
....
Thanks,
Prabhu Joseph
application attempt, there are many
finishApplicationMaster request causing the ERROR.
Need your help to understand on what scenario the above happens.
JIRA's related are
https://issues.apache.org/jira/browse/SPARK-1032
https://issues.apache.org/jira/browse/SPARK-3072
Thanks,
Prabhu Joseph
22 matches
Mail list logo