Hello Team,
I am trying to write a DataSet as parquet file in Append mode partitioned
by few columns. However since the job is time consuming, I would like to
enable DirectFileOutputCommitter (i.e by-passing the writes to temporary
folder).
Version of the spark i am using is 2.3.1.
Can someone p
Hi All,
I have video surveillance data and this needs to be processed in Spark. I
am going through the Spark + OpenCV. How to load .mp4 images into an RDD ?
Can we directly do this or the video needs to be coverted to sequenceFile ?
Thanks,
Padma CH
wouldn't necessarily "use spark" to send the alert. Spark is in an
> important sense one library among many. You can have your application use
> any other library available for your language to send the alert.
>
> Marcin
>
> On Tue, Jul 12, 2016 at 9:25 AM, Priya Ch
>
Hi All,
I am building Real-time Anomaly detection system where I am using k-means
to detect anomaly. Now in-order to send alert to mobile or an email alert
how do i send it using Spark itself ?
Thanks,
Padma CH
Is anyone resolved this ?
Thanks,
Padma CH
On Wed, Jun 22, 2016 at 4:39 PM, Priya Ch
wrote:
> Hi All,
>
> I am running Spark Application with 1.8TB of data (which is stored in Hive
> tables format). I am reading the data using HiveContect and processing it.
> The cluster ha
Hi All,
I am running Spark Application with 1.8TB of data (which is stored in Hive
tables format). I am reading the data using HiveContect and processing it.
The cluster has 5 nodes total, 25 cores per machine and 250Gb per node. I
am launching the application with 25 executors with 5 cores each
Hello Team,
I am trying to perform join 2 rdds where one is of size 800 MB and the
other is 190 MB. During the join step, my job halts and I don't see
progress in the execution.
This is the message I see on console -
INFO spark.MapOutputTrackerMasterEndPoint: Asked to send map output
locations
Hi All,
I have two RDDs A and B where in A is of size 30 MB and B is of size 7
MB, A.cartesian(B) is taking too much time. Is there any bottleneck in
cartesian operation ?
I am using spark 1.6.0 version
Regards,
Padma Ch
which would convey the same.
On Wed, Jan 6, 2016 at 8:19 PM, Annabel Melongo
wrote:
> Priya,
>
> It would be helpful if you put the entire trace log along with your code
> to help determine the root cause of the error.
>
> Thanks
>
>
> On Wednesday, Januar
f" on
> one of the spark executors (perhaps run it in a for loop, writing the
> output to separate files) until it fails and see which files are being
> opened, if there's anything that seems to be taking up a clear majority
> that might key you in on the culprit.
>
> O
iles"
> exception.
>
>
> On Tuesday, January 5, 2016 8:03 AM, Priya Ch <
> learnings.chitt...@gmail.com> wrote:
>
>
> Can some one throw light on this ?
>
> Regards,
> Padma Ch
>
> On Mon, Dec 28, 2015 at 3:59 PM, Priya Ch
> wrote:
>
> Chris
ep 21, 2015 at 3:06 PM, Petr Novak wrote:
> add @transient?
>
> On Mon, Sep 21, 2015 at 11:27 AM, Priya Ch
> wrote:
>
>> Hello All,
>>
>> How can i pass sparkContext as a parameter to a method in an object.
>> Because passing sparkContext is giving me Ta
; true. What is the possible solution for this ?
Is this a bug in Spark 1.3.0? Changing the scheduling mode to Stand-alone
or Mesos mode would work fine ??
Please someone share your views on this.
On Sat, Sep 12, 2015 at 11:04 PM, Priya Ch
wrote:
> Hello All,
>
> When I push messages into
Hello All,
When I push messages into kafka and read into streaming application, I see
the following exception-
I am running the application on YARN and no where broadcasting the message
within the application. Just simply reading message, parsing it and
populating fields in a class and then prin
combine the messages with the same primary key.
>
> Hope that helps.
>
> Greetings,
>
> Juan
>
>
> 2015-07-30 10:50 GMT+02:00 Priya Ch :
>
>> Hi All,
>>
>> Can someone throw insights on this ?
>>
>> On Wed, Jul 29, 2015 at 8:29 AM, Priya
Hi All,
Can someone throw insights on this ?
On Wed, Jul 29, 2015 at 8:29 AM, Priya Ch
wrote:
>
>
> Hi TD,
>
> Thanks for the info. I have the scenario like this.
>
> I am reading the data from kafka topic. Let's say kafka has 3 partitions
> for the topic. I
s will guard against multiple attempts to
> run the task that inserts into Cassandra.
>
> See
> http://spark.apache.org/docs/latest/streaming-programming-guide.html#semantics-of-output-operations
>
> TD
>
> On Sun, Jul 26, 2015 at 11:19 AM, Priya Ch
> wrote:
>
>>
Hi All,
I have a problem when writing streaming data to cassandra. Or existing
product is on Oracle DB in which while wrtiting data, locks are maintained
such that duplicates in the DB are avoided.
But as spark has parallel processing architecture, if more than 1 thread is
trying to write same d
Hi All,
I configured Kafka cluster on a single node and I have streaming
application which reads data from kafka topic using KafkaUtils. When I
execute the code in local mode from the IDE, the application runs fine.
But when I submit the same to spark cluster in standalone mode, I end up
with
Hi All,
I have akka remote actors running on 2 nodes. I submitted spark application
from node1. In the spark code, in one of the rdd, i am sending message to
actor running on node1. My Spark code is as follows:
class ActorClient extends Actor with Serializable
{
import context._
val curre
Hi All,
We have set up 2 node cluster (NODE-DSRV05 and NODE-DSRV02) each is
having 32gb RAM and 1 TB hard disk capacity and 8 cores of cpu. We have set
up hdfs which has 2 TB capacity and the block size is 256 mb When we try
to process 1 gb file on spark, we see the following exception
14/11/
Hi Spark users/experts,
In Spark source code (Master.scala & Worker.scala), when registering the
worker with master, I see the usage of *persistenceEngine*. When we don't
specify spark.deploy.recovery mode explicitly, what is the default value
used ? This recovery mode is used to persists and re
the classpath? In Spark
>> 1.0, we use breeze 0.7, and in Spark 1.1 we use 0.9. If the breeze
>> version you used is different from the one comes with Spark, you might
>> see class not found. -Xiangrui
>>
>> On Fri, Oct 3, 2014 at 4:22 AM, Priya Ch
>> wrote:
>&
Hi Team,
When I am trying to use DenseMatrix of breeze library in spark, its
throwing me the following error:
java.lang.noclassdeffounderror: breeze/storage/Zero
Can someone help me on this ?
Thanks,
Padma Ch
Hi,
I am using spark 1.0.0. In my spark code i m trying to persist an rdd to
disk as rrd.persist(DISK_ONLY). But unfortunately couldn't find the
location where the rdd has been written to disk. I specified
SPARK_LOCAL_DIRS and SPARK_WORKER_DIR to some other location rather than
using the default /
Please accept the request
26 matches
Mail list logo