Hi,
Using the command
val table = spark
.read
.format("org.apache.spark.sql.cassandra")
.options(Map( "table" -> "A", "keyspace" -> "B"))
.load
someone can load whole table data into a dataframe. Instead, I want to run
a query in Cassandra and load just the result in dataframe (not whole
Thanks for the tip!
On Wed, Jan 23, 2019 at 10:28 AM Moein Hosseini wrote:
> In this manner, your application should create distinct jobs each time. So
> for the first time you driver create DAG and do it with help of executors,
> then finish the job and goes to sleep( Driver/Application ). When
In this manner, your application should create distinct jobs each time. So
for the first time you driver create DAG and do it with help of executors,
then finish the job and goes to sleep( Driver/Application ). When it wakes
up, it will create new Job and DAG and ...
Some how same as create cron-jo
I will be streaming data and am trying to understand how to get rid of old
data from a stream so it does not become to large. I will stream in one
large table of buying data and join that to another table of different
data. I need the last 14 days from the second table. I will not need data
that is
I’d recommend using a scheduler of some kind to trigger your job each hour,
and have the Spark job exit when it completes. Spark is not meant to run in
any type of “sleep mode”, unless you want to run a structured streaming job
and create a separate process to pull data from Casandra and publish it
Hi Soheil,
Yes, It's possible to force your application to sleep after Job
do {
// Your spark job goes here
Thread.sleep(360);
} while(true);
But maybe AirFlow is better option if you need scheduler on your Spark Job.
On Wed, Jan 23, 2019 at 9:26 AM Soheil Pourbafrani
wrote:
> Hi,
>
Hi,
I want to submit a job in YARN cluster to read data from Cassandra and
write them in HDFS, every hour, for example.
Is it possible to make Spark Application sleep in a while true loop and
awake every hour to process data?
Hi All,
We are trying to enable encryption between spark-shuffle and local FileSystem.
We wanted to clarify our understanding on this. Currently we're working on
Spark 2.4
According to our understanding of Spark supporting Local Storage Encryption,
that is, "Enabling local disk I/O encryption"
Hi , please tell me why you need to increase the time?
At 2019-01-22 18:38:29, "Chetan Khatri" wrote:
Hello Spark Users,
Can you please tell me how to increase the time for Spark job to be in Accept
mode in Yarn.
Thank you. Regards,
Chetan
I have check the method `process` of class `Inbox`.
If the message is not null ,will continue treat next message.
If the message is null,will exit the process.
This logic looks right.
在 2019-01-23 11:35:59,"大啊" 写道:
Could you show how you hit this error?
At 2019-01-23 09:50:16, "kaishen
Could you show how you hit this error?
At 2019-01-23 09:50:16, "kaishen" wrote:
>Inbox.scala, line 158:
>message = messages.poll()
>if the message is not null, then it will be lost and never be executed.
>Please help to verify this bug!
>
>
>
>
>--
>Sent from: http://apache-spark-u
Inbox.scala, line 158:
message = messages.poll()
if the message is not null, then it will be lost and never be executed.
Please help to verify this bug!
--
Sent from: http://apache-spark-user-list.1001560.n3.nabble.com/
--
Hi Marcelo,
I have dumped through jstack, and saw the ShutdownHookManager :
'''
"Thread-1" #19 prio=5 os_prio=0 tid=0x7f9b6828e800 nid=0x77cb waiting
on condition [0x7f9a123e3000]
java.lang.Thread.State: WAITING (parking)
at sun.misc.Unsafe.park(Native Method)
- parking
About deployment/serving
SPIP
https://issues.apache.org/jira/browse/SPARK-26247
From: Riccardo Ferrari
Sent: Tuesday, January 22, 2019 8:07 AM
To: User
Subject: I have trained a ML model, now what?
Hi list!
I am writing here to here about your experience on pu
Hi list!
I am writing here to here about your experience on putting Spark ML models
into production at scale.
I know it is a very broad topic with many different faces depending on the
use-case, requirements, user base and whatever is involved in the task.
Still I'd like to open a thread about th
Hi,
We’ve setup spark-history service (based on spark 2.4) on K8S. UI works
perfectly fine when running on NodePort. We’re facing some issues when on
ingress.
Please let us know what kind of inputs do you need?
Thanks and Regards,
Abhishek
From: Battini Lakshman
Sent: Tuesday, January 22, 201
Hello,
We are running Spark 2.4 on Kubernetes cluster, able to access the Spark UI
using "kubectl port-forward".
However, this spark UI contains currently running Spark application logs,
we would like to maintain the 'completed' spark application logs as well.
Could someone help us to setup 'Spar
You can try with Yarn node labels:
https://hadoop.apache.org/docs/stable/hadoop-yarn/hadoop-yarn-site/NodeLabel.html
Then you can whitelist nodes.
> Am 19.01.2019 um 00:20 schrieb Serega Sheypak :
>
> Hi, is there any possibility to tell Scheduler to blacklist specific nodes in
> advance?
The new issue is https://issues.apache.org/jira/browse/SPARK-26688.
On Tue, Jan 22, 2019 at 11:30 AM Attila Zsolt Piros
wrote:
> Hi,
>
> >> Is it this one: https://github.com/apache/spark/pull/23223 ?
>
> No. My old development was https://github.com/apache/spark/pull/21068,
> which is closed.
Hello Spark Users,
Can you please tell me how to increase the time for Spark job to be in
*Accept* mode in Yarn.
Thank you. Regards,
Chetan
Hi,
>> Is it this one: https://github.com/apache/spark/pull/23223 ?
No. My old development was https://github.com/apache/spark/pull/21068,
which is closed.
This would be a new improvement with a new Apache JIRA issue (
https://issues.apache.org) and with a new Github pull request.
>> Can I try
21 matches
Mail list logo