That doesn't necessarily look like a Spark-related issue. Your
terminal seems to be displaying the glyph with a question mark because
the font lacks that symbol, maybe?
On Fri, Nov 9, 2018 at 7:17 PM lsn24 wrote:
>
> Hello,
>
> Per the documentation default character encoding of spark is UTF-8. B
Hello,
Per the documentation default character encoding of spark is UTF-8. But
when i try to read non ascii characters, spark tend to read it as question
marks. What am I doing wrong ?. Below is my Syntax:
val ds = spark.read.textFile("a .bz2 file from hdfs");
ds.show();
The string "KøBENHAVN"
Great work Hyukjin! I'm not too familiar with R, but I'll take a look at
the PR.
Bryan
On Fri, Nov 9, 2018 at 9:19 AM Shivaram Venkataraman <
shiva...@eecs.berkeley.edu> wrote:
> Thanks Hyukjin! Very cool results
>
> Shivaram
> On Fri, Nov 9, 2018 at 10:58 AM Felix Cheung
> wrote:
> >
> > Very
Another solution to the decimal case is using the capability API: use a
capability to signal that the table knows about `supports-decimal`. So
before the decimal support check, it would check
`table.isSupported("type-capabilities")`.
On Fri, Nov 9, 2018 at 12:45 PM Ryan Blue wrote:
> For that ca
For that case, I think we would have a property that defines whether
supports-decimal is assumed or checked with the capability.
Wouldn't we have this problem no matter what the capability API is? If we
used a trait to signal decimal support, then we would have to deal with
sources that were writt
"If there is no way to report a feature (e.g., able to read missing as
null) then there is no way for Spark to take advantage of it in the first
place"
Consider this (just a hypothetical scenario): We added "supports-decimal"
in the future, because we see a lot of data sources don't support decima
Do you have an example in mind where we might add a capability and break
old versions of data sources?
These are really for being able to tell what features a data source has. If
there is no way to report a feature (e.g., able to read missing as null)
then there is no way for Spark to take advanta
How do we deal with forward compatibility? Consider, Spark adds a new
"property". In the past the data source supports that property, but since
it was not explicitly defined, in the new version of Spark that data source
would be considered not supporting that property, and thus throwing an
exceptio
That sounds reasonable to me
On Fri, Nov 9, 2018 at 2:26 AM Anastasios Zouzias wrote:
>
> Hi all,
>
> I run in the following situation with Spark Structure Streaming (SS) using
> Kafka.
>
> In a project that I work on, there is already a secured Kafka setup where ops
> can issue an SSL certifica
Thanks Hyukjin! Very cool results
Shivaram
On Fri, Nov 9, 2018 at 10:58 AM Felix Cheung wrote:
>
> Very cool!
>
>
>
> From: Hyukjin Kwon
> Sent: Thursday, November 8, 2018 10:29 AM
> To: dev
> Subject: Arrow optimization in conversion from R DataFrame to Spark Da
Right now, it is up to the source implementation to decide what to do. I
think path-based tables (with no metastore component) treat an append as an
implicit create.
If you're thinking that relying on sources to interpret SaveMode is bad for
consistent behavior, I agree. That's why the community a
I'd have two places. First, a class that defines properties supported and
identified by Spark, like the SQLConf definitions. Second, in documentation
for the v2 table API.
On Fri, Nov 9, 2018 at 9:00 AM Felix Cheung
wrote:
> One question is where will the list of capability strings be defined?
>
One question is where will the list of capability strings be defined?
From: Ryan Blue
Sent: Thursday, November 8, 2018 2:09 PM
To: Reynold Xin
Cc: Spark Dev List
Subject: Re: DataSourceV2 capability API
Yes, we currently use traits that have methods. Something
Very cool!
From: Hyukjin Kwon
Sent: Thursday, November 8, 2018 10:29 AM
To: dev
Subject: Arrow optimization in conversion from R DataFrame to Spark DataFrame
Hi all,
I am trying to introduce R Arrow optimization by reusing PySpark Arrow
optimization.
It boost
I'm always suffering Spark SQL job fails with error "Container exited with
a non-zero exit code 143".
I know that it was casused by the memory used execeeds the limits of
spark.yarn.executor.memoryOverhead. As shown below, memory allocation
request was failed at 18/11/08 17:36:05, then it RECEIVED
Thanks this is a great news
Can you please lemme if dynamic resource allocation is available in spark
2.4?
I’m using spark 2.3.2 on Kubernetes, do I still need to provide executor
memory options as part of spark submit command or spark will manage
required executor memory based on the spark job s
Hi all,
I run in the following situation with Spark Structure Streaming (SS) using
Kafka.
In a project that I work on, there is already a secured Kafka setup where
ops can issue an SSL certificate per "group.id", which should be predefined
(or hopefully its prefix to be predefined).
On the other
17 matches
Mail list logo