Hi,
I notice that “alter table add column” command is banned in spark 2.0.
Any plans on supporting it in the future? (After all it was supported in spark
1.6.x)
Thanks.
-
To unsubscribe e-mail: dev-unsubscr...@spark.apache.org
> 在 2016年9月2日,下午5:58,汪洋 写道:
>
> Yeah, using external shuffle service is a reasonable choice but I think we
> will still face the same problems. We use SSDs to store shuffle files for
> performance considerations. If the shuffle files are not going to be used
> anymore,
;
>
> It is described here:
> https://jaceklaskowski.gitbooks.io/mastering-apache-spark/content/spark-ExternalShuffleService.html
>
> <https://jaceklaskowski.gitbooks.io/mastering-apache-spark/content/spark-ExternalShuffleService.html>
>
> ---
> Artur
>
> On
Unless they are brutally
> killed.
>
> You can safely delete the directories when you are sure that the spark
> applications related to them have finished. A crontab task may be used for
> automatic clean up.
>
>> On Sep 2, 2016, at 12:18, 汪洋 wrote:
>>
>&g
; 在 2016年6月9日,下午12:51,Alexander Pivovarov 写道:
>
> reduceByKey(randomPartitioner, (a, b) => a + b) also gives incorrect result
>
> Why reduceByKey with Partitioner exists then?
>
> On Wed, Jun 8, 2016 at 9:22 PM, 汪洋 <mailto:tiandiwo...@icloud.com>> wrote:
> Hi
Hi Alexander,
I think it does not guarantee to be right if an arbitrary Partitioner is passed
in.
I have created a notebook and you can check it out.
(https://databricks-prod-cloudfront.cloud.databricks.com/public/4027ec902e239c93eaaa8714f173bcfc/7973071962862063/2110745399505739/58107563000366
Hi all,
I notice that HiveContext used to have a refreshTable() method, but it doesn’t
in branch-2.0.
Do we drop that intentionally? If yes, how do we achieve similar functionality?
Thanks.
Yang
Hi,
Currently the TakeOrderedAndProject operator in spark sql uses RDD’s
takeOrdered method. When we pass a large limit to operator, however, it will
return partitionNum*limit number of records to the driver which may cause an
OOM.
Are there any plans to deal with the problem in the community?
I think it cannot be right.
> 在 2016年1月22日,下午4:53,汪洋 写道:
>
> Hi,
>
> Do we support distinct count in the over clause in spark sql?
>
> I ran a sql like this:
>
> select a, count(distinct b) over ( order by a rows between unbounded
> preceding and cu
Hi,
Do we support distinct count in the over clause in spark sql?
I ran a sql like this:
select a, count(distinct b) over ( order by a rows between unbounded preceding
and current row) from table limit 10
Currently, it return an error says: expression ‘a' is neither present in the
group by,
I get it, thanks!
> 在 2015年12月31日,上午3:00,Michael Armbrust 写道:
>
> The goal here is to ensure that the non-deterministic value is evaluated only
> once, so the result won't change for a given row (i.e. when sorting).
>
> On Tue, Dec 29, 2015 at 10:57 PM, 汪洋 <mai
Hi fellas,
I am new to spark and I have a newbie question. I am currently reading the
source code in spark sql catalyst analyzer. I not quite understand the partial
function in PullOutNondeterministric. What does it mean by "pull out”? Why do
we have to do the "pulling out”?
I would really appre
12 matches
Mail list logo