Re: Partition parquet data by ENUM column

2015-07-21 Thread Ankit
Thanks a lot Cheng. So it seems even in spark 1.3 and 1.4, parquet ENUMs were treated as Strings in Spark SQL right? So does this mean partitioning for enums already works in previous versions too since they are just treated as strings? Also, is there a good way to verify that the partitioning is

Unsubscribe

2021-12-16 Thread Ankit Maloo
Please do unsubscribe me from your mailing list.

Scala commands syntax shortcuts(alias)

2023-04-14 Thread Ankit Singla
*eval *option available. I would like to know if any option is available to ease down the keystrokes. In advance appreciate your help and time Regards, Ankit Singla +1 847 471 4988

Re: Spark Multiple Hive Metastore Catalog Support

2023-04-17 Thread Ankit Gupta
++ User Mailing List Just a reminder, anyone who can help on this. Thanks a lot ! Ankit Prakash Gupta On Wed, Apr 12, 2023 at 8:22 AM Ankit Gupta wrote: > Hi All > > The question is regarding the support of multiple Remote Hive Metastore > catalogs with Spark. Starting Spark

Re: Spark Multiple Hive Metastore Catalog Support

2023-04-17 Thread Ankit Gupta
Thanks Elliot ! Let me check it out ! On Mon, 17 Apr, 2023, 10:08 pm Elliot West, wrote: > Hi Ankit, > > While not a part of Spark, there is a project called 'WaggleDance' that > can federate multiple Hive metastores so that they are accessible via a > singl

Exception while triggering spark job from remote jvm

2015-07-19 Thread ankit tyagi
Hi, I am using below code to trigger spark job from remote jvm. import org.apache.hadoop.conf.Configuration; import org.apache.spark.SparkConf; import org.apache.spark.deploy.yarn.Client; import org.apache.spark.deploy.yarn.ClientArguments; /** * @version 1.0, 15-Jul-2015 * @author ankit

Re: Exception while triggering spark job from remote jvm

2015-07-19 Thread ankit tyagi
Just to add more information. I have checked the status of this file, not a single block is corrupted. *[hadoop@ip-172-31-24-27 ~]$ hadoop fsck /ankit -files -blocks* *DEPRECATED: Use of this script to execute hdfs command is deprecated.* *Instead use the hdfs command for it.* Connecting to

Re: [Spark 1.5.2]Check Foreign Key constraint

2016-05-11 Thread Ankit Singhal
You can use Joins as a substitute to subqueries. On Wed, May 11, 2016 at 1:27 PM, Divya Gehlot wrote: > Hi, > I am using Spark 1.5.2 with Apache Phoenix 4.4 > As Spark 1.5.2 doesn't support subquery in where conditions . > https://issues.apache.org/jira/browse/SPARK-4226 > > Is there any altern

[jira] Ankit shared "SPARK-11213: Documentation for remote spark Submit for R Scripts from 1.5 on CDH 5.4" with you

2015-10-22 Thread Ankit (JIRA)
Ankit shared an issue with you --- > Documentation for remote spark Submit for R Scripts from 1.5 on CDH 5.4 > --- > > Key: SPARK-11213 >

Exception while submit spark job through yarn client

2015-07-29 Thread ankit tyagi
--driver-memory", "1000M", // path to your application's JAR file // required in yarn-cluster mode "--jar", "local:/home/ankit/Repository/Personalization/rtis/Cust360QueryDriver/target/SnapdealCustomer360QueryDriver.jar",

OOM in spark driver

2015-09-01 Thread ankit tyagi
Hi All, I am using spark-sql 1.3.1 with hadoop 2.4.0 version. I am running sql query against parquet files and wanted to save result on s3 but looks like https://issues.apache.org/jira/browse/SPARK-2984 problem still coming while saving data to s3. Hence Now i am saving result on hdfs and with t

Re: spark streaming job stopped

2016-10-04 Thread Ankit Jindal
Hi Divya, Can you please provide full logs or Stacktrace. Ankit Thanks, Ankit Jindal | Lead Engineer GlobalLogic P +91.120.406.2277 M +91.965.088.6887 www.globallogic.com http://www.globallogic.com/email_disclaimer.txt On Wed, Oct 5, 2016 at 10:29 AM, Divya Gehlot wrote: > Hi, >

Re: RDD order preservation through transformations

2017-09-13 Thread Ankit Maloo
AFAIK, the order of a rdd is maintained across a partition for Map operations. There is no way a map operation can change sequence across a partition as partition is local and computation happens one record at a time. On 13-Sep-2017 9:54 PM, "Suzen, Mehmet" wrote: I think the order has no meani

Re: An alternative logic to collaborative filtering works fine but we are facing run time issues in executing the job

2019-04-16 Thread Ankit Khettry
aster node resources. Try running the job in yarn mode and if the issue persists, try increasing the disc volumes. Best Regards Ankit Khettry On Wed, 17 Apr, 2019, 9:44 AM Balakumar iyer S, wrote: > Hi , > > > While running the following spark code in the cluster with following >

Re: BigDL and Analytics Zoo talks at upcoming Spark+AI Summit and Strata London

2019-04-19 Thread Khare, Ankit
Thanks for sharing. Sent from my iPhone On 19. Apr 2019, at 01:35, Jason Dai mailto:jason@gmail.com>> wrote: Hi all, Please see below for a list of upcoming technical talks on BigDL and Analytics Zoo (https://github.com/intel-analytics/analytics-zoo/) in the coming weeks: * Engineers

Re: writing into oracle database is very slow

2019-04-19 Thread Khare, Ankit
Hi Jiang We faced similar issue so we write the file and then use sqoop to export data to mssql. We achieved a great time benefit with this strategy. Sent from my iPhone On 19. Apr 2019, at 10:47, spark receiver mailto:spark.recei...@gmail.com>> wrote: hi Jiang, i was facing the very same i

Re: Update / Delete records in Parquet

2019-04-23 Thread Khare, Ankit
connect to MSSQL and then get CDC data to Apache KUDU Total records. : 3 B Thanks Ankit From: Chetan Khatri Date: Tuesday, 23. April 2019 at 05:58 To: Jason Nerothin Cc: user Subject: Re: Update / Delete records in Parquet Hello Jason, Thank you for reply. My use case is that, first time I

Re: [Spark SQL]: Slow insertInto overwrite if target table has many partitions

2019-04-25 Thread Khare, Ankit
Why do you need 1 partition when 10 partition is doing the job .. ?? Thanks Ankit From: vincent gromakowski Date: Thursday, 25. April 2019 at 09:12 To: Juho Autio Cc: user Subject: Re: [Spark SQL]: Slow insertInto overwrite if target table has many partitions Which metastore are you

Turning off Jetty Http Options Method

2019-04-30 Thread Ankit Jain
s has been addressed, please let us know too. -- Thanks & Regards, Ankit.

Re: Turning off Jetty Http Options Method

2019-04-30 Thread Ankit Jain
Aah - actually found https://issues.apache.org/jira/browse/SPARK-18664 - "Don't respond to HTTP OPTIONS in HTTP-based UIs" Does anyone know if this can be prioritized? Thanks Ankit On Tue, Apr 30, 2019 at 1:31 PM Ankit Jain wrote: > Hi Fellow Spark users, > We are

Re: Turning off Jetty Http Options Method

2019-04-30 Thread Ankit Jain
+ d...@spark.apache.org On Tue, Apr 30, 2019 at 4:23 PM Ankit Jain wrote: > Aah - actually found https://issues.apache.org/jira/browse/SPARK-18664 - > "Don't respond to HTTP OPTIONS in HTTP-based UIs" > > Does anyone know if this can be prioritized? > > Thanks

Re: Turning off Jetty Http Options Method

2019-04-30 Thread Ankit Jain
-band mechanism. In this case, allowing OPTIONS allowed a remote server compromise." Thanks Ankit On Tue, Apr 30, 2019 at 7:35 PM wrote: > If this is correct *“**This method exposes what all methods are supported > by the end point” , *I really don’t understand how’s that a security >

OOM Error

2019-09-06 Thread Ankit Khettry
them are even marked resolved. Can someone guide me as to how to approach this problem? I am using Databricks Spark 2.4.1. Best Regards Ankit Khettry

Re: OOM Error

2019-09-06 Thread Ankit Khettry
Nope, it's a batch job. Best Regards Ankit Khettry On Sat, 7 Sep, 2019, 6:52 AM Upasana Sharma, <028upasana...@gmail.com> wrote: > Is it a streaming job? > > On Sat, Sep 7, 2019, 5:04 AM Ankit Khettry > wrote: > >> I have a Spark job that consists of a large nu

Re: OOM Error

2019-09-07 Thread Ankit Khettry
Thanks Chris Going to try it soon by setting maybe spark.sql.shuffle.partitions to 2001. Also, I was wondering if it would help if I repartition the data by the fields I am using in group by and window operations? Best Regards Ankit Khettry On Sat, 7 Sep, 2019, 1:05 PM Chris Teoh, wrote: >

Re: OOM Error

2019-09-07 Thread Ankit Khettry
Sure folks, will try later today! Best Regards Ankit Khettry On Sat, 7 Sep, 2019, 6:56 PM Sunil Kalra, wrote: > Ankit > > Can you try reducing number of cores or increasing memory. Because with > below configuration your each core is getting ~3.5 GB. Otherwise your data > is s

SparkStreaming onStart not being invoked on CustomReceiver attached to master with multiple workers

2015-04-19 Thread Ankit Patel
I am experiencing problem with SparkStreaming (Spark 1.2.0), the onStart method is never called on CustomReceiver when calling spark-submit against a master node with multiple workers. However, SparkStreaming works fine with no master node set. Anyone notice this issue?

RE: SparkStreaming onStart not being invoked on CustomReceiver attached to master with multiple workers

2015-04-20 Thread Ankit Patel
eiver extends Receiver { public TestReceiver() { super(StorageLevel.MEMORY_ONLY()); System.out.println("Ankit: Created TestReceiver"); } @Override public void onStart() { System.out.println("Start TestReceiver&q

RE: SparkStreaming onStart not being invoked on CustomReceiver attached to master with multiple workers

2015-04-20 Thread Ankit Patel
when no master is defined, but do not see it when there is. Also, I am running some other simple code with spark-submit with printlns and I do see them in my SparkUI, but not for spark streaming. Thanks,Ankit From: t...@databricks.com Date: Mon, 20 Apr 2015 13:29:31 -0700 Subject: Re

Re: Any ideas why a few tasks would stall

2014-12-04 Thread Ankit Soni
I ran into something similar before. 19/20 partitions would complete very quickly, and 1 would take the bulk of time and shuffle reads & writes. This was because the majority of partitions were empty, and 1 had all the data. Perhaps something similar is going on here - I would suggest taking a l