from:"Mahender Sarangam"

how to store results of Scala Query in Text format or tab delimiter

2016-06-09 Thread Mahender Sarangam

Hi, We are newbies learning spark. We are running Scala query against our Parquet table. Whenever we fire query in Jupyter, results are shown in page, Only part of results are shown in UI. So we are trying to store the results into table which is Parquet format. By default, In Spark all the ta

Re: HIVE Query 25x faster than SPARK Query

2016-06-15 Thread Mahender Sarangam

+1, Even see performance degradation while comparing SPark SQL with Hive. We have table of 260 columns. We have executed in hive and SPARK. In Hive, it is taking 66 sec for 1 gb of data whereas in Spark, it is taking 4 mins of time. On 6/9/2016 3:19 PM, Gavin Yue wrote: Could you print out the s

Any Dynamic Compilation of Scala Query

2016-10-26 Thread Mahender Sarangam

Hi, Is there any way to dynamically execute a string which has scala code against spark engine. We are dynamically creating scala file, we would like to submit this scala file to Spark, but currently spark accepts only JAR file has input from Remote Job submission. Is there any other way to s

Re: Any Dynamic Compilation of Scala Query

2016-11-08 Thread Mahender Sarangam

want to do so? Ideally there would be a better approach than solving such problems as mentioned below. A sample example would help to understand the problem. Regards, Kiran From: Mahender Sarangam <mailto:mahender.bigd...@outlook.com> Date: Wednesday, October 26, 2016 at 2:05 PM To: user <m

Any equivalent method lateral and explore

2016-11-22 Thread Mahender Sarangam

Hi, We are converting our hive logic which is using lateral view and explode functions. Is there any builtin function in scala for performing lateral view explore. Below is our query in Hive. temparray is temp table with c0 and c1 columns SELECT id, CONCAT_WS(',', collect_list(LineID)) as Li

Support of Theta Join

2017-01-12 Thread Mahender Sarangam

Hi All, Is there any support of theta join in SPARK. We want to identify the country based on range on IP Address (we have in our DB) - To unsubscribe e-mail: user-unsubscr...@spark.apache.org

Need help

2017-10-09 Thread Mahender Sarangam

Hi, I'm new to spark and big data, we are doing some poc and building our warehouse application using Spark. Can any one share with me guidance like Naming Convention for HDFS Name,Table Names, UDF and DB Name. Any sample architecture diagram. -Mahens

Dynamic Key JSON Parsing

2018-03-18 Thread Mahender Sarangam

Hi, I'm new to Spark and Scala, need help on transforming Nested JSON using Scala. We have upstream returning JSON like { "id": 100, "text": "Hello, world." Users : [ "User1": { "name": "Brett", "id": 200, "Type" : "Employee" "empid":"2" }, "Use

Scala - Spark for beginners

2018-03-18 Thread Mahender Sarangam

Hi, Can any one share with me nice tutorials on Spark with Scala like videos, blogs for beginners. Mostly focusing on writing scala code. Thanks in advance. - To unsubscribe e-mail: user-unsubscr...@spark.apache.org

Building Datwarehouse Application in Spark

2018-04-04 Thread Mahender Sarangam

Hi, Does anyone has good architecture document/design principle for building warehouse application using Spark. Is it better way of having Hive Context created with HQL and perform transformation or Directly loading files in dataframe and perform data transformation. We need to implement SCD

Internal table stored NULL as \N. How to remove it

2018-06-23 Thread Mahender Sarangam

Hi, We are storing our final transformed data in Hive table in JSON format. while storing data into table, all the null fields are converted into \\N. while reading table, we are seeing \\N instead of NULL. We tried setting ALTER TABLE sample set SERDEPROPERTIES ('serialization.null.format' = "\

Unable to read multiple JSON.Gz File.

2018-10-01 Thread Mahender Sarangam

I’m trying to read multiple .json.gz files from a Blob storage path using the below scala code. But I’m unable to read the data from the files or print the schema. If the files are not compressed as .gz then we are able to read all the files into the Dataframe. I’ve even tried giving *.gz but n

Re: Unable to read multiple JSON.Gz File.

2018-10-18 Thread Mahender Sarangam

a folder containing multiple gz files. From: Mahender Sarangam <mailto:mahender.bigd...@outlook.com> Sent: Monday, October 1, 2018 2:00 AM To: user@spark.apache.org<mailto:user@spark.apache.org> Subject: Unable to read multiple JSON.Gz File. I’m trying to read multiple .json.gz

Delta Logic in Spark

2018-11-17 Thread Mahender Sarangam

Hi, We have daily data pull which pulls almost 50 GB of data from upstream system. We are using Spark SQL for processing of 50 GB. Finally insert 50 GB of data into Hive Target table and Now we are copying whole hive target table to SQL esp. SQL Staging Table & implement merge from staging

how to store results of Scala Query in Text format or tab delimiter

Re: HIVE Query 25x faster than SPARK Query

Any Dynamic Compilation of Scala Query

Re: Any Dynamic Compilation of Scala Query

Any equivalent method lateral and explore

Support of Theta Join

Need help

Dynamic Key JSON Parsing

Scala - Spark for beginners

Building Datwarehouse Application in Spark

Internal table stored NULL as \N. How to remove it

Unable to read multiple JSON.Gz File.

Re: Unable to read multiple JSON.Gz File.

Delta Logic in Spark

14 matches

Site Navigation

Mail list logo

Footer information