from:"Subhajit Purkayastha"

New to spark 2.2.1 - Problem with finding tables between different metastore db

2018-02-06 Thread Subhajit Purkayastha

All, I am new to Spark 2.2.1. I have a single node cluster and also have enabled thriftserver for my Tableau application to connect to my persisted table. I feel that the spark cluster metastore is different from the thrift-server metastore. If this assumption is valid, what do I need to

Spark DataFrame Join _ performance issues

2016-09-19 Thread Subhajit Purkayastha

I am running my spark (1.5.2) instance in a virtualbox VM. I have 10gb memory allocated to it. I have a fact table extract, with 1 rows var glbalance_df_select = glbalance_df.select ("LEDGER_ID","CODE_COMBINATION_ID","CURRENCY_CODE", "PERIOD_TYPE","TEMPLATE_ID", "PERIOD_NAME","ACT

Spark Save mode "Overwrite" -Lock wait timeout exceeded; try restarting transaction Error

2016-09-11 Thread Subhajit Purkayastha

I am using spark 1.5.2 with Memsql Database as a persistent repository I am trying to update rows (based on the primary key), if it is appears more than 1 time (basically run the save load as a Upsert operation) val UpSertConf = SaveToMemSQLConf(msc.memSQLConf,

RE: Spark 2.0 - Insert/Update to a DataFrame

2016-08-26 Thread Subhajit Purkayastha

I want Thanks for your help From: Mike Metzger [mailto:m...@flexiblecreations.com] Sent: Friday, August 26, 2016 2:12 PM To: Subhajit Purkayastha Cc: user @spark Subject: Re: Spark 2.0 - Insert/Update to a DataFrame Without seeing exactly what you were wanting to accomplish, it&#

RE: Spark 2.0 - Insert/Update to a DataFrame

2016-08-26 Thread Subhajit Purkayastha

:13 PM To: Subhajit Purkayastha Cc: user @spark Subject: Re: Spark 2.0 - Insert/Update to a DataFrame Without seeing the makeup of the Dataframes nor what your logic is for updating them, I'd suggest doing a join of the Forecast DF with the appropriate columns from the SalesOrd

Spark 2.0 - Insert/Update to a DataFrame

2016-08-26 Thread Subhajit Purkayastha

I am using spark 2.0, have 2 DataFrames, SalesOrder and Forecast. I need to update the Forecast Dataframe record(s), based on the SaleOrder DF record. What is the best way to achieve this functionality

DataFrame Data Manipulation - Based on a timestamp column Not Working

2016-08-23 Thread Subhajit Purkayastha

Using spark 2.0 & scala 2.11.8, I have a DataFrame with a timestamp column root |-- ORG_ID: integer (nullable = true) |-- HEADER_ID: integer (nullable = true) |-- ORDER_NUMBER: integer (nullable = true) |-- LINE_ID: integer (nullable = true) |-- LINE_NUMBER: integer (nullable = true) |--

Spark 2.0 - Join statement compile error

2016-08-22 Thread Subhajit Purkayastha

All, I have the following dataFrames and the temp table. I am trying to create a new DF , the following statement is not compiling val df = sales_demand.join(product_master,(sales_demand.INVENTORY_ITEM_ID==product_ma ster.INVENTORY_ITEM_ID),joinType="inner") What am I do

Getting error, when I do df.show()

2016-08-01 Thread Subhajit Purkayastha

I am getting this error in the spark-shell when I do . Which jar file I need to download to fix this error? Df.show() Error scala> val df = msc.sql(query) df: org.apache.spark.sql.DataFrame = [id: int, name: string] scala> df.show() java.lang.NoClassDefFoundError: spray/json/JsonR

Configure Spark to run with MemSQL DB Cluster

2016-07-26 Thread Subhajit Purkayastha

All, Is it possible to integrate spark 1.6.1 with MemSQL Cluster? Any pointers on how to start with the project will be appreciated. Thx, Subhajit

Saprk 1.5 - How to join 3 RDDs in a SQL DF?

2015-10-11 Thread Subhajit Purkayastha

Can I join 3 different RDDs together in a Spark SQL DF? I can find examples for 2 RDDs but not 3. Thanks

Error - Calling a package (com.databricks:spark-csv_2.10:1.0.3) with spark-submit

2015-09-11 Thread Subhajit Purkayastha

I am on spark 1.3.1 When I do the following with spark-shell, it works spark-shell --packages com.databricks:spark-csv_2.10:1.0.3 Then I can create a DF using the spark-csv package import sqlContext.implicits._ import org.apache.spark.sql._ // Return the dataset specified by d

New to spark 2.2.1 - Problem with finding tables between different metastore db

Spark DataFrame Join _ performance issues

Spark Save mode "Overwrite" -Lock wait timeout exceeded; try restarting transaction Error

RE: Spark 2.0 - Insert/Update to a DataFrame

RE: Spark 2.0 - Insert/Update to a DataFrame

Spark 2.0 - Insert/Update to a DataFrame

DataFrame Data Manipulation - Based on a timestamp column Not Working

Spark 2.0 - Join statement compile error

Getting error, when I do df.show()

Configure Spark to run with MemSQL DB Cluster

Saprk 1.5 - How to join 3 RDDs in a SQL DF?

Error - Calling a package (com.databricks:spark-csv_2.10:1.0.3) with spark-submit

12 matches

Site Navigation

Mail list logo

Footer information