But, how do I take 100 partitions at a time from staging table? On Sun, May 22, 2016 at 11:26 AM, Mich Talebzadeh <[email protected] > wrote:
> ok so you still keep data as ORC in Hive for further analysis > > what I have in mind is to have an external table as staging table and do > insert into an orc internal table which is bucketed and partitioned. > > > > Dr Mich Talebzadeh > > > > LinkedIn * > https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw > <https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>* > > > > http://talebzadehmich.wordpress.com > > > > On 22 May 2016 at 19:11, swetha kasireddy <[email protected]> > wrote: > >> I am looking at ORC. I insert the data using the following query. >> >> sqlContext.sql(" CREATE EXTERNAL TABLE IF NOT EXISTS records (id STRING, >> record STRING) PARTITIONED BY (datePartition STRING, idPartition STRING) >> stored as ORC LOCATION '/user/users' ") >> sqlContext.sql(" orc.compress= SNAPPY") >> sqlContext.sql( >> """ from recordsTemp ps insert overwrite table users >> partition(datePartition , idPartition ) select ps.id, ps.record , >> ps.datePartition, ps.idPartition """.stripMargin) >> >> On Sun, May 22, 2016 at 12:37 AM, Mich Talebzadeh < >> [email protected]> wrote: >> >>> where is your base table and what format is it Parquet, ORC etc) >>> >>> >>> >>> Dr Mich Talebzadeh >>> >>> >>> >>> LinkedIn * >>> https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw >>> <https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>* >>> >>> >>> >>> http://talebzadehmich.wordpress.com >>> >>> >>> >>> On 22 May 2016 at 08:34, SRK <[email protected]> wrote: >>> >>>> Hi, >>>> >>>> In my Spark SQL query to insert data, I have around 14,000 partitions of >>>> data which seems to be causing memory issues. How can I insert the data >>>> for >>>> 100 partitions at a time to avoid any memory issues? >>>> >>>> >>>> >>>> -- >>>> View this message in context: >>>> http://apache-spark-user-list.1001560.n3.nabble.com/How-to-insert-data-for-100-partitions-at-a-time-using-Spark-SQL-tp26997.html >>>> Sent from the Apache Spark User List mailing list archive at Nabble.com. >>>> >>>> --------------------------------------------------------------------- >>>> To unsubscribe, e-mail: [email protected] >>>> For additional commands, e-mail: [email protected] >>>> >>>> >>> >> >
