Re: BigDecimal problem in parquet file

2015-06-18 Thread Bipin Nag
wide tables. > > Cheng > > > On 6/15/15 5:48 AM, Bipin Nag wrote: > > HI Davies, > > I have tried recent 1.4 and 1.5-snapshot to 1) open the parquet and save > it again or 2 apply schema to rdd and save dataframe as parquet but now I > get this error (right in t

Re: BigDecimal problem in parquet file

2015-06-15 Thread Bipin Nag
bug. My error doesn't show up in newer versions, so this is the problem to fix now. Thanks On 13 June 2015 at 06:31, Davies Liu wrote: > Maybe it's related to a bug, which is fixed by > https://github.com/apache/spark/pull/6558 recently. > > On Fri, Jun 12, 2015 at 5:3

Re: BigDecimal problem in parquet file

2015-06-12 Thread Bipin Nag
have to change it properly. Thanks for helping out. Bipin On 12 June 2015 at 14:57, Cheng Lian wrote: > On 6/10/15 8:53 PM, Bipin Nag wrote: > > Hi Cheng, > > I am using Spark 1.3.1 binary available for Hadoop 2.6. I am loading an > existing parquet file, then repartitioni

Re: BigDecimal problem in parquet file

2015-06-10 Thread Bipin Nag
Hi Cheng, I am using Spark 1.3.1 binary available for Hadoop 2.6. I am loading an existing parquet file, then repartitioning and saving it. Doing this gives the error. The code for this doesn't look like causing problem. I have a feeling the source - the existing parquet is the culprit. I create

Re: Error in using saveAsParquetFile

2015-06-08 Thread Bipin Nag
wrote: > I suspect that Bookings and Customerdetails both have a PolicyType field, > one is string and the other is an int. > > > Cheng > > > On 6/8/15 9:15 PM, Bipin Nag wrote: > > Hi Jeetendra, Cheng > > I am using following code for joining > > val

Re: Error in using saveAsParquetFile

2015-06-08 Thread Bipin Nag
Hi Jeetendra, Cheng I am using following code for joining val Bookings = sqlContext.load("/home/administrator/stageddata/Bookings") val Customerdetails = sqlContext.load("/home/administrator/stageddata/Customerdetails") val CD = Customerdetails. where($"CreatedOn" > "2015-04-01 00:00:00.0").

Re: How to group multiple row data ?

2015-04-30 Thread Bipin Nag
OK, consider the case where there are multiple event triggers for a given customer/ vendor/product like 1,1,2,2,3 arranged in the order of *event* *occurrence* (time stamp). So output should be two groups (1,2) and (1,2,3). The doublet would be first occurrence of 1,2 and triplet later occurrences

Re: Microsoft SQL jdbc support from spark sql

2015-04-06 Thread Bipin Nag
Thanks for the information. Hopefully this will happen in near future. For now my best bet would be to export data and import it in spark sql. On 7 April 2015 at 11:28, Denny Lee wrote: > At this time, the JDBC Data source is not extensible so it cannot support > SQL Server. There was some tho