Re: REG: Using Sequences in Phoenix Data Frame

2015-08-18 Thread James Taylor
See PhoenixHBaseLoaderIT.testDataForSQLQueryWithSequences() On Mon, Aug 17, 2015 at 9:10 AM, Ns G wrote: > It would be really helpful if links to resources are provided where > sequences are used in Map reduce which I will try to replicate in spark. > > Thank you James and Josh for your answer

Re: REG: Using Sequences in Phoenix Data Frame

2015-08-17 Thread Ns G
It would be really helpful if links to resources are provided where sequences are used in Map reduce which I will try to replicate in spark. Thank you James and Josh for your answers. On 17-Aug-2015 8:25 pm, "Josh Mahonin" wrote: > Oh, neat! I was looking for some references to it in code, uni

Re: REG: Using Sequences in Phoenix Data Frame

2015-08-17 Thread Josh Mahonin
Oh, neat! I was looking for some references to it in code, unit tests and docs and didn't see anything relevant. It's possible they might "just work" then, although it's definitely an untested scenario. On Mon, Aug 17, 2015 at 10:48 AM, James Taylor wrote: > Sequences are supported by MR integr

Re: REG: Using Sequences in Phoenix Data Frame

2015-08-17 Thread James Taylor
Sequences are supported by MR integration, but I'm not sure if their usage by the Spark integration would cause any issues. On Monday, August 17, 2015, Josh Mahonin wrote: > Hi Satya, > > I don't believe sequences are supported by the broader Phoenix map-reduce > integration, which the phoenix-s

Re: REG: Using Sequences in Phoenix Data Frame

2015-08-17 Thread Josh Mahonin
Hi Satya, I don't believe sequences are supported by the broader Phoenix map-reduce integration, which the phoenix-spark module uses under the hood. One workaround that would give you sequential IDs, is to use the 'zipWithIndex' method on the underlying Spark RDD, with a small 'map()' operation t

REG: Using Sequences in Phoenix Data Frame

2015-08-15 Thread Ns G
Hi All, I hope that someone will reply to this email as all my previous emails have been unanswered. I have 10-20 Million records in file and I want to insert it through Phoenix-Spark. The table primary id is generated by a sequence. So, every time an upsert is done, the sequence Id gets generate