ata.
This is really is not a spark thing, but a hadoop input format
discussion
HTH?
On Wed, Nov 23, 2016 at 10:00 AM, yeshwanth kumar
wrote:
> Hi Ayan,
>
> we have default rack topology.
>
>
>
> -Yeshwanth
> Can you Imagine what I would do if I could do all I can - A
5 is in a different rack than 227 or
> 228? What does your topology file says?
> On 22 Nov 2016 10:14, "yeshwanth kumar" wrote:
>
>> Thanks for your reply,
>>
>> i can definitely change the underlying compression format.
>> but i am trying to understand the Loc
. Another alternative would be bzip2 (but slower in general) or
> Lzo (usually it is not included by default in many distributions).
>
> On 21 Nov 2016, at 23:17, yeshwanth kumar wrote:
>
> Hi,
>
> we are running Hive on Spark, we have an external table over snappy
> co
Hi,
we are running Hive on Spark, we have an external table over snappy
compressed csv file of size 917.4 M
HDFS block size is set to 256 MB
as per my Understanding, if i run a query over that external table , it
should launch 4 tasks. one for each block.
but i am seeing one executor and one task
the record?
>
> > On Jul 23, 2016, at 7:53 PM, yeshwanth kumar
> wrote:
> >
> > Hi,
> >
> > i am doing bulk load to hbase using spark,
> > in which i need to generate a sequential key for each record,
> > the key should be sequential across all
Hi,
i am doing bulk load to hbase using spark,
in which i need to generate a sequential key for each record,
the key should be sequential across all the executors.
i tried zipwith index, didn't worked because zipwith index gives index per
executor not across all executors.
looking for some sugge
Hi i am doing bulk load into HBase as HFileFormat, by
using saveAsNewAPIHadoopFile
when i try to write i am getting an exception
java.io.IOException: Added a key not lexically larger than previous.
following is the code snippet
case class HBaseRow(rowKey: ImmutableBytesWritable, kv: KeyValue)