Thanks for the quick reply Bin. Phenix is something I'm going to try for
sure but is seems somehow useless if I can use Spark.
Probably, as you said, since Phoenix use a dedicated data structure within
each HBase Table has a more effective memory usage but if I need to
deserialize data stored in a HBase cell I still have to read in memory that
object and thus I need Spark. From what I understood Phoenix is good if I
have to query a simple column of HBase but things get really complicated if
I have to add an index for each column in my table and I store complex
object within the cells. Is it correct?

Best,
Flavio



On Tue, Apr 8, 2014 at 6:05 PM, Bin Wang <binwang...@gmail.com> wrote:

> Hi Flavio,
>
> I happened to attend, actually attending the 2014 Apache Conf, I heard a
> project called "Apache Phoenix", which fully leverage HBase and suppose to
> be 1000x faster than Hive. And it is not memory bounded, in which case sets
> up a limit for Spark. It is still in the incubating group and the "stats"
> functions spark has already implemented are still on the roadmap. I am not
> sure whether it will be good but might be something interesting to check
> out.
>
> /usr/bin
>
>
> On Tue, Apr 8, 2014 at 9:57 AM, Flavio Pompermaier 
> <pomperma...@okkam.it>wrote:
>
>> Hi to everybody,
>>
>>  in these days I looked a bit at the recent evolution of the big data
>> stacks and it seems that HBase is somehow fading away in favour of
>> Spark+HDFS. Am I correct?
>> Do you think that Spark and HBase should work together or not?
>>
>> Best regards,
>> Flavio
>>
>

Reply via email to