Hello.
thank you for your information and tips.
I will try a UDF with inspiration from get_json_object().
Thanks,
Kjell Tore
22. apr. 2015 22:00 skrev "Gopal Vijayaraghavan" :
>
> > In production we run HDP 2.2.4. Any thought when crazy stuff like bloom
> >filters might move to GA?
>
> I¹d say
> In production we run HDP 2.2.4. Any thought when crazy stuff like bloom
>filters might move to GA?
I¹d say that it will be in the next release, considering it is already
checked into hive-trunk.
Bloom filters aren¹t too crazy today. They are written within the ORC file
right next to the row-in
> In production we run HDP 2.2.4. Any thought when crazy stuff like bloom
>filters might move to GA?
I¹d say that it will be in the next release, considering it is already
checked into hive-trunk.
Bloom filters aren¹t too crazy today. They are written within the ORC file
right next to the row-in
Hey Gopal.
Thanks for your answers. I did some followups;
On Wed, Apr 22, 2015 at 3:46 PM, Gopal Vijayaraghavan
wrote:
>
> > I have about 100 TB of data, approximately 180 billion events, in my
> >HDFS cluster. It is my raw data stored as GZIP files. At the time of
> >setup this was due to "sav
> I have about 100 TB of data, approximately 180 billion events, in my
>HDFS cluster. It is my raw data stored as GZIP files. At the time of
>setup this was due to "saving the data" until we figured out what to do
>with it.
>
> After attending @t3rmin4t0r's ORC 2015 session @hadoopsummit in Brusse
It is worth to mention it is 100TB raw size, approximately 19TB with gzip
-9 (best/slowed compression)
On Wed, Apr 22, 2015 at 2:50 PM, Kjell Tore Fossbakk
wrote:
> Hello user@hive.apache.org
>
> I have about 100 TB of data, approximately 180 billion events, in my HDFS
> cluster. It is my raw da
Hello user@hive.apache.org
I have about 100 TB of data, approximately 180 billion events, in my HDFS
cluster. It is my raw data stored as GZIP files. At the time of setup this
was due to "saving the data" until we figured out what to do with it.
After attending @t3rmin4t0r's ORC 2015 session @had