the
> expressions I will be "inserting".
>
>
> On Fri, May 11, 2012 at 5:07 PM, David Kulp wrote:
> Here is the default textfile. Substitute delimiters as necessary.
>
> CREATE TABLE ...
> ROW FORMAT DELIMITED
> FIELDS TERMINATED BY '\001' COLLECT
Here is the default textfile. Substitute delimiters as necessary.
CREATE TABLE ...
ROW FORMAT DELIMITED
FIELDS TERMINATED BY '\001' COLLECTION ITEMS TERMINATED BY '\002' MAP KEYS
TERMINATED BY '\003'
LINES TERMINATED BY '\n'
STORED AS TEXTFILE;
On May 11, 2012, at 5:58 PM, Igor Tatarinov wrot
It's simpler than this. All files look the same -- and are often very simple
delimited text -- whether managed or external. The only difference is that the
files associated with a managed table are dropped when the table is dropped and
files that are loaded into a managed table are moved into
uot;STORED AS SEQUENCEFILE" and you should be
golden.
You can presumably use one of the alternative serializers in your MR program,
but I haven't tried it, yet.
-d
On Apr 19, 2012, at 8:52 AM, David Kulp wrote:
> But I'm not clear on how to write a single row of multiple va
ses the value part of it, other than that you won’t
> notice the difference between sequence or plain text file
>
> From: David Kulp [mailto:dk...@fiksu.com]
> Sent: Thursday, April 19, 2012 2:13 PM
> To: user@hive.apache.org
> Subject: Re: using the key from a SequenceFile
>
I'm trying to achieve something very similar. I want to write an MR program
that writes results in a record-based sequencefile that would be directly
readable from hive as though it were created using "STORED AS SEQUENCEFILE"
with, say, BinarySortableSerDe.
From this discussion it seems that H
FROM mytable t1
> JOIN mytable t2 ON (t1.rownum = t2.rownum + 1 AND t2.partition=bar)
> WHERE t1.partition=foo;
>
> This should be faster as partition selection will happen earlier.
>
> This is still going to involve an awful lot of I/O, and not going to be fast.
>
> Phil.
>
CRIBE FORMATTED tablename".
On Apr 10, 2012, at 10:51 AM,
wrote:
> Thanks - I will check this out.
>
> Meanwhile, would default clustering happen using rownum? How can I check on
> how is clustering happening in our environment?
>
> Rgds
>
> ----- Original
New here. Hello all.
Could you try a self-join, possibly also restricted to partitions?
E.g. SELECT t2.value - t1.value FROM mytable t1, mytable t2 WHERE t1.rownum =
t2.rownum+1 AND t1.partition=foo AND t2.partition=bar
If your data is clustered by rownum, then this join should, in theory, be