Try with set hive.default.fileformat=SequenceFile; Thanks, Navis
2014-10-06 20:51 GMT+09:00 Maciek <mac...@sonra.io>: > Hello, > > I've encountered a situation when printing new lines corrupts (multiplies) > the returned dataset. > This seem to be similar to HIVE-3012 > <https://issues.apache.org/jira/browse/HIVE-3012> (fixed on 0.11), but as > I'm on Hive 0.13 it's still the case. > Here are the steps to illustrate/reproduce: > > 1. Fist let'e create table with one row and one column by selecting from > any existing table (substitute ANYTABLE respecitvely): > > CREATE TABLE singlerow AS SELECT 'worldofhostels' wordsmerged FROM > ANYTABLE LIMIT 1; > > and verify: > > SELECT * FROM singlerow; > > OK----------- > worldofhostels > > Time taken: 0.028 seconds, Fetched: 1 row(s) > > All good so far. > 2. Now let's introduce newline here by: > > SELECT regexp_replace(wordsmerged,'of',"\nof\n") wordsseparate FROM > singlerow; > > OK---------- > > world > of > hostels > > Time taken: 6.404 seconds, Fetched: 3 row(s) > and I'm suddenly getting 3 rows now. > 3. This is not just for CLI output as when submitting CTAS, it > materializes such corrupted result set: > > CREATE TABLE corrupted AS > SELECT regexp_replace(wordsmerged,'of',"\nof\n") wordsseparate, > wordsmerged FROM singlerow; > > hive> select * from corrupted; > > OK > > world NULL > of NULL > hostels worldofhostels > > Time taken: 0.029 seconds, Fetched: 3 row(s) > Apparently, the same happens - new table is split into multiple rows with > columns following the one in question (like wordsmerged) become NULLs > Am i doing something wrong here? > > Regards, > Maciek >