Re: VARCHAR or STRING fields in Hive

2017-01-16 Thread Gopal Vijayaraghavan
> Sounds like VARCHAR and CHAR types were created for Hive to have ANSI SQL > Compliance. Otherwise they seem to be practically the same as String types. They are relatively identical in storage, except both are slower on the CPU in actual use (CHAR has additional padding code in the hot-path).

Re: VARCHAR or STRING fields in Hive

2017-01-16 Thread Mich Talebzadeh
Sounds like VARCHAR and CHAR types were created for Hive to have ANSI SQL Compliance. Otherwise they seem to be practically the same as String types. HTH Dr Mich Talebzadeh LinkedIn * https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw

Re: VARCHAR or STRING fields in Hive

2017-01-16 Thread Mich Talebzadeh
Thanks Elliot for the insight. Another issue that Spark does not support "CHAR" types. It supports VARCHAR. Often one uses Spark as well on these tables. This should not really matter. I tend to define CHA(N) to be VARCHAR(N) as the assumption is that the table ingested into Parquet say is alread

Re: VARCHAR or STRING fields in Hive

2017-01-16 Thread Elliot West
Internally it looks as though Hive simply represents CHAR/VARCHAR values using a Java String and so I would not expect a significant change in execution performance. The Hive JIRA suggests that these types were added to 'support for more SQL-compliant behavior, such as SQL string comparison semanti

Re: VARCHAR or STRING fields in Hive

2017-01-16 Thread Mich Talebzadeh
thanks both. String has a max length of 2GB so in a MapReduce with a 128MB block size we are talking about 16 blocks. With VARCHAR(30) we are talking about 1 block. I have not really experimented with this, however, I assume a table of 100k rows with VARCHAR columns will have a smaller footprint i

Re: VARCHAR or STRING fields in Hive

2017-01-16 Thread sreebalineni .
How is that efficient storage wise because as far as I see it is in hdfs and storage is based on your block size. Am i missing something here? On Jan 16, 2017 9:07 PM, "Mich Talebzadeh" wrote: Coming from DBMS background I tend to treat the columns in Hive similar to an RDBMS table. For examp

Re: VARCHAR or STRING fields in Hive

2017-01-16 Thread Devopam Mittra
Few things that might have an effect: 1. Compression (better if you are in VARCHAR with finite length, instead of a STRING) 2. Multicharset support (like NVARCHAR) 3. LOBs from RDBMS world are more suitable to be typecast to STRING for pure text data (not images e.g.) regards Devopam On Mon, Jan

Re: varchar

2014-08-26 Thread Jason Dere
What version of Hive? Do you have some sample SQL? On Aug 26, 2014, at 1:20 PM, upd r wrote: > Hi, > > I have a created a table with fields defined as varchar(length). Is it > correct to insert data in to the table casting the fields as VARCHAR(length). > > I am getting this error. > Error o

Re: varchar

2014-08-26 Thread upd r
Hi, I have a created a table with fields defined as varchar(length). Is it correct to insert data in to the table casting the fields as VARCHAR(length). I am getting this error. Error occurred executing hive query: OK FAILED: SemanticException Generate Map Join Task Error: Class cannot be created