Well the link provided isn't really about what I originally asked about. I
have not come across a SQL implementation (Postgres, MySQL, or MSSQL are
the ones I have experience in) where LIKE was "by default" case sensitive
with wildcards. That being said, I'm not the type to based my assertions
on
Thanks for the followup.
Keith Wiley kwi...@keithwiley.com keithwiley.commusic.keithwiley.com
"I used to be with it, but then they changed what it was. Now, what I'm with
isn't it, and what's it seems we
By saying "OTHER DATABASES" you are implying that all other databases
agree on implementation.
I could not find the SQL official spec but anecdotally it seems this
is not the case.
http://stackoverflow.com/questions/153944/is-sql-syntax-case-sensitive
With hive you have like and rlike to chose f
On Apr 4, 2012, at 06:40 , John Omernik wrote:
> I think the like statement should be changed to be case-insensitive to match
> it's function in other DBMS Thoughts?
Out of curiosity, was there any activity on this issue? I see John's original
post in the archives (~5wks ago) with no followup
Hadoop in general does well with fewer large data files instead of more
smaller data files. RDBMS type of indexing and run time optimization is not
exactly available in Hadoop/Hive yet. So one suggestion is to combine some
of this data, if you can, into fewer tables as you are doing sqoop. Even if
Are NaNs and/or Infinity supported in HIVE? If yes, I wanted to know
how are NaNs and Infinity values represented in HDFS files to be
interpreted correctly in Hive.
When I do 'select 1/0 from tab', I get a text value, "Infinity".
However, when I enter "Infinity" v in my HDFS file represented by th
Ranjith,
If the schema of the data changes, when using external tables, you can drop the
table and re-create it on the same dataset taking care of the schema changes
(hopefully, maintaining backwards compatibility).
I think you can still achieve that using alter table commands with managed
tabl
Hi Shin,
If you could list the query that failed and the query used to create the tables
in question, that would be very helpful.
Mark
- Original Message -
From: "Shin Chan"
To: "HIVE User"
Sent: Monday, May 14, 2012 2:28:06 AM
Subject: Order by Sort by partitioned columns
Hi All
Ju
Thanks Nitin for your continous support.
*Here is my data layout and change the queries as per needed*:
1) Initially after importing the tables from MS SQL Server, 1st basic task
I am doing is that *PIVOTING.*
As SQL stores data in name value pair.
2) Pivoting results in subset of data, Using th
partitioning is mainly used when you want to access the table based on
value of a particular column and dont want to go through entire table for
same operation. This actually means if there are few columns whose values
are repeated in all the records, then you can consider partitioning on
them. Oth
Hello Nitin,
Thanks for suggesting me about the partition.
But I want to tell one thing that I forgot to mention before is that :*
I am using Indexes on all tables tables which are used again and again. *
But the problem is that after execution I didn't see the difference in
performance (before app
You can also have a reduce-side bottleneck if, for example, you are doing
distinct counts or with skewed group sizes (ie one aggregation group is
much larger than others).
But to know this you really need to look at the stats of your jobs via the
jobtracker and even the progress counter output of
it is definitely possible to increase your performance.
I have run queries where more than 10 billion records were involved.
If you are doing joins in your queries, you may have a look at different
kind of joins supported by hive.
If one of your table is very small in size compared to another tabl
That I fail to know, how many maps and reducers are there. Because due to
some reason my instance get terminated :(
I want to know one thing that If we use multiple nodes, then what should be
the count of maps and reducers.
Actually I am confused about that. How to decide it?
Also I want to try
with a 10 node cluster the performance should improve.
how many maps and reducers are being launched?
On Mon, May 14, 2012 at 1:18 PM, Bhavesh Shah wrote:
> I have near about 1 billion records in my relational database.
> Currently locally I am using just one cluster. But I also tried this on
>
I have near about 1 billion records in my relational database.
Currently locally I am using just one cluster. But I also tried this on
Amazon Elastic Mapreduce with 10 nodes. But the time taken to execute the
complete program is same as that on my single local machine.
On Mon, May 14, 2012 at 1:1
how many # records?
what is your hadoop cluster setup? how many nodes?
if you are running hadoop on a single node setup with normal desktop, i
doubt it will be of any help.
You need a stronger cluster setup for better query runtimes and ofcourse
query optimization which I guess you would have alr
Hello all,
My Use Case is:
1) I have a relational database which has a very large data. (MS SQL Server)
2) I want to do analysis on these huge data and want to generate reports
on it after analysis.
Like this I have to generate various reports based on different analysis.
I tried to implement thi
18 matches
Mail list logo