Adithya,
Here small articles about:
MS SQL => Sqoop
http://mapredit.blogspot.com/2011/10/sqoop-and-microsoft-sql-server.html
And to use the self-generated classes:
http://mapredit.blogspot.com/2011/10/speedup-sqoop.html
- Alex
On Fri, Dec 9, 2011 at 5:51 AM, wrote:
> Adithya
> The answer is y
Adithya
The answer is yes. SQOOP is the tool you are looking for. It has an
import option to load data from from any jdbc compliant database into hive. It
even creates the hive table for you by refering to the source db table.
Hope It helps!..
Regards
Bejoy K S
-Original Message-
Hi,
I want to know if there is any way to load data directly from
some other DB, say Oracle/MySQL etc., into hive tables, without getting the
data from DB into a text/rcfile/sequence file in a specific format and then
loading the data from that file into hive table.
Regards,
Adi
It is a hadoop limitation. hdfs move operation is inexpensive. I am
assuming that is not an option to you because you want to save the path
structure (for some backward compatibility sake).
Something like symbolic links (i think its not supported in 0.20, not sure)
or path filter might help. But,
Hi Vince,
Hive partitioning can only exist by issueing new directories in HDFS. There
is no way to partition the data in a Hive table without adding extra
filepaths/dirs in HDFS.
For an external table you have to redistribute the data yourself in
corresponding filepaths and add the new partition
Hi Matt
Thanks for the response. We tried the example you provided without success.
When we tried to add a partition by specifying the location as a file
(log-2011-09-01.log), Hive complained with "Parent path is not a directory". I
think Hive expects a directory.
Our directory structure, a
Hi Vince,
External tables shouldn't issue copy or move commands to your data files. You
should define the base table location to '/logs', and issue alter table
commands to add partitions for each date.
Example:
CREATE EXTERNAL TABLE logs (
Data STRING
) PARTITIONED BY (cal_date STRING)
ROW FO
https://cwiki.apache.org/confluence/display/Hive/ContributorMinutes20111205
I created an INFRA ticket to take Hive out of Review Board:
https://issues.apache.org/jira/browse/INFRA-4200
Please use Phabricator for all new review requests:
https://cwiki.apache.org/confluence/display/Hive/Phabricat
Hi,
I am running Hive 0.7.0 with Hadoop 0.20.2. I have one HDFS folder full of web
server logs dated back several months.
Is possible to partition an EXTERNAL TABLE without copying/moving files or
altering the layout of the directory?
For example, in HDFS, I have:
> /logs/log-2011-09-01
> /l
On Dec 8, 2011, at 12:20 PM, Sam William wrote:
> I have a bunch of custom UDFs and I d like the others in the company to
> make use of then in an easy way .Im not very happy with the 'CREATE
> TEMPORARY FUNCTION' arrangement for each session . It d be great if our
> site-specific
Hi,
I have a bunch of custom UDFs and I d like the others in the company to make
use of then in an easy way .Im not very happy with the 'CREATE TEMPORARY
FUNCTION' arrangement for each session . It d be great if our site-specific
functions , work the sameway as the inbuilt functio
Hi Keshav
Adding on to others comments. You can install hive anywhere, not
necessary on the namenode. You can install the same on a data node or an
utility server other than name node as well, I know a few large clusters that
operates so.It applies the same with pig and other librar
Using CombineFileInputFormat might help, but it still creates overhead
when you hold many small files in HDFS.
I don't know details of your requirements, but but option 2 seems to be
better, make sure that X is at least size of few blocks in HDFS.
You could also merge files incrementally, lik
Hi Keshav,
What you want is not possible I guess. You can't submit anything into HDFS
without the namenode. Datanodes reports their local blocks into the
namenode. If the namenode does not know them it will instruct the datanode
to delete them.
But whats the point? If you submit local files to HDF
Hi Vikas,
I think there is some problem in understanding, I have my cluster setup
where I have installed Hive on namenode, and I can insert data into HDFS
using hive.
My question is, can I install hive on any of the datanode (instead of
namenode) and load data from there in datanode directl
You can also take a look at--
https://issues.apache.org/jira/browse/HIVE-74
On Wed, Dec 7, 2011 at 9:05 PM, Savant, Keshav <
keshav.c.sav...@fisglobal.com> wrote:
> You are right Wojciech Langiewicz, we did the same thing and posted my
> result yesterday. Now we are planning to do this using a sh
16 matches
Mail list logo