https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DML
2011/8/12 Daniel,Wu
> suppose the table is partitioned by period_key, and the csv file also has
> a column named as period_key. The csv file contains multiple days of data,
> how can we load it in the the table?
>
> I think of
suppose the table is partitioned by period_key, and the csv file also has a
column named as period_key. The csv file contains multiple days of data, how
can we load it in the the table?
I think of an workaround by first load the data into a non-partition table, and
then insert the data from n
drop table store_sales;
CREATE TABLE store_sales(
SUBVENDOR_ID_KEY int ,
VENDOR_KEY int ,
RETAILER_KEY int ,
ITEM_KEY int ,
STORE_KEY int ,
SubvendorId string,
OOS_REASON_KEY int ,
Total_Sales_Amount float ,
Total_Sales_Volume_Units float ,
Store_On_Hand_Volume_Units float ,
Promoted_Sal
Hi,
See in the line that log4j props is not in found .. I added Hive_conf dir to
the classpath while running and now I get this trace ..
http://pastebin.com/vXs98aZ5
I am completely clueless !
Thanks
JS
On Fri, Aug 12, 2011 at 9:54 AM, john smith wrote:
> Hi Carl,
>
> This is the stack tra
Hi Carl,
This is the stack trace I get .. http://pastebin.com/3pASqvDq
I configured mysql as my metastore and its perfectly getting updated when
ever I am adding tables via commandline.
Also one more thing is ..I am not getting any log statements while using
command line . I haven't messed up wi
We have a table name as sales, which is partitioned by period (MMDD),
and we also need a table ly_sales(last year sales). To speed up the query, we
don't use a view to join sales with last year mapping table( e.g 20110603
mapped to 20100603) for performance viewpoint. However we used the
The Hive table is just a directory in HDFS, so you can recursively set the
replication factor on it as you like. You can set it to the number of datanodes
you have. If you have 100 nodes, then run this after you create your table:
hadoop fs -setrep -R -w 100
/path/to/hive/warehouse/smal
if we have a very small table to be joined. we can use map side join and need
the small table to be located on the map task. Is it possible to replicate the
small table to ALL nodes when create the small table to cute the time to
distribute the small table?
The Mapjoin hint syntax help optimize by loading the smaller tables specified
in the Mapjoin hint into memory. Then every small table is in memory of each
mapper.
-Ayon
See My Photos on Flickr
Also check out my Blog for answers to commonly asked questions.
Fr
if the retailer fact table is sale_fact with 10B rows, and join with 3 small
tables: stores (10K), products(10K), period (1K). What's the best join solution?
In oracle, it can first build hash for stores, and hash for products, and hash
for stores. Then probe using the fact table, if the row mat
Hi John,
Can you please include the error messages/exceptions that you're
encountering?
Thanks.
Carl
On Thu, Aug 11, 2011 at 1:40 PM, john smith wrote:
> Hi folks,
>
> I am trying to run Hive from eclipse. I've set it up correctly and it is
> building the jars and stuff. However I face execep
Hi folks,
I am trying to run Hive from eclipse. I've set it up correctly and it is
building the jars and stuff. However I face execeptions when I try to run
hive queries like "show tables" etc. There has been a discussion on this in
the mailing list previously but there was no solution provided.
Are you using a custom scheduler?
I have seen issues with jobs having 0 mappers and 1 reducer with Fair scheduler.
From: hadoop n00b [mailto:new2h...@gmail.com]
Sent: Thursday, August 11, 2011 9:32 AM
To: user@hive.apache.org
Subject: Reducer Issue in New Setup
Hello,
We have just setup Hive on
Can you run normal MR jobs, like the example Pi calculation? Sometimes a
no-reducer problem stems from DNS issues— reducers use node names, not IP
addresses, so you need to have each machine knows how to resolve the names of
all the other machines in the cluster.
If it's a new cluster, you may
Have you checked your logs? These are often the best places to start.
Look at the running job and click on the running count, the current
task, then the task logs.
Sometimes they're helpful, sometimes they're not.
http://hadoop-master:50030/jobtracker.jsp
Travis Powell / tpow...@tealea
Hello,
We have just setup Hive on a new Hadoop cluster.
When I run a select * on a table, it works fine but when I run any query
which needs a reducer, like count(1) or a where condition, the query just
sits there doing nothing (map 0%). I see some message like no reducers to
run. How do I fix th
Vikas,
This question belongs to Hadoop's lists. I'm moving it to
hdfs-u...@hadoop.apache.org.
To answer your question:
DN hostnames must exist in the dfs.hosts pointed file if you want
selective inclusion. Else you just have to start the DN with the right
config and network access to the NN, and
Hey All,
Please tell me where to enter datanode IP's in CHD3U2 , actally i installed
all the components in namenode and datanode but confuse where to put
datanode IPS in namenode so thet they get connected.
--
With Regards
Vikas Srivastava
DWH & Analytics Team
Mob:+91 9560885900
One97 | Let's
I did some test, found that it is not Hive's issue, when I submit a job
using hadoop jar it also has the same problem , so I need to find the key
point from the hadoop cluster !
2011/8/11 air
> hi Aggarwal, I am using the newest version (CDH3 Update1 Hive 0.7), after
> submitting several jobs us
19 matches
Mail list logo