date:20110815

Setting up stats database

2011-08-15 Thread wd

hi, I'm try to use postgres as stats database. And made following settings in hive-site.xml hive.stats.dbclass jdbc:postgresql The default database that stores temporary hive statistics. hive.stats.autogather true A flag to gather statistics automatically during the INSERT OVERWR

Re: Setting up stats database

2011-08-15 Thread wd

oh, found hive only support mysql and hbase. I'll try hbase. On Mon, Aug 15, 2011 at 3:09 PM, wd wrote: > hi, > > I'm try to use postgres as stats database. And made following settings > in hive-site.xml > > > > hive.stats.dbclass > jdbc:postgresql > The default database that stores temporary

Re: Setting up stats database

2011-08-15 Thread wd

HBase Publisher/Aggregator classes cannot be loaded. need to configure publisher/aggregator for hbase...there is only one way, that is use mysql .. does stats database will optimize hive query? Consider whether or not setup a mysql for this. On Mon, Aug 15, 2011 at 3:17 PM, wd wrote: > oh, foun

slow performance when using udf

2011-08-15 Thread wd

hi, I create a udf to decode urlencoded things, but found the speed for mapred is 3 times(73sec -> 213 sec) as before. How to optimize it? package com.test.hive.udf; import org.apache.hadoop.hive.ql.exec.UDF; import java.net.URLDecoder; public final class urldecode extends UDF { public Str

Re: slow performance when using udf

2011-08-15 Thread Carl Steinbach

Converting it to a GenericUDF (i.e. extending GenericUDF instead of UDF) should help some with performance. On Mon, Aug 15, 2011 at 1:49 AM, wd wrote: > hi, > > I create a udf to decode urlencoded things, but found the speed for > mapred is 3 times(73sec -> 213 sec) as before. How to optimize it

Re: slow performance when using udf

2011-08-15 Thread Edward Capriolo

On Monday, August 15, 2011, Carl Steinbach wrote: > Converting it to a GenericUDF (i.e. extending GenericUDF instead of UDF) should help some with performance. > On Mon, Aug 15, 2011 at 1:49 AM, wd wrote: >> >> hi, >> >> I create a udf to decode urlencoded things, but found the speed for >> mapre

copy table, change serde

2011-08-15 Thread Jonathan Grimm

Hi, I'm trying to do what I think should be a simple task, but I'm running into some issues with carrying through column names. All I want to do is essentially copy an existing table but change the serialization format (if you're curious, this is to help integrate with some existing map reduce

Single Map task for Hive queries

2011-08-15 Thread Jon Bender

Hello, I have external tables in Hive stored in a single flat text file. When I execute queries against it, all of my jobs are run as a single map task, even on very large tables. What steps do I need to make to ensure that these queries are split up and pushed out to multiple TTs? Do I need to

Re: Single Map task for Hive queries

2011-08-15 Thread Loren Siebert

Is your external file compressed with GZip or BZip? Those file formats aren’t splittable, so they get assigned to one mapper. On Aug 15, 2011, at 10:23 AM, Jon Bender wrote: > Hello, > > I have external tables in Hive stored in a single flat text file. When I > execute queries against it, al

Re: Single Map task for Hive queries

2011-08-15 Thread Jon Bender

It's actually just an uncompressed UTF-8 text file. This was essentially the create table clause: CREATE EXTERNAL TABLE foo ROW FORMAT DELIMITED STORED AS TEXTFILE LOCATION '/data/foo' Using Hive 0.7. On Mon, Aug 15, 2011 at 10:37 AM, Loren Siebert wrote: > Is your external file compressed wit

Re: Single Map task for Hive queries

2011-08-15 Thread Ayon Sinha

Can you try to recreate the external table with fields terminated by and lines terminated by clauses? -Ayon See My Photos on Flickr Also check out my Blog for answers to commonly asked questions. From: Jon Bender To: user@hive.apache.org Sent: Monday, Augus

Re: Single Map task for Hive queries

2011-08-15 Thread Loren Siebert

You should not have to do anything special to Hive to make it use all of your TT’s. The actual MR job should be governed by your mapred-site.xml file. When you run sample MR jobs (like the Pi example) and look at the job tracker, are you seeing all your TT’s getting used? On Aug 15, 2011, at 10

Re: Single Map task for Hive queries

2011-08-15 Thread Jon Bender

Yeah MapReduce itself is set up to use all of my task trackers--only one Map Task gets created one the external table queries. I tried querying another external table (composed of some 20 files) and it created 20 map tasks in turn during the query. I will try the LINES TERMINATED BY clause next t

Wiki write access, please

2011-08-15 Thread Jakob Homan

The current DDL page doesn't have documentation about the describe database command. I'd like to add that. I'm listed under my apache addr: jgho...@apache.org Thanks, Jakob

Re: Wiki write access, please

2011-08-15 Thread John Sichi

Granted! JVS On Aug 15, 2011, at 4:35 PM, Jakob Homan wrote: > The current DDL page doesn't have documentation about the describe > database command. I'd like to add that. I'm listed under my apache > addr: jgho...@apache.org > > Thanks, > Jakob

Re: slow performance when using udf

2011-08-15 Thread wd

Thanks for all your advise, I'll try it out. On Mon, Aug 15, 2011 at 9:02 PM, Edward Capriolo wrote: > > > On Monday, August 15, 2011, Carl Steinbach wrote: >> Converting it to a GenericUDF (i.e. extending GenericUDF instead of UDF) >> should help some with performance. >> On Mon, Aug 15, 2011 a

Re: slow performance when using udf

2011-08-15 Thread wd

Finally, the flowing code get no performance lose. I think the point is to avoid to use the getString method, Thanks everyone again. //import org.apache.hadoop.hive.ql.udf.generic.GenericUDF; import org.apache.hadoop.hive.ql.exec.UDF; import org.apache.hadoop.io.Text; import java.net.URLDecoder;

Setting up stats database

Re: Setting up stats database

Re: Setting up stats database

slow performance when using udf

Re: slow performance when using udf

Re: slow performance when using udf

copy table, change serde

Single Map task for Hive queries

Re: Single Map task for Hive queries

Re: Single Map task for Hive queries

Re: Single Map task for Hive queries

Re: Single Map task for Hive queries

Re: Single Map task for Hive queries

Wiki write access, please

Re: Wiki write access, please

Re: slow performance when using udf

Re: slow performance when using udf

17 matches

Site Navigation

Mail list logo

Footer information