hi,
I'm try to use postgres as stats database. And made following settings
in hive-site.xml
hive.stats.dbclass
jdbc:postgresql
The default database that stores temporary hive
statistics.
hive.stats.autogather
true
A flag to gather statistics automatically during the
INSERT OVERWR
oh, found hive only support mysql and hbase. I'll try hbase.
On Mon, Aug 15, 2011 at 3:09 PM, wd wrote:
> hi,
>
> I'm try to use postgres as stats database. And made following settings
> in hive-site.xml
>
>
>
> hive.stats.dbclass
> jdbc:postgresql
> The default database that stores temporary
HBase Publisher/Aggregator classes cannot be loaded.
need to configure publisher/aggregator for hbase...there is only one
way, that is use mysql ..
does stats database will optimize hive query? Consider whether or not
setup a mysql for this.
On Mon, Aug 15, 2011 at 3:17 PM, wd wrote:
> oh, foun
hi,
I create a udf to decode urlencoded things, but found the speed for
mapred is 3 times(73sec -> 213 sec) as before. How to optimize it?
package com.test.hive.udf;
import org.apache.hadoop.hive.ql.exec.UDF;
import java.net.URLDecoder;
public final class urldecode extends UDF {
public Str
Converting it to a GenericUDF (i.e. extending GenericUDF instead of UDF)
should help some with performance.
On Mon, Aug 15, 2011 at 1:49 AM, wd wrote:
> hi,
>
> I create a udf to decode urlencoded things, but found the speed for
> mapred is 3 times(73sec -> 213 sec) as before. How to optimize it
On Monday, August 15, 2011, Carl Steinbach wrote:
> Converting it to a GenericUDF (i.e. extending GenericUDF instead of UDF)
should help some with performance.
> On Mon, Aug 15, 2011 at 1:49 AM, wd wrote:
>>
>> hi,
>>
>> I create a udf to decode urlencoded things, but found the speed for
>> mapre
Hi,
I'm trying to do what I think should be a simple task, but I'm running
into some issues with carrying through column names. All I want to do
is essentially copy an existing table but change the serialization
format (if you're curious, this is to help integrate with some existing
map reduce
Hello,
I have external tables in Hive stored in a single flat text file. When I
execute queries against it, all of my jobs are run as a single map task,
even on very large tables.
What steps do I need to make to ensure that these queries are split up and
pushed out to multiple TTs? Do I need to
Is your external file compressed with GZip or BZip? Those file formats aren’t
splittable, so they get assigned to one mapper.
On Aug 15, 2011, at 10:23 AM, Jon Bender wrote:
> Hello,
>
> I have external tables in Hive stored in a single flat text file. When I
> execute queries against it, al
It's actually just an uncompressed UTF-8 text file.
This was essentially the create table clause:
CREATE EXTERNAL TABLE foo
ROW FORMAT DELIMITED
STORED AS TEXTFILE
LOCATION '/data/foo'
Using Hive 0.7.
On Mon, Aug 15, 2011 at 10:37 AM, Loren Siebert wrote:
> Is your external file compressed wit
Can you try to recreate the external table with fields terminated by and lines
terminated by clauses?
-Ayon
See My Photos on Flickr
Also check out my Blog for answers to commonly asked questions.
From: Jon Bender
To: user@hive.apache.org
Sent: Monday, Augus
You should not have to do anything special to Hive to make it use all of your
TT’s. The actual MR job should be governed by your mapred-site.xml file.
When you run sample MR jobs (like the Pi example) and look at the job tracker,
are you seeing all your TT’s getting used?
On Aug 15, 2011, at 10
Yeah MapReduce itself is set up to use all of my task trackers--only one Map
Task gets created one the external table queries.
I tried querying another external table (composed of some 20 files) and it
created 20 map tasks in turn during the query. I will try the LINES
TERMINATED BY clause next t
The current DDL page doesn't have documentation about the describe
database command. I'd like to add that. I'm listed under my apache
addr: jgho...@apache.org
Thanks,
Jakob
Granted!
JVS
On Aug 15, 2011, at 4:35 PM, Jakob Homan wrote:
> The current DDL page doesn't have documentation about the describe
> database command. I'd like to add that. I'm listed under my apache
> addr: jgho...@apache.org
>
> Thanks,
> Jakob
Thanks for all your advise, I'll try it out.
On Mon, Aug 15, 2011 at 9:02 PM, Edward Capriolo wrote:
>
>
> On Monday, August 15, 2011, Carl Steinbach wrote:
>> Converting it to a GenericUDF (i.e. extending GenericUDF instead of UDF)
>> should help some with performance.
>> On Mon, Aug 15, 2011 a
Finally, the flowing code get no performance lose. I think the point
is to avoid to use the getString method, Thanks everyone again.
//import org.apache.hadoop.hive.ql.udf.generic.GenericUDF;
import org.apache.hadoop.hive.ql.exec.UDF;
import org.apache.hadoop.io.Text;
import java.net.URLDecoder;
17 matches
Mail list logo