Hi,
I have a data on HDFS that is already stored into directories as per date.
for example- /abc/xyz/-mm-d1, /abc/xyz/-mm-d2. How do I create
external table with partition key as date to point to data in this
directory?
Please advise.
Thanks,
Aniket
7; or 'hdfs:///abc/xyz/'
>
> - Prashanth
>
>
>
>
> On Fri, Jul 1, 2011 at 3:57 PM, Aniket Mokashi wrote:
>
>> Hi,
>>
>> I have a data on HDFS that is already stored into directories as per date.
>> for example- /abc/xyz/-mm-d1, /abc/xyz
mmand everytime I get some new
partition.
Can I have-
CREATE EXTERNAL TABLE IF NOT EXISTS tablename (...) partitioned by
(insertdate string) Location '/abc/xyz';
and hive would start scanning through all available partitions
(sub-directories inside /abc/xyz)
Thanks,
Aniket
On
d. However you could have a cron that invokes a script
> that keeps changing the insertdate and you could point it to the directory
> where it has nothing but only the files (that has data) which will be loaded
> on to hive.
>
> Let me know.
>
> - Prashanth
>
>
> On T
Most likely its : HIVE-3226 (HIVE-1901)
Workaround: set hive.optimize.cp=false
On Sat, Jan 26, 2013 at 10:40 AM, Mark Grover
wrote:
> Hi John,
> Thanks for reporting this.
>
> Can you please take a look at the Lateral View issues here:
>
> https://issues.apache.org/jira/issues/?jql=project%20%3D
have you specified map-join hint in your query?
On Thu, Feb 7, 2013 at 11:39 AM, Mayuresh Kunjir
wrote:
>
> Hello all,
>
>
> I am trying to join two tables, the smaller being of size 4GB. When I set
> hive.mapjoin.smalltable.filesize parameter above 500MB, Hive tries to
> perform a local task to
I think hive.mapjoin.smalltable.filesize parameter will be disregarded in
that case.
On Thu, Feb 14, 2013 at 7:25 AM, Mayuresh Kunjir
wrote:
> Yes, the hint was specified.
> On Feb 14, 2013 3:11 AM, "Aniket Mokashi" wrote:
>
>> have you specified map-join hint in yo
yTo: * user@hive.apache.org
> *Subject: *Re: Map join optimization issue
>
> Thanks Aniket. I actually had not specified the map-join hint though.
> Sorry for providing the wrong information earlier. I had only
> set hive.auto.convert.join=true before firing my join query.
>
> ~Mayur
AFAIK, 0.8 and 0.9 have same schema. Did you upgrade to 0.10 accidentally?
On Sun, Feb 17, 2013 at 9:31 PM, FangKun Cao wrote:
> Hi Sam William:
>
> Check this issue: https://issues.apache.org/jira/browse/HIVE-3649
>
> and
>
> http://svn.apache.org/repos/asf/hive/tags/release-0.10.0/metastore/
You need
-datanucleus.connectionPool.maxActive1
in your hive-site.xml. Even with that, hive will open 2 connections to
metastore.
Thanks,
Aniket
On Mon, Mar 18, 2013 at 7:33 PM, Navis류승우 wrote:
> Hive uses DBCP as a connection pool.
>
> Ref http://www.datanucleus.org/products/accessplatform_2
#x27;t ideal long term but I am setting up the environment and
> would rather start with all tables being external then switch them to
> manager if necessary.
>
> Thanks,
> George
>
> -Original Message-
>
> From: Aniket Mokashi
> Sent: 21 Mar 2013 18:31:
In your hive-site.xml, change value to "lib/hive-hwi-0.9.0.war" from
"/lib/hive-hwi-0.9.0.war". I guess its a known issue with hwi.
~Aniket
On Thu, May 16, 2013 at 8:58 AM, Stephen Sprague wrote:
> ok. i'll bite. you've cut 'n pasted the stderr to us -- but have you any
> further comment on w
Again, you need to set value to "lib/hive-hwi-0.9.0.war". value =
'/path/to/lib/hive-hwi-0.9.0.war' will not work.
~Aniket
On Thu, May 16, 2013 at 9:57 AM, Sanjay Subramanian <
sanjay.subraman...@wizecommerce.com> wrote:
> 1. U will need to set this in the hive-site.xml
>
>
> hive.hwi.war.f
Following should help:-
http://hive.apache.org/docs/r0.10.0/api/org/apache/hadoop/hive/metastore/api/Table.html
http://hive.apache.org/docs/r0.10.0/api/org/apache/hadoop/hive/metastore/HiveMetaStoreClient.html
~Aniket
On Thu, Jun 27, 2013 at 7:21 AM, Gelesh G Omathil
wrote:
> Hi,
>
> I would
Hi,
I am trying to use uniquejoin to join multiple tables with same key in one
mapreduce job.
It works well if I stage individual partition data in temporary staging
tables.
But, if I make it work on top of partitioned tables, I do not get any
output.
Does anyone know how to fix this?
I am using
You can take a look at --
https://issues.apache.org/jira/browse/HCATALOG-3
and
https://issues.apache.org/jira/browse/HIVE-2038
Thanks,
Aniket
On Sun, Nov 27, 2011 at 11:41 PM, Ibrahim Acet wrote:
> Hi,
>
> I was wondering if there is an option to trigger queries. I use cloudera
> CDH3.
>
Pig has a Log loader in Piggybank. You can use that to generate the columns
of that table and make the table point to it.
Take a look--
https://github.com/apache/pig/tree/trunk/contrib/piggybank/java/src/main/java/org/apache/pig/piggybank/storage/apachelog
Thanks,
Aniket
On Tue, Dec 6, 2011 at 1
You can also take a look at--
https://issues.apache.org/jira/browse/HIVE-74
On Wed, Dec 7, 2011 at 9:05 PM, Savant, Keshav <
keshav.c.sav...@fisglobal.com> wrote:
> You are right Wojciech Langiewicz, we did the same thing and posted my
> result yesterday. Now we are planning to do this using a sh
It is a hadoop limitation. hdfs move operation is inexpensive. I am
assuming that is not an option to you because you want to save the path
structure (for some backward compatibility sake).
Something like symbolic links (i think its not supported in 0.20, not sure)
or path filter might help. But,
Hi,
You have a couple of options to save your intermediate state- 1. If your
metastore is HA, you can save your state in metastore (eg- alter table
TBLPROPERTIES ("job.state", "DoneTill:121122)). 2. You can
periodically save your state in EMR-local drives and upload it to s3. You
can use any cust
Hi,
I think you are missing the word "table" after "overwrite".
Syntax-
insert overwrite table month_tbl select
https://cwiki.apache.org/confluence/display/Hive/GettingStarted
Thanks,
Aniket
On Fri, Dec 23, 2011 at 9:10 AM, Periya.Data wrote:
> Hi,
> I am trying to run a simple quer
I think I saw this error a few days back. I think this is due to
HADOOP_CLASSPATH (have you placed any other jars in classpath? hive-0.8?).
Check and compare the classpath with the other installation that you have.
Thanks,
Aniket
On Fri, Dec 30, 2011 at 2:01 AM, alo alt wrote:
> Try to disable
>> On Fri, Dec 30, 2011 at 11:09 AM, praveenesh kumar
>> wrote:
>> > I recently added hbase 0.90.5 jars in HADOOP_CLASSPATH ?
>> > Previously I was running hbase 0.90.4, Is it the cause of trouble ?
>> >
>> > Thanks,
>> > Praveenesh
>>
Looks like asf jira is down. Is this a scheduled downtime? Where should I
subscribe to get updates about it?
https://issues.apache.org/jira
Thanks,
Aniket
Thanks Steven!
On Mon, Jan 2, 2012 at 9:50 PM, Steven Wong wrote:
> http://monitoring.apache.org/status/
>
> ** **
>
> ** **
>
> *From:* Aniket Mokashi [mailto:aniket...@gmail.com]
> *Sent:* Monday, January 02, 2012 4:04 PM
> *To:* user@hive.apache.or
Hi Bhavesh,
[moving discussion to hive user list]
I would suggest you to send your discussion to hive user list in order to
reach a broader audience.
As per my understanding, in the query- map_script and reduce_script are
custom scripts that run as a streaming jobs. You are asking hive to run
ma
If you do not want compression
set hive.exec.compress.output=false
If you want to compress-
set hive.exec.compress.output=false
and specify
mapred.map.output.compression.codec and mapred.output.compression.codec
depending upon query is map-only or map-reduce.
If your question is about changing
[moving to user@hive]
Can you send us the details from task logs?
Thanks,
Aniket
On Fri, Jan 6, 2012 at 2:53 AM, Bhavesh Shah wrote:
> Hello,
>
> hive> FROM (
>> FROM subset
>> MAP subset.patient_mrn, subset.encounter_date
>> USING 'q1.txt'
>> AS mp1, mp2
>> CLUSTER BY mp1)
Programmatically,
listPartitionsByFilter on
HiveMetaStoreClient returns list of partitions for a given filter criteria
(which supports only string based partitions, we need to open a jira for
that).
For each of these partitions, you can
ptn.getSd().getLocation() to get its location.
Or you can use
select quarter, COUNT(*) from table group by quarter?
On Mon, Jan 9, 2012 at 10:06 PM, Bhavesh Shah wrote:
> Hello,
> I want to calculate count of quarter wise record in Hive.
> (e.g.: In 1st Quarter - 72 (counts) likewise for other quarter)
>
> How can we calculate it through query or UDF in Hiv
A better way would be to mount a table on top of RCFiles and use
http://incubator.apache.org/hcatalog/docs/r0.2.0/inputoutput.html#HCatInputFormat
But, you will have to install and run hcatalog server for it.
(Note: By default, hcatalog assumes underlying storage is RCFile, so you do
not need to p
It means if value associated with key FIELD_DELIM is absent, then use value
associated with SERIALIZATION_FORMAT.
Thanks,
Aniket
On Mon, Jan 9, 2012 at 10:52 PM, Lu, Wei wrote:
> Hi there,
>
> ** **
>
> Codes highlighted below may have some problem. SERIALIZATION_FORMAT should
> be *FIELD
Created https://issues.apache.org/jira/browse/HIVE-2702 for the same.
On Mon, Jan 9, 2012 at 10:33 PM, Aniket Mokashi wrote:
> Programmatically,
> listPartitionsByFilter on
> HiveMetaStoreClient returns list of partitions for a given filter criteria
> (which supports only
elect quarter, count(*) from subset group by quarter;
> FAILED: Error in semantic analysis: Line 1:46 Invalid table alias or
> column reference quarter
>
> Is there any mistake in query.
>
>
> On Tue, Jan 10, 2012 at 12:04 PM, Aniket Mokashi wrote:
>
>> select quarter,
Hi,
Can this be because of the scheduler you are using? I see there are 3
queues, check the scheduler configuration, that might give you some hints.
Thanks,
Aniket
On Thu, Jan 12, 2012 at 12:24 AM, hadoop hive wrote:
> thanks for your reply bejoy
>
> Actually there is nothing like rack awarene
You need https://issues.apache.org/jira/browse/HIVE-2355
Thanks,
Aniket
On Thu, Jan 12, 2012 at 5:26 PM, Roberto Congiu wrote:
> Hey guys,
>
> I ran into a quite annoying issue. I have some jars in
> HIVE_AUX_JARS_PATH that include some serdes and UDF.
> They work fine from CLI, but when I run t
and I think https://issues.apache.org/jira/browse/HIVE-2139
to make 2355 work.
Thanks,
Aniket
On Thu, Jan 12, 2012 at 5:35 PM, Aniket Mokashi wrote:
> You need https://issues.apache.org/jira/browse/HIVE-2355
>
> Thanks,
> Aniket
>
>
> On Thu, Jan 12, 2012 at 5:26 PM, Ro
Everything in auxlib is added to HADOOP_CLASSPATH. But, the paths in
HADOOP_CLASSPATH are added to the class path of the Job Client, but they
are not added to the class path of the Task Trackers. Therefore if you put
a JAR called MyJar.jar on the HADOOP_CLASSPATH and don't do anything to
make it av
how about Load data inpath?
On Tue, Jan 17, 2012 at 9:00 PM, Bhavesh Shah wrote:
> Hello,
> I am using Hive-0.7.1. I want to append the data in table.
> Is hive-0.7.1 support appending feature or just support OVERWRITE feature?
> When I tried for the appending, the query is not working.
>
> What
Add the jar to HADOOP_CLASSPATH when you launch hive. That should help.
Thanks,
Aniket
On Sun, Jan 22, 2012 at 9:25 AM, Tim Havens wrote:
> I have a similar UDF to this one which create's just fine.
>
> I cam seem to resolve what 'return code -101' means however with this
> one.
>
> Can anyone
A simplest way would be to put the jar in auxlib directory. That does the
both for you I guess. After that you can directly create temporary function
in hive.
~Aniket
On Sun, Jan 22, 2012 at 1:24 PM, Aniket Mokashi wrote:
> Add the jar to HADOOP_CLASSPATH when you launch hive. That should h
ore
> FAILED: Execution Error, return code -101 from
> org.apache.hadoop.hive.ql.exec.FunctionTask
>
> On Sun, Jan 22, 2012 at 3:43 PM, Tim Havens wrote:
>
>> Unfortunately the issue appears to be something with the Jar, or my UDF.
>>
>> What I can't seem
Hi Hans,
Can you please elaborate on the use case more? Is your data already in
Binary format readable to LazyBinarySerDe (if you mount a table with that
serde with hive)?
OR
are you trying to write data using mapreduce (java) into a location that
can be further read by a table that is declared to
th hive. Do I need to
> define it as a struct, just normal fields and row format is LazyBinarySerDe?
>
> On Sun, Jan 22, 2012 at 5:41 PM, Aniket Mokashi wrote:
>
>> Hi Hans,
>>
>> Can you please elaborate on the use case more? Is your data already in
>> Binary form
Hi Carl,
It would be helpful for me too.
My wiki username: aniket486.
Thanks,
Aniket
On Tue, Jan 24, 2012 at 11:57 AM, Carl Steinbach wrote:
> Hi Matt,
>
> Great!
>
> Please sign up for a wiki account here:
> https://cwiki.apache.org/confluence/signup.action
>
> Then email me your wiki usernam
You will have to do your own serde..
Hive can write it sequencefile but it will be Text with NULL(bytewritable)
key.
Thanks,
Aniket
On Tue, Jan 24, 2012 at 11:41 PM, jingjung Ng wrote:
> Hi,
>
> I have following hive query (pseudo hive query code)
>
> select name, address, phone from t1 join
If you are on hdfs
How about--
use db1;
create table table1 like db2.table1;
and move the data?
Thanks,
Aniket
On Mon, Jan 30, 2012 at 8:09 AM, Sriram Krishnan wrote:
> AFAIK there is no way in HiveQL to do this. We had a similar requirement
> in the past, and we wrote a shell script to upda
Can you cast it into string and compare?
On Thu, Feb 2, 2012 at 9:18 AM, Sunderlin, Mark
wrote:
> I am trying to see of two hive maps have the same data in them. I am
> not looking to see if any single key-value pair in the maps match, I am
> looking to see if
>
> **a)**There is a one
Add your udf jar to HADOOP_CLASSPATH
Thanks,
Aniket
On Mon, Feb 6, 2012 at 10:01 AM, Mark Grover wrote:
> Hi Jean-Charles,
> Please make sure that your jar is built properly and the class
> com.autoscout24.hive.udf.Md5 exists within the jar.
> Also, make sure the jar gets added properly to Hive
AFAIK, hive uses default delimiters nested data structures. There is no
workaround for this for now.
for (int i = 3; i < serdeParams.separators.length; i++) {
serdeParams.separators[i] = (byte) (i + 1);
}
Thanks,
Aniket
On Wed, Feb 8, 2012 at 10:15 PM, Hao Cheng wrote:
> Hi,
>
> My
This means the data key is null and data is (delimited) text. This would
not work for generic sequencefiles.
Thanks,
Aniket
On Thu, Mar 1, 2012 at 4:39 AM, Bejoy Ks wrote:
> Hi Madhu
> You can definitely do the same. Specify the SEQUENCE FILE in 'STORED
> AS' clause in your DDL.
>
> An ex
If you have hive-server running somewhere you can do following-
HiveConf hiveConf = new HiveConf(MyClass.class);
hiveConf.set("hive.metastore.local", "false");
hiveConf.set(ConfVars.METASTOREURIS.varname, url);
HiveMetaStoreClient client = new HiveMetaStoreClient(hiveConf);
and then do-
client
If you add a column to the table in the end, for old files your new field
will be NULL. Is it not what you observe?
Thanks,
Aniket
On Thu, Mar 1, 2012 at 12:06 PM, Anson Abraham wrote:
> If i have a hive table, which is an external table, and have my "log
> files" being read into it, if a new fi
field null. Sorry I should
> mention that I'm on hive .0.7.1.
> does 0.8.0 support this function? of if old files doesn't have column it
> will make it null? again, this is an external table.
>
>
> On Thu, Mar 1, 2012 at 5:02 PM, Aniket Mokashi wrote:
>
>> If
1. create external table B like A;
2. alter table B set location 's3n://'
Thanks,
Aniket
On Mon, Mar 5, 2012 at 4:59 PM, Igor Tatarinov wrote:
> Is there a way to create an external table LIKE another table?
>
> This doesn't work:
>
> CREATE TABLE B LIKE A
> ROW FORMAT DELIMITED FIELDS TERM
Hive client also needs to be able to list paths on hadoop (checks if path
exists etc while creating tables (external too))
I think we should fix this.
On Wed, Apr 25, 2012 at 6:07 AM, Ashish Thusoo wrote:
> Hive needs the hadoop jars to talk to hadoop. The machine that it is
> installed on has t
put libthrift and libfb303 jars on classpath.
Thanks,
Aniket
On Wed, Apr 25, 2012 at 11:14 PM, Bhavesh Shah wrote:
> Hello all,
> I have written this small program But I am getting error.
>
> Program:
> -
> import java.io.FileWriter;
> import java.io.InputStream;
> import java.sql.Co
How about passing it outside of hive? (calculating it in bash etc.)
Note, unix_timestamp is non-deterministic udf and hence would not help you
in partition pruning.
Alternatively, you can develop your own current_date (trivial) udf
(deterministic).
Thanks,
Aniket
On Wed, Apr 25, 2012 at 5:54 PM
I think right URI scheme is s3n://abc/def. We use that with EMR version of
hive in production.
create table test (schema string) location 's3n://abc/def'; should work.
On Tue, May 29, 2012 at 2:35 PM, Balaji Rao wrote:
> To partition on s3, one would create folders like:
> s3://mybucket/path/dt
If this is UDF, you will need hive-exec.jar to compile it. I am not sure
what is the use of this udf.
Serde has following interface--
public interface SerDe extends Deserializer, Serializer
~Aniket
On Wed, May 30, 2012 at 9:51 PM, Russell Jurney wrote:
> I tried to make a simple Serde that con
You should look at hive log and find exact exception. That will give you a
hint.
On Thu, May 31, 2012 at 12:33 AM, wd wrote:
> No problem, thanks for your reply.
> I'm very curious why this didn't work, this sql come from hive wiki.
> The metadata is store in postgres, does it matter?
>
> On Thu
;
> On Thu, May 31, 2012 at 3:34 PM, Aniket Mokashi
> wrote:
> > You should look at hive log and find exact exception. That will give you
> a
> > hint.
> >
> >
> > On Thu, May 31, 2012 at 12:33 AM, wd wrote:
> >>
> >> No problem, thanks fo
Put hive-exec*.jar in your eclipse classpath. (project properties-> java
build path -> libraries)
On Tue, Jun 5, 2012 at 8:52 AM, kulkarni.swar...@gmail.com <
kulkarni.swar...@gmail.com> wrote:
> Did you try this[1]? It had got me most of my way through the process.
>
> [1] https://cwiki.apache.o
put jar in hive-classpath (libs directory etc) and do a create temporary
function every time you connect from server.
What version of hive are you on?
~Aniket
On Mon, Jun 11, 2012 at 11:12 PM, Sreenath Menon
wrote:
> I have a jar file : 'twittergen.jar', now how can I add it to hive lib.
> Kind
I mean every time you connect to hive server-
execute-
create temporary function...;
your hive query...;
~Aniket
On Mon, Jun 11, 2012 at 11:27 PM, Aniket Mokashi wrote:
> put jar in hive-classpath (libs directory etc) and do a create temporary
> function every time you connect from
Hi Mark,
Collection items terminated by applies to both maps and arrays. In your
case, you can play with hive's nested complex data structures (so that you
can introduce another separator) to deserialize your data but that would
require some experimentation (digging into code). This would be non-t
This would need changes to hive.
On Wed, Jun 13, 2012 at 8:34 AM, wrote:
> Not sure… but may be through Jython…
>
>
>
> *From:* 王锋 [mailto:wfeng1...@163.com]
> *Sent:* miércoles, 06 de junio de 2012 7:36
> *To:* user@hive.apache.org
> *Subject:* Re:Custom UDF in Python?
>
>
>
> udfs need extend
Can you share the stack trace?
On Tue, Jun 12, 2012 at 2:18 AM, Marcin Cylke wrote:
> Hi,
>
> I'm having problems running current releases of Apache Hive, I get an
> error:
>
> java.lang.NoSuchMethodError:
>
> org.apache.thrift.server.TThreadPoolServer.(Lorg/apache/thrift/server/TThreadPoolServe
Hive also have something called uniquejoin. May be you are looking for
that. I cannot find documentation for your reference but you can do a jira
search.
It allows you to perform joining multiple sources with same key, mapside.
(all sources should have the same key)
~Aniket
On Wed, Jun 13, 2012 a
https://cwiki.apache.org/confluence/download/attachments/27362054/HiveServer2HadoopSummit2012BoF.pdf?version=1&modificationDate=1339790767000
On Wed, Jun 20, 2012 at 10:16 AM, Abhishek Pratap Singh wrote:
> Hi All,
>
> Any good pointers for How to make Hiver Server (Thrift Server) connection
> p
Can you do client.getAllTables()?
~Aniket
On Mon, Jun 25, 2012 at 12:43 PM, VanHuy Pham wrote:
> Hi,
>I am trying to use the hive thrift client to connect to hive. Even
> though I have started the hive thrift server (it's running by checking
> netstat -na | grep 1).
>However, the th
Can you share your query and use case?
~Aniket
On Tue, Jul 10, 2012 at 9:39 AM, Harsh J wrote:
> This appears to be a Hive issue (something probably called FS.close()
> too early?). Redirecting to the Hive user lists as they can help
> better with this.
>
> On Tue, Jul 10, 2012 at 9:59 PM, 안의건
Congrats Ashutosh!
~Aniket
On Wed, Jul 18, 2012 at 10:25 PM, Ashutosh Chauhan wrote:
> Thanks, Andes and Bejoy !
>
> Ashutosh
>
> On Tue, Jul 17, 2012 at 12:52 AM, Bejoy KS wrote:
>
>> **
>> Well deserved one. Congrats Ashutosh.
>> Regards
>> Bejoy KS
>>
>> Sent from handheld, please excuse typ
I haven't used insert into but i observed a similar problem earlier-
https://issues.apache.org/jira/browse/HIVE-2617
You can check against the trunk or open a jira.
~Aniket
On Thu, Sep 13, 2012 at 1:36 PM, Kaufman Ng wrote:
> Does anyone know if insert into statement is supposed to work across
(Probably not what you are looking for) Check -
http://www.larsgeorge.com/2009/10/hive-vs-pig.html
~Aniket
On Fri, Sep 14, 2012 at 2:28 PM, Russell Jurney wrote:
> A detailed post comparing Pig/Hive performance from last week:
> http://hortonworks.com/blog/pig-performance-and-optimization-analys
Just a guess- Put your jar on hadoop classpath.
On Mon, Sep 24, 2012 at 5:45 PM, Abhishek Pratap Singh
wrote:
> i m using hive-0.7.1
>
>
> On Mon, Sep 24, 2012 at 5:10 PM, Edward Capriolo wrote:
>
>> I have noticed this as well with hive 0.7.0. Not sure what CDH is
>> based on but newer versions
+1. Great work guys. Congrats!
I just placed an order.
~Aniket
On Sun, Sep 30, 2012 at 11:37 AM, varun kumar wrote:
> Hi Edward,
>
> May i know the password to open the pdf file.
>
> Regards,
> Varun
>
> On Sun, Sep 30, 2012 at 5:21 AM, Edward Capriolo wrote:
>
>> Hello all,
>>
>> I wanted to l
mapreduce.fileoutputcommitter.marksuccessfuljobs=false;
MAPREDUCE-947, i guess..
~Aniket
On Thu, Oct 4, 2012 at 11:06 PM, Balaraman, Anand <
anand_balara...@syntelinc.com> wrote:
> Hi
>
> ** **
>
> While using Map reduce programs, the output folder where reducer writes
> out the result cont
78 matches
Mail list logo