find_in_set is documented on this page:
http://wiki.apache.org/hadoop/Hive/LanguageManual/UDF
But Hive 0.6 also supports the IN syntax (HIVE-801).
JVS
On Oct 12, 2010, at 10:23 AM, Karthik wrote:
Neil,
Thanks a lot, that works like a charm. Perhaps, that should go into the Hive
Wiki page to
If your query only accesses HBase tables, then yes, Hive does not access any
source data directly from HDFS (although of course it may put intermediate
results in HDFS, e.g. for the result of a join).
However, if your query does something like join a HBase table with a native
Hive table, then i
Up through release 0.5, Hive has been producing tarballs
hive-x.y.z-bin.tar.gz (binaries only)
hive-x.y.z-dev.tar.gz (binaries, source, and doc)
The top-level tar directory structures match the tarball name.
The "dev" is a little confusing since these are releases (not development
builds). I j
The HBase folks took a look at this one just now and updated JIRA with an
explanation of what's going wrong. Sounds like the workaround may be to set
the zookeeper port explicitly, e.g.
-hiveconf hbase.zookeeper.property.clientPort=2181
If that works, we can update JIRA with the workaround.
J
Hive users etc are encouraged to vote too :)
JVS (gotta love cut-and-paste)
On Oct 22, 2010, at 2:51 PM, Ashish Thusoo wrote:
> Hi Folks,
>
> I propose that we adopt the following bylaws for the Apache Hive Project
>
> https://cwiki.apache.org/HIVE/bylaws.html
>
> These are basically a cut-an
I'm starting on the svn move in a little bit. Committers, please hold off on
further commits until you see an update on this.
JVS
On Oct 7, 2010, at 10:45 AM, Edward Capriolo wrote:
> All,
>
> Part of the move to TLP will require us moving our SVN.
> https://issues.apache.org/jira/browse/INFR
If you have outstanding checkouts (including ones with changes) you can update
them using svn switch:
svn switch https://svn.apache.org/repos/asf/hive/trunk
The above assumes you have trunk checked out (with https for committing). If
you instead have a branch checked out, or are using http, th
> I filed a request with ASF INFRA to update the Git mirror:
> https://issues.apache.org/jira/browse/INFRA-3107
>
> Carl
>
> On Tue, Oct 26, 2010 at 1:46 PM, John Sichi wrote:
>
>> If you have outstanding checkouts (including ones with changes) you can
>> up
http://wiki.apache.org/hadoop/Hive/Development/ContributorsMeetings/HiveContributorsMinutes101025
JVS
HIVE-1226 (support for pushing filters down into storage handlers) is only in
trunk (not 0.6).
Separately: as a followup to HIVE-1434, it would be good to plan a refactoring
across the handlers for HBase/Hypertable/Cassandra since there is a lot of
duplicated code.
JVS
On Nov 4, 2010, at 1:3
see the last commit for HIVE-1226 was almost a
> month before the 0.6.0 release and I'm wondering why it didn't get into the
> release.
>
> Also, I agree theres a lot of duplicated code and will be happy to help out
> in a refactoring effort.
>
> -Sanjit
>
>
They are only present in trunk, so they won't be in a release until 0.7.
Since they are UDAF's, you might be able to compile them by building trunk and
then plug them into 0.6 via CREATE TEMPORARY FUNCTION.
JVS
On Nov 5, 2010, at 5:20 PM, Gokul Pillai wrote:
> According to the Hive wiki, ngram
1 for shorter release cycles.
>
> Currently, is it time based or feature based ? If time based, whats
> the timeline for 0.7 and if its feature based, which features are
> "must" for 0.7 release?
>
> Ashutosh
> On Thu, Nov 4, 2010 at 11:42, John Sichi wrote:
>> We
I didn't see any camera, so I'm guessing not.
JVS
On Nov 8, 2010, at 10:17 PM, איל (Eyal) wrote:
> Will there be a video of this talk ?
>
> On Tue, Nov 9, 2010 at 1:09 AM, John Sichi wrote:
>> http://www.slideshare.net/jsichi/hive-evolution-apachecon-2010
>>
>> JVS
>>
>>
This is unrelated to Hive/HBase integration; it looks like a Hadoop version
issue.
JVS
On Nov 17, 2010, at 9:56 PM, Vivek Mishra wrote:
> Hi,
> Currently I am facing an issue with Hive/HBase integration.
>
> Exception in thread "main" java.lang.NoSuchMethodError:
> org.apache.hadoop.util.She
As noted here, when writing to HBase, existing rows are overwritten, but old
rows are not deleted.
http://wiki.apache.org/hadoop/Hive/HBaseIntegration#Overwrite
There is not yet any deletion support.
JVS
On Nov 18, 2010, at 1:00 AM, afancy wrote:
> Hi,
>
> Does the INSERT clause have to in
It looks a bit like this one where ISCOMPRESSED was used instead of
IS_COMPRESSED:
https://issues.apache.org/jira/browse/HIVE-1435
Maybe your datanucleus.identifierFactory is somehow misconfigured?
JVS
On Nov 23, 2010, at 4:16 PM, Xavier Stevens wrote:
> I'm trying to create an external table
Try
set hbase.client.scanner.caching=5000;
Also, check to make sure that you are getting the expected locality so that
mappers are running on the same nodes as the region servers they are scanning
(assuming that you are running HBase and mapreduce on the same cluster). When
I was testing this
ask matches the input split
location.
JVS
On Dec 10, 2010, at 10:10 AM, vlisovsky wrote:
> Thanks for the info. Moreover how can we make sure that our regionservers are
> running with same Datanodes ( locality). Is there a way we can make sure?
>
> On Thu, Dec 9, 2010 at 11:09 P
Enclose them in backticks.
alter table fb_images1 change `_c5` ref_array array;
JVS
On Dec 20, 2010, at 3:23 PM, Leo Alekseyev wrote:
> Often I forget to name a column that results from running an
> aggregation. Then, I'm stuck: describe table lists those columns by
> their default names, i.e.
It runs the same as a nested select. Currently, since Hive doesn't do any
relational common subexpression elimination, it will be executed twice. In the
example below, this can be a good thing, since cond1 and cond2 can be pushed
down separately.
JVS
On Dec 28, 2010, at 12:18 AM, Neil Xu wro
Since the exception below is from JDO, it has to do with the configuration of
Hive's metastore (not HBase/Zookeeper).
JVS
On Jan 5, 2011, at 2:14 AM, Adarsh Sharma wrote:
>
>
>
>
> Dear all,
>
> I am trying Hive/Hbase Integration from the past 2 days. I am facing the
> below issue while c
Here is what you need to do:
1) Use svn to check out the source for Hive 0.6
2) In your checkout, replace the HBase 0.20.3 jars with the ones from 0.20.6
3) Build Hive 0.6 from source
4) Use your new Hive build
JVS
On Jan 6, 2011, at 2:34 AM, Adarsh Sharma wrote:
> Dear all,
>
> I am sorry
On Jan 6, 2011, at 9:53 PM, Adarsh Sharma wrote:
> I want to know why it occurs in hive.log
>
> 2011-01-05 15:19:36,783 ERROR DataNucleus.Plugin
> (Log4JLogger.java:error(115)) - Bundle "org.eclipse.jdt.core" requires
> "org.eclipse.core.resources" but it cannot be resolved.
>
That is a bogus
SELECT bar.latitude, bar.longitude FROM
(SELECT parseCoordinates(latitude, longitude) as coord FROM foo) bar;
JVS
> SELECT parseCoordinates(latitude, longitude) AS lat, lng FROM foo;
On Jan 13, 2011, at 10:10 AM, Lars Francke wrote:
> Hello,
>
> I'm looking for help with UDFs.
>
> I need a UD
On Jan 26, 2011, at 10:52 AM, Jay Ramadorai wrote:
> - Create views on temporary tables named by day. Have jobs go against the
> views. When we are ready to rename, basically replace the view, pointing it
> now to the new table of today. The key question here is: is the View metadata
> consulted
Besides the fact that the refactoring required is significant, I don't think
this is possible to do quickly since:
1) Hive (unlike Pig) requires a metastore
2) Hive releases can't depend on an incubator project
It's worth pointing out that Howl is already using Hive's CLI+DDL (not just
metasto
But Howl does layer on some additional code, right?
https://github.com/yahoo/howl/tree/howl/howl
JVS
On Feb 3, 2011, at 1:49 PM, Ashutosh Chauhan wrote:
> There are none as of today. In the past, whenever we had to have
> changes, we do it in a separate branch in Howl and once those get
> commi
I was going off of what I read in HADOOP-3676 (which lacks a reference as
well). But I guess if a release can be made from the incubator, then it's not
a blocker.
JVS
On Feb 3, 2011, at 3:29 PM, Alex Boisvert wrote:
> On Thu, Feb 3, 2011 at 11:38 AM, John Sichi wrote:
> Besid
it we expect it will continue to add more
> additional layers.
>
> Alan.
>
> On Feb 3, 2011, at 2:49 PM, John Sichi wrote:
>
>> But Howl does layer on some additional code, right?
>>
>> https://github.com/yahoo/howl/tree/howl/howl
>>
>> JVS
>
On Feb 3, 2011, at 5:09 PM, Alan Gates wrote:
> Are you referring to the serde jar or any particular serde's we are making
> use of?
Both (see below).
JVS
[jsichi@dev1066 ~/open/howl/howl/howl/src/java/org/apache/hadoop/hive/howl] ls
cli/ common/ data/ mapreduce/ pig/ rcfile/
[jsic
doing that.
>
> Now, whether the project choses to use and release with an incubator
> dependency is a matter of judgment (and ultimately a vote by committers if
> there is no consensus). I just wanted to make sure there were no incorrect
> assumptions made.
>
> alex
>
I think I forgot to put
add jar /path/to/hive_contrib.jar;
in the instructions. Can you try that?
Also, some things may have changed since those instructions were written; I
recently had to update the way the corresponding unit test works.
Also, since then, HBase has added an API for bulk loa
As I mentioned in my previous message, some changes in Hive required an update
to the unit test, meaning the wiki is out of date as well.
This is the latest unit test script; the main change is that instead of using
an external table for generating the partition list, it now uses a managed
tabl
One of the impediments for uptake of the CREATE VIEW feature in Hive has been
the lack of partition awareness. This made it non-transparent to replace a
table with a view, e.g. for renaming purposes. To address this as well as some
other use cases, I'm proposing the first steps towards view pa
There's no explicit way to enforce it, but in practice you can get it to work
by using the UDF invocation in an outer select, typically with an ORDER or SORT
BY on the inner select, as in this example:
http://wiki.apache.org/hadoop/Hive/HBaseBulkLoad#Prepare_Range_Partitioning
Note also this se
Please see the discussion in this JIRA issue:
https://issues.apache.org/jira/browse/HIVE-1994
JVS
On Feb 21, 2011, at 10:45 PM, Igor Tatarinov wrote:
> I would like to implement the moving average as a UDF (instead of a streaming
> reducer). Here is what I am thinking. Please let me know if I
Could you elaborate?
>
> Also, there is no mentioning of UDF object sharing (between mappers) in the
> current implementation. Is this a problem? do I need to use ThreadLocal or
> something like that?
>
> On Tue, Feb 22, 2011 at 11:42 AM, John Sichi wrote:
> Please see th
You should try the latest Hive (either trunk or the 0.7 release branch) instead.
JVS
On Feb 28, 2011, at 5:41 AM, Vivek Krishna wrote:
> In short, I am trying to make hbase_handler to work with hive-0.6 and
> hbase-0.90.1.
>
> I am trying to integrate Hbase and Hive. There is a pretty good
>
Yes.
JVS
On Mar 7, 2011, at 9:59 PM, Biju Kaimal wrote:
> Hi,
>
> I loaded a data set which has 1 million rows into both Hive and HBase tables.
> For the HBase table, I created a corresponding Hive table so that the data in
> HBase can be queried from Hive QL. Both tables have a key column an
at query time.
On the other hand, with native Hive tables, there's latency in loading new
batches of data.
JVS
On Mar 7, 2011, at 10:13 PM, Biju Kaimal wrote:
> Hi,
>
> Could you please explain the reason for the behavior?
>
> Regards,
> Biju
>
> On Tue, Mar 8,
d watch?
>
> Thanks,
> Otis
>
> Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch
> Lucene ecosystem search :: http://search-lucene.com/
>
>
>
> - Original Message
>> From: John Sichi
>> To: ""
>> Sent: Tue, March 8, 2011 1
actor varies? Is if often closer to 1 or is
> is
> more often close to 10?
> Just trying to get a better feel for this...
>
> Thanks,
> Otis
>
> Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch
> Lucene ecosystem search :: http://search-lucene.com/
>
https://issues.apache.org/jira/browse/HIVE-1016
https://issues.apache.org/jira/browse/HIVE-1360
JVS
On Apr 5, 2011, at 11:20 AM, Larry Ogrodnek wrote:
> For some UDFs I'm working on now it feels like it would be handy to be
> able to pass in parameters during construction. It's an integration
>
Until HBase has a well-defined separation between client and server, including
protocol compatibility across versions, the situation is going to remain sticky.
I think I heard that 0.89 and 0.90 should be protocol compatible, but I haven't
confirmed that. If it's true, then you should be able t
Automatic usage of indexes is still under development (HIVE-1644).
JVS
On Apr 15, 2011, at 1:31 AM, Erix Yao wrote:
> hi,
> I installed the hive-0.7 release for the index feature.
> Here's my test table schema:
>
> create table testforindex (id bigint, type int) row format delimited fields
We had a good meetup yesterday, with a lot of discussion topics; here are my
notes:
http://wiki.apache.org/hadoop/Hive/Development/ContributorsMeetings/HiveContributorsMinutes110425
JVS
Try one of these suggestions:
(1) run HBase and Hive in separate clusters (downside is that map/reduce tasks
will have to issue remote request to region servers whereas normally they could
run on the same nodes)
(2) debug the shim exception and see if you can contribute a patch that makes
Hive
Apparently our roadmap includes hard drive recovery, wedding reception flowers,
and developing muscles.
http://wiki.apache.org/hadoop/Hive/Roadmap
Anyone want to take a crack at migrating the HIve wiki content from Hadoop's
MoinMoin over to the Hive-specific Confluence space we have set up? In
The Apache Software Foundation (ASF)'s Travel Assistance Committee (TAC) is
now accepting applications for ApacheCon North America 2011, 7-11 November
in Vancouver BC, Canada.
The TAC is seeking individuals from the Apache community at-large --users,
developers, educators, students, Committers, an
Hey, signups are now open for this event:
http://hivecontribday2011.eventbrite.com/
It's free, but please don't sign up unless you're sure you can make it, because
seating is limited.
Contributions come in many shapes and forms (not just code), so anyone
interested in helping to advance the pr
Hmmm, I think this might be a bug which is only exposed when one of the mappers
gets zero rows of input.
If you have a Hive build, can you try adding this before line 238 of
GenericUDAFnGrams.java?
if (n == 0) {
return;
}
Just before this line:
if(myagg.n > 0 && n > 0 && myagg.n != n)
Hey there,
With some wiki migration magic from Brock Noland (assisted by Gavin from
INFRA), we've moved all of the content from MoinMoin to Confluence.
The new location is here:
https://cwiki.apache.org/confluence/display/Hive
All of the MoinMoin pages have been deleted; this is to make sure p
The location details are on the event page:
http://hivecontribday2011.eventbrite.com
There are still plenty of seats left for anyone interested.
JVS
It's not empty, but the links on it were broken; I just fixed them.
jVS
On Jun 27, 2011, at 2:31 PM, Ayon Sinha wrote:
> https://cwiki.apache.org/confluence/display/Hive/AdminManual+SettingUpHiveServer
> is empty
> and the old link is gone.
>
> -Ayon
> See My Photos on Flickr
> Also check out
On Jun 27, 2011, at 4:37 PM, Time Less wrote:
> Might as well add me as editor. I've found tons of errors and problems. Not
> the least of which the regexserde is now completely borked and nonsensical.
> Compare "([^]*) ([^]*) ..." against "([^ ]*) ([^ ]*) ..." - I thought I was
> going insane.
On Jun 27, 2011, at 5:16 PM, wrote:
> I don't have control over the MoinMoin server; if someone has something
> specific they can create an INFRA request, but the page name translation is
> not 1-to-1, so it's probably not worth the effort; the old stuff should age
> out, and the new stuff will
As the comments in HIVE-1228 mention, we decided not to address the :timestamp
requirement. So if you need that, you can work on enhancing the HBase storage
handler by opening a JIRA issue, proposing an approach, and submitting a patch.
JVS
On Jul 24, 2011, at 12:42 PM, 张建轶 wrote:
> Hello!
>
https://cwiki.apache.org/confluence/display/Hive/ContributorsMinutes110726
If you were there and spot anything incorrect or left out, please let me know
or edit the wiki.
JVS
I often get asked questions about this topic, so I've put together a wiki page
which expresses some of my thoughts on it:
https://cwiki.apache.org/confluence/display/Hive/BecomingACommitter
Let me know if there are points you'd like to add, or where you see it
differently.
JVS
I've granted you write access...thanks for helping to fix the wiki!
JVS
On Aug 8, 2011, at 11:08 AM, Travis Powell wrote:
> Hello,
>
> The Wiki has been one of the most important resources to me for learning
> Hive. There are a lot of broken links that make it hard to flip between
> topics.
The wiki docs are incorrect here. CREATE INDEX does not yet supported a
PARTITIONED BY clause; that was added in the spec to support HIVE-1499, which
hasn't been implemented yet.
For now, the index partitioning always follows the table partitioning exactly.
JVS
On Aug 14, 2011, at 3:22 AM, Da
Granted!
JVS
On Aug 15, 2011, at 4:35 PM, Jakob Homan wrote:
> The current DDL page doesn't have documentation about the describe
> database command. I'd like to add that. I'm listed under my apache
> addr: jgho...@apache.org
>
> Thanks,
> Jakob
Hey, the Apache Hive project is responsible for coming into compliance with
these:
http://www.apache.org/foundation/marks/pmcs.html
I've created a JIRA issue for tracking this, with sub-tasks for the various
work items:
https://issues.apache.org/jira/browse/HIVE-2432
Our quarterly reports fro
Thanks a lot Alan!
JVS
On Sep 9, 2011, at 12:53 PM, Alan Gates wrote:
> Posted at
> https://cwiki.apache.org/confluence/display/Hive/ContributorMinutes20110907
>
> Alan.
mirror.facebook.net is currently down and won't be back up for at least a few
days. There's a fallback at
http://archive.cloudera.com/hive-deps
If it's not kicking in for you automatically, you'll need to edit
ivy/ivysettings.xml.
JVS
On Sep 28, 2011, at 11:22 PM, Ramya Sunil wrote:
> Hi,
>
Hi Avrilia,
These are (some of) the patches you are looking for:
HIVE-1644
HIVE-2128
HIVE-2138
I'm not sure what went into 0.7.1 but they will all be in the upcoming 0.8
release.
JIRA is your friend:
https://issues.apache.org/jira/secure/IssueNavigator.jspa?reset=true&jqlQuery=project+%3D+HIV
Hey all,
Earlier this year, Facebook released a bunch of its code browsing/review tools
as a new (and independent) open source project called Phabricator:
http://phabricator.org/
We're currently experimenting with using it for improving the developer
experience when contributing and reviewing
t; Thanks,
> Ashutosh
>
> On Thu, Oct 20, 2011 at 14:00, John Sichi wrote:
> Hey all,
>
> Earlier this year, Facebook released a bunch of its code browsing/review
> tools as a new (and independent) open source project called Phabricator:
>
> http://phabricator.o
On Oct 21, 2011, at 10:07 AM, Mayuresh wrote:
> Hi,
>
> I am trying to understand the exact code flow of how the percentile_approx
> function works What happens step by step. Is there some write up which
> would help understanding the architecture? I am looking to understand how to
> add n
I've put up instructions for how anyone can start using Phabricator for code
review:
https://cwiki.apache.org/confluence/display/Hive/PhabricatorCodeReview
We've tested out the git workflows; still working on svn.
Feedback on how it works for you, anything you noticed missing, etc is
appreciat
Marek added support for svn, so that is working now too...give it a try!
Instructions updated at
https://cwiki.apache.org/confluence/display/Hive/PhabricatorCodeReview
JVS
On Oct 26, 2011, at 10:49 PM, wrote:
> I've put up instructions for how anyone can start using Phabricator for code
> r
It has been quite a while since those instructions were written, so maybe
something has broken. There is a unit test for it
(hbase-handler/src/test/queries/hbase_bulk.m) which is still passing.
If you're running via CLI, logs by default go in /tmp/
Long-term, energy best expended on this wo
On Nov 29, 2011, at 3:24 PM, Jakob Homan wrote:
> I'm trying to find documentation as to what changes in the metastore
> structure are necessary going from 0.7 to the 0.8RCs, and am failing.
> Does that mean there is none, or I'm just not very good at finding it?
README.txt, section "Upgrading f
As you can guess from the 0.89 dependency, there has been a lot of water under
the bridge since this integration was developed. If someone would like to take
on bringing it up to date, that would be great.
Note that auxpath is to make the jars available in map/reduce task VM's (we
don't put ev
The queries go through the region servers, not directly to HDFS.
JVS
On Dec 2, 2011, at 10:53 AM, Gabriel Eisbruch wrote:
> Hi everybody,
> I have a question about hbase hive integration that, I have not can
> found in any where: if I run a hive query, this query will read the HFiles
>
Yes, everything goes through the HBase API.
JVS
On Dec 2, 2011, at 2:09 PM, Gabriel Eisbruch wrote:
> Ok, so The map/reduce jobs are connnected to The regionservers?
>
> Gabriel.
>
> El dic 2, 2011 6:52 p.m., "John Sichi" escribió:
> The queries go through the reg
On Dec 8, 2011, at 12:20 PM, Sam William wrote:
> I have a bunch of custom UDFs and I d like the others in the company to
> make use of then in an easy way .Im not very happy with the 'CREATE
> TEMPORARY FUNCTION' arrangement for each session . It d be great if our
> site-specific
https://cwiki.apache.org/confluence/display/Hive/ContributorMinutes20111205
I created an INFRA ticket to take Hive out of Review Board:
https://issues.apache.org/jira/browse/INFRA-4200
Please use Phabricator for all new review requests:
https://cwiki.apache.org/confluence/display/Hive/Phabricat
Yo if you have patches outstanding, sorry about the conflicts from the massive
one-time diff update from HIVE-1040 I just committed :)
JVS
Hey all,
Marek Sapota has put together a doc on the new scripts for spreading Hive unit
test execution across a cluster:
https://cwiki.apache.org/confluence/display/Hive/Unit+Test+Parallel+Execution
Whether you are a committer or someone contributing patches, if you are
currently frustrated by
Carl, big thanks to you as release manager for pushing this one through!
JVS
On Dec 19, 2011, at 10:29 AM, Carl Steinbach wrote:
> The Apache Hive team is proud to announce the the release of Apache
> Hive version 0.8.0.
>
> The Apache Hive (TM) data warehouse software facilitates querying and
I am happy to announce that the Apache Hive PMC has voted to add existing
committer Ashutosh Chauhan as a new PMC member. Thanks Ashutosh for all of
your work on the project!
JVS
The Apache Hive PMC has passed a vote to make Navis Ryu a new
committer on the project.
JIRA is currently down, so I can't send out a link with his
contribution list at the moment, but if you have an account at
reviews.facebook.net, you can see his activity here:
https://reviews.facebook.net/p/na
84 matches
Mail list logo