ractSerDe, unless the API has changed such that
>> such a mapping cannot be done.
>>
>> Regards,
>> Matt
>>
>>
>>
>> On Oct 25, 2017, at 7:31 PM, Owen O'Malley
>> wrote:
>>
>>
>> On Wed, Oct 25, 2017 at 3:20 PM, Stephen S
hould use AbstractSerDe instead.
>
> .. Owen
>
> On Oct 25, 2017, at 2:18 PM, Stephen Sprague wrote:
>
> hey guys,
>
> could be a dumb question but not being a java type of guy i'm not quite
> sure about it. I'm upgrading from 2.1.0 to 2.3.0 and encountering this
hey guys,
could be a dumb question but not being a java type of guy i'm not quite
sure about it. I'm upgrading from 2.1.0 to 2.3.0 and encountering this
error:
class not found: org/apache/hadoop/hive/serde2/SerDe
so in hive 2.1.0 i see it in this jar:
* hive-serde-2.1.0.jar
org/apache/hadoop/hi
Now its a matter of comparing the performance with Tez.
Cheers,
Stephen.
On Wed, Sep 27, 2017 at 9:37 PM, Stephen Sprague wrote:
> ok.. getting further. seems now i have to deploy hive to all nodes in the
> cluster - don't think i had to do that before but not a big deal to do it
> now.
bly a compatibility issue.
i know. i know. no surprise here.
so i guess i just got to the point where everybody else is... build spark
w/o hive.
lemme see what happens next.
On Wed, Sep 27, 2017 at 7:41 PM, Stephen Sprague wrote:
> thanks. I haven't had a chance to dig into this agai
look at the HoS Remote Driver logs. The driver
> gets launched in a YARN container (assuming you are running Spark in
> yarn-client mode), so you just have to find the logs for that container.
>
> --Sahil
>
> On Tue, Sep 26, 2017 at 9:17 PM, Stephen Sprague
> wrote:
>
>>
lelism.getSparkMemoryAndCores(SetSparkReducerParallelism.java:236)
[hive-exec-2.3.0.jar:2.3.0]
i'll dig some more tomorrow.
On Tue, Sep 26, 2017 at 8:23 PM, Stephen Sprague wrote:
> oh. i missed Gopal's reply. oy... that sounds forebodin
oh. i missed Gopal's reply. oy... that sounds foreboding. I'll keep you
posted on my progress.
On Tue, Sep 26, 2017 at 4:40 PM, Gopal Vijayaraghavan
wrote:
> Hi,
>
> > org.apache.hadoop.hive.ql.parse.SemanticException: Failed to get a
> spark session: org.apache.hadoop.hive.ql.metadata.HiveExc
u
> have older versions of Spark installed locally?
>
> --Sahil
>
> On Tue, Sep 26, 2017 at 3:33 PM, Stephen Sprague
> wrote:
>
>> thanks Sahil. here it is.
>>
>> Exception in thread "main" java.lang.NoClassDefFoundError:
>> org/apache/spark/sc
h more recent versions
> of Spark, but we only test with Spark 2.0.0.
>
> --Sahil
>
> On Tue, Sep 26, 2017 at 2:35 PM, Stephen Sprague
> wrote:
>
>> * i've installed hive 2.3 and spark 2.2
>>
>> * i've read this doc plenty of times -> https://cwi
* i've installed hive 2.3 and spark 2.2
* i've read this doc plenty of times ->
https://cwiki.apache.org/confluence/display/Hive/Hive+on+Spark%3A+Getting+Started
* i run this query:
hive --hiveconf hive.root.logger=DEBUG,console -e 'set
hive.execution.engine=spark; select date_key, count(*) f
i'm running hive version 2.1.0 and found this interesting. i've broken it
down into a trivial test case below.
i run this:
select a.date_key,
a.property_id,
cast(NULL as bigint) as malone_id,
cast(NULL as bigint) as zpid,
su
e quickly followed up by Hive 2.3, which will
> be more aggressive with features, but less stable.
>
> .. Owen
>
> On Mon, Jun 19, 2017 at 7:53 PM, Stephen Sprague
> wrote:
>
>> Hey guys,
>> Is there any word out on the street about a timeframe for the next
Hey guys,
Is there any word out on the street about a timeframe for the next 2.x hive
release? Looks like Dec 2016 was the last one.
The natives are getting restless i think. :)
thanks,
Stephen.
have you researched the yarn schedulers? namely the capacity and fair
schedulers? those are the places where resource limits can be easily
defined.
On Mon, Jun 5, 2017 at 9:25 PM, Chang.Wu <583424...@qq.com> wrote:
> My Hive engine is MapReduce and Yarn. What my urgent need is to limit the
> mem
jgaonkar
> wrote:
>
>> This is interesting and possibly a bug. Did you try changing them to
>> managed tables and then dropping or truncating them? How do we reproduce
>> this on our setup?
>>
>> On Tue, May 16, 2017 at 6:38 PM, Stephen Sprague
>> wrote:
&g
nd then dropping or truncating them? How do we reproduce
> this on our setup?
>
> On Tue, May 16, 2017 at 6:38 PM, Stephen Sprague
> wrote:
>
>> fwiw. i ended up re-creating the ec2 cluster with that same host name
>> just so i could drop those tables from the metastore
at 6:38 AM, Stephen Sprague wrote:
> hey guys,
> here's something bizarre. i created about 200 external tables with a
> location something like this 'hdfs:///path'. this was three
> months ago and now i'm revisiting and want to drop these tables.
>
> ha! no can
hey guys,
here's something bizarre. i created about 200 external tables with a
location something like this 'hdfs:///path'. this was three
months ago and now i'm revisiting and want to drop these tables.
ha! no can do!
that is long gone.
Upon issuing the drop table command i get this:
Error
ge developers. If you make more $$s it makes sense
> learning this stuff is supposed to be harder.
>
> Conclusion, don't try it. Or try using Tez/Hive instead of Spark/Hive if
> you are querying large files.
>
>
>
> On Friday, March 17, 2017 11:33 AM, Stephen Sprague
it based on previous
>> experiences)
>>
>> But in hindsight, people who work on this kinds of things typically make
>> more money that the average developers. If you make more $$s it makes sense
>> learning this stuff is supposed to be harder.
>>
>> Conclu
:( gettin' no love on this one. any SME's know if Spark 2.1.0 will work
with Hive 2.1.0 ? That JavaSparkListener class looks like a deal breaker
to me, alas.
thanks in advance.
Cheers,
Stephen.
On Mon, Mar 13, 2017 at 10:32 PM, Stephen Sprague
wrote:
> hi guys,
> wonderin
hi guys,
wondering where we stand with Hive On Spark these days?
i'm trying to run Spark 2.1.0 with Hive 2.1.0 (purely coincidental
versions) and running up against this class not found:
java.lang.NoClassDefFoundError: org/apache/spark/JavaSparkListener
searching the Cyber i find this:
1.
h
hey guys,
I have a question on why Hiveserver2 would issue a "killjob" signal.
We run Yarn on Hadoop 5.6 with the HiveServer2 process. It uses the
fair-scheduler. Pre-emption is turned off. At least twice a day we have
jobs that are randomly killed. they can be big jobs, they can be small
ones. t
s you have not specified the
>> location).
>>
>> Adding user@hive.apache.org as this is hive related.
>>
>>
>> ~Rajesh.B
>>
>> On Sun, Dec 25, 2016 at 12:08 AM, Stephen Sprague
>> wrote:
>>
>> all,
>>
>> i'm running tez
my 2 cents. :)
as soon as you say "complex query" i would submit you've lost the upperhand
and you're behind the eight-ball right off the bat. And you know this too
otherwise you wouldn't have posted here. ha!
i use cascading CTAS statements so that i can examine the intermediate
tables. Anothe
Ahh. thank you.
On Thu, Dec 8, 2016 at 3:19 PM, Alan Gates wrote:
> Apache keeps just the latest version of each release on the mirrors. You
> can find all Hive releases at https://archive.apache.org/dist/hive/ if
> you need 2.1.0.
>
> Alan.
>
> > On Dec 8, 2016, a
out of curiosity any reason why release 2.1.0 disappeared from
apache.claz.org/hive ? apologies if i missed the conversation about it.
thanks.
[image: Inline image 1]
On Thu, Dec 8, 2016 at 9:58 AM, Jesus Camacho Rodriguez wrote:
> The Apache Hive team is proud to announce the release of Apa
.
Anyway, I reset that back to hdfs and was inserting into an external table
located in s3 and *still* got that error above much to my consternation.
however, by playing with "hive.exec.stagingdir" (and reading that
stackoverflow) i was able to overcome the error.
YMMV.
Cheers,
Stephen.
a/browse/HADOOP-13345>
>- Use Hive on EMR with Amazon's S3 filesystem implementation and
>EMRFS. Note that this confusingly requires and overloads the 's3://'
> scheme.
>
> Hope this helps, and please report back with any findings as we are doing
> quite a bit
th success?
seems to me hive 2.2.0 and perhaps hadoop 2.7 or 2.8 are the only chances
of success but i'm happy to be told i'm wrong.
thanks,
Stephen.
On Mon, Nov 14, 2016 at 10:25 PM, Jörn Franke wrote:
> Is it a permission issue on the folder?
>
> On 15 Nov 2016, at 06:28,
so i figured i try and set hive.metastore.warehouse.dir=s3a://bucket/hive
and see what would happen.
running this query:
insert overwrite table omniture.hit_data_aws partition
(date_key=20161113) select * from staging.hit_data_aws_ext_20161113 limit 1;
yields this error:
Failed with exce
ha! kinda shows how the tech stack boundaries now are getting blurred,
eh? well at least for us amateurs! :o
On Thu, Nov 3, 2016 at 5:00 AM, Donald Matthews wrote:
> |Spark calls its SQL part HiveContext, but it is not related to this
> list
>
> Oof, I didn't realize that. Thanks for letti
ok. i'll bite.
lets see the output of this command where Hiveserver2 is running.
$ ps -ef | grep -i hiveserver2
this'll show us all the command line parameters HS2 was (ultimately)
invoked with.
Cheers,
Stephen
On Sun, Oct 23, 2016 at 6:46 AM, patcharee
wrote:
> Hi,
>
> I use beeline to conn
hey guys,
this is a long shot but i'll ask anyway. We're running YARN and
HiveServer2 (v2.1.0) and noticing "random" kills - what looks to me - being
issued by HiveServer2.
we've turned DEBUG log level on for the Application Master container and
see the following in the logs:
2016-10-05 02:06:1
you might just end up using your own heuristics. if the port is "alive"
(ie. you can list it via netstat or telnet to it) but you can't connect...
then you got yourself a problem.
kinda like a bootstrapping problem, eh? you need to connect to get the
version but you can't connect if you don't hav
gotta start by looking at the logs and run the local client to eliminate
HS2. perhaps running hive as such:
$ hive -hiveconf hive.root.logger=DEBUG,console
do you see any smoking gun?
On Wed, Sep 28, 2016 at 7:34 AM, Jose Rozanec wrote:
> Hi,
>
> We have a Hive cluster (Hive 2.1.0+Tez 0.8.4)
> * Are you using Hive-2.x at your org and at what scale?
yes. we're using 2.1.0. 1.5PB. 30 node cluster. ~1000 jobs a day.And
yeah hive 2.1.0 has some issues and can require some finesse wrt the
hive-site.xml settings.
> * Is the release stable enough? Did you notice any correctness issue
>at org.apache.hadoop.hive.ql.io.orc.OrcInputFormat.
validateInput(OrcInputFormat.java:508)
would it be safe to assume that you are trying to load a text file into an
table stored as ORC?
your create table doesn't specify that explicitly so that means you have a
setting in your configs that says
for the query to hang.
* so empty result expected.
as Gopal mentioned previously this does indeed fix it:
* set hive.fetch.task.conversion=none;
but not sure its the right thing to set globally just yet.
Anyhoo users beware.
Regards,
Stephen
On Wed, Aug 31, 2016 at 7:01 AM, Stephen Spragu
hmmm. so beeline blew up *before* the query was even submitted to the
execution engine? one would think 16G would be plenty 8M row sql
statement.
some suggestions if you feel like going further down the rabbit hole.
1. confirm your beeline java process is indeed running with expanded
memory (
lemme guess. your query contains an 'in' clause with 1 million static
values? :)
* brute force solution is to set:
HADOOP_CLIENT_OPTS=-Xmx8G (or whatever)
before you run beeline to force a larger memory size
(i'm pretty sure beeline uses that env var though i didn't actually check
the script)
> rogue queries
so this really isn't limited to just hive is it? any dbms system perhaps
has to contend with this. even malicious rogue queries as a matter of fact.
timeouts are cheap way systems handle this - assuming time is related to
resource. i'm sure beeline or whatever client you use has
ons.
>
> Cheers,
> Vlad
>
> ---
> From: Stephen Sprague
> To: "user@hive.apache.org"
> Cc:
> Date: Tue, 30 Aug 2016 20:28:50 -0700
> Subject: hive.root.logger influencing query plan?? so it's not so
> Hi guys,
> I've banged my head
Hi guys,
I've banged my head on this one all day and i need to surrender. I have a
query that hangs (never returns). However, when i turn on logging to DEBUG
level it works. I'm stumped. I include here the query, the different
query plans (with the only thing different being the log level) and
t;
>> http://talebzadehmich.wordpress.com
>>
>>
>> *Disclaimer:* Use it at your own risk. Any and all responsibility for
>> any loss, damage or destruction of data or any other property which may
>> arise from relying on this email's technical content is expl
ich may arise
> from relying on this email's technical content is explicitly disclaimed.
> The author will in no case be liable for any monetary damages arising from
> such loss, damage or destruction.
>
>
>
> On 26 August 2016 at 20:32, Stephen Sprague wrote
arying
VIEW_EXPANDED_TEXT | text |
VIEW_ORIGINAL_TEXT | text |
{quote}
wonder if i can perform some surgery here. :o do i feel lucky?
On Fri, Aug 26, 2016 at 12:28 PM, Stephen Sprague
wrote:
> well that doesn't bode well. :(
>
> we definitely
ion of data or any other property which may arise
> from relying on this email's technical content is explicitly disclaimed.
> The author will in no case be liable for any monetary damages arising from
> such loss, damage or destruction.
>
>
>
> On 26 August 2016 at 16:43, Stephe
thanks Gopal. you're right our metastore is using Postgres. very
interesting you were able to intuit that!
lemme give your suggestions a try and i'll post back.
thanks!
Stephen
On Fri, Aug 26, 2016 at 8:32 AM, Gopal Vijayaraghavan
wrote:
> > NULL::character%20varying)
> ...
> > i want to say
hey guys,
this ones a little more strange.
hive> create view foo_vw as select * from foo;
OK
Time taken: 0.376 seconds
hive> drop view foo_vw;
FAILED: Execution Error, return code 1 from
org.apache.hadoop.hive.ql.exec.DDLTask.
MetaException(message:java.lang.IllegalArgumentException:
java.net.URI
s on master. I’ll file a bug...
>
> From: Stephen Sprague
> Reply-To: "user@hive.apache.org"
> Date: Thursday, August 25, 2016 at 13:34
> To: "user@hive.apache.org"
> Subject: Re: hive 2.1.0 and "NOT IN ( list )" and column is a
> partition_key
&
Hi Gopal,
Thank you for this insight. good stuff. The thing is there is no 'foo'
for etl_database_source so that filter if anything should be
short-circuited to 'true'. ie. double nots. 1. not in 2. and foo not
present.
it doesn't matter what what i put in that "not in" clause the filter al
anybody run up against this one? hive 2.1.0 + using a "not in" on a list
+ the column is a partition key participant.
* using not
query:
explain
SELECT count(*)
FROM bi.fact_email_funnel
WHERE
event_date_key = 20160824
AND etl_source_database *not* in ('foo')
output frag:
Map Opera
indeed +1 to Gopal on that explanation! That was huge.
On Wed, Aug 17, 2016 at 12:58 AM, 明浩 冯 wrote:
> Hi Gopal,
>
>
> It works when I disabled the dfs.namenode.acls.
>
> For the data loss, it doesn't affect me too much currently. But I will
> track the issue in Kylin.
>
> Thank you very much fo
stackoverflow is your friend.
that said have a peek at the doc even :) cf. https://cwiki.apache.org/
confluence/display/Hive/LanguageManual+DDL#LanguageManualDDL-Keywords,Non-
reservedKeywordsandReservedKeywords paying close attention to this
paragraph:
{quote}
Reserved keywords are permitted a
this error messages says everything you need to know:
>Likely cause: new client talking to old server. Continuing without it.
when you upgrade hive you also need to upgrade the metastore schema.
failing to do that can trigger the message you're getting.
On Wed, Aug 10, 2016 at 6:41 AM, Mich Tale
Hi Gopal,
Aha! thank you for background behind this. that makes things much more
understandable.
and ~3000 queries across 10 HS2 servers. sweet. now that's what i call
pushing the edge. I like it!
Thanks again,
Stephen.
On Tue, Aug 9, 2016 at 10:29 PM, Gopal Vijayaraghavan
wrote:
> > not get
g seek out that operation_log dir & associated
file.
Thanks,
Stephen.
On Tue, Aug 9, 2016 at 6:44 PM, Stephen Sprague wrote:
> well, well. i just found this: https://issues.apache.org/
> jira/browse/HIVE-14183 seems something changed between 1.2.1 and
> 2.1.0.
>
> i'll see
well, well. i just found this:
https://issues.apache.org/jira/browse/HIVE-14183 seems something changed
between 1.2.1 and 2.1.0.
i'll see if the Rx as prescribed in that ticket does indeed work for me.
Thanks,
Stephen.
On Tue, Aug 9, 2016 at 5:12 PM, Stephen Sprague wrote:
> hey guy
hey guys,
try as i might i cannot seem to get beeline (via jdbc) to log information
back from hiveserver2 like job_id, progress and that kind of information
(similiar to what the local beeline or hive clients do.)
i see this ticket that is closed:
https://issues.apache.org/jira/browse/HIVE-7615 wh
ad=0"? You mentioned that your tables tables are in s3,
> but the external table created was pointing to HDFS. Was that intentional?
>
> ~Rajesh.B
>
> On Fri, Jul 15, 2016 at 6:58 AM, Stephen Sprague
> wrote:
>
>> in the meantime given my tables are in s3 i've
t msck repair tables does but in a non-portable
way. oh well. gotta do what ya gotta do.
On Wed, Jul 13, 2016 at 9:29 PM, Stephen Sprague wrote:
> hey guys,
> i'm using hive version 2.1.0 and i can't seem to get msck repair table to
> work. no matter what i try i get the '
hey guys,
i'm using hive version 2.1.0 and i can't seem to get msck repair table to
work. no matter what i try i get the 'ol NPE. I've set the log level to
'DEBUG' but yet i still am not seeing any smoking gun.
would anyone here have any pointers or suggestions to figure out what's
going wrong?
Hi guys,
it was suggested i post to the user@hive group rather than the user@tez
group for this one. Here's my issue. My query hangs when using beeline via
HS2 (but works with the local beeline client). I'd like to overcome that.
This is my query:
beeline -u 'jdbc:hive2://
dwrdevnn1.sv2.trui
i refuse to take anybody seriously who has a sig file longer than one line
and that there is just plain repugnant.
On Wed, Feb 3, 2016 at 1:47 PM, Mich Talebzadeh wrote:
> I just did some further tests joining a 5 million rows FACT tables with 2
> DIMENSION tables.
>
>
>
> SELECT t.calendar_mon
quot; Sure setting
mapreduce.job.name explicitly is a workaround but... that's a boat load of
code changes!
Would not there be a "fix" to roll this back to how it got the job.name
before?
Thanks,
Stephen Sprague
On Wed, Mar 11, 2015 at 1:38 PM, Viral Bajaria
wrote:
> I haven't used
great policy. install open source software that's not even version 1.0 into
production and then not allow the ability to improve it (but of course reap
all the rewards of its benefits.) so instead of actually fixing the
problem the right way introduce a super-hack work-around cuz, you know,
that's
og and save it under hive warehouse
> as table and query from there.
>
>
>
> *RegardsMuthupandi.K*
>
> [image: Picture (Device Independent Bitmap)]
>
>
>
> On Sat, Sep 6, 2014 at 4:47 AM, Stephen Sprague
> wrote:
>
>> great find, Muthu. I would be interes
great find, Muthu. I would be interested in hearing any about any success
or failures using this adapter. almost sounds too good to be true.
After reading the blog (
http://innovating-technology.blogspot.com/2013/04/mysql-hadoop-applier-part-2.html)
about it i see it comes with caveats and it loo
what container are you using for your metastore? Derby, mysql or postgres?
for a large set of tables don't use Derby.
So you've confirmed its the ODBC driver and not the metastore itself?
On Fri, Aug 15, 2014 at 8:54 AM, Bradley Wright wrote:
> Try an eval of our commercial ODBC driver for Hiv
i'll take a stab at this.
- probably no reason.
- if you can. is there a derby client s/t you can issue the command: "alter
table COLUMNS_V2 modify TYPE_NAME varchar(32672)". otherwise maybe use the
mysql or postgres metastores (instead of derby) and run that alter command
after the install.
- t
searching this list will in fact show you're not alone. what is being
done about it is another matter.
On Wed, Jun 11, 2014 at 10:42 AM, Benjamin Bowman
wrote:
> All,
>
> I am running Hadoop 2.4 and Hive 0.13. I consistently run out of Hive
> heap space when running for a long period of time
wow. good find. i hope these config settings are well documented and that
you didn't have to spend alot time searching for that. Interesting that
the default isn't true for this one.
On Wed, Apr 2, 2014 at 11:00 PM, Abhay Bansal wrote:
> I was able to resolve the issue by setting "hive.optimize
fwiw. i would not have the repair table statement as part of a production
job stream. That's kinda a poor man's way to employ dynamic partitioning
off the back end.
Why not either use hive's dynamic partitioning features or pre-declare your
partitions? that way you are explicitly coding for your
the error message is correct. remember the partition columns are not
stored with the data and by doing a "select *" that's what doing. And this
has nothing to do with ORC either its a Hive thing. :)
so your second approach was close. just omit the partition columns yr, mo,
day.
On Wed, Mar 26
g
>
> select key from (query result that doesn't contain the key field) ...
>
>
> On Thu, Mar 20, 2014 at 1:28 PM, Stephen Sprague wrote:
>
>> I agree with your assessment of the inner query. why stop there though?
>> Doesn't the outer query fetch the ids of the tag
e a
> list of duplicate elements and their counts, but it loses the information
> as to what id had these elements.
>
> I'm trying to find which pairs of ids have any duplicate tags.
>
>
> On Thu, Mar 20, 2014 at 11:57 AM, Stephen Sprague wrote:
>
>> hmm.
hmm. would this not fall under the general problem of identifying
duplicates?
Would something like this meet your needs? (untested)
select -- outer query finds the ids for the duplicates
key
from ( -- inner query lists duplicate values
select
count(*) as cnt,
value
t; maintains the count, how can Hive be used to derive the percentile?
>
> Value Count
> 100 2
> 200 4
> 300 1
>
> Thanks,
> Seema
>
> From: Stephen Sprague
> Reply-To: "user@hive.apache.org"
> Date: Thursday
not a hive question is it? its more like a math question.
On Wed, Mar 19, 2014 at 1:30 PM, Seema Datar wrote:
>
>
> I understand the percentile function is supported in Hive in the latest
> versions. However, how does once calculate percentiles when the data is
> across two columns. So say
but why go through all this and make it so long-winded, verbose and
non-standard? That's a pain to maintain!
just use tabs as your transform in/out separator and go easy on the next
guy who has to maintain your code. :)
On Tue, Mar 18, 2014 at 4:59 PM, Nurdin Premji <
nurdin.pre...@casalemedia.
ver2 is running on. Basically, there is no way to reach those files
> from our boxes. That's why I was asking about writing it locally.
> I'll check this list for import/export like you mentioned.
>
> Thanks.
>
>
> On Friday, March 14, 2014 12:23 PM, Stephen Spra
re: HiveServer2
this is not natively possible (this falls under the export rubric.)
similarly, you can't load a file directly from your client using native
syntax (import.)
Believe me, you're not the only one who'd like this both of these
functions. :)
I'd search this list for import or export ut
luck!
On Fri, Mar 14, 2014 at 4:21 AM, Nitin Pawar wrote:
> Can you first try updating hive to atleast 0.11 if you can not move to
> 0.12 ?
>
>
> On Fri, Mar 14, 2014 at 4:49 PM, Arafat, Moiz wrote:
>
>> My comments inline
>>
>>
>>
>> *From:* Ste
/partition_hr=1
>
> $ hadoop fs -copyFromLocal test.dat
> /user/moiztcs/moiz_partition_test/partition_hr=10
>
> $ hadoop fs -copyFromLocal test.dat
> /user/moiztcs/moiz_partition_test/partition_hr=2
>
>
>
> 5) hive> select distinct partition_hr from moiz_partition_test ord
just a public service announcement.
I had a case where i had a nested json array in a string and i needed that
to act like a first class array in hive. natively, you can pull it out but
it'll just a string. woe is me.
I searched around the web and found this:
http://stackoverflow.com/questions/1
2, 2014 at 9:36 AM, Stephen Sprague wrote:
> interesting.don't know the answer but could you change the UNION in
> the Postgres to UNION ALL? I'd be curious if the default is UNION DISTINCT
> on that platform. That would at least partially explain postgres behaviour
> lea
interesting.don't know the answer but could you change the UNION in the
Postgres to UNION ALL? I'd be curious if the default is UNION DISTINCT on
that platform. That would at least partially explain postgres behaviour
leaving hive the odd man out.
On Wed, Mar 12, 2014 at 6:47 AM, Martin Kud
est.dat /user/moiztcs/moiz_partition_test/02
>
> hadoop fs -copyFromLocal test.dat /user/moiztcs/moiz_partition_test/10
>
>
>
> 4) Ran the sql
>
> hive> select distinct partition_hr from moiz_partition_test order by
> partition_hr;
>
> Ended Job
>
> OK
that makes no sense. if the column is an int it isn't going to sort like a
string. I smell a user error somewhere.
On Tue, Mar 11, 2014 at 6:21 AM, Arafat, Moiz wrote:
> Hi ,
>
> I have a table that has a partition column partition_hr . Data Type is int
> (partition_hrint) . When i run
short answer: its by position.
yeah. that's not right.
1. lets see the output of "show create table foo"
2. what version of hive are you using.
On Fri, Mar 7, 2014 at 11:46 AM, Keith Wiley wrote:
> I want to convert a table to a bucketed table, so I made a new table with
> the same schema as the old table and specified a c
0') )
-- no where clause needed on 'AGE' since its part of the where clause in the
-- derived table.
{code}
i switched your ON clause and WHERE clause so be sure to take that under
consideration. And finally its not tested.
Best of luck.
Cheers,
Stephen
On Tue, Mar 4, 2014 a
Let's just say this. Coercing hive into doing something its not meant to
do is kinda a waste of time. Sure you can rewrite any update as a
delete/insert but that's not the point of Hive.
Seems like your going down a path here that's not optimal for your
situation.
You know, I could buy a Tesla a
that advice is way over complicating something that is very easy. instead,
please take this approach.
1. run the ddl to create the table on the new cluster
2. distcp the hdfs data into the appropriate hdfs directory.
3. run "msck repair table " in hive to discover the partitions and
populate the m
if you can configure flume to create temporary files that start with an
underscore (_) i believe hive will safely ignore them. otherwise you have
write a script to move them out.
On Fri, Feb 28, 2014 at 11:09 AM, P lva wrote:
> Hi,
>
> I'm have a flume stream that stores data in a directory whi
this is a FAQ. see doc on: msck repair table
this will scan hdfs and create the corresponding partitions in the
metastore.
On Fri, Feb 28, 2014 at 12:59 AM, shashwat shriparv <
dwivedishash...@gmail.com> wrote:
> Where was your meta data in derby or MySql?
>
>
> *Warm Regards_**∞_*
> * Shashw
yeah. That traceback pretty much spells it out - its metastore related and
that's where the partitions are stored.
I'm with the others on this. HiveServer2 is still a little jankey on memory
management. I bounce mine once a day at midnight just to play it safe (and
because i can.)
Again, for me,
Hi Jone,
um. i can say for sure something is wrong. :)
i would _start_ by going to the tasktracker. this is your friend. find
your job and look for failed reducers. That's the starting point anyway,
IMHO.
On Fri, Feb 21, 2014 at 11:35 AM, Jone Lura wrote:
> Hi,
>
> I have tried some variat
1 - 100 of 235 matches
Mail list logo