I think that's right. My testing (not very scientific) puts it on par for
redshift for the datasets I use.
On Sunday, August 7, 2016, Edward Capriolo wrote:
> A few entities going to "kill/take out/better than hive"
> I seem to remember HadoopDb, Impala, RedShift , voltdb...
>
> But apparent hiv
any other property which may arise
> from relying on this email's technical content is explicitly disclaimed.
> The author will in no case be liable for any monetary damages arising from
> such loss, damage or destruction.
>
>
>
> On 7 August 2016 at 13:17, Marcin
gt;
>
> Dudu
>
>
>
> *From:* Marcin Tustin [mailto:mtus...@handybook.com]
> *Sent:* Sunday, August 07, 2016 3:17 PM
> *To:* user@hive.apache.org
> *Subject:* Re: Crate Non-partitioned table from partitioned table using
> CREATE TABLE .. LIKE
>
>
>
> Will CR
Will CREATE TABLE sales5 AS SELECT * FROM SALES; not work for you?
On Thu, Aug 4, 2016 at 5:05 PM, Nagabhushanam Bheemisetty <
nbheemise...@gmail.com> wrote:
> Hi I've a scenario where I need to create a table from partitioned table
> but my destination table should not be partitioned. I won't be
) to reader type
> struct (1) (state=,code=0)
>
>
>
> So what is wrong with the above?
>
>
> I should mention, that I created the orc files having used using the latest
> orc-core lib (1.1.2). That seems not to be the same implementation for orc
> files access as being use
Yes. Create an external table whose location contains only the orc file(s)
you want to include in the table.
On Wed, Aug 3, 2016 at 7:53 AM, Johannes Stamminger <
johannes.stammin...@airbus.com> wrote:
> Hi,
>
>
> is it possible to write data to an orc file(s) using the hive-orc api and
> to
> us
r property which may arise
> from relying on this email's technical content is explicitly disclaimed.
> The author will in no case be liable for any monetary damages arising from
> such loss, damage or destruction.
>
>
>
> On 14 July 2016 at 23:29, Marcin Tustin
What do you want it to do? There are at least two web interfaces I can
think of.
On Thu, Jul 14, 2016 at 6:04 PM, Mich Talebzadeh
wrote:
> Hi Gopal,
>
> If I recall you were working on a UI support for Hive. Currently the one
> available is the standard Hadoop one on port 8088.
>
> Do you have a
> loss, damage or destruction of data or any other property which may arise
> from relying on this email's technical content is explicitly disclaimed.
> The author will in no case be liable for any monetary damages arising from
> such loss, damage or destruction.
>
>
>
>
Quick note - my experience (no benchmarks) is that Tez without LLAP (we're
still not on hive 2) is faster than MR by some way. I haven't dug into why
that might be.
On Tue, Jul 12, 2016 at 9:19 AM, Mich Talebzadeh
wrote:
> sorry I completely miss your points
>
> I was NOT talking about Exadata.
This is because a GZ file is not splittable at all. Basically, try creating
this from an uncompressed file, or even better split up the file and put
the files in a directory in hdfs/s3/whatever.
On Tue, Jun 21, 2016 at 7:45 PM, @Sanjiv Singh
wrote:
> Hi ,
>
> I have big compressed data file *my_
Hi All,
I just added local jars to my hive session, created permanent functions,
and find that they are available across sessions and machines. This is of
course excellent, but I'm wondering where those jars are being stored? What
setting or what default directory would I find them in.
My session
Mich - it sounds like maybe you should try these benchmarks with alluxio
abstracting the storage layer, and see how much it makes a difference.
Alluxio should (if I understand it right) provide a lot of the optimisation
you're looking for with in memory work.
I've never used it, but I would love t
Hi All,
I have a database backed by an s3 bucket. When I try to drop that database,
I get a NullPointerException:
hive> drop database services_csvs cascade;
FAILED: Execution Error, return code 1 from
org.apache.hadoop.hive.ql.exec.DDLTask.
MetaException(message:java.lang.NullPointerException)
Mich
>
>
>
>
> Dr Mich Talebzadeh
>
>
>
> LinkedIn *
> https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
> <https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>*
>
>
>
> http://talebzadehmich.wordpress.co
They're not simply interchangeable. sqoop is written to use mapreduce.
I actually implemented my own replacement for sqoop-export in spark, which
was extremely simple. It wasn't any faster, because the bottleneck was the
receiving database.
Is your motivation here speed? Or correctness?
On Sat,
+---+--+--+
>
>
>
>
>
> Dr Mich Talebzadeh
>
>
>
> LinkedIn *
> https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
> <https://www.linkedin.com/profile/view?id=AAEWh2gBxianrb
> <https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>*
>
>
>
> http://talebzadehmich.wordpress.com
>
>
>
> On 18 April 2016 at 23:43, Marcin Tustin wrote:
>
>> HBase has a different use case - it's for low-latency querying of big
>> tables. I
HBase has a different use case - it's for low-latency querying of big
tables. If you combined it with Hive, you might have something nice for
certain queries, but I wouldn't think of them as direct competitors.
On Mon, Apr 18, 2016 at 6:34 PM, Mich Talebzadeh
wrote:
> Hi,
>
> I notice that Impal
This is a classic transform-load problem. You'll want to anonymise it once
before making it available for analysis.
On Thursday, March 17, 2016, Ajay Chander wrote:
> Hi Everyone,
>
> I have a csv.file which has some sensitive data in a particular column
> in it. Now I have to create a table in
issue. I
> can verify it and provide a fix in case of bug.
>
> Thanks
> Prasanth
>
> On Mar 8, 2016, at 5:52 AM, Marcin Tustin > wrote:
>
> Hi Mich,
>
> ddl as below.
>
> Hi Prasanth,
>
> Hive version as reported by Hortonworks is 1.2.1.2.3.
>
>
I you wish to keep it in its current location consider creating an external
table.
On Saturday, March 12, 2016, Rex X wrote:
> Hi Mich,
>
> I am doing this, because I need to update an existing big hive table,
> which can be stored in any arbitrary customized location on hdfs. But when
> we do A
:
> Hi
>
> can you please provide DDL for this table "show create table "
>
> Dr Mich Talebzadeh
>
>
>
> LinkedIn *
> https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
> <https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUr
I believe updates and deletes have always had this constraint. It's at
least hinted at by:
https://cwiki.apache.org/confluence/display/Hive/Hive+Transactions#HiveTransactions-ConfigurationValuestoSetforINSERT,UPDATE,DELETE
On Mon, Mar 7, 2016 at 7:46 PM, Mich Talebzadeh
wrote:
> Hi,
>
> I notice
Hi All,
Following on from from our parquet vs orc discussion, today I observed
hive's alter table ... concatenate command remove rows from an ORC
formatted table.
1. Has anyone else observed this (fuller description below)? And
2. How to do parquet users handle the file fragmentation issue?
Desc
Don't bucket on columns you expect to update.
Potentially you could delete the whole row and reinsert it.
On Sunday, March 6, 2016, Ashok Kumar wrote:
> Hi gurus,
>
> I have an ORC table bucketed on invoicenumber with "transactional"="true"
>
> I am trying to update invoicenumber column used fo
If you google, you'll find benchmarks showing each to be faster than the
other. In so far as there's any reality to which is faster in any given
comparison, it seems to be a result of each incorporating ideas from the
other, or at least going through development cycles to beat each other.
ORC is v
Hi All,
I'm seeing some data loss/corruption in hive. This isn't HDFS-level
corruption - hdfs reports that the files and blocks are healthy.
I'm using managed ORC tables. Normally we write once an hour to each table,
with occasional concatenations through hive. We perform the writing using
spark
That is the expected behaviour. Managed tables are created within the
directory of their host database.
On Tuesday, 19 January 2016, 董亚军 wrote:
> hi list,
>
> we use the HDFS and S3 as the Hive Filesystem at the same time. here has
> an issue:
>
>
> *scenario* 1:
>
> hive command:
>
> use defa
See this:
http://stackoverflow.com/questions/23082763/need-to-add-auto-increment-column-in-a-table-using-hive
On Sat, Jan 16, 2016 at 11:52 AM, Ashok Kumar wrote:
> Hi,
>
> Is there an equivalent to Microsoft IDENTITY column in Hive please.
>
> Thanks and regards
>
--
Want to work at Handy? C
troy it immediately. Any information in this
> message shall not be understood as given or endorsed by Peridale Technology
> Ltd, its subsidiaries or their employees, unless expressly so stated. It is
> the responsibility of the recipient to ensure that this email is virus
> free, therefore
I second this. I've generally found anything else to be disappointing when
working with data which is at all funky.
On Wed, Jan 13, 2016 at 8:13 PM, Alexander Pivovarov
wrote:
> Time to use Spark and Spark-Sql in addition to Hive?
> It's probably going to happen sooner or later anyway.
>
> I sen
You can join on any equality criterion, just like in any other relational
database. Foreign keys in "standard" relational databases are primarily an
integrity constraint. Hive in general lacks integrity constraints.
On Sun, Jan 10, 2016 at 9:45 AM, Ashok Kumar wrote:
> hi,
>
> what is the equiva
Yes, that's why I haven't had to compile anything.
On Wed, Dec 30, 2015 at 4:16 PM, Jörn Franke wrote:
> Hdp Should have TEZ already on-Board bye default.
>
> On 30 Dec 2015, at 21:42, Marcin Tustin wrote:
>
> I'm afraid I use the HDP distribution so I haven
1-4, volume
> one out shortly
>
>
>
> http://talebzadehmich.wordpress.com
>
>
>
> NOTE: The information in this email is proprietary and confidential. This
> message is for the designated recipient only, if you are not the intended
> recipient, you should
I'm using TEZ 0.7.0.2.3 with hive 1.2.1.2.3. I can confirm that TEZ is much
faster than MR in pretty much all cases. Also, with hive, you'll make sure
you've performed optimizations like aligning ORC stripe sizes with HDFS
block sizes, and concatenated your tables (not so much an optimization as a
Hi All,
We import our production database into hive on a schedule using sqoop.
Unfortunately, sqoop won't update the table schema in hive when the table
schema has changed in the source database.
Accordingly, to get updates to the table schema we drop the hive table
first.
Unfortunately, this ca
37 matches
Mail list logo