> I want to override partitionByHash function on Flink like the same way
>of DBY on Hive.
> I am working on implementing some benchmark system for these two system,
>which could be contritbutino to Hive as well.
I would be very disappointed if Flink fails to outperform Hive with a
Distribute BY,
Hello, the same question about DISTRIBUTE BY on Hive.
Accorring to you, you do not use hashCode of Object class on DBY,
Distribute By.
I tried to understand how ObjectInspectorUtils works for distribution, but
it seemed it has a lot of Hive API. It is not much understnading.
I want to override pa
Hello, the same question about DISTRIBUTE BY on Hive.
Accorring to you, you do not use hashCode of Object class on DBY,
Distribute By.
I tried to understand how ObjectInspectorUtils works for distribution, but
it seemed it has a lot of Hive API. It is not much understnading.
I want to override pa
> so do you think if we want the same result from Hive and Spark or the
>other freamwork, how could we try this one ?
There's a special backwards compat slow codepath that gets triggered if
you do
set mapred.reduce.tasks=199; (or any number)
This will produce the exact same hash-code as the jav
Thanks for your help.
so do you think if we want the same result from Hive and Spark or the other
freamwork, how could we try this one ?
could you tell me in detail.
Regards,
Philip
On Thu, Oct 22, 2015 at 6:25 PM, Gopal Vijayaraghavan wrote:
>
> > When applying [Distribute By] on Hive to the
> When applying [Distribute By] on Hive to the framework, the function
>should be partitionByHash on Flink. This is to spread out all the rows
>distributed by a hash key from Object Class in Java.
Hive does not use the Object hashCode - the identityHashCode is
inconsistent, so Object.hashCode() .
You can create an external table to make your Data visible in hive.
Sent from my iPhone
On Jul 11, 2012, at 7:39 AM, shaik ahamed wrote:
> Hi All,
>
>As i have a data of 100GB in HDFS as i want this 100 gb file to
> move or copy to the hive directory or path how can i achieve th
Hi Shaik
If you already have the data in hdfs then just create an External Table with
that hdfs location. You'll have the data in your hive table.
Or if you want to have a managed table then also it is good use a Load data
statement. It'd be faster as well since it is a hdfs move operation unde
Try it out using "distcp" command.
Regards,
Mohammad Tariq
On Wed, Jul 11, 2012 at 8:09 PM, shaik ahamed wrote:
> Hi All,
>
>As i have a data of 100GB in HDFS as i want this 100 gb file to
> move or copy to the hive directory or path how can i achieve this .
>
> Is there any cm
"No space left on device" this mean your HDD is full.
On Fri, Jul 6, 2012 at 10:25 PM, shaik ahamed wrote:
> Hi all,
>
> As im trying to insert the data in to the hive table as im
> getting the below error
>
> Total MapReduce jobs = 2
> Launching Job 1 out of 2
> Number of reduce tasks
can you tell us
1) how many nodes are there in the cluster?
2) is there any connectivity problems if the # nodes > 3
3) if you have just one slave do you have a higher replication factor?
4) what is the compression you are using for the tables?
5) if you have a dhcp based network, did your slave ma
Hi ,
Below is the error,i found in the Job Tracker log file :
*Shuffle Error: Exceeded MAX_FAILED_UNIQUE_FETCHES; bailing-out*
Please help me in this ...
*Thanks in Advance*
*Shaik.*
On Fri, Jul 6, 2012 at 5:22 PM, Bejoy KS wrote:
> **
> Hi Shaik
>
> There is some error while MR jobs
Hi Shaik
There is some error while MR jobs are running. To get the root cause please
post in the error log from the failed task.
You can browse the Job Tracker web UI and choose the right job Id and drill
down to the failed tasks to get the error logs.
Regards
Bejoy KS
Sent from handheld, pl
Increase your replication factor as suggested by Nitin. Replication
factor > 1 is a must to avoid data loss. And you can use "fsck"
command to check the HDFS. Also verify whether you are able to see all
the datanodes at the namenode WEBUI page.
Regards,
Mohammad Tariq
On Thu, Jul 5, 2012 at
Thanks for ur reply nitin ..
On Thu, Jul 5, 2012 at 12:30 PM, Nitin Pawar wrote:
> read for hadoop dfs fsck command
>
>
> On Thu, Jul 5, 2012 at 12:29 PM, shaik ahamed wrote:
>
>> Hi Nitin,
>>
>> How can i check the dfs health? could u plz guide me the steps...
>>
>> On Thu, Jul 5, 2012 at
read for hadoop dfs fsck command
On Thu, Jul 5, 2012 at 12:29 PM, shaik ahamed wrote:
> Hi Nitin,
>
> How can i check the dfs health? could u plz guide me the steps...
>
> On Thu, Jul 5, 2012 at 12:23 PM, Nitin Pawar wrote:
>
>> can you check dfs health?
>>
>> I think few of your nodes are
if you have 2 nodes and replication factor is 1 then this is a problem,
I would like to suggest minimum replication factor as 2
this would make sure that even if 1 node is down, data is served from
replicated blocks
On Thu, Jul 5, 2012 at 12:27 PM, shaik ahamed wrote:
> Thanks for the reply guy
Hi Nitin,
How can i check the dfs health? could u plz guide me the steps...
On Thu, Jul 5, 2012 at 12:23 PM, Nitin Pawar wrote:
> can you check dfs health?
>
> I think few of your nodes are down
>
>
> On Thu, Jul 5, 2012 at 12:17 PM, shaik ahamed wrote:
>
>> Hi All,
>>
>>
>>
Thanks for the reply guys,
Yesterday night im able to fecth the data .And my second node is down in
the sence im not able to connect to the 2 machine as i have 3 machiens 1
master and 2 slave .As the 2 second one im not able to connect .Is this the
prob for not retreiving the data, or other than t
Hello shaik,
Were you able to fetch the data earlier. I mean is it
happening for the first time or you were not able to fetch the data
even once??
Regards,
Mohammad Tariq
On Thu, Jul 5, 2012 at 12:17 PM, shaik ahamed wrote:
> Hi All,
>
>
>Im not able to fetch the
can you check dfs health?
I think few of your nodes are down
On Thu, Jul 5, 2012 at 12:17 PM, shaik ahamed wrote:
> Hi All,
>
>
>Im not able to fetch the data from the hive table ,getting
> the below error
>
>FAILED: Error in semantic analysis:
>
> hive> select * from vender
Hi Shaik
Updates are not supported in hive. Still you can accomplish updates by over
writing either a whole table or a partition.
In short updates are not directly supported in hive, indirectly it is really
expensive as well.
Regards
Bejoy KS
Sent from handheld, please excuse typos.
-Or
Hi Shaik
On a first look, since you are using Dynamic Partition Insert, the partition
column should be the last column on select query used in Insert Overwrite.
Modify your Insert as
INSERT OVERWRITE TABLE vender_part PARTITION (order_date) SELECT
vender,supplier,quantity,order_date FROM ve
That configuration has been removed from HBase 2 years ago, so no. You need
to point to the zookeeper ensemble. You can copy your hbase-site.xml file
into hive's conf directory so that it gets all those configurations you have
on your cluster.
J-D
On Wed, Oct 12, 2011 at 12:28 AM, liming liu wrot
Hi,
2011/9/7 Harold(陳春宏)
> Hello:
>
> I have analysis apache log from hive, but there is a problem
>
> When I write hive command in Script file and use crontab for schedule it *
> ***
>
> The result is different with run in hive container
>
> The attachment file is 2 way process
Wow, is there any other way to do that?
2011/3/21 Ted Yu
> I don't think so:
> Object deserialize(Writable blob) throws SerDeException;
>
>
>
> On Mon, Mar 21, 2011 at 4:55 AM, 幻 wrote:
>
>> Hi,all.Is it possible to generate mutiple records in one SerDe? I mean if
>> I can return more than on
I don't think so:
Object deserialize(Writable blob) throws SerDeException;
On Mon, Mar 21, 2011 at 4:55 AM, 幻 wrote:
> Hi,all.Is it possible to generate mutiple records in one SerDe? I mean if I
> can return more than one rows in deserialize?
>
> Thanks!
>
27 matches
Mail list logo