Subject: RE: Resources/Distributed Cache on Spark
Without using add files, we’d have to make sure these resources exist on every
node, and would configure a hive session like this:
set myCustomProperty=/path/to/directory/someSubDir/;
select myCustomUDF(‘param1’,’param2’);
With the shared resources
...@gmail.com]
Sent: Thursday, February 8, 2018 12:45 PM
To: user@hive.apache.org
Subject: Re: Resources/Distributed Cache on Spark
It should work. We have tests such as groupby_bigdata.q that run on HoS and
work. They use the "add file" command. What are the exact commands you are
running?
It should work. We have tests such as groupby_bigdata.q that run on HoS and
work. They use the "add file" command. What are the exact commands you are
running? What error are you seeing?
On Thu, Feb 8, 2018 at 6:28 AM, Ray Navarette wrote:
> Hello,
>
>
>
> I’m hoping to find some information abo
Hello,
I'm hoping to find some information about using "ADD FILES " when using
the spark execution engine. I've seen some jira tickets reference this
functionality, but little else. We have written some custom UDFs which require
some external resources. When using the MR execution engine, we
?You might be using the wrong path to reference the distributed cache - I was
under the impression that the distributed cache files would accessible using a
local path not something starting with '/'.
I suspect query 1 is working because fetch task conversion is running the
select
What if you extends genericUDF
Thanks,
Dayong
> On Apr 5, 2016, at 2:11 PM, Abhishek Dubey wrote:
>
> Hi,
>
>
> We have written a Hive UDF in Java to fetch value from file added in
> distributed cache which works perfectly from a select query like :
>
> Query
Hi,
We have written a Hive UDF in Java to fetch value from file added in
distributed cache which works perfectly from a select query like :
Query 1.
select country_key, MyFunction(country_key,"/data/MyData.txt") as capital from
tablename;
But not working when trying to create
robably the local working
directory from when Hive was invoked. I don't think the Distributed Cache will
be working correctly in this case, because the UDF is not running in a
map/reduce task.
If a map-reduce job is kicked off for the query and the UDF is running in
this m/r tas
ted without the need for a map/reduce job, in
> which case the working directory for the query is probably the local
> working directory from when Hive was invoked. I don't think the Distributed
> Cache will be working correctly in this case, because the UDF is not
> running in a map
eing executed without the need for a map/reduce job, in which case
the working directory for the query is probably the local working directory
from when Hive was invoked. I don't think the Distributed Cache will be working
correctly in this case, because the UDF is not running in a map/reduce t
Does this error occur for anyone else? It might be a serious issue.
2015-05-05 13:59 GMT+02:00 Zsolt Tóth :
> Hi,
>
> I've just upgraded to Hive 1.1.0 and it looks like there is a problem with
> the distributed cache.
> I use ADD FILE, then an UDF that wants to read the
Hi,
I've just upgraded to Hive 1.1.0 and it looks like there is a problem with
the distributed cache.
I use ADD FILE, then an UDF that wants to read the file. The following
syntax works in Hive 1.0.0 but Hive can't find the file in 1.1.0 (testfile
exists on hdfs, the built-in udf in_fi
astle)
into the distributed cache.
So I get an error like:
Caused by: java.lang.SecurityException: JCE cannot authenticate the provider BC
at javax.crypto.Cipher.getInstance(DashoA13*..)
at javax.crypto.Cipher.getInstance(DashoA13*..)
Hi,
My all hive jobs require "mongodb-java-driver, mongo-hadoop-core,
mongo-hadoop-hive" jars to successfully execute. I don't have cluster access to
copy these jars. So I use distributed cache (add jar ) for every job
to make available these jars for M/R tasks. Because of t
F. The functionality involves
>> reading 7 different text files and create lookup structures such as Map,
>> Set, List , Map of String and List etc to be used in the logic.
>>
>> These files are small size average 15 MB.
>>
>> I can add these files in distr
>
> These files are small size average 15 MB.
>
> I can add these files in distributed cache and access them in UDTF, read
> the files, and create the necessary lookup data structures, but this would
> mean that the files will be opened, read and closed every time the UDTF is
>
distributed cache and access them from UDTF?
I don't think creating hive tables from these files and doing a map side
join is possible, as the functionality that I want to implement is fairly
complex and I am not sure if it can be done just using hive query and join
without using UDTF.
Than
s
>> table
>> > needs a custom serde, which is added every time using add Jars(size of
>> jar
>> > is 25KB) in hive.
>> > The problem is when hive performs map side join, classes in that serde
>> jar
>> > is not loaded and class not found excep
rforms map side join, classes in that serde
> jar
> > is not loaded and class not found exception is thrown. But if I disable
> map
> > side join, it works perfectly fine which showcase that the distributed
> cache
> > is working and serde jar is available.
> > Any idea what is happening in Map side join? I m using CDH3
> >
> >
> > Regards,
> > Abhishek
> >
>
rks perfectly fine which showcase that the distributed cache
> is working and serde jar is available.
> Any idea what is happening in Map side join? I m using CDH3
>
>
> Regards,
> Abhishek
>
thrown. But if I disable map
side join, it works perfectly fine which showcase that the distributed
cache is working and serde jar is available.
Any idea what is happening in Map side join? I m using CDH3
Regards,
Abhishek
x27;d common-user@ and CC'd you.
On Mon, Jun 11, 2012 at 9:46 PM, abhishek dodda
wrote:
> hi all,
>
> Map side join with distributed cache how to do this? can any one help
> me on this.
>
> Regards
> Abhishek.
--
Harsh J
ge-
> From: abhishek dodda
> Date: Mon, 11 Jun 2012 08:59:49
> To:
> Reply-To: user@hive.apache.org
> Subject: Hive Mapside join with Distributed cache
>
> hi all,
>
> Map side join with distributed cache how to do this? can any one help
> me on this.
>
> Regards
> Abhishek
>
>
ive Mapside join with Distributed cache
hi all,
Map side join with distributed cache how to do this? can any one help
me on this.
Regards
Abhishek
fopen("filename","r") works.
Thanks
Mohit
On Wed, Oct 19, 2011 at 7:16 PM, Mark Grover wrote:
> Mohit,
> I use Hive 0.7.1 and am able to access the file from distributed cache just
> by filename. Did you try that?
>
> Mark
>
> - Origin
Mohit,
I use Hive 0.7.1 and am able to access the file from distributed cache just by
filename. Did you try that?
Mark
- Original Message -
From: "Chinna Rao Lalam 72745"
To: user@hive.apache.org
Sent: Wednesday, October 19, 2011 6:56:38 AM
Subject: Re: Accessing distribute
Hi,
Can u post some more details like for the "list file" what command u have
used.
- Original Message -
From: Mohit Gupta
Date: Wednesday, October 19, 2011 3:16 pm
Subject: Re: Accessing distributed cache in transform scripts
To: user@hive.apache.org
> Plz help..
Plz help...Any pointers!!
On 10/19/11, Mohit Gupta wrote:
> Hi All,
>
> I want some read-only data to be available at the reducers / transform
> scripts. I am trying to use distributed cache to achieve this using
> the following steps:
> 1. add file s3://bucket_name/prefix/t
Hi All,
I want some read-only data to be available at the reducers / transform
scripts. I am trying to use distributed cache to achieve this using
the following steps:
1. add file s3://bucket_name/prefix/testfile
then
2. "list file" to find out the location of local copy of testfile
29 matches
Mail list logo