Re: Hive on Spark - Hadoop 2 - Installation - Ubuntu

Dasun Hegoda Sat, 21 Nov 2015 00:03:07 -0800

Thank you very much but I would like to do the integration of these
components myself rather than using a packaged distribution. I think I have
come to right place. Can you please kindly tell me the configuration steps
run Hive on Spark?


At least someone please elaborate these steps.
https://cwiki.apache.org/confluence/display/Hive/Hive+on+Spark%3A+Getting+Started
.

Because at the latter part of the guide configurations are set in the Hive
runtime shell which is not permanent according to my knowledge.

Please help me to get this done. Also I'm planning write a detailed guide
with configuration steps to run Hive on Spark. So others can benefited from
it and not troubled like me.

Can someone please kindly tell me the configuration steps run Hive on Spark?


On Sat, Nov 21, 2015 at 12:28 PM, Sai Gopalakrishnan <
sai.gopalakrish...@aspiresys.com> wrote:

> Hi everyone,
>
>
> Thank you for your responses. I think Mich's suggestion is a great one,
> will go with it. As Alan suggested, using compactor in Hive should help out
> with managing the delta files.
>
>
> @Dasun, pardon me for deviating from the topic. Regarding configuration,
> you could try a packaged distribution (Hortonworks , Cloudera or MapR)
> like  Jörn Franke said. I use Hortonworks, its open-source and compatible
> with Linux and Windows, provides detailed documentation for installation
> and can be installed in less than a day provided you're all set with the
> hardware. http://hortonworks.com/hdp/downloads/
> <http://hortonworks.com/hdp/downloads/>
> Download Hadoop - Hortonworks
> Download Apache Hadoop for the enterprise with Hortonworks Data Platform.
> Data access, storage, governance, security and operations across Linux and
> Windows
> Read more... <http://hortonworks.com/hdp/downloads/>
>
>
> Regards,
>
> Sai
>
> ------------------------------
> *From:* Dasun Hegoda <dasunheg...@gmail.com>
> *Sent:* Saturday, November 21, 2015 8:00 AM
> *To:* user@hive.apache.org
> *Subject:* Re: Hive on Spark - Hadoop 2 - Installation - Ubuntu
>
> Hi Mich, Hi Sai, Hi Jorn,
>
> Thank you very much for the information. I think we are deviating from the
> original question. Hive on Spark on Ubuntu. Can you please kindly tell me
> the configuration steps?
>
>
>
> On Fri, Nov 20, 2015 at 11:10 PM, Jörn Franke <jornfra...@gmail.com>
> wrote:
>
>> I think the most recent versions of cloudera or Hortonworks should
>> include all these components - try their Sandboxes.
>>
>> On 20 Nov 2015, at 12:54, Dasun Hegoda <dasunheg...@gmail.com> wrote:
>>
>> Where can I get a Hadoop distribution containing these technologies?
>> Link?
>>
>> On Fri, Nov 20, 2015 at 5:22 PM, Jörn Franke <jornfra...@gmail.com>
>> wrote:
>>
>>> I recommend to use a Hadoop distribution containing these technologies.
>>> I think you get also other useful tools for your scenario, such as Auditing
>>> using sentry or ranger.
>>>
>>> On 20 Nov 2015, at 10:48, Mich Talebzadeh <m...@peridale.co.uk> wrote:
>>>
>>> Well
>>>
>>>
>>>
>>> “I'm planning to deploy Hive on Spark but I can't find the installation
>>> steps. I tried to read the official '[Hive on Spark][1]' guide but it has
>>> problems. As an example it says under 'Configuring Yarn'
>>> `yarn.resourcemanager.scheduler.class=org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler`
>>> but does not imply where should I do it. Also as per the guide
>>> configurations are set in the Hive runtime shell which is not permanent
>>> according to my knowledge.”
>>>
>>>
>>>
>>> You can do that in yarn-site.xml file which is normally under
>>> $HADOOP_HOME/etc/hadoop.
>>>
>>>
>>>
>>>
>>>
>>> HTH
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>> Mich Talebzadeh
>>>
>>>
>>>
>>> *Sybase ASE 15 Gold Medal Award 2008*
>>>
>>> A Winning Strategy: Running the most Critical Financial Data on ASE 15
>>>
>>>
>>> http://login.sybase.com/files/Product_Overviews/ASE-Winning-Strategy-091908.pdf
>>>
>>> Author of the books* "A Practitioner’s Guide to Upgrading to Sybase ASE
>>> 15", ISBN 978-0-9563693-0-7*.
>>>
>>> co-author *"Sybase Transact SQL Guidelines Best Practices", ISBN
>>> 978-0-9759693-0-4*
>>>
>>> *Publications due shortly:*
>>>
>>> *Complex Event Processing in Heterogeneous Environments*, ISBN:
>>> 978-0-9563693-3-8
>>>
>>> *Oracle and Sybase, Concepts and Contrasts*, ISBN: 978-0-9563693-1-4, volume
>>> one out shortly
>>>
>>>
>>>
>>> http://talebzadehmich.wordpress.com
>>>
>>>
>>>
>>> NOTE: The information in this email is proprietary and confidential.
>>> This message is for the designated recipient only, if you are not the
>>> intended recipient, you should destroy it immediately. Any information in
>>> this message shall not be understood as given or endorsed by Peridale
>>> Technology Ltd, its subsidiaries or their employees, unless expressly so
>>> stated. It is the responsibility of the recipient to ensure that this email
>>> is virus free, therefore neither Peridale Ltd, its subsidiaries nor their
>>> employees accept any responsibility.
>>>
>>>
>>>
>>> *From:* Dasun Hegoda [mailto:dasunheg...@gmail.com
>>> <dasunheg...@gmail.com>]
>>> *Sent:* 20 November 2015 09:36
>>> *To:* user@hive.apache.org
>>> *Subject:* Hive on Spark - Hadoop 2 - Installation - Ubuntu
>>>
>>>
>>>
>>> Hi,
>>>
>>>
>>>
>>> What I'm planning to do is develop a reporting platform using existing
>>> data. I have an existing RDBMS which has large number of records. So I'm
>>> using. (
>>> http://stackoverflow.com/questions/33635234/hadoop-2-7-spark-hive-jasperreports-scoop-architecuture
>>> )
>>>
>>>
>>>
>>>  - Scoop - Extract data from RDBMS to Hadoop
>>>
>>>  - Hadoop - Storage platform -> *Deployment Completed*
>>>
>>>  - Hive - Datawarehouse
>>>
>>>  - Spark - Read time processing -> *Deployment Completed*
>>>
>>>
>>>
>>> I'm planning to deploy Hive on Spark but I can't find the installation
>>> steps. I tried to read the official '[Hive on Spark][1]' guide but it has
>>> problems. As an example it says under 'Configuring Yarn'
>>> `yarn.resourcemanager.scheduler.class=org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler`
>>> but does not imply where should I do it. Also as per the guide
>>> configurations are set in the Hive runtime shell which is not permanent
>>> according to my knowledge.
>>>
>>>
>>>
>>> Given that I read [this][2] but it does not have any steps.
>>>
>>>
>>>
>>> Please provide me the steps to run Hive on Spark on Ubuntu as a
>>> production system?
>>>
>>>
>>>
>>>
>>>
>>>   [1]:
>>> https://cwiki.apache.org/confluence/display/Hive/Hive+on+Spark%3A+Getting+Started
>>>
>>>   [2]:
>>> http://stackoverflow.com/questions/26018306/how-to-configure-hive-to-use-spark
>>>
>>>
>>>
>>> --
>>>
>>> Regards,
>>>
>>> Dasun Hegoda, Software Engineer
>>> www.dasunhegoda.com | dasunheg...@gmail.com
>>>
>>>
>>
>>
>> --
>> Regards,
>> Dasun Hegoda, Software Engineer
>> www.dasunhegoda.com | dasunheg...@gmail.com
>>
>>
>
>
> --
> Regards,
> Dasun Hegoda, Software Engineer
> www.dasunhegoda.com | dasunheg...@gmail.com
> [image: Aspire Systems]
>
> This e-mail message and any attachments are for the sole use of the
> intended recipient(s) and may contain proprietary, confidential, trade
> secret or privileged information. Any unauthorized review, use, disclosure
> or distribution is prohibited and may be a violation of law. If you are not
> the intended recipient, please contact the sender by reply e-mail and
> destroy all copies of the original message.
>



-- 
Regards,
Dasun Hegoda, Software Engineer
www.dasunhegoda.com | dasunheg...@gmail.com

Re: Hive on Spark - Hadoop 2 - Installation - Ubuntu

Reply via email to