Re: Using HDFS as a secondary FS

Ivan V. Sun, 13 Dec 2015 23:26:07 -0800

Hi, Valentin,

1) first of all note that the author of the question uses not the latest
doc page, namely
http://apacheignite.gridgain.org/v1.0/docs/igfs-secondary-file-system .
This is version 1.0, while the latest is 1.5:
https://apacheignite.readme.io/docs/hadoop-accelerator. Besides, it
appeared that some links from the latest doc version point to 1.0 doc
version. I fixed that in several places where I found that. Do we really
need old doc versions (1.0 -1.4)?

2) our documentation (
http://apacheignite.gridgain.org/docs/secondary-file-system) does not
provide any special setup instructions to configure HDFS as secondary file
system in Ignite. Our docs assume that if a user wants to integrate with
Hadoop, (s)he follows generic Hadoop integration instruction (e.g.
http://apacheignite.gridgain.org/docs/installing-on-apache-hadoop). It
looks like the page
http://apacheignite.gridgain.org/docs/secondary-file-system should be more
clear regarding the required configuration steps (in fact, setting up
HADOOP_HOME variable for Ignite node process).

3) Hadoop jars are correctly found by Ignite if the following conditions
are met:
(a) The "Hadoop Edition" distribution is used (not a "Fabric" edition).
(b) Either HADOOP_HOME environment variable is set up (for Apache Hadoop
distribution), or file "/etc/default/hadoop" exists and matches the Hadoop
distribution used (BigTop, Cloudera, HDP, etc.)

The exact mechanism of the Hadoop classpath composition can be found in
files
IGNITE_HOME/bin/include/hadoop-classpath.sh
IGNITE_HOME/bin/include/setenv.sh .

The issue is discussed in https://issues.apache.org/jira/browse/IGNITE-372
, https://issues.apache.org/jira/browse/IGNITE-483 .

On Sat, Dec 12, 2015 at 3:45 AM, Valentin Kulichenko <
valentin.kuliche...@gmail.com> wrote:

> Igniters,
>
> I'm looking at the question on SO [1] and I'm a bit confused.
>
> We ship ignite-hadoop module only in Hadoop Accelerator and without Hadoop
> JARs, assuming that user will include them from the Hadoop distribution he
> uses. It seems OK for me when accelerator is plugged in to Hadoop to run
> mapreduce jobs, but I can't figure out steps required to configure HDFS as
> a secondary FS for IGFS. Which Hadoop JARs should be on classpath? Is user
> supposed to add them manually?
>
> Can someone with more expertise in our Hadoop integration clarify this? I
> believe there is not enough documentation on this topic.
>
> BTW, any ideas why user gets exception for JobConf class which is in
> 'mapred' package? Why map-reduce class is being used?
>
> [1]
>
> http://stackoverflow.com/questions/34221355/apache-ignite-what-are-the-dependencies-of-ignitehadoopigfssecondaryfilesystem
>
> -Val
>

Re: Using HDFS as a secondary FS

Reply via email to