On Mon, Dec 14, 2015 at 7:28 AM, Denis Magda <dma...@gridgain.com> wrote:
> Yes, this will be documented tomorrow. I want to go though all the steps > by myself checking all other possible obstacles the user may face with. > Thanks, Denis! > > — > Denis > > > On 14 дек. 2015 г., at 18:11, Dmitriy Setrakyan <dsetrak...@apache.org> > wrote: > > > > Ivan, I think this should be documented, no? > > > > On Mon, Dec 14, 2015 at 2:25 AM, Ivan V. <iveselovs...@gridgain.com> > wrote: > > > >> To enable just an IGFS persistence there is no need to use HDFS (this > >> requires Hadoop dependency, requires configured HDFS cluster, etc.). > >> We have requests https://issues.apache.org/jira/browse/IGNITE-1120 , > >> https://issues.apache.org/jira/browse/IGNITE-1926 to implement the > >> persistence upon local file system, and we already close to the > solution. > >> > >> Regarding the secondary Fs doc page ( > >> http://apacheignite.gridgain.org/docs/secondary-file-system) I would > >> suggest to add the following text there: > >> ------------------------ > >> If Ignite node with secondary file system configured on a machine with > >> Hadoop distribution, make sure Ignite is able to find appropriate Hadoop > >> libraries: set HADOOP_HOME environment variable for the Ignite process > if > >> you're using Apache Hadoop distribution, or, if you use another > >> distribution (HDP, Cloudera, BigTop, etc.) make sure /etc/default/hadoop > >> file exists and has appropriate contents. > >> > >> If Ignite node with secondary file system configured on a machine > without > >> Hadoop distribution, you can manually add necessary Hadoop dependencies > to > >> Ignite node classpath: these are dependencies of groupId > >> "org.apache.hadoop" listed in file modules/hadoop/pom.xml . Currently > they > >> are: > >> > >> 1. hadoop-annotations > >> 2. hadoop-auth > >> 3. hadoop-common > >> 4. hadoop-hdfs > >> 5. hadoop-mapreduce-client-common > >> 6. hadoop-mapreduce-client-core > >> > >> ------------------------ > >> > >> On Mon, Dec 14, 2015 at 11:21 AM, Valentin Kulichenko < > >> valentin.kuliche...@gmail.com> wrote: > >> > >>> Guys, > >>> > >>> Why don't we include ignite-hadoop module in Fabric? This user simply > >> wants > >>> to configure HDFS as a secondary file system to ensure persistence. Not > >>> having the opportunity to do this in Fabric looks weird to me. And > >> actually > >>> I don't think this is a use case for Hadoop Accelerator. > >>> > >>> -Val > >>> > >>> On Mon, Dec 14, 2015 at 12:11 AM, Denis Magda <dma...@gridgain.com> > >> wrote: > >>> > >>>> Hi Ivan, > >>>> > >>>> 1) Yes, I think that it makes sense to have the old versions of the > >> docs > >>>> while an old version is still considered to be used by someone. > >>>> > >>>> 2) Absolutely, the time to add a corresponding article on the > >> readme.io > >>>> has come. It's not the first time I see the question related to HDFS > >> as a > >>>> secondary FS. > >>>> Before and now it's not clear for me what exact steps I should follow > >> to > >>>> enable such a configuration. Our current suggestions look like a > >> puzzle. > >>>> I'll assemble the puzzle on my side and prepare the article. Ivan if > >> you > >>>> don't mind I would reaching you out directly asking for any technical > >>>> assistance if needed. > >>>> > >>>> Regards, > >>>> Denis > >>>> > >>>> > >>>> On 12/14/2015 10:25 AM, Ivan V. wrote: > >>>> > >>>>> Hi, Valentin, > >>>>> > >>>>> 1) first of all note that the author of the question uses not the > >> latest > >>>>> doc page, namely > >>>>> > http://apacheignite.gridgain.org/v1.0/docs/igfs-secondary-file-system > >> . > >>>>> This is version 1.0, while the latest is 1.5: > >>>>> https://apacheignite.readme.io/docs/hadoop-accelerator. Besides, it > >>>>> appeared that some links from the latest doc version point to 1.0 doc > >>>>> version. I fixed that in several places where I found that. Do we > >> really > >>>>> need old doc versions (1.0 -1.4)? > >>>>> > >>>>> 2) our documentation ( > >>>>> http://apacheignite.gridgain.org/docs/secondary-file-system) does > not > >>>>> provide any special setup instructions to configure HDFS as secondary > >>> file > >>>>> system in Ignite. Our docs assume that if a user wants to integrate > >> with > >>>>> Hadoop, (s)he follows generic Hadoop integration instruction (e.g. > >>>>> http://apacheignite.gridgain.org/docs/installing-on-apache-hadoop). > >> It > >>>>> looks like the page > >>>>> http://apacheignite.gridgain.org/docs/secondary-file-system should > be > >>>>> more > >>>>> clear regarding the required configuration steps (in fact, setting up > >>>>> HADOOP_HOME variable for Ignite node process). > >>>>> > >>>>> 3) Hadoop jars are correctly found by Ignite if the following > >> conditions > >>>>> are met: > >>>>> (a) The "Hadoop Edition" distribution is used (not a "Fabric" > >> edition). > >>>>> (b) Either HADOOP_HOME environment variable is set up (for Apache > >> Hadoop > >>>>> distribution), or file "/etc/default/hadoop" exists and matches the > >>> Hadoop > >>>>> distribution used (BigTop, Cloudera, HDP, etc.) > >>>>> > >>>>> The exact mechanism of the Hadoop classpath composition can be found > >> in > >>>>> files > >>>>> IGNITE_HOME/bin/include/hadoop-classpath.sh > >>>>> IGNITE_HOME/bin/include/setenv.sh . > >>>>> > >>>>> The issue is discussed in > >>>>> https://issues.apache.org/jira/browse/IGNITE-372 > >>>>> , https://issues.apache.org/jira/browse/IGNITE-483 . > >>>>> > >>>>> On Sat, Dec 12, 2015 at 3:45 AM, Valentin Kulichenko < > >>>>> valentin.kuliche...@gmail.com> wrote: > >>>>> > >>>>> Igniters, > >>>>>> > >>>>>> I'm looking at the question on SO [1] and I'm a bit confused. > >>>>>> > >>>>>> We ship ignite-hadoop module only in Hadoop Accelerator and without > >>>>>> Hadoop > >>>>>> JARs, assuming that user will include them from the Hadoop > >> distribution > >>>>>> he > >>>>>> uses. It seems OK for me when accelerator is plugged in to Hadoop to > >>> run > >>>>>> mapreduce jobs, but I can't figure out steps required to configure > >> HDFS > >>>>>> as > >>>>>> a secondary FS for IGFS. Which Hadoop JARs should be on classpath? > Is > >>>>>> user > >>>>>> supposed to add them manually? > >>>>>> > >>>>>> Can someone with more expertise in our Hadoop integration clarify > >>> this? I > >>>>>> believe there is not enough documentation on this topic. > >>>>>> > >>>>>> BTW, any ideas why user gets exception for JobConf class which is in > >>>>>> 'mapred' package? Why map-reduce class is being used? > >>>>>> > >>>>>> [1] > >>>>>> > >>>>>> > >>>>>> > >>> > >> > http://stackoverflow.com/questions/34221355/apache-ignite-what-are-the-dependencies-of-ignitehadoopigfssecondaryfilesystem > >>>>>> > >>>>>> -Val > >>>>>> > >>>>>> > >>>> > >>> > >> > >