Hi,

I'm trying to manage our Hadoop cluster with Puppet but there are a
few challenges. The one I'm facing now is managing the following.

I've got an array variable depending on the type of server:
$hadoop_disks = ['/mnt/disk1', '/mnt/disk2', ...]

Depending on the classes I include for each role there needs to be a
different directory structure on all those disks.

Namenode + Datanode = /mnt/diskX/hadoop/dfs
Jobtracker + Tasktracker = /mnt/diskX/hadoop/mapred

Each directory (/hadoop, /hadoop/dfs, /hadoop/mapred) has different
permissions and both roles can be on the same server (Namenode +
Datanode).

I've tried multiple different things but I wasn't able to find a
solution that works. This is what I thought about doing:

base class:

define hadoop_main_directory() {
  file { "${name}/hadoop":
    ensure  => directory,
    owner   => "root",
    group   => "hadoop",
  }
}

define hadoop_sub_directory($path, $user) {
  file { "${name}/hadoop/${path}":
    ensure  => directory,
    owner   => $user,
    group   => "hadoop",
    require => Hadoop_main_directory[$name],
  }
}

And in each of the four classes a definition like

hadoop_sub_directory { $hadoop_disks:
  path    => "dfs",
  owner   => "hdfs",
}

But I guess that doesn't work because a resource may be managed multiple times.

Any ideas how to solve this? I can provide more details. Our
configuration is also on github[1] but it's not working right now and
probably not very pretty. First time I've used Puppet and learning on
the go...

Cheers,
Lars

[1] https://github.com/lfrancke/gbif-puppet

-- 
You received this message because you are subscribed to the Google Groups 
"Puppet Users" group.
To post to this group, send email to puppet-us...@googlegroups.com.
To unsubscribe from this group, send email to 
puppet-users+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/puppet-users?hl=en.

Reply via email to