On Monday, April 14, 2014 1:41:42 PM UTC-5, Jim Donnellan wrote:
>
> Puppeteers,
>
> I'm trying to get something done in puppet/hiera, and I'm curious if it's 
> possible. A bit of background:
>
> We're using puppet and hiera to build out and maintain Apache Solr, and 
> we're using Solr in a cloud structure. What this ends up meaning 
> configuration-wise is that we have our data broken off into shards, and 
> servers that are hosting different replicas of the shards for redundancy. 
> For a simplified sense, say it looks like this...
>
> Server1 hosts:
>   shard1, replica 1
>   shard2, replica 1
>
> Server2 hosts:
>   shard1, replica 2
>   shard2, replica 2
>
> Server3 hosts:
>   shard3, replica 1
>   shard4, replica 1
>
> ...and so on. So each shard exists on multiple hosts, each host has 
> multiple shards, but not every shard is on every host. 
>
> We were able to handle this just fine by using a hiera array to list the 
> shards at the _host_.yaml level. No big deal. Done and done, works great in 
> production.
>
>
> The issue that has come up is that some of the shards (which are JVMs) 
> have grown a bit and require a heap size greater than the default. 
> Obviously this would be something we'd want to wrangle with puppet and 
> hiera. We've come up with some initial attempts that seem to work, which 
> involve just enhancing the hiera data when we declare the shards at a host 
> level. So instead of:
>
> Server1.yaml::
> shards:
> - shard1
> - shard2
>
> We have something like this:
>
> Server1.yaml::
> shards:
> - shard1: 5G
> - shard2: 7G
>
> ...which is workable. The thing I don't like about it is that I'm defining 
> the heap size at the host level, even though they should be consistent for 
> any given shard across servers. This is redundant at best, and leaves 
> things open for inconsistency across servers at worst. I kind of want to 
> raise the heap size declarations up above the host level, up to the 
> application or environment level I guess. But I would still need to declare 
> which shards are where at the host level. In short, I guess I need the 
> deployment to look for what shards should be on a host at the host level, 
> and then look up the chain a bit to see what heap size that shard should 
> have. 
>
> Does this sound doable?
>
>

At that level of abstraction, yes, it sounds doable, but the Devil is in 
the details.  The per-host shard data probably need to be references (by 
name) to shard details in some more general level of your hierarchy.  You 
can then use a defined type to declare all the shards for each server, 
based on the shared details for each shard.  From the data structure you 
describe, I suppose you probably already have something going in this 
direction.  So your data might look more like this:

Server1.yaml:
shards:
- shard1
- shard2

Server5.yaml:
shards:
- shard1
- shard5

common.yaml:
shard_details:
  shard1:
    max_heap: 5G
  shard2:
    max_heap: 7G
  shard5:
    max_heap: 4G


There are any number of ways you could tweak the data structure, but that 
general approach seems sound to me.


John

-- 
You received this message because you are subscribed to the Google Groups 
"Puppet Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to puppet-users+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/puppet-users/a53bb7c6-8aef-4966-9563-b0c2aeda1b04%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to