Puppet configuration variables and Hiera.
Managing configuration variables within Puppet has always given me a bit of a headache, and I’ve never really found a way to do it that I’m all together happy with, particularly when dealing with the deployment of complex applications that require a lot, sometimes hundreds, of different configuration variables and multiple environments. I started thinking a while ago that Puppet wasn’t the best place to be keeping these variables in the first place. For starters, this is really valuable data we’re talking about, there may be lots of other applications that may benefit from having access to the way your software is configured, so why should Puppet retain all of this information exclusively for itself? The original extlookup() function in Puppet provides some decoupling of configuration data from Puppet manifests, but I found it a bit limiting and not very elegant having to maintain a bunch of CSV files. I’ve been interested in R.I.Pienaar’s Hiera for a while and thought I’d give it a proper spin and see if it meets my needs.
Hiera itself is a standalone configuration data store that supports multiple back ends including YAML, JSON and Puppet itself, and adding more back ends to it is a fairly non-challenging task for anyone competent with Ruby. Thanks to hiera-puppet it plugs nicely into Puppet.
Configuring a basic Hiera setup
After installing hiera (gem install hiera), I want to test it by setting up a pretty basic configuration store that will override my configuration variables based on environment settings of dev, stage or live. Let’s take a variable called $webname. I want to set it correctly in each of my three environments, or default it to localhost.
Firstly, I create four YAML files in /etc/puppet/hieradata
[root@localhost hieradata]# cat dev.yaml — webname: dev.app.local [root@localhost hieradata]# cat stage.yaml — webname: stage.app.mydomain.com [root@localhost hieradata]# cat live.yaml — webname: www.app.mydomain.com [root@localhost hieradata]# cat common.yaml — webname: localhost |
Now I have a YAML file representitive of each environment, I create a simple config in /etc/puppet/hiera.yaml that tells Hiera to search for my environment YAML file followed by common.yaml.
— :backends: – yaml :logger: console :hierarchy: – %{env} – common :yaml: :datadir: /etc/puppet/hieradata |
Now using hiera from the command line, I can look up the default value of $webname with the following command
[root@localhost puppet]# hiera -c /etc/puppet/hiera.yaml webname localhost |
But now if I want to know the value for the live and dev environments I can pass an env flag to Hiera
[root@localhost puppet]# hiera -c /etc/puppet/hiera.yaml webname env=dev dev.app.local [root@localhost puppet]# hiera -c /etc/puppet/hiera.yaml webname env=live www.app.mydomain.com |
Accessing this from Puppet
I can now access these variables directly from my Puppet modules using the hiera() function provided by hiera-puppet. In this example, I already have a fact called ${::env} that is set to dev, stage or live (in my particular set up we use the puppet environment variable for other things)
class myapplication ( $webname = hiera(“webname”) ) { … } |
Adding more scoping
OK, thats a fairly simple set up but demonstrates how easy it is to get up and running with Hiera. The requirements I had were a little more complex. Firstly, our hierarchy is broken down into both environment (live, stage, dev..etc) and location. I have multiple environments in multiple locations, a particular location will either be a live, stage or dev environment. So some variables I want to override on the environment level, and some at the more granular location level.
Secondly, I don’t like the idea of asking Hiera for $webname. That doesn’t tell me anything; what is $webname, what uses it? Consider a more generic variable called $port – that’s going to be confusing. So I started thinking about ways of grouping and scoping my variables. The way I solved this was to introduce a module parameter as well as environment and location in Hiera and place variables for a particular module in it’s own YAML file, using a filesystem layout to determine the hierarchy.
My new hierdata file system looks a little like this
|-- dev | |-- glasgow | `-- london |-- live | |-- birmingham | |-- london | `-- dublin `-- stage |-- birmingham |-- london `-- dublin
Now for each of my modules, I create a YAML file in the folder level that I want to override with the values for my module. Taking the previous example, lets say that I want $webname to be www.myapp.mycorp.com for all live environments, except for Dublin, which I want to be a special case of www.myapp.mycorp.ie. To accomplish this I create the following two files:
[root@localhost hieradata]# cat live/myapplication.yaml — webname: www.myapp.mycorp.com [root@localhost hieradata]# cat live/dublin/myapplication.yaml — webname: www.myapp.mycorp.ie |
Hiera-puppet will pass the value of $calling_module from Puppet to Hiera, and we can use this in our hierarchy in hiera.yaml. NOTE: Currently you will need this patch to hiera-puppet in order for this to work!
So our new /etc/puppet/hiera.yaml file looks like:
— :backends: – yaml :logger: console :hierarchy: – %{env}/%{location}/%{calling_module} – %{env}/%{calling_module} – common :yaml: :datadir: /etc/puppet/hieradata |
On the command line, we can now see that environment, location and calling module are now used when looking up a configuration variable
[root@localhost hieradata]# hiera -c /etc/puppet/hiera.yaml webname env=live calling_module=myapplication www.myapp.mycorp.com [root@localhost hieradata]# hiera -c /etc/puppet/hiera.yaml webname env=live calling_module=myapplication location=london www.myapp.mycorp.com [root@localhost hieradata]# hiera -c /etc/puppet/hiera.yaml webname env=live calling_module=myapplication location=dublin www.myapp.mycorp.ie |
In Puppet, I have ${::env} and ${::location} already set as facts, and since $calling_module will get automatically passed to Hiera from Puppet, my myapplication class looks no different…
class myapplication ( $webname = hiera(“webname”) ) { … } |
but knowing the module name means I can easily find where this value is set, and I can easily see what configuration a module requires by examining its YAML files under /etc/puppet/hierdata
Conclusion
In conclusion, I’m now convinced that moving configuration variable data out of Puppet is a very good thing. Now other systems can easily query this valuable information either on the command line or directly with Ruby. By forcing the use of $calling_module I’ve introduced a sort of pseudo scoping for my variables, so, for example… “port” now becomes “port calling_module=apache” and gives me a lot more meaning.
Many thanks to R.I.Pienaar for help in setting this up, as well as providing the patch to scope.rb that enabled me to get this working.