Puppet data from CouchDB using hiera-http

Introducing hiera-http

I started looking at various places people store data, and ways to fetch it and realized that a lot of data storage applications are RESTful, yet there doesn’t seem to be any support in Hiera to query these things, so I whipped up hiera-http, a Hiera back end to connect to any HTTP RESTful API and return data based on a lookup. It’s very new and support for other stuff like SSL and Auth is coming, but what it does support is a variety of handlers to parse the returned output of the HTTP doc, at the moment these are limited to YAML and JSON (or just ‘plain’ to simply return the whole response of the request). The following is a quick demonstration of how to plug CouchDB into Puppet using Hiera and the hiera-http backend.

Hiera-http is available as a rubygem, or from GitHub: http://github.com/crayfishx/hiera-http


Apache CouchDB is a scalable database that uses no set schema and is ideal for storing configuration data as everything is stored and retrieved as JSON documents. For the purposes of this demo I’m just going to do a very simple database installation with three documents and a few configuration parameters to demonstrate how to integrate this in with Puppet.

After installing couchdb and starting the service I’m able to access Futon, the web GUI front-end for my couchdb service – using this I create three documents, “dev”, “common” and “puppet.puppetlabs.lan”

CouchDB documents
CouchDB documents

Next I populate my common and dev documents with some variables.

Common document populated with data

Now CouchDB is configured I should be able to query the data over HTTP

Query with Hiera

After installing hiera-http I can query this data directly from Hiera…

First I need to configure Hiera with the HTTP back end. The search hierarchy is determined by the :paths: configuration parameter and since CouchDB returns JSON I set that as the output handler.

I can now query this directly from Hiera on the command line

And of course, that means that this data is now available from Puppet and if I add some overriding configuration variables to my dev document in CouchDB, my lookup will resolve based on my environment setting in Puppet

Hiera-http is fully featured and supports all standard Hiera back end functions such as hiera_hash, hiera_array order overrides.

Future stuff

I’m going to carry on working on new features for hiera-http – including basic auth, HTTPS/SSL, proxys and a wider variety of output handlers – I would like for this back end to be flexible enough to allow users to configure Hiera with any network service that uses a RESTful API to perform data lookups. Keep watching.

Follow and share if you liked this

Introducing hiera-mysql MySQL Backend for Hiera


Some time ago I started looking at Hiera, a configuration datastore with pluggable back ends that also plugs seamlessly into Puppet for managing variables. When I wrote hiera-gpg a few months ago I realised how easy extending Hiera was and the potential for really useful backends that can consolidate all your configuration options from a variety of systems and locations into one streamlined process that systems like Puppet and other tools can hook into. This, fuelled by a desire to learn more Ruby, lead to hiera-mysql, a MySQL Backend for Hiera.


hiera-mysql is available as a ruby gem and can be installed with:

Note: this depends on the Ruby mysql gem, so you’ll need gcc, ruby-devel and mysql-devel packages installed. Alternativley the source can be Downloaded here

MySQL database

To demonstrate hiera-mysql, here’s a simple MySQL database some sample data;

Configuring Hiera

In this example we’re going to pass the variable “env” in the scope. hiera-mysql will interpret any scope variables defined in the query option, and also has a special case for %{key}. Example:

Running Hiera

With the above example, I want to find the value of the variable colour in the scope of live

If I add more rows to the database that match the criteria, and use Hiera’s array search function by passing -a I can make Hiera return all the rows

Hiera’s pluggable nature means that you can use this back end alongside other back ends such as YAML or JSON and configure your search order accordingly.


Currently hiera-mysql will only return the first element of a row, or an array of first elements, so you can’t do things like SELECT foo,bar FROM table. I intend to introduce this feature by implementing Hiera’s hash search in a future release. Also, the module could do with slightly better exception handling around the mysql stuff. Please let me know if theres anything else that would improve it.


And of course, because Hiera is completely transparent, accessing these variables from Puppet couldn’t be easier!


  • Github homepage for hiera-mysql
  • Official Hiera Project Homepage
  • Hiera – A pluggable hierarchical data store
    Follow and share if you liked this
  • Secret variables in Puppet with Hiera and GPG

    Last week I wrote an article on Puppet configuration variables and Hiera. This almost sorted out all my configuration variable requirements, bar one; what do I do with sensitive data like database passwords, hashed user passwords…etc that I don’t want to store in my VCS repo as plaintext.

    Hiera allows you to quite easily add new backends, so I came up with hiera-gpg, a backend plugin for Hiera that will GPG decrypt a YAML file on the fly. It’s quite minimal and there is some stuff I’d like to do better – for instance it currently shells out to the GPG command, hopefully someone has some code they can contribute that’ll use the GPGME gem instead to do the encryption bit.

    Once you’re up and running with Hiera, you can get the hiera-gpg backend from Rubygems…

    We run several Puppetmasters, so for each one I create a GPG key and add the public key to a public keyring that’s kept in my VCS repo. For security reasons I maintain a dev and a live keyring so only live Puppetmasters can see live data.

    Currently hiera-gpg doesn’t support key passwords, I’ll probably add this feature in soon but it would mean having the password stored in /etc/puppet/hiera.yaml as plaintext anyway, so I don’t see that as adding much in the way of security.

    So I have my GPG secret key set up in roots homedir:

    Next I add my GPG public key to the keyring for live puppetmasters (in my set up, /etc/puppet/keyrings/live is a subversion checkout)

    Now I can create a YAML file in my hieradata folder and encrypt it for the servers in my live keyring.

    If like me you have more than one puppetmaster in your live keyrings, multiple -r entries can be specified on the command line for gpg, you should encrypt your file for all the puppet master keys that are allowed to decrypt it.

    Now you just need to tell Hiera about the GPG backend, My previous Hiera configuration now becomes:

    Here we’re telling Hiera to behave exactly as it used to when we just had the YAML back end, and if it doesn’t find the value you are requesting from YAML it will query the GPG back end which will pick up on your %{calling_module}.gpg.

    Now I can query Hiera on the command line to find my live MySQL root password with:

    In Puppet, I reference my variables in exactly the same way as any other variable

    Theres probably lots of stuff I can improve here, but I have the basics of what I need, a transparent method of data storage using GPG encryption and no sensitive data stored in my VCS repo as plain text.

    Follow and share if you liked this

    Puppet configuration variables and Hiera.

    Managing configuration variables within Puppet has always given me a bit of a headache, and I’ve never really found a way to do it that I’m all together happy with, particularly when dealing with the deployment of complex applications that require a lot, sometimes hundreds, of different configuration variables and multiple environments. I started thinking a while ago that Puppet wasn’t the best place to be keeping these variables in the first place. For starters, this is really valuable data we’re talking about, there may be lots of other applications that may benefit from having access to the way your software is configured, so why should Puppet retain all of this information exclusively for itself? The original extlookup() function in Puppet provides some decoupling of configuration data from Puppet manifests, but I found it a bit limiting and not very elegant having to maintain a bunch of CSV files. I’ve been interested in R.I.Pienaar’s Hiera for a while and thought I’d give it a proper spin and see if it meets my needs.

    Hiera itself is a standalone configuration data store that supports multiple back ends including YAML, JSON and Puppet itself, and adding more back ends to it is a fairly non-challenging task for anyone competent with Ruby. Thanks to hiera-puppet it plugs nicely into Puppet.

    Configuring a basic Hiera setup

    After installing hiera (gem install hiera), I want to test it by setting up a pretty basic configuration store that will override my configuration variables based on environment settings of dev, stage or live. Let’s take a variable called $webname. I want to set it correctly in each of my three environments, or default it to localhost.

    Firstly, I create four YAML files in /etc/puppet/hieradata

    Now I have a YAML file representitive of each environment, I create a simple config in /etc/puppet/hiera.yaml that tells Hiera to search for my environment YAML file followed by common.yaml.

    Now using hiera from the command line, I can look up the default value of $webname with the following command

    But now if I want to know the value for the live and dev environments I can pass an env flag to Hiera

    Accessing this from Puppet

    I can now access these variables directly from my Puppet modules using the hiera() function provided by hiera-puppet. In this example, I already have a fact called ${::env} that is set to dev, stage or live (in my particular set up we use the puppet environment variable for other things)

    Adding more scoping

    OK, thats a fairly simple set up but demonstrates how easy it is to get up and running with Hiera. The requirements I had were a little more complex. Firstly, our hierarchy is broken down into both environment (live, stage, dev..etc) and location. I have multiple environments in multiple locations, a particular location will either be a live, stage or dev environment. So some variables I want to override on the environment level, and some at the more granular location level.

    Secondly, I don’t like the idea of asking Hiera for $webname. That doesn’t tell me anything; what is $webname, what uses it? Consider a more generic variable called $port – that’s going to be confusing. So I started thinking about ways of grouping and scoping my variables. The way I solved this was to introduce a module parameter as well as environment and location in Hiera and place variables for a particular module in it’s own YAML file, using a filesystem layout to determine the hierarchy.

    My new hierdata file system looks a little like this

    Now for each of my modules, I create a YAML file in the folder level that I want to override with the values for my module. Taking the previous example, lets say that I want $webname to be www.myapp.mycorp.com for all live environments, except for Dublin, which I want to be a special case of www.myapp.mycorp.ie. To accomplish this I create the following two files:

    Hiera-puppet will pass the value of $calling_module from Puppet to Hiera, and we can use this in our hierarchy in hiera.yaml. NOTE: Currently you will need this patch to hiera-puppet in order for this to work!

    So our new /etc/puppet/hiera.yaml file looks like:

    On the command line, we can now see that environment, location and calling module are now used when looking up a configuration variable

    In Puppet, I have ${::env} and ${::location} already set as facts, and since $calling_module will get automatically passed to Hiera from Puppet, my myapplication class looks no different…

    but knowing the module name means I can easily find where this value is set, and I can easily see what configuration a module requires by examining its YAML files under /etc/puppet/hierdata


    In conclusion, I’m now convinced that moving configuration variable data out of Puppet is a very good thing. Now other systems can easily query this valuable information either on the command line or directly with Ruby. By forcing the use of $calling_module I’ve introduced a sort of pseudo scoping for my variables, so, for example… “port” now becomes “port calling_module=apache” and gives me a lot more meaning.

    Many thanks to R.I.Pienaar for help in setting this up, as well as providing the patch to scope.rb that enabled me to get this working.

    Follow and share if you liked this