Hierarchical Data Lookups with Ansible

A new lookup plugin for Ansible that uses Jerakia to do hierarchical data lookups right from your playbook.

Jerakia is an open source, highly flexible data lookup tool for performing hierarchical data lookups from numerous data sources. It’s aim is to be a stand alone tool that can integrate with a wide variety of tools. In this post I will examine how to integrate it with Ansible to be able to perform powerful hierarchical lookups right from your playbook.

Hierarchical lookups

I recently wrote an in-depth look at Hierarchical data lookups. What we mean by hierarchical lookups is essentially a key/value data lookup that traverses a hierarchy of queries until it finds an answer. This enables us to define data at a global level and then override the values at different points of our hierarchy depending on the scope (ansible facts) returned from the node, this model of lookup is particularly suited to infrastructure configuration. The hierarchy is configurable based on whatever is appropriate for your environment, you may for example define variables at the global level and then override them depending on the operational environment, location or role of the requestor.

The difference between this and traditional lookups, or dynamic inventories, is that you do not have to collate and organise the data itself against each host prior to using it. For example if you are overriding a setting based on environment and location you do not have to build a data set that defines which hosts are in which environment or location, that data is already available in the facts that Ansible gathers at runtime, and is used dynamically when performing a lookup against the hierarchy tree.

A simple Ansible playbook

Let’s start with a simple Ansible playbook to manage NTP. I’m not sure why everyone uses NTP as the go-to configuration management example, but who am I to argue with such a well established convention. In this example we’re going create a playbook that manages the servers defined in /etc/ntp.conf.

Here is our playbook:

And our corresponding template:

If I run this, all my hosts get an identical ntp.conf:

Overriding the behaviour

Depending on certain characteristics of the node we are configuring, we may want to configure different NTP servers. In this example I want to retain these defaults but easily override these values depending on the operating environment that a host is in, and further more to be able to further override that for a specific host if I counter a one off edge case.

Ansible facts

For this example, we are going to work with the ansible fact ansible_nodename and a custom fact called environment that is available on the nodes in facts.d;

Ansible lookups

Ansible has several sources of runtime data, the integration with Jerakia for performing hierarchical lookups is currently possible by a lookup plugin (ansible-jerakia). Lookup plugins can be used within Ansible playbooks to call out to an external data source, in this case Jerakia, to populate variables with data. With Jerakia, we pass on information about the node in the lookup request, this information is gathered dynamically from the nodes facts. We refer to this collection of information as the scope. Rather than saying lookup the value of this key, we are now saying Lookup the value of this key in the context of this scope

Configuring Jerakia

I’m going to assume you already have a Jerakia instance running, and for simplicity it is on localhost and the port is default.

When a lookup request for a key is received by Jerakia, it uses one or more lookups contained in a policy to search for the data, policy files are Ruby based and live in the configured policy.d folder. Let’s start with a fairly basic Jerakia policy that defines a global layer and then overrides on environment and the specific hostname of the machine. Something like

Using this configuration, when Jerakia receives a lookup request, it will first check to see if there is any matching data at the hostname level specific to that node, if not it will fall down to the next level and check if there is any matching data in the corresponding environment that the node belongs to before finally falling down to the “global” layer.

Finally, you should create a token so that Ansible can authenticate against the Jerakia API

Integrating Ansible with Jerakia

Once you’ve installed the lookup plugin into the correct location, you should create a jerakia.yaml file at the root of your playbook. In this file we specify the Jerakia server with authentication details, and also pass on the scope values we are going to need to perform lookups.

If you refer back to the Jerakia policy file above, you’ll see that in the scope section of our configuration here we are mapping Ansible facts to keys that will be available in the scope from within the Jerakia policy. Nested fact values can be referenced using the dot-notation as in the above example

Add some data to Jerakia

We’ll start off by adding our default NTP servers to Jerakia, at the global level so it applies to everything. We will create the key servers in the namespace ntp so under the data directory we need a directory called global and a YAML document inside that directory containing the keys within the ntp namespace.

We can now validate that this is correct by performing a Jerakia lookup on the command line

Look up data from the Ansible playbook

Now we have Jerakia running and the lookup plugin in place we can modify our playbook to lookup the value for the NTP servers array from Jerakia rather than hard coding it in the playbook. The Jerakia lookup plugin takes an argument of the namespace and the lookup key separated by a ‘/’.

If we run this playbook now, things should stay as they are, but the value of the NTP servers array is now coming from Jerakia.

Overriding data in the hierarchy

Now if we wish to keep these defaults for NTP servers across our estate, but we want to override these in the production environment we can do this simply by adding more data to the Jerakia hierarchy in the environment level. Jerakia will search for data within the environments directory under the sub-directory corresponding to the environment of the node. It will do this before hitting the global level, so any data defined here will win.

Now we can see on the command line that we’re able to get different results from Jerakia by feeding it a different environment

Now if I re-run my Ansible playbook from before, I should see that servers in the production environment (in this case ansible3) should be configured differently.


In our example we override based on environment and then hostname, but in your organisation this could be completely different, maybe it makes more sense to override on role, location, datacenter..etc. Building the right hierarchy to suit your infrastructure needs is crucially important.

This post touches the surface of Jerakia. It has many more advanced features, such as cascading lookups that traverse through a hierarchy and return a consolidated set of data, Vault integration for secrets management and a very powerful flexible configuration. Also, it’s pluggable so you can use the same interface to retrieve data from a variety of sources, including file based hierarchies, HTTP APIs and databases.

Jerakia is pretty established already but the integration with Ansible is still fairly new, any feedback on how to improve the integration would be greatly appreciated. More information on Jerakia can be found here and the Ansible lookup plugin can be downloaded from GitHub

Follow and share if you liked this