Extending Jerakia with lookup plugins


In my last post, I introduced Jerakia as a data lookup engine for Puppet, and other tools. We looked out how to use lookup policies to get around complex requirements and edge cases including providing different lookup hierarchies to different parts of an organisation. In this post we are going to look at extending the core functionality of Jerakia. Jerakia is very plugguble, from data sources to output filters, and I’ll cover all of them in the coming days, but today we are going to cover plugins.

Lookup plugins

Last week we looked at Jerakia polcies, which are containers for lookups. A lookup, at the very least contains a name and a datasource. A classic lookup would be;

Within a lookup we have access to both the request and all scope data sent by the requestor. Having access to read and modify these gives us a great deal of flexibility. Jerakia policies are written in Ruby DSL so there is nothing stopping you from putting any amount of functionality directly in the lookup. However, that makes for rather long and complex policies and isn’t easy to share or re-use. The recommended way therefore to add extra functionality to a lookup is to use the plugin mechanism.

As an example, let’s look at how Jerakia differs from a standard Hiera installation in terms of data structure and filesystem layout between Hieras’ YAML backend and Jerakias’ file datasource. Puppet data mappings are requested from Hiera as modulename::key and are searched from the entries in the configured hierarchy. Jerakia has the concept of a namespace and a key, and when requesting data from Puppet, the namespace is mapped to the module name initiating the request. Jerakia looks for a filename matching the namespace and the variable name as the key. Take this example;

A standard hiera filesystem would contain something like;

In Jerakia, by default this would be something like;

The difference is subtle enough, and if we wanted to use Jerakia against a Hiera-style file layout with keys formatted as module::key we could manipulate the request to add the first element of request.namespace to the key, separated by ::, and then drop the namespace completely. You could implement this directly in the lookup, but a better way is to use a plugin keeping the functionality modular and shareable. Jerakia ships with a plugin to do just this, it’s called, unsuprisingly, hiera.

Using lookup plugins

To use a plugin in a lookup it must be loaded using the :use parameter to the lookup block, eg:

If you want to use more than one plugin, the argument to :use can also be an array

Once a plugin is loaded into the lookup, it exposes it’s methods in the plugin.name namespace. For example, the hiera plugin has a method called rewrite_lookup which rewrites the lookup key and drops the namespace from the request, as described above. So to implement this functionality we would call the method using the plugin mechanism;

Writing plugins

Lookup plugins are loaded as jerakia/lookup/plugin/pluginname from the ruby load path, meaning they can be shipped as a rubygem or placed under

jerakia/lookup/plugin relative to the plugindir option in the configuration file. The boilerplate template for a plugin is formed by creating a module with a name corresponding to your plugin name in the Jerakia::Lookup::Plugin class… in reality that looks a lot simpler than it sounds

We can now define methods inside this plugin that will be exposed to our lookups in the plugin.mystuff namespace. For this example we are going to generate a dynamic hierarchy based on a top level variable role. The variable contains a colon delimited string, and starting with the deepest level construct a hierarchy to the top. For example, if the role variable is set to web::frontend::application_foo we want to generate a search hierarchy of;

To do this, we will write a method in our plugin class called role_hierarchy and then use it in our lookup. First, let’s add the method;

We can now use this within our module by loading the mystuff plugin and calling our method as plugins.mystuff.role_hierarchy. Here is the final lookup policy using our new plugin;


My example here is pretty simple, but it demonstrates the flexibility of Jerakia to create a dynamic search hierarchy. With access to the request object and the requestors scope, lookup plugins can be very powerful tools to get around the most complex of edge cases. Being able to write Jerakia policies in native Ruby DSL is great for flexibility, but runs the risk of having excessive amount of code complicating your policy files, the plugin mechanism offers a way to keep extended lookup functionality separate, and make it shareable and reusable.

Up next…

We’re still not done, Jerakia offers numerous extension points. In my next post we will look at output filters to parse the data returned by the backend data source. We will look first at what I consider the most useful of output filters, encryption which uses hiera-eyaml to decrypt data strings in your returned data, no matter what datasource is used, and look at how easy it is to write your own output filters. After that we will look at extending Jerakia to add further data sources, so stay tuned!


Solving real world problems with Jerakia


I’ve always been a great admirer of Hiera, and I still remember the pain and turmoil of living in a pre-Hiera world trying to come up with insane code patterns within Puppet to try and organize my data in a sensible way. Hiera was, and still is, the answer to lots of problems.

For me however, when I moved beyond a small-scale, single-customer orientated Puppet implementation into larger, complex and diverse environments I started to find that I was spending a lot of time trying to figure out how to model things in Hiera to meet my use requirements. It’s a great tool but it has some limitations in the degree of flexibiity it offers around how to store and look up your data.

Some examples of problems I was trying to solve were; How can I…

  • use a different backend for one particular module?
  • give a team a separate hierarchy just for their app?
  • give access to a subset of data to a particular user or team?
  • enjoy the benefits of eyaml encryption without having to use YAML?
  • implemenet a dynamic hierarchy rather than hard coding it in config?
  • group together applciation specific data into separate YAML files?
  • There are many more examples, and after some time I began exploring some solutions. Initially I started playing around with the idea of a “smart backend” to hiera that could give me more flexibility in my implementation and that eventually grew into what is now Jerakia. In fact, you can still use Jerakia as a regular Hiera backend, or you can wire it directly into Puppet as a data binding terminus.

    Introducing Jerakia

    Jerakia is a lookup tool that has the concept of a policy, which contains a number of lookups to perform. Policies are written in Ruby DSL allowing the maximum flexibility to get around those pesky edge cases. In this post we will look at how to deploy a standard lookup policy and then enhance it to solve one of the use cases above.

    Define a policy

    After installing Jerakia the first setup is to create our default policy in /etc/jerakia/policy.d/default.rb

    Jerakia policies are containers for lookups. A policy can have any number of lookups and they are run in the order they are defined

    Writing our first lookup

    A lookup must contain, at the very least, a name and a datasource to use for the lookup. The current datasource that ships with Jerakia is the file datasource. This takes several options, including format and searchpath to define how lookups should be processed. Within the lookup we have access to scope[] which contains all the information we need to determine what data should be returned. In Puppetspeak, the scope contains all facts and top-level variables passed from the agent

    We now have a fairly standard lookup policy which should be fairly familar to Hiera users. A Jerakia lookup request contains two parts, a lookup key and a namespace. This allows us to group together lookup keys such as port, docroot and logroot into a namespace such as apache. When integrating from Hiera/Puppet, the module is used for the namespace, and the variable name for the key. In Puppet we declare;

    This will reach Jerakia as a lookup request with the key port in the namespace apache, and with our lookup policy above a typical request would look for the key “port” in the following files, in order

    This is slightly different behaviour than you would find in Puppet using Hiera, if you are using Jerakia against an existing Hiera filesystem layout which has namespace::key in path, rather than key in path/namespace.yaml then check out the hiera plugin which provides a lookup method called plugin.hiera.rewrite_lookup to mimic hiera behaviour. More on lookup plugins in the next post!

    Adding some complexity

    So far what we have done is not rocket science, and certainly nothing that can’t be easily achieved with Hiera. So let’s mix it up a bit by defining a use case that will change our requirements. This use case is based on a real world scenario.

    We have a team based in Ireland. Their servers are identified with the top level variable location. They need to be able to manage PHP and Apache using Puppet, but they need a data lookup hierarchy based on their project, which is something only they use. Furthermore, we wish to give them access to manage data specifically for the modules they are responsible for, without being able to read, override or edit data for other modules (eg: network, firewall, kernel).

    So, the requirements are to provide a different lookup hierarchy for servers that are in the location “ie”, but only when configuring the apache or php modules, and to source the data from a different location separate from our main data repo. With Jerakia this is easily solvable, lets first look at creating the lookup for the Ireland team…

    So now we have defined a separate lookup for our Ireland based friends. The problem here is that every request will first load the lookup ireland and then proceed down to the main lookup. This is no different than just adding new hierarchy entries in hiera, they are global. This means potentially bad data creeping in, if for example they accidentally override the firewall rules or network configuration.

    To get around this we can use the confine method in the lookup block to restrict this lookup to requests that have “location: ie” in the scope, and are requesting keys in the apache or php namespaces, meaning the requesting modules. If the confine criteria is not met then the lookup will be invalidated and skipped, and the next one used. Finally, we do not want to risk dirty configuration from default values that we have in our hierarchy for apache and php, so we need to tell Jerakia that if this lookup is considered valid (eg: it has met all the criteria of confine) then only use this lookup and don’t proceed down the chain of available lookups. To do this, we use the stop method.

    The confine takes two arguments, a value and a match. The match is a string that can contain regular expressions. The confine method supports either a single match, or an array of matches to compare. So in order to confine this lookup to the location ie we can confine it as follows

    By confining in this way we tell Jerakia to invalidate and skip this lookup unless location is “ie”. Similarly we can add another confine statement to ensure that only lookups for the apache and php namespaces are handled by this lookup. Our final policy would look like this:


    This example demonstrates that using Jerakia lookup policies you can tailor your data lookups quite extensivly giving a high amount of flexibility. This is especially useful in larger organisations with many customers using one central Puppet infrastructure.

    This is just one example of using Jerakia to solve a use case, I hope to blog a small mini-series on other use cases and solutions, and welcome any suggestions that come from the real-world!

    Next up…

    Jerakia is still fairly experimental at the time of writing (0.1.6) and there is still a lot of room for improvement both in exposed functionality and in the underlying code. I’d like to see it mature and there are still plenty of features to add, and code to be tidied up. There is some excellent work being done in Puppet 4.0 with regards to internal handling of data lookups that I think would complement our aims very well (currently all work has been done against 3.x) and the next phase of major development will be exploring these options.

    Also, I talk about Puppet a lot because I am a Puppet user and the problems that I were trying to solve were Puppet/Hiera related, that doesn’t mean that Jerakia is exclusively a Puppet tool. The plan is to integrate it with other tools in the devops space, which given the policy driven model should be fairly straightforward.

    My next post will focus on extending Jerakia and will cover writing and using lookup plugins to enhance the power of lookups and output filters to provide features like eyaml style decryption of data regardless of the data source. I will also cover Jerakia’s pluggable datastore model that encourages community development.