Managing Puppet Secrets with Jerakia and Vault

A new approach to managing encrypted secrets in Puppet using Vault and Jerakia

Introduction and History

Over the past couple of years I’ve talked a lot about a project called Jerakia. Jerakia is a data lookup system inspired by Hiera, but built to be a stand-alone solution that is decoupled from Puppet, or any particular configuration management system that not only offers opportunities to integrate with other tools aside from Puppet thanks to it’s REST API server architecture but also offers a solution to people with far reaching edge cases around data complexity that are hard or impossible to solve in Hiera, it does this largely thanks to being configurable with native Ruby DSL.  If you’ve never heard of Jerakia before, you can read my initial blog post that covers the basics or see the official website.

Being able to deal with secret data, such as passwords and other sensitive data that needs to be served by Puppet has proved to be a very important requirement for Puppet users.  Shortly after Hiera was first released as a third party tool by R.I Pienaar in 2012 I developed one of the first pluggable backends for it, the now deprecated hiera-gpg.  hiera-gpg became hugely popular very quickly as people finally had a way to store production sensitive data along side other non-production data (eg: in the same Git repo) without compromising the details since anyone browsing the Git repo could only see the encrypted form of the key values.

As hiera-gpg grew in popularity as the first plugin of it’s kind to be able to solve the problem, it also suffered from a few design limitations and eventually hiera-eyaml was developed and became the next evolutionary step for handling sensitive data from Hiera.  hiera-eyaml had a better and more modern design than hiera-gpg and has served many users well over the years, but it re-implements a lot of what the yaml backend does with added capabilities to handle encryption, Hiera has always had the ability to support pluggable backends so you can source your data from a variety of different systems, whether they be files, databases or REST API services, but to be able to support encryption within a Hiera lookup you are tied to using a file based YAML back end.

Jerakia initially released with the ability to handle encrypted values from any data source, and up until now it’s done that using a mish-mash of the hiera-eyaml library to provide the decryption mechanism.  I’ve always felt this level of integration wasn’t ideal, hiera-eyaml was never designed to be a standalone solution to be used outside of Puppet and Hiera and the role of providing reliable and secure encryption for your sensitive data is an important one, and so I started looking at platforms that were built specifically for encryption, and more importantly, a shared encryption solution that I could use throughout my toolchain and still maintain the flexibility to store data where and how I want.  I’ve settled on Vault (but you don’t  have to!).

Vault

Vault is an open source encryption platform by Hashicorp, the makers of many great software platforms such as Vagrant and Terraform.  Vault is a highly feature rich system for handling all of your encryption and cryptography needs, most of the features of Vault I won’t even touch on in this post, since there are so many.  To take a quote directly from the website; [source]

Vault secures, stores, and tightly controls access to tokens, passwords, certificates, API keys, and other secrets in modern computing. Vault handles leasing, key revocation, key rolling, and auditing. Through a unified API, users can access an encrypted Key/Value store and network encryption-as-a-service, or generate AWS IAM/STS credentials, SQL/NoSQL databases, X.509 certificates, SSH credentials, and more.

Many people use Vault as a place to store their secret data, like an encrypted database, and can use either the command line or the HTTP API to authenticate and retrieve the encrypted values.  Jerakia strives not to be a database but to offer users flexibility in where and how they want to store their data but perform hierarchical lookups in a uniformed fashion regardless of the source of the data, so I was particularly interested in Vault when I read about the introduction of the Transit Backend that is now available.

In a nutshell, the transit backend turns vault from an encrypted database into “cryptography as a service”.  You create an encryption key that authenticated clients can use to encrypt and decrypt data on-the-fly using Vaults’ API but Vault itself never stores the encrypted or decrypted data, but as a dedicated encryption platform it offers a an excellent level of protection around authentication and key storage protection.

This immediately seemed like a great idea for providing the encryption functions to support sensitive data in Puppet and other tools and for a tool like Jerakia.  So the 2.0.0 release of Jerakia now has native support for Vault integration using the Transit secret backend.

Jerakia Encryption

Jerakia has always had the concept of output filters. These are pluggable, data source agnostic and give you the ability to pass the results from any data lookup from any source to a filter that can perform modifications to the results before it’s sent back to the requestor.  In Jerakia 1.x there was an output filter called encryption which was a filter that tried to pick out hiera-eyaml style encoded strings and decrypt them using the slightly hacky integration I touched on earlier.

In Jerakia 2.0 the concept of encryption has become a bit more of a first class citizen, and it’s also pluggable.  In Jerakia 2.0 you can enable encryption and specify a provider for the encryption mechanism you want to use – the shipped provider for encryption in 2.0 is vault, but the API allows you to extend Jerakia with your own providers if you wish, even hiera-eyaml.

The output filter encryption will now use whichever encryption provider has been configured to provide decryption of data regardless of the source, based on a signature (a regular expression) that is advertised by the provider. So if a returned value matches the regular expression advertised by the providers signature, the encryption output filter will flag that as an encrypted value and attempt to decrypt it, otherwise the value will be returned unaltered.  We’re also not tied to a particular data source, Jerakia will detect an decrypt the data no matter which data source is used for the lookup.

Furthermore, an encryption provider can advertise an encrypt capability which allows you to encrypt and decrypt values right on the command line using the Jerakia CLI.   You can use the Jerakia CLI to encrypt a secret string, copy that into your data, whether that be YAML files, or a database or some other data source, and that’s it.

Jerakia + Vault

So I’ve covered Vault and the encryption capabilities provided by Jerakia – now let’s look at how to tie the two together, and use Vaults’ transit backend as an encryption provider for Jerakia, and therefore handle secrets in Puppet / Hiera.

To start with, you’ll need to install and configure Vault and unseal it. Once unsealed, theres a few steps we need to cover in Vault before we integrate Jerakia.  The following steps assume you have an unsealed Vault installation that you can run root commands against.

Configuring Vault

The first thing we need to do is enable the transit backend in Vault, this can be achieved by mounting it with the following command

Once the backend is mounted, we need to create a key for Jerakia to use for encryption and decryption of values.  The name of the key is configurable, but by default Jerakia will try and use a key called jerakia.

Now we have a dedicated key that Jerakia will use for encrypting and decrypting data.  The second step is to create a policy to restrict activity to just the endpoints used for encrypting and decrypting, we’ll also call that policy jerakia. To create a policy, we’re going to create a new file called jerakia_policy.hcl and then import that policy into Vault.

The file should contain the following rules;

Once created and saved, we need to import the policy into vault.

We can do a quick test now to make sure everything is ok by trying to encrypt a value on the command line using the Jerakia transit key and the policy that we’ve just created

If you had a similar output, then we’re good to go! But before we can plug Jerakia into this mix we need to give Jerakia itself something to authenticate with against Vault.  We could use a simple Vault token, which is supported, but this raises issues of expiry and renewal which we probably don’t want to be dealing with every 30 days (or whatever you set your token TTL to).  The recommended way of authenticating to Vault is to use the AppRole authentication backend of Vault.  When using this method of authentication, we configure Jerakia with a role_id and a secret_id and Jerakia uses this to obtain a limited lifetime token from the Vault server to use for interacting with the transit backend API.  When that token expires, Jerakia will automatically request a new token using it’s role_id and secret_id to get a new token.

First we need to create a new AppRole for Jerakia, we’ll give it a token TTL of 10 minutes (optional) but it’s important that we tie this role to the access policy that we created earlier;

Now we can read the AppRole jerakia and determine the role_id

The role_id that we see here we are going to use later on.  But we’re not quite done yet, we need to create a secret_id along with our role_id and the combination of these two values will give Jerakia the authentication it needs to request tokens.  So let’s create that;

Now we have the two crucial pieces of information that we need to integrate Jerakia with Vault, the role_id and secret_id.  Now Vault is ready to be a cryptography provider to Jerakia, we just now need to add some simple configuration to Jerakia to glue all this together.

Configuring Jerakia

With an out-of-the-box installation of Jerakia, we don’t have encryption configured by default, it must be enabled.  If we look at the options on the command line for the sub-command secret we’ll see that there are no sub-commands available.

So the first thing we need to do is enable an encryption provider and give the configuration that it needs. We can do that in jerakia.yaml.  In the configuration file we configure the encryption option with a provider of vault and the specific configuration that our provider requires.  In this example I’m using a Vault instance that is over HTTP, not HTTPS so I need to set vault_use_ssl to false, see the documentation for options to enable SSL.  Because I’m not using SSL I need to set the vault_addr option as well as the secret_id and role_id.

Now I’ve configured an encryption provider in jerakia.yaml, that provider should advertise it’s capabilities to Jerakia and on the CLI I now see some new options available when running jerakia help secret….

Now we should be all set to test encrypting and decrypting data using the Vault encryption provider in Jerakia, we can use the CLI commands to encrypt and decrypt data… Let’s give that a spin;

Now Jerakia can use Vault as a provider of “cryptography as a service”, in a secure, authenticated way.  The only thing left now is to enable our Jerakia  data lookups to be encryption enabled, and we do that by calling the encryption output filter our lookup written in our policy.  We use the output_filter method to add a filter to our lookup, like this;

The inclusion of output_filter :encryption in this lookup tells Jerakia to pass all results to the encryption filter, which will match all returned data values against the signature provided by the encryption provider and if it matches, it will use the encryption provider to handle the decryption of the value before it it passed to the requestor.

Looking up secrets

So let’s add our encrypted value from earlier to this lookup…

This encrypted value can then be imported into any type of data source that you are using with Jerakia, here we’re using the default file data source so we’ll add it to test.yaml in our common path.

Because this string matches the regular expression provided by the vault encryption providers signature, and we’ve enabled the encryption output filter, if we try and look up the key password from the namespace test (test::password in Hiera speak), Jerakia automatically decrypts the data using the Vault transit backend.

Tying it up all into Puppet

Of course, this also means when Puppet / Hiera is integrated with Jerakia this now becomes transparent to Puppet and now we have our Puppet secrets, stored encrypted in any data source with decryption provided by Vault.

Summary

Theres a whole lot of really awesome functionality about Vault that I haven’t even touched on in this post, it’s extensive.  Having one tool for your cryptography needs across your infrastructure rather than a variety of smaller less dedicated tools doing their own thing simplifies things a lot.  If you don’t want to use Vault, the encryption feature of Jerakia is entirely pluggable and could be stripped out and replaced with whatever platform you want to use.

The subject of handling sensitive data in Puppet, and other tools, is an ongoing challenge, I’d certainly welcome any feedback over the approach used here.

 

 

Follow and share if you liked this

Solving real world problems with Jerakia

Background

I’ve always been a great admirer of Hiera, and I still remember the pain and turmoil of living in a pre-Hiera world trying to come up with insane code patterns within Puppet to try and organize my data in a sensible way. Hiera was, and still is, the answer to lots of problems.

For me however, when I moved beyond a small-scale, single-customer orientated Puppet implementation into larger, complex and diverse environments I started to find that I was spending a lot of time trying to figure out how to model things in Hiera to meet my use requirements. It’s a great tool but it has some limitations in the degree of flexibiity it offers around how to store and look up your data.

Some examples of problems I was trying to solve were; How can I…

  • use a different backend for one particular module?
  • give a team a separate hierarchy just for their app?
  • give access to a subset of data to a particular user or team?
  • enjoy the benefits of eyaml encryption without having to use YAML?
  • implemenet a dynamic hierarchy rather than hard coding it in config?
  • group together applciation specific data into separate YAML files?
  • There are many more examples, and after some time I began exploring some solutions. Initially I started playing around with the idea of a “smart backend” to hiera that could give me more flexibility in my implementation and that eventually grew into what is now Jerakia. In fact, you can still use Jerakia as a regular Hiera backend, or you can wire it directly into Puppet as a data binding terminus.

    Introducing Jerakia

    Jerakia is a lookup tool that has the concept of a policy, which contains a number of lookups to perform. Policies are written in Ruby DSL allowing the maximum flexibility to get around those pesky edge cases. In this post we will look at how to deploy a standard lookup policy and then enhance it to solve one of the use cases above.

    Define a policy

    After installing Jerakia the first setup is to create our default policy in /etc/jerakia/policy.d/default.rb

    Jerakia policies are containers for lookups. A policy can have any number of lookups and they are run in the order they are defined

    Writing our first lookup

    A lookup must contain, at the very least, a name and a datasource to use for the lookup. The current datasource that ships with Jerakia is the file datasource. This takes several options, including format and searchpath to define how lookups should be processed. Within the lookup we have access to scope[] which contains all the information we need to determine what data should be returned. In Puppetspeak, the scope contains all facts and top-level variables passed from the agent

    We now have a fairly standard lookup policy which should be fairly familar to Hiera users. A Jerakia lookup request contains two parts, a lookup key and a namespace. This allows us to group together lookup keys such as port, docroot and logroot into a namespace such as apache. When integrating from Hiera/Puppet, the module is used for the namespace, and the variable name for the key. In Puppet we declare;

    This will reach Jerakia as a lookup request with the key port in the namespace apache, and with our lookup policy above a typical request would look for the key “port” in the following files, in order

    This is slightly different behaviour than you would find in Puppet using Hiera, if you are using Jerakia against an existing Hiera filesystem layout which has namespace::key in path, rather than key in path/namespace.yaml then check out the hiera plugin which provides a lookup method called plugin.hiera.rewrite_lookup to mimic hiera behaviour. More on lookup plugins in the next post!

    Adding some complexity

    So far what we have done is not rocket science, and certainly nothing that can’t be easily achieved with Hiera. So let’s mix it up a bit by defining a use case that will change our requirements. This use case is based on a real world scenario.

    We have a team based in Ireland. Their servers are identified with the top level variable location. They need to be able to manage PHP and Apache using Puppet, but they need a data lookup hierarchy based on their project, which is something only they use. Furthermore, we wish to give them access to manage data specifically for the modules they are responsible for, without being able to read, override or edit data for other modules (eg: network, firewall, kernel).

    So, the requirements are to provide a different lookup hierarchy for servers that are in the location “ie”, but only when configuring the apache or php modules, and to source the data from a different location separate from our main data repo. With Jerakia this is easily solvable, lets first look at creating the lookup for the Ireland team…

    So now we have defined a separate lookup for our Ireland based friends. The problem here is that every request will first load the lookup ireland and then proceed down to the main lookup. This is no different than just adding new hierarchy entries in hiera, they are global. This means potentially bad data creeping in, if for example they accidentally override the firewall rules or network configuration.

    To get around this we can use the confine method in the lookup block to restrict this lookup to requests that have “location: ie” in the scope, and are requesting keys in the apache or php namespaces, meaning the requesting modules. If the confine criteria is not met then the lookup will be invalidated and skipped, and the next one used. Finally, we do not want to risk dirty configuration from default values that we have in our hierarchy for apache and php, so we need to tell Jerakia that if this lookup is considered valid (eg: it has met all the criteria of confine) then only use this lookup and don’t proceed down the chain of available lookups. To do this, we use the stop method.

    The confine takes two arguments, a value and a match. The match is a string that can contain regular expressions. The confine method supports either a single match, or an array of matches to compare. So in order to confine this lookup to the location ie we can confine it as follows

    By confining in this way we tell Jerakia to invalidate and skip this lookup unless location is “ie”. Similarly we can add another confine statement to ensure that only lookups for the apache and php namespaces are handled by this lookup. Our final policy would look like this:

    Conclusion

    This example demonstrates that using Jerakia lookup policies you can tailor your data lookups quite extensivly giving a high amount of flexibility. This is especially useful in larger organisations with many customers using one central Puppet infrastructure.

    This is just one example of using Jerakia to solve a use case, I hope to blog a small mini-series on other use cases and solutions, and welcome any suggestions that come from the real-world!

    Next up…

    Jerakia is still fairly experimental at the time of writing (0.1.6) and there is still a lot of room for improvement both in exposed functionality and in the underlying code. I’d like to see it mature and there are still plenty of features to add, and code to be tidied up. There is some excellent work being done in Puppet 4.0 with regards to internal handling of data lookups that I think would complement our aims very well (currently all work has been done against 3.x) and the next phase of major development will be exploring these options.

    Also, I talk about Puppet a lot because I am a Puppet user and the problems that I were trying to solve were Puppet/Hiera related, that doesn’t mean that Jerakia is exclusively a Puppet tool. The plan is to integrate it with other tools in the devops space, which given the policy driven model should be fairly straightforward.

    My next post will focus on extending Jerakia and will cover writing and using lookup plugins to enhance the power of lookups and output filters to provide features like eyaml style decryption of data regardless of the data source. I will also cover Jerakia’s pluggable datastore model that encourages community development.

    Follow and share if you liked this

    Puppet data from CouchDB using hiera-http

    Introducing hiera-http

    I started looking at various places people store data, and ways to fetch it and realized that a lot of data storage applications are RESTful, yet there doesn’t seem to be any support in Hiera to query these things, so I whipped up hiera-http, a Hiera back end to connect to any HTTP RESTful API and return data based on a lookup. It’s very new and support for other stuff like SSL and Auth is coming, but what it does support is a variety of handlers to parse the returned output of the HTTP doc, at the moment these are limited to YAML and JSON (or just ‘plain’ to simply return the whole response of the request). The following is a quick demonstration of how to plug CouchDB into Puppet using Hiera and the hiera-http backend.

    Hiera-http is available as a rubygem, or from GitHub: http://github.com/crayfishx/hiera-http

    CouchDB

    Apache CouchDB is a scalable database that uses no set schema and is ideal for storing configuration data as everything is stored and retrieved as JSON documents. For the purposes of this demo I’m just going to do a very simple database installation with three documents and a few configuration parameters to demonstrate how to integrate this in with Puppet.

    After installing couchdb and starting the service I’m able to access Futon, the web GUI front-end for my couchdb service – using this I create three documents, “dev”, “common” and “puppet.puppetlabs.lan”

    CouchDB documents
    CouchDB documents

    Next I populate my common and dev documents with some variables.

    Common document populated with data

    Now CouchDB is configured I should be able to query the data over HTTP

    Query with Hiera

    After installing hiera-http I can query this data directly from Hiera…

    First I need to configure Hiera with the HTTP back end. The search hierarchy is determined by the :paths: configuration parameter and since CouchDB returns JSON I set that as the output handler.

    I can now query this directly from Hiera on the command line

    And of course, that means that this data is now available from Puppet and if I add some overriding configuration variables to my dev document in CouchDB, my lookup will resolve based on my environment setting in Puppet

    Hiera-http is fully featured and supports all standard Hiera back end functions such as hiera_hash, hiera_array order overrides.

    Future stuff

    I’m going to carry on working on new features for hiera-http – including basic auth, HTTPS/SSL, proxys and a wider variety of output handlers – I would like for this back end to be flexible enough to allow users to configure Hiera with any network service that uses a RESTful API to perform data lookups. Keep watching.

    Follow and share if you liked this

    Introducing hiera-mysql MySQL Backend for Hiera

    Introduction

    Some time ago I started looking at Hiera, a configuration datastore with pluggable back ends that also plugs seamlessly into Puppet for managing variables. When I wrote hiera-gpg a few months ago I realised how easy extending Hiera was and the potential for really useful backends that can consolidate all your configuration options from a variety of systems and locations into one streamlined process that systems like Puppet and other tools can hook into. This, fuelled by a desire to learn more Ruby, lead to hiera-mysql, a MySQL Backend for Hiera.

    Installing

    hiera-mysql is available as a ruby gem and can be installed with:

    Note: this depends on the Ruby mysql gem, so you’ll need gcc, ruby-devel and mysql-devel packages installed. Alternativley the source can be Downloaded here

    MySQL database

    To demonstrate hiera-mysql, here’s a simple MySQL database some sample data;

    Configuring Hiera

    In this example we’re going to pass the variable “env” in the scope. hiera-mysql will interpret any scope variables defined in the query option, and also has a special case for %{key}. Example:

    Running Hiera

    With the above example, I want to find the value of the variable colour in the scope of live

    If I add more rows to the database that match the criteria, and use Hiera’s array search function by passing -a I can make Hiera return all the rows

    Hiera’s pluggable nature means that you can use this back end alongside other back ends such as YAML or JSON and configure your search order accordingly.

    Limitations

    Currently hiera-mysql will only return the first element of a row, or an array of first elements, so you can’t do things like SELECT foo,bar FROM table. I intend to introduce this feature by implementing Hiera’s hash search in a future release. Also, the module could do with slightly better exception handling around the mysql stuff. Please let me know if theres anything else that would improve it.

    Puppet

    And of course, because Hiera is completely transparent, accessing these variables from Puppet couldn’t be easier!

    References

  • Github homepage for hiera-mysql
  • Official Hiera Project Homepage
  • Hiera – A pluggable hierarchical data store
    Follow and share if you liked this