puppetverse, vagrant, jekyllfication

Hey, my colleague Max Martin at Puppet Labs had a piece about standing up a Linux dev environment in Puppet and Vagrant published at Linux.com today. It’s the first of a two-parter that provides a nice, practical Vagrant/Puppet walkthrough. I believe the plan for the next part is to include a github repo you can clone to get all the bits he covers in the tutorial. Once both parts are published, you won’t have any excuse not to give Vagrant a try. That’s another way in which 2013 is totally kicking 1985’s ass: No typing in those Compute! Gazette hex listings with the checksums at the end of each line.

Speaking of Vagrant, I cleaned up puppetverse a little today. There’s still some room for improvement, but my main goal was to get rid of the parts that were particular to my immediate needs (documenting hiera) and organize things a little better so that it’s clear which modules, manifests, and files are part of the Vagrant boot process, and which will be included by the puppet master once it’s up and running independent of Vagrant’s provisioning phase.


I’ve taken a stab at getting this blog moved over to Jekyll from WordPress three or four times in the past couple of years. Each time, something’s been not quite right.

Verdict: We live in a world of tradeoffs. I’ve tried three basic “export to Jekyll” approaches, and they’ve all had issues:

  1. The Jekyll + MySQL method pulls in the content pretty well, and posts keep their category. They lose their tags, though. I don’t think that’s necessarily a problem: My current template doesn’t even expose tags, and I can’t see much evidence Google has ever found many of the tag pages.

  2. The exitwp method ate some of my markup alive. Whole tables and spans vanished in just the first six or seven posts I reviewed after a migration. It’s a Python-based migration method, and I am not looking for a “get to know Python” project.

  3. The WordPress export to Jekyll plugin method never got past parse errors.

Do I blame myself for the problems I had with approaches 2 and 3? I might as well. I settled on Markdown some time after I started dot unplanned, and I didn’t always deal with Markdown’s limitations in a constructive fashion. Any parser written by a conscientious person unwilling to take responsibility for the universe of awful HTML people can come up with is in its rights to give up and sit down.

That said, all my recent posts look pretty close to correct when I use the Jekyll + MySQL migration method, so if I can bring myself to shed the tag metadata I wasn’t using anyhow …

… I’ll just need to decide if I really need to do this, or if it’s something I talked myself into wanting to do because it didn’t work very well the first time I tried it, triggering some weird nerd combat reflex that wouldn’t allow me to put the idea behind me until I knew I could make it work.

I mean … If you have a WordPress blog you know two things:

  1. It’s very popular, and so it is well supported for standard blogging purposes.

  2. It’s so popular that no matter what comes along to replace it in the next several years, should something that great come along, that thing will have to have very good WordPress migration support.

Jekyll, though? It’s critical to the docs toolchain I work with daily, but a lot of the personal use cases I’ve read seem to come down to git maximalism, WordPress performance concerns that are completely addressable by well established caching plugins, and security.

I totally hear the security concerns. I also hear an undercurrent of fascination with the deathless novelty of static site generators, and that’s fine, too. The question I’ve got to ask myself is what happens after I move to Jekyll. At that point, my content will live in a bunch of Markdown files. If I decide I’m tired of a git-based publishing workflow or whatever, it’ll probably be on me to write the scripts needed to get my stuff out. That doesn’t sound like a ton of fun now, and I wouldn’t trust the Me of Now to decide what’s going to be fun for the Me of Later Down the Road. After all, the Me of the Past played a shit-ton of World of Warcraft, and the Me of Now does not think that sounds interesting or fun at all.

Like Being Back

Hey, here’s something that made me pretty happy:

My name turned up as a contributor in some release notes. It stands to reason that it would, because I work there and I did some work that shows up in that release, but I think that’s a first for me at Puppet. The actual ticket I worked on was for the documentation on the newest version of the Puppet Ruby DSL. It was kind of a heavy thing to have a hand in, I wasn’t sure I liked having to deal with it at the time, but there it is and if you’re the type to maybe feel a little stifled by the vanilla Puppet DSL, maybe you’ll enjoy giving the Puppet Ruby DSL a spin.

Something else I worked on related to the Puppet 3.1-rc1 release, but not as anything that was on a ticket anywhere, is a small enhancement to the documentation for installing Puppet on OS X. Installation itself is pretty simple: You just download a few DMGs and run the installers, but you’re not left with much to work with from there if you want to run a puppet master or background agent. So we rescued a few launchd plists from the aging wiki, spruced them up a little bit, and gave the information a home on the docs site, where it will seem less like some vaguely illicit thing you can try but aren’t encouraged to count on and more like a thing you ought to be doing. So here’s the basic installation guide for Puppet on OS X and here’s the bit about launchd plists for Puppet on OS X with links to a pair of plists that will get you going.

I used them to help finish the setup on a puppet master for a home Mac that isn’t seeing much use now that I’m not working at home as much. I’m going to use it to puppetize a few configurations between my several machines, and also with Kelsey Hightower’s Homebrew provider.

One more note on those: They’re very simple. If you’ve been curious about launchd and thought about replacing cron with it, those links above will give you a minimal working example. All they do is kick Puppet off at system launch, at which point Puppet’s own configuration handles how often a run happens. If you want to set up a launchd job that runs on a given interval, you have to add a StartInterval key. Here’s a reasonable, minimal guide to launchd that points to some tools that’ll keep you out of editing XML. Oh, and here’s a coherent case for launchd from back during Tiger’s launch. The word “barbaric” is used, so if you’re really sensitive about cron and Our Sacred Unix Heritage, you should maybe just let that link be. Alternately, use it as a test to see how you do when exposed to higher quality Apple advocacy.

Oh. Why so happy? Because I remember back when I was kind of this paid Linux and open source guy, and I let some things that were not awesome about that get in the way of enjoying the parts that were awesome. I stopped being that guy and started doing other stuff. In the process of doing that other stuff, I felt pretty cut off from the open source world.

Moving to Puppet Labs, I feel reconnected with that world and understand the ethical language people are speaking around me. I’m really glad I get to make a living contributing to free software people love.


In the spirit of getting things out in front of people and maybe attracting useful feedback, it would please me very much to offer to you a little Puppet ecosystem I call puppetverse.

I’m putting it out there less to encourage you to use it, specifically, but to note the existence of this thing called Vagrant that I find very promising and am now fiddling with quite a bit. Before leaving the matter of puppetverse, I’ll just offer that if you choose to visit that link and follow the README, you’ll end up with a small virtual laboratory that will allow you to explore the open source release of Puppet with relative ease.

Some background:

Have you ever managed a website with a few outside web developers, a testing server, a staging server and a production server? How did you do it?

I had to for a while. Part of that time was spent dealing with a staging server and production server using different Linux distributions, different versions of PHP, and different versions of MySQL. Worse, I was locked out of the staging server (it belonged to and was managed by a small support shop) so there was no way to harmonize it with production: You just did an svn commit, waited 10 minutes for the staging server to run a cron job to svn up the codebase, and checked your work.

At the development work level, it was even worse. When I arrived, the official development setup involved a malconfigured plug-n-play desktop LAMP bundle that wasn’t even up to importing our application’s database reliably. Outside developers were invited to use that method or cobble together whatever they could on their own. Sometimes, things just didn’t work on delivery and introduction to our environment. Other times they worked in the staging environment but failed in production.

Eventually I set up a virtual environment for myself that I could keep in tight harmony with production systems. We moved from our little third-party support contractor to hosting with Acquia, which I would recommend to just about anybody running Drupal who’s feeling maybe in a little over her head. Acquia’s cloud hosting stuff is very slick. That handled our testing, staging, and production discontinuities overnight.

It still left us with how to help outside developers, and I never had a satisfactory answer beyond “here’s a cloud server we keep spun up that you can bang on, we’ll periodically update the codebase there so you can tell if your code will blow up on production, probably.”

So, Vagrant

Vagrant would have solved a lot of problems for me.

Vagrant provides a way to create a minimal virtual machine that makes just enough assumptions about its own configuration to be useful in a variety of contexts that involve many divergent assumptions.

Prior to learning about Vagrant, I dealt with VMs this way:

I’d set up a minimal virtual machine and configure it to a certain point, then I’d save it somewhere. Each time I needed the base configuration, I’d copy the VM to a new file and start using it. So far, so good.

Then I’d screw something up. Maybe I’d mistakenly use my clean base VM, or I’d explore in a certain direction and discover it wasn’t a good direction, and I’d be caught with a VM that was pretty messed up and probably not worth the time to get it back to baseline.

Vagrant offers a way to create a base VM just once, then freeze it in that state, then reuse it over and over with custom configurations each time. If you take a VM based on one of those custom configurations in a direction you don’t like, that’s o.k.: You just tear it down and it goes back to baseline.

For a Puppet or Chef user, Vagrant is pretty nice because it uses either of those tools to provision freshly powered up VMs.

While we were preparing for the Puppet 3.1.0-rc1 release, I used puppetverse to help test the Ubuntu packages:

I edited the base Vagrantfile and pointed it to a different Ubuntu version. Then I’d have Vagrant power up the two VMs I’d configured (one puppet master, one puppet agent), watch them provision themselves with a simple Puppet manifest, then confirm that Puppet was working properly.

All I had to change for each Ubuntu release was the name of the base virtual machine Vagrant was to load. Once I’d done my testing, I powered down the virtual machine with Vagrant’s destroy command, which also deleted the VM’s files.

Meanwhile, the base VM images were in use on a few other projects: One for my documentation work on Hiera and one for a side project to help stand up a quick development environment. Each had its own configuration saved in a few text files, and as I stopped needing to work on each, I could discard them without worrying about saving my work.

Vagrant offers a few other nice features:

ssh into the VM “just works,” meaning there’s no cumbersome ssh key setup or remembering passwords. Just issue the command vagrant ssh and you’re logged into the VM.

Creating mount points in the guest operating system from the host operating system is pretty simple, and part of Vagrant’s big appeal to Web developers: You check out your website’s codebase into a directory on the host machine and mount it in the guest machine, so a web developer doesn’t need a Linux toolchain to work on code running on a Linux VM.

Port forwarding from guest to host is also simple to set up, so the web server running inside the guest VM is readily available to a browser running out in the host system.

All of that combines to make it incredibly simple to set up a non-admin developer with a virtual environment that closely maps to production conditions, and to allow admins to keep that environment up-to-date as conditions in production change. If the web developer can type git pull and vagrant provision on the command line, she can keep her testing environment up to date.

So, puppetverse

Which brings me back around to puppetverse, which leverages Vagrant to help me with my technical writing at Puppet Labs.

I have to be able to do a few things as I work on Puppet documentation:

  1. Test and verify any assertions I make about how Puppet works.

  2. Provide working example code that I’ve tested and verified in a current Puppet environment.

Puppetverse allows me to bring up a puppet master and a pair of agents in a virtual environment so I can test what I’m saying in my documentation or write example configuration code in virtual machines running Ubuntu (or Debian). By configuring the basics for such an ecosystem in my Vagrantfile and Puppet manifests, I don’t have to step through the tedious part of getting Puppet up and running on three machines: I did it once and I can reuse it as many times as I need. If I mess up something inside one of my virtual machines, or need to know that all the systems are back to baseline, I can use the vagrant destroy command to wipe them all out and bring them back up to baseline: No need to manually uninstall packages, edit files or reconfigure the agents.

Thanks to the ability to easily mount directories in the host filesystem inside the Vagrant virtual machines, I can store my example Puppet configuration in my puppetverse repository. That allows me to test the same configuration across multiple versions of Puppet or different operating systems, depending on the combination of base provisioning and virtual machine puppetverse happens to be running. I’ve started storing different tasks in different branches to make Puppetverse more reusable: Each branch is checked out into its own Vagrant directory with a different set of VMs running in it. Switching from work on Hiera to testing release packages is as simple as changing directories in the shell or opening a different directory as a BBEdit project.

Sources of Truth

I probably should have had the hell frustrated out of me by this, but I think I learned something on Wednesday. This post is going to be kind of a work post and kind of a process post. Things could get a little tangled up.

I’ve been working on the documentation for this thing at work called Hiera. What’s there right now is mostly not mine (I inherited the work from coworker Nick), but I did write the bits about using Hiera on the command line, and I wrote the JSON and YAML examples.

I’m not going to get too much into what Hiera is, except to say it has the potential to be a really awesome tool if used properly, and that it was a good pick to hand me for my first thing to work on because you can’t really get Hiera unless you get Puppet, and then — once you’re pretty sure you get them both — there’s room for even more clouds to part.

There are good ways to get Hiera and there are bad ways to get Hiera. I’m pretty sure, based on my own experience after thinking about Hiera for a number of weeks and coming at it from the perspective of someone who was learning all about Hiera at the same time he was learning all about Puppet, that newer Puppet users following a relatively normal path of learning about Puppet are at fairly high risk to get Hiera one of the bad ways. I know I spent a number of weeks getting Hiera one of the bad ways. If not “bad,” anyhow, “skewed.”

An Aside on Hiera and Truth

If you’re actually here because you’re interested in Hiera and are suddenly wondering what’s “bad” and what’s “good” and who the hell am I to say, and are you Doing It Wrong, I’ll offer this:

There are a few kinds of truth in the world. Some truth is local (or organizational) and some truth is universal:

Universal Truth Organizational Truth
What Debian calls its MySQL packages The password for your website’s database
What the name of the postfix service is on CentOS The name of the host your Postfix service is running on
The default NTP servers for Ubuntu The ntp servers your East Coast datacenter should be using

Just, you know, go with this. For our purposes, your special in-house MySQL package doesn’t count and doesn’t really change my point:

Hiera is sometimes sold as a way to remove a lot of conditional logic from your Puppet code, and it’s true that’s a good use. If, however, you’re removing conditional code that describes universal truths from your modules and classes, then moving it over into Hiera, you’re creating a pretty bad situation from a management point of view (you now have logic living in two places — your Hiera data sources and your module code) and you’re also living a pretty bad pattern from a community point of view, because your modules will be difficult to use on places like the awesome Puppet Forge: A chunk of their logic will live outside the confines of a standard Puppet module, making them largely unusable without requiring consumers to create or modify a Hiera hierarchy. That’s a drag.

Instead, you want to put organizational truth into Hiera and call it into your classes and modules via parameters. That means you can put that organizational truth in one place, potentially reuse it in a number of classes and modules, and you can share your modules without the pain of sanitizing them of organizational data every time you share them: That data is coming from Hiera data sources, which you aren’t sharing at all.

Now that I’m at the point in the process where I’m about ready to move on to writing a complete example, that’s what I’m thinking.

How We Got There

It took a while to get to where I could type that up. Time spent getting Puppet up and running in such a way that I could make sure I understood what constitutes normal Puppet behavior, time spent making my Puppet setup work reliably, time spent learning how to provide data to Hiera in YAML and JSON, time spent learning how the Hiera command line tool works to make sure I was testing my assumptions correctly, etc. etc. etc. Most recently, it took me on a detour to learn Vagrant so I could build an easily maintained and reproducible Puppet environment. It also took me time to learn about Puppet modules so I could write a few to work with Hiera.

After months, I was able to push up that page about how to use Hiera on the command line and drop a quick note to the Puppet Users group to let them know we’d made a little more progress. I’m pretty happy with that, because the command line tool is really useful for learning about Hiera outside even a complete Puppet environment and there have been a number of requests for even a bare hint of how to get going with Hiera.

The act of pushing that one page and letting people know about it caused another coworker to write and ask if I needed more help with the documentation. I’d started the day I got that mail thinking I was going to be working on one thing, but his mail got me to thinking about Hiera interacted with his particular concerns and how I’d had the beginnings of a discussion with another coworker about those issues, so I spent the next 2.5 hours trying to crystallize all my thinking and write up some notes not only on what I perceived the state of affairs to be, but how we might make it better.

After doing all that writing, which had started as a 1:1 response and ended up cc’ing somebody else, I paused for a moment to eat lunch, then opened my draft back up, re-read it, saw some things I felt less confident about, and did what I always do before sending a mail of more than a graf or two: I went back to the documentation and my notes and asked myself if that was really, really what I wanted to say. I discovered it really wasn’t, because I had things backwards and didn’t completely understand part of how Hiera interacts differently between a few different versions of Puppet.

So I deleted 90 percent of the email.

Time on a Discarded Mail Well Spent

We can’t always count on someone asking us a simple or friendly question at just the right time to trigger a response that ultimately helps us understand something we were kind of stuck or misguided about, right? So as I sat there staring at the email I’d just pared down to practically nothing, after spending hours thinking about it and staring at it and consulting manuals and looking through at least three git repos to write bits of it, I was tempted to think “why the fuck did I even write this fucking thing?” It felt like a waste of time, because I’d just spent a ton of time documenting questions and concerns that weren’t so much stupid as they were, perhaps, misguided.

Then I realized that in the process of figuring out how much I’d not gotten things right to start with, I’d learned a few new answers, understood a concept I had listed in my todos as a thing I needed to figure out, and I’d been spared writing an example that would have been teaching people A Bad Way to Do Things, provided it even got past the stage where it wasted someone else’s time reviewing it.

It didn’t take me long to think back to when I was blogging a lot more often about a lot of things, and how it would sometimes take me days from starting an entry to actually publishing it, and how sometimes entries were never, ever published in any form because a premise turned out to be too flawed for public consumption, or I decided I was just wrong.

I started out behaving very differently. Writing came naturally enough to me that I won praise and a few small awards without ever having to revise or reconsider what I was writing. I used to think that merely meant that the products I was best suited to create were written ones. That makes sense, even though I’ll confess I’ve never really thrown myself into the craft very deeply.

But I think it may also mean that the best thoughts I have are the ones produced because they were written down, thought through and reconsidered once or twice. I don’t trust many of my thoughts until I’ve taken the time to write them down and think about them. The thoughts I trust the very most are the ones I’ve prepared myself to stand by in public in written form.

So there’s the claim “I’m a writer” you can make from the standpoint of a vocation, profession or hobby; and there’s the claim “I’m a writer” you can make from the standpoint that writing is key to your good internal working order.

I can check both boxes at the moment, but the former can and has come and gone. The latter’s just a matter of truth. It took a few hours worth of email that never saw the light of day to remember it.

© Michael Hall, licensed under a Creative Commons Attribution-ShareAlike 3.0 United States license.