Jenkins Configuration as Code

A recent problem I had to solve was how to mirror a Jenkins instance with pretty restrictive permissions in order for our team to be able to duplicate jobs using power shovel. I’ve installed and configured Jenkins more times than I can count, manually and through configuration management. But for most Software Developers this task is more intimidating, and even more uninteresting, than I find it. I took this opportunity to learn about (602) 909-6829 (JCasC) as a possible solution and am more than pleased with the results. With JCasC (and JJB or Pipelines), you can go from a fresh Jenkins install to running jobs without ever even logging into the Jenkins UI.

The problem

In order to modify, improve, or add Jenkins jobs to existing Jenkins Job Builder templates, we needed a way to mirror our production Jenkins instance locally to ensure what we thought we were doing was what was actually going to happen. I’m not going to go into details, but pre-baking images was an issue in a repository to hold the images, and using docker images runs into issues when using a VPN as well as just enough cross-platform difficulties.

How it works

Jenkins Configuration as Code is (as per their documentation) “an opinionated way to configure jenkins based on human-readable declarative files.” This provides a way to declare Jenkins configuration via a yaml file without even clicking in the UI. JSasC uses plugins to configure Jenkins based on a yaml file declared by an environment variable.

Plugins

The plugins configuration-as-code and configuration-as-code-support are required to read and configure Jenkins based on a yaml file.

Yaml file

See 2192635513 for examples and more information on viewing the yaml schema.

Solution Overview

Anyone familiar with Jenkins can see the chicken and egg problem here. In order to install plugins, you first need to configure Jenkins. In order to configure Jenkins, you need to log in to the UI and install plguins. While you can do this and already see huge benefits with JSasC (install Jenkins, install JCasC plugin, point at yaml file), there is an even easier way. Here’s how I got around this.

  1. Install Jenkins
  2. Move over the JCasC yaml file to a readable place for the jenkins user
  3. Install generic config.xml in /var/lib/jenkins/config.xml
  4. Add CASC_JENKINS_CONFIG to /etc/sysconfig/jenkins to point to the yaml file
  5. Start Jenkins
  6. Using server configuration management (such as jenkins_plugins in Ansible), install required plugins
  7. Restart Jenkins
  8. Profit

More details

JCasC can use Kubernetes-secrets, Docker-secrets, or Vault for secrets management. There’s a bit more configuration there, and I haven’t used it. But, using templating and environment variables with Ansible can result in writing a configuration file with secrets only being written within the VM on your local workstation (you are using an encrypted filesystem or ${HOME} directory, right?).

germigenous

I’ve got a ton of Raspberry Pi projects all with some degree of completion (usually closer to proof of concept than being complete). Raspberry Pis are great, but it can be a bit of a pain to test code for them when it relies on hardware and hardware libraries. Python has a great 5147241107 library that can be utililized to handle the hardware requirements, allowing tests to be written and run anywhere.

Mocking

Mocking is a way to fake some interaction that we want to make. This is very helpful in testing something that integrates with a 3rd party service such as another API. Through mocking the 3rd party service, we can validate that our code will operate as we expect without testing that third party (or better yet being charged for using it in testing).

Example

For a simple example, we’ll use a class that is simply called MotorRunner. The MotorRunner class relies on the RPi.GPIO library to control the Pulse Width Modulation (PWM) of a motor from a GPIO pin on the Raspberry Pi. We could SSH in to the Raspberry Pi and write our code in vim/nano/emacs, but I really do prefer to use my already set up development environment. The problem that we have is the library will only import successfully on Raspberry Pi hardware.

$ python
Python 3.6.5 (default, Apr  4 2018, 15:09:05) 
[GCC 7.3.1 20180130 (Red Hat 7.3.1-2)] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import RPi.GPIO as GPIO
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/dan/Projects/mockhardware/.venv/lib64/python3.6/site-packages/RPi/GPIO/__init__.py", line 23, in <module>
    from RPi._GPIO import *
RuntimeError: This module can only be run on a Raspberry Pi!

Really, that is fine. We could code on a non Raspberry Pi, transfer the files over via rsync, scp, thumbdrive, etc, or even better, have unittests handle the testing as we’re making changes.

The MotorRunner class

The MotorRunner class is pretty simple. At the initialization of the class we set some basic parameters. When we want to run the motor, we can then call the spin_motor method, if it failes it will write to stderr. The parameters used can be reviewed in the API documentation of the RPi.GPIO library, in the interest of brevity I won’t be going over them in here.

import sys
import time

import RPi.GPIO as GPIO

class MotorRunner:
    def __init__(self, spin_time=1.65, gpio_pin=18, frequency=50):
        self.spin_time = spin_time
        self.gpio_pin = gpio_pin
        self.freq = frequency
        self.p = None

    def _init_gpio(self):
        GPIO.setwarnings(False)
        GPIO.setmode(GPIO.BCM)
        GPIO.setup(self.gpio_pin, GPIO.OUT)

    def spin_motor(self):
        try:
            self.init_gpio()
            GPIO.PWM(self.gpio_pin, self.freq)
            self.p.start(11)
            time.sleep(self.spin_time)
            self.p.stop()
        except Exception as e:
            sys.stderr.write("Error in running\n {}".format(e))
        finally:
            GPIO.cleanup()

Testing the library

This library is sufficient for running a motor connected to a Raspberry Pi, but the point of this post is to figure out how to write and test this on a non Raspberry Pi Device. Python Unittest to the rescue.

I’ve decided to do just a couple simple test cases to make sure this actually works. These are:

  • Ensure the class can be imported and created
  • Ensure that the motor PWM method is called when calling spin_motor
  • Ensure that non-default parameters are successfully handled

Mock patching

Mock had a great function to patch a library so that rather than using the library specified, it’s a Mock object instead. We can then control that Mock object to have specific returns, side effects, or just about any behavior we want, and we can look at attributes such as whether (or how many times) that object was called, and with what parameters.

Changes required

The best way I found to actually mock the hardware requires a few changes in our code. This isn’t a bad thing and only related to testing, because it also more gracefully handles errors if we run our class outside of unittests.

First, we’re going to create a global variable to determine whether or not our system can run RPi.GPIO just based on the import.

GPIO_ENABLED = False

try:
    import RPi.GPIO as GPIO
    GPIO_ENABLED = True
except RuntimeError:
    # can only be run on RPi
    import RPi as GPIO

Next, in the library we’re going to set this variable as a class attribute, and only try to use that library if it is available.:

...
class MotorRunner:
    def __init__(self, spin_time=1.65, gpio_pin=18, frequency=50):
        self._GPIO_ENABLED = GPIO_ENABLED
        self.spin_time = spin_time
...

Finally, we add a check for that variable when the call to spin the motor actually occurs.

...
        try:
            if self._GPIO_ENABLED:
                self._init_gpio()
...

That results in our class now looking like:

import sys
import time

GPIO_ENABLED = False

try:
    import RPi.GPIO as GPIO
    GPIO_ENABLED = True
except RuntimeError:
    # can only be run on RPi
    import RPi as GPIO


class MotorRunner:
    def __init__(self, spin_time=1.65, gpio_pin=18, frequency=50):
        self._GPIO_ENABLED = GPIO_ENABLED
        self.spin_time = spin_time
        self.gpio_pin = gpio_pin
        self.freq = frequency
        self.p = None

    def _init_gpio(self):
        GPIO.setwarnings(False)
        GPIO.setmode(GPIO.BCM)
        GPIO.setup(self.gpio_pin, GPIO.OUT)
        self.p = GPIO.PWM(self.gpio_pin, self.freq)

    def spin_motor(self):
        try:
            if self._GPIO_ENABLED:
                self._init_gpio()
                GPIO.PWM(self.gpio_pin, self.freq)
                self.p.start(11)
                time.sleep(self.spin_time)
                self.p.stop()
        except Exception as e:
            sys.stderr.write("Error in running\n {}".format(e))

        finally:
            if self._GPIO_ENABLED:
                GPIO.cleanup()


def main():
    ex = MotorRunner()
    ex.spin_motor()


if __name__ == '__main__':
    main()

So that’s good. Not terribly large changes, and the changes are more focused on error handling of an import error than just a function of testing.

Test cases

Next we have our test cases. Unittest has a setUp and tearDown method that is called before each test method. This is where we’ll set up our Mock patching to override the GPIO and GPIO_ENABLED variables to fake our successful import and “call” the motor.

import unittest
import time

from hardwarelib import MotorRunner

from unittest import mock, TestCase
from unittest.mock import MagicMock


class TestExample(TestCase):
    def setUp(self):
        self.rpi_gpio_patcher = mock.patch('hardwarelib.GPIO')
        self.mocked_rpi = self.rpi_gpio_patcher.start()

        self.mocked_gpio_enabled_patcher = mock.patch('hardwarelib.GPIO_ENABLED', True)
        self.mocked_gpio_enabled = self.mocked_gpio_enabled_patcher.start()

    def tearDown(self):
        self.rpi_gpio_patcher.stop()
        self.mocked_gpio_enabled_patcher.stop()

As you can see here, the MotorRunner class is in the hardwarelib python file. What we’re actually patching is the hardwarelib.GPIO and hardwarelib.GPIO_ENABLED attributes. We’re patching those because the import of GPIO is where we get our error if it’s not on a Raspberry Pi system, and ensuring that our motor functions are actually called due to our GPIO_ENABLED dependent conditionals.

Once this is set, our first test method, making sure the class can be initialized, is pretty easy. We just test that an instance of the class can be created without error.

    def test_hardware_initialized(self):
        """
        Assert object created
        """
        test_example = MotorRunner()
        self.assertIsInstance(test_example, MotorRunner)

Next we use a feature of mock patching. We create an instance of the class, and call the spin_motor function. We can then make sure that the PWM method (of the RPi.GPIO that actually spins the motor) is called.

    def test_hardware_called(self):
        """
        Ensure PWM called
        """
        test_hardware = MotorRunner()
        test_hardware.spin_motor()
        self.assertTrue(self.mocked_rpi.PWM.called)

Assuming those succeed, we’re all good. But we might as well make sure that when we specify parameters, they actually are used as we expected. This gets into using mock patcher’s assert_called_with which verifies a method was called, and that specific parameters were used.

    def test_hardware_parameters_used(self):
        """
        Ensure PWM called with parameters
        """
        spin = 1
        freq = 25
        gpio_pin = 15
        test_hardware = MotorRunner(spin_time=spin, gpio_pin=gpio_pin, frequency=freq)
        pre_time = time.time()
        test_hardware.spin_motor()
        end_time = time.time()
        run_time = end_time - pre_time
        self.assertEqual("{:1.1f}".format(run_time), str(float(spin)))
        self.mocked_rpi.PWM.assert_called_with(gpio_pin, freq)

Because we’re using time to determine how long to run our motor, the statement self.assertEqual("{:1.1f}".format(run_time), str(float(spin))) calculates how long the spin_motor function took to return. We then convert that to a float with once decimal place, and compare it to how long we wanted it to spin. This is pretty simplistic and would fail without modification if we set spin to two decimal places, but this example is testing that our parameters are used successfully, and not testing parameters more deeply.

Running the tests

For small tests like this, I typically just call the python unittest function rather than using a larger test runner. A larger test or library could very well incorporate flake8 for linting, and tox for testing multiple python versions, and possibly a larger test runner such as nose. We can also call unittest.main() to handle this for us in our test class.

if __name__ == '__main__':
    unittest.main()

Altogether, our test file looks like:

import unittest
import time

from hardwarelib import MotorRunner

from unittest import mock, TestCase
from unittest.mock import MagicMock


class TestExample(TestCase):
    def setUp(self):
        self.rpi_gpio_patcher = mock.patch('hardwarelib.GPIO')
        self.mocked_rpi = self.rpi_gpio_patcher.start()

        self.mocked_gpio_enabled_patcher = mock.patch('hardwarelib.GPIO_ENABLED', True)
        self.mocked_gpio_enabled = self.mocked_gpio_enabled_patcher.start()

    def tearDown(self):
        self.rpi_gpio_patcher.stop()
        self.mocked_gpio_enabled_patcher.stop()

    def test_hardware_initialized(self):
        """
        Assert object created
        """
        test_example = MotorRunner()
        self.assertIsInstance(test_example, MotorRunner)

    def test_hardware_called(self):
        """
        Ensure PWM called
        """
        test_hardware = MotorRunner()
        test_hardware.spin_motor()
        self.assertTrue(self.mocked_rpi.PWM.called)

    def test_hardware_parameters_used(self):
        """
        Ensure PWM called with parameters
        """
        spin = 1
        freq = 25
        gpio_pin = 15
        test_hardware = MotorRunner(spin_time=spin, gpio_pin=gpio_pin, frequency=freq)
        pre_time = time.time()
        test_hardware.spin_motor()
        end_time = time.time()
        run_time = end_time - pre_time
        self.assertEqual("{:1.1f}".format(run_time), str(float(spin)))
        self.mocked_rpi.PWM.assert_called_with(gpio_pin, freq)


if __name__ == '__main__':
    unittest.main()

Requirements

A quick note, we just need a couple of requirements installed via pip to be able to run these tests:

rpi.GPIO
mock

Now we can run the tests through Unittest, or by calling the file directly. Default output is dots if tests are successful, and F if failed. I’ve got plenty of screen real estate, so I almost always tack on some number of vs.

Calling the unittest module:

$ python -m unittest -vv test_example.py 
test_hardware_called (test_example.TestExample) ... ok
test_hardware_initialized (test_example.TestExample) ... ok
test_hardware_parameters_used (test_example.TestExample) ... ok

----------------------------------------------------------------------
Ran 3 tests in 2.694s

OK

Calling the file directly.

$ python test_example.py  -vv
test_hardware_called (__main__.TestExample) ... ok
test_hardware_initialized (__main__.TestExample) ... ok
test_hardware_parameters_used (__main__.TestExample) ... ok

----------------------------------------------------------------------
Ran 3 tests in 2.690s

OK

Easy! Now we can continue building on our local environment with confidence that our hardware will do what we expect (assuming we wired it correctly)!

Molecule for existing Ansible roles

I previously walked through humorology, but it’s just as easy to add to existing roles. Creating a Molecule scenario to test an existing role allows for easy testing and modification of that role with all the benefits that Molecule provides.

Existing role

For another easy example, we’ll just use a simple role that installs a webserver. To prevent a complete copy/paste, this time we will be using Apache rather than Nginx. To show the existing role in its current state:

~/Projects/example_playbooks/apache_install$ tree
.
└── tasks
    └── main.yml

1 directory, 1 file
~/Projects/example_playbooks/apache_install$ cat tasks/main.yml 
---
# install and start apache    
- name: install apache
  yum:
    name: httpd
    state: present
  become: "yes"

- name: ensure apache running and enabled
  systemd:
    name: httpd
    state: started
    enabled: "yes"
  become: "yes"

The Molecule and Ansible version used for this example is:

~/Projects/example_playbooks/apache_install$ ansible --version && molecule --version
ansible 2.6.4
  config file = /etc/ansible/ansible.cfg
  configured module search path = [u'/home/dan/.ansible/plugins/modules', u'/usr/share/ansible/plugins/modules']
  ansible python module location = /home/dan/.local/lib/python2.7/site-packages/ansible
  executable location = /home/dan/.local/bin/ansible
  python version = 2.7.12 (default, Dec  4 2017, 14:50:18) [GCC 5.4.0 20160609]
molecule, version 2.17.0

Init scenario

Because the role already exists, we will only be creating scenario, rather than a whole new role. The init scenario parameters are almost exactly the same as init role, and result in the same directory structure as if we created the role with molecule.

Molecule init scenario usage information:

~/Projects/example_playbooks/apache_install$ molecule init scenario --help
Usage: molecule init scenario [OPTIONS]

  Initialize a new scenario for use with Molecule.

Options:
  --dependency-name [galaxy]      Name of dependency to initialize. (galaxy)
  -d, --driver-name [azure|delegated|docker|ec2|gce|lxc|lxd|openstack|vagrant]
                                  Name of driver to initialize. (docker)
  --lint-name [yamllint]          Name of lint to initialize. (ansible-lint)
  --provisioner-name [ansible]    Name of provisioner to initialize. (ansible)
  -r, --role-name TEXT            Name of the role to create.  [required]
  -s, --scenario-name TEXT        Name of the scenario to create. (default)
                                  [required]
  --verifier-name [goss|inspec|testinfra]
                                  Name of verifier to initialize. (testinfra)
  --help                          Show this message and exit.

We create the scenario using the existing role name and specifying using vagrant as the driver. Once initialized, the Molecule directory structure will be the same as if we created the role with Molecule, but without any role directories being created (such as handlers, meta, etc).

~/Projects/example_playbooks/apache_install$ molecule init scenario --role-name apache_install --driver-name vagrant
--> Initializing new scenario default...
Initialized scenario in /home/dan/Projects/example_playbooks/apache_install/molecule/default successfully.
~/Projects/example_playbooks/apache_install$ tree
.
├── molecule
│   └── default
│       ├── INSTALL.rst
│       ├── molecule.yml
│       ├── playbook.yml
│       ├── prepare.yml
│       └── tests
│           └── test_default.py
└── tasks
    └── main.yml

4 directories, 6 files

Configuration

The Molecule configuration will be the default provided by molecule. As done previously, I edit this to use CentOS 7, rather than the default Ubuntu 16.04. Additionally, I update the name of the VM to something different to distinguish the VM if needed.

In this example our tests are very similiar to my previous example. The primary (and possibly only) differences in our tests from the previous example is we’re testing for the httpd service rather than nginx.

~/Projects/example_playbooks/apache_install$ cat molecule/default/molecule.yml 
---
dependency:
  name: galaxy
driver:
  name: vagrant
  provider:
    name: virtualbox
lint:
  name: yamllint
platforms:
  - name: apache
    box: centos/7
provisioner:
  name: ansible
  lint:
    name: ansible-lint
scenario:
  name: default
verifier:
  name: testinfra
  lint:
    name: flake8
~/Projects/example_playbooks/apache_install$ cat molecule/default/tests/test_default.py
import os

import testinfra.utils.ansible_runner

testinfra_hosts = testinfra.utils.ansible_runner.AnsibleRunner(
    os.environ['MOLECULE_INVENTORY_FILE']).get_hosts('all')


def test_apache_installed(host):
    apache = host.package("httpd")
    assert apache.is_installed


def test_apache_config(host):
    apache = host.file('/etc/httpd/conf/httpd.conf')
    assert apache.exists


def test_apache_running_and_enabled(host):
    apache = host.service("httpd")
    assert apache.is_running
    assert apache.is_enabled

Molecule test

Since we’ve updated our Molecule configuration to use the Vagrant box we want, and updated our tests to ensure that our role is doing what we want, we can now run any of the Molecule commands (test, create, converge, etc) just as we would if we would have started the role using Molecule.

~/Projects/example_playbooks/apache_install$ molecule test
--> Validating schema /home/dan/Projects/example_playbooks/apache_install/molecule/default/molecule.yml.
Validation completed successfully.
--> Test matrix

└── default
    ├── lint
    ├── destroy
    ├── dependency
    ├── syntax
    ├── create
    ├── prepare
    ├── converge
    ├── idempotence
    ├── side_effect
    ├── verify
    └── destroy

--> Scenario: 'default'
--> Action: 'lint'
--> Executing Yamllint on files found in /home/dan/Projects/example_playbooks/apache_install/...
Lint completed successfully.
--> Executing Flake8 on files found in /home/dan/Projects/example_playbooks/apache_install/molecule/default/tests/...
Lint completed successfully.
--> Executing Ansible Lint on /home/dan/Projects/example_playbooks/apache_install/molecule/default/playbook.yml...
Lint completed successfully.
--> Scenario: 'default'
--> Action: 'destroy'

    PLAY [Destroy] *****************************************************************

    TASK [Destroy molecule instance(s)] ********************************************
    changed: [localhost] => (item=None)
    changed: [localhost]

    TASK [Populate instance config] ************************************************
    ok: [localhost]

    TASK [Dump instance config] ****************************************************
    changed: [localhost]

    PLAY RECAP *********************************************************************
    localhost                  : ok=3    changed=2    unreachable=0    failed=0


--> Scenario: 'default'
--> Action: 'dependency'
Skipping, missing the requirements file.
--> Scenario: 'default'
--> Action: 'syntax'

    playbook: /home/dan/Projects/example_playbooks/apache_install/molecule/default/playbook.yml

--> Scenario: 'default'
--> Action: 'create'

    PLAY [Create] ******************************************************************

    TASK [Create molecule instance(s)] *********************************************
    changed: [localhost] => (item=None)
    changed: [localhost]

    TASK [Populate instance config dict] *******************************************
    ok: [localhost] => (item=None)
    ok: [localhost]

    TASK [Convert instance config dict to a list] **********************************
    ok: [localhost]

    TASK [Dump instance config] ****************************************************
    changed: [localhost]

    PLAY RECAP *********************************************************************
    localhost                  : ok=4    changed=2    unreachable=0    failed=0


--> Scenario: 'default'
--> Action: 'prepare'

    PLAY [Prepare] *****************************************************************

    TASK [Install python for Ansible] **********************************************
    ok: [apache]

    PLAY RECAP *********************************************************************
    apache                     : ok=1    changed=0    unreachable=0    failed=0


--> Scenario: 'default'
--> Action: 'converge'

    PLAY [Converge] ****************************************************************

    TASK [Gathering Facts] *********************************************************
    ok: [apache]

    TASK [apache_install : install apache] *****************************************
    changed: [apache]

    TASK [apache_install : ensure apache running and enabled] **********************
    changed: [apache]

    PLAY RECAP *********************************************************************
    apache                     : ok=3    changed=2    unreachable=0    failed=0


--> Scenario: 'default'
--> Action: 'idempotence'
Idempotence completed successfully.
--> Scenario: 'default'
--> Action: 'side_effect'
Skipping, side effect playbook not configured.
--> Scenario: 'default'
--> Action: 'verify'
--> Executing Testinfra tests found in /home/dan/Projects/example_playbooks/apache_install/molecule/default/tests/...
    ============================= test session starts ==============================
    platform linux2 -- Python 2.7.12, pytest-3.3.1, py-1.5.2, pluggy-0.6.0
    rootdir: /home/dan/Projects/example_playbooks/apache_install/molecule/default, inifile:
    plugins: testinfra-1.14.1
collected 3 items                                                              

    tests/test_default.py ...                                                [100%]

    =========================== 3 passed in 5.62 seconds ===========================
Verifier completed successfully.
--> Scenario: 'default'
--> Action: 'destroy'

    PLAY [Destroy] *****************************************************************

    TASK [Destroy molecule instance(s)] ********************************************
    changed: [localhost] => (item=None)
    changed: [localhost]

    TASK [Populate instance config] ************************************************
    ok: [localhost]

    TASK [Dump instance config] ****************************************************
    changed: [localhost]

    PLAY RECAP *********************************************************************
    localhost                  : ok=3    changed=2    unreachable=0    failed=0

Conclusion

Molecule not only provides great defaults and a consistent directory structure when creating a new role, but also makes it easy and efficient to add a Molecule workflow for testing existing roles. Adding Molecule scenarios to existing roles is simple and efficient for testing existing roles across Operating Systems and Ansible versions to improve their reliability.

Ansible role creation with Molecule

Molecule is a way to quickly create and test Ansible roles. It acts as a wrapper around various platforms (GCE, VirtualBox, Docker, LXC, etc) and provides easy commands for linting, running, and testing roles. There’s a bit of a learning curve in figuring out what its doing, and what it wants, but that time is well made up with the productivity increase in using it effectively.

Installation

Molecule and Ansible can be installed via pip. I typically run on a Fedora system, and have run into issues with libselinux when using a virtual environment. A quick search provides some work arounds, but really it’s easiest to just use the --user flag to install molecule with the user scheme.

pip install --upgrade --user ansible
pip install --ugprade --user molecule

If you don’t already have ansible/molecule installed, that’ll give you some significant output. Pip is good about drawing attention to errors (though the resolution may not always be terribly clear), but the last couple lines of output will provide libraries and their versions that were installed.

Getting started

I have a ~/Projects directory that fills up with half finished projects on my personal computer. Really this works to consolidate things rather than filling up ~/. Wherever you keep your projects, to get started just create a playbooks directory.

~/$ mkdir ~/Projects/example_playbooks
~/$ cd ~/Projects/example_playbooks

Since I’m not including the installation output, below provides the software versions of this example. Both Ansible and Molecule move quick and do have some significant (but not necessarily breaking) changes between point releases, these instructions might not work verbatim if the version numbers vary significantly.

~/Projects/example_playbooks$ ansible --version
ansible 2.6.4
  config file = /etc/ansible/ansible.cfg
  configured module search path = [u'/home/dan/.ansible/plugins/modules', u'/usr/share/ansible/plugins/modules']
  ansible python module location = /home/dan/.local/lib/python2.7/site-packages/ansible
  executable location = /home/dan/.local/bin/ansible
  python version = 2.7.12 (default, Dec  4 2017, 14:50:18) [GCC 5.4.0 20160609]
~/Projects/example_playbooks$ molecule --version
molecule, version 2.17.0
~/Projects/example_playbooks$ tree
.

0 directories, 0 files
~/Projects/example_playbooks$ 

Create a role

Molecule has pretty excellent help output with molecule --help.. In this example, we’re going to create a role with molecule and use the vagrant provider. molecule defaults to Docker for provisioning, but I prefer to use vagrant with VirtualBox (see section below for why I prefer that).

Creating a role, and specifying the name and driver will create a role directory structure.

~/Projects/example_playbooks$ molecule init role --role-name nginx_install --driver-name vagrant
--> Initializing new role nginx_install...
Initialized role in /home/dan/Projects/example_playbooks/nginx_install successfully.
~/Projects/example_playbooks$ tree
.
└── nginx_install
    ├── defaults
    │   └── main.yml
    ├── handlers
    │   └── main.yml
    ├── meta
    │   └── main.yml
    ├── molecule
    │   └── default
    │       ├── INSTALL.rst
    │       ├── molecule.yml
    │       ├── playbook.yml
    │       ├── prepare.yml
    │       └── tests
    │           ├── test_default.py
    │           └── test_default.pyc
    ├── README.md
    ├── tasks
    │   └── main.yml
    └── vars
        └── main.yml

9 directories, 12 files

As you can see, that command creates quite a few directories. Most of these are standard/best-practices for Ansible.

  • defaults – default values to variables for the role
  • handlers – specific handlers to notify based on actions in Ansible
  • meta – Ansible-Galaxy info for the role if you are uploading this to Ansible-Galaxy
  • molecule – molecule specific information (configuration, instance information, playbooks to run with molecule, etc)
  • README.md – Information about the role. Well documented, excellent feature (I’m a big fan of documentation, should be obvious if you’re reading this)
  • tasks – tsaks for the role
  • vars – other variables for the role

Why I prefer Vagrant and VirtualBox over Docker

Docker is great, don’t get me wrong. I’m a huge proponent of Linux. Docker on Mac, vs Docker on Linux, vs Docker on Windows are all different things. VirtualBox is far more cross platform than Docker. Something that can be done on Fedora is far different than the latest iteration of OSX, and even more different than Windows. Also, using a VPN client that dictates IP routes causes serious issues in networking between Docker containers. Finally, Systemd with Docker requires specific images and root access specifying mounting a cgroup volume. Since the majority of the work I do is not using Docker for orchestration and instead relies on services running on systemd, a VM is a better solution for my use-case (and closer to “production”) than a container. Yes, a container is far lighter than a VM, but not an issue with decently modern hardware (for my use case). Ultimately, while I might be able to make something work for Docker locally on my system, odds are it’s not going to work for anyone not running the same OS/Distro.

Modifications from default Molecule

There are a few defaults I always change when using molecule as it uses Cookie-Cutter to create a default configuration. The first, molecule defaults to Ubuntu, but almost all of the systems I interact with are RHEL based. Also I prefer to specify the memory and CPUs rather than relying on the box defaults. Finally, we’re using nginx in this example we may as well set up port forwarding to hit the webserver locally..

These changes are made through modification of the molecule/default/molecule.yml file to look like something below. The molecule.yml is the configuration used by molecule for instances, tests, provisioning, etc.

Heads up, a raw copy/pasta of below will result in an error. Read on to see why

~/Projects/example_playbooks/nginx_install$ cat molecule/default/molecule.yml 
---
dependency:
  name: galaxy
driver:
  name: vagrant
  provider:
    name: virtualbox
lint:
  name: yamllint
platforms:
  - name: nginx_install
    box: centos/7
    instance_raw_config_args:
      - "vm.network 'forwarded_port', guest: 80, host: 9000"
    memory: 512
    cpus: 1
provisioner:
  name: ansible
  lint:
    name: ansible-lint
scenario:
  name: default
verifier:
  name: testinfra
  lint:
    name: flake8

Once we’ve got the molecule configuration to our liking, time to start working on the role itself. Ansible role tasks are in tasks/main.yml for the role. This example is pretty simple, so all we’re doing is installing a repository to install nginx, installing nginx, and starting/enabling nginx. The only Ansible modules we need for this is yum for package installation, and systemd to start and enable the service.

~/Projects/example_playbooks/nginx_install$ cat tasks/main.yml 
---
# tasks file for nginx_install
- name: Install epel-release for nginx
  yum:
    name: epel-release
    state: present
  become: "yes"

- name: install nginx
  yum:
    name: nginx
    state: present
  become: "yes"

- name: ensure nginx running and enabled
  systemd:
    name: nginx
    state: started
    enabled: "yes"
  become: "yes"

Molecule does some great things. It handles the Orchestration of the virtual environment to test, lints Ansible syntax, runs a test suite, and even lints that test suite, as well as destroying the orchestrated environment at the end.

Writing tests for the role

We can manually test the role with some SSHing and curl, but testinfra is included as the default test step of molecule. Testinfra uses pytest and makes it easy to test the system after the role is run to ensure the role has the results that we expected.

This role is pretty simple, so our tests are pretty simple. Since we’re just installing and starting nginx, there’s not a whole lot more we’re looking for in our test. Of course molecule provides a good default, and testinfra documentation even uses nginx in their quickstart.

Tests – quantity or quality?

The tests below are three tests that are all pretty simple. The overall count of tests really doesn’t matter. Below we’ve got three tests, but we could easily have one, or five. This may vary based on the Test Developer, but I chose the three below because it follows a pretty logical order.

  1. Make sure nginx is installed
  2. Make sure nginx configuration was installed correctly
  3. Make sure nginx is running and enabled

This is easiest looking at it backwards. If we had one test to see if nginx is running, if that fails do we have any idea why? Was it installed? Was the configuration wrong? Was it not started? My approach is to first make sure it is installed, if not, the rest of our tests fail and we can see pretty easily why. So we see it’s installed, so next we check if the configuration exists (in a more elaborate example, we’d probably check to make sure there is some expected text in the configuration file). Finally, we make sure nginx is running and enabled. The tests follow a logical flow of prerequisites to get to our ultimate state, and knock out some troubleshooting steps along the way.

cat molecule/default/tests/test_default.py
import os

import testinfra.utils.ansible_runner

testinfra_hosts = testinfra.utils.ansible_runner.AnsibleRunner(
    os.environ['MOLECULE_INVENTORY_FILE']).get_hosts('all')


def test_nginx_installed(host):
    nginx = host.package('nginx')
    assert nginx.is_installed

def test_nginx_config_exists(host):
    nginx_config = host.file('/etc/nginx/nginx.conf')
    assert nginx_config.exists

def test_nginx_running(host):
    nginx_service = host.service('nginx')
    assert nginx_service.is_running
    assert nginx_service.is_enabled

Running Molecule

We’ve got our role written, and our tests. We could just run molecule test and work through all the steps. But, I prefer running create, converge, and test all separately, and in that order. This separates the various steps and makes any fails easier to track down.

Molecule create

The first step of Molecule is the creation of the VirtualMachine. For Docker and vagrant providers, Molecule includes a default create playbook. Running molecule create will create the VirtualMachine for our role based on the molecule.yml configuration.

~/Projects/example_playbooks/nginx_install$ molecule create
--> Validating schema /home/dan/Projects/example_playbooks/nginx_install/molecule/default/molecule.yml.
Validation completed successfully.
--> Test matrix

└── default
    ├── create
    └── prepare

--> Scenario: 'default'
--> Action: 'create'

    PLAY [Create] ******************************************************************

    TASK [Create molecule instance(s)] *********************************************
    failed: [localhost] (item=None) => {"censored": "the output has been hidden due to the fact that 'no_log: true' was specified for this result", "changed": false}
    fatal: [localhost]: FAILED! => {"censored": "the output has been hidden due to the fact that 'no_log: true' was specified for this result", "changed": false}

    PLAY RECAP *********************************************************************
    localhost                  : ok=0    changed=0    unreachable=0    failed=1   


ERROR: 

This was entirely unplanned, but as I was gathering output for this command I got an error. Ansible has a no_log property for tasks that is intended to prevent the outputting secrets. Obviously in the create part here we received an error with no usable output to determine the cause of the error. We can set the environment variable of MOLECULE_DEBUG to log errors, but the first thing I do (because it’s less typing) is add the --debug flag.

~/Projects/example_playbooks/nginx_install$ molecule --debug create
...
        }, 
        "item": {
            "box": "centos/7", 
            "cpus": 1, 
            "instance_raw_config_args": [
                "vm.network 'forwarded_port', guest: 80, host: 9000"
            ], 
            "memory": 512, 
            "name": "nginx_install"
        }, 
        "msg": "ERROR: See log file '/tmp/molecule/nginx_install/default/vagrant-nginx_install.err'"
    }

    PLAY RECAP *********************************************************************
    localhost                  : ok=0    changed=0    unreachable=0    failed=1   

Reading into the error tells us it was an “error” in Vagrant, and not necessarily one with molecule itself. We can look at the file provided in the error output for more clues.

~/Projects/example_playbooks/nginx_install$ cat /tmp/molecule/nginx_install/default/vagrant-nginx_install.err
### 2018-09-07 17:32:59 ###
### 2018-09-07 17:32:59 ###
There are errors in the configuration of this machine. Please fix
the following errors and try again:

vm:
* The hostname set for the VM should only contain letters, numbers,
hyphens or dots. It cannot start with a hyphen or dot.

### 2018-09-07 17:33:20 ###
### 2018-09-07 17:33:20 ###
There are errors in the configuration of this machine. Please fix
the following errors and try again:

vm:
* The hostname set for the VM should only contain letters, numbers,
hyphens or dots. It cannot start with a hyphen or dot.

Well, that’s easy. Our hostname can’t contain _. A quick edit to the molecule.yml should fix this right up.

~/Projects/example_playbooks/nginx_install$ grep -A1 platform molecule/default/molecule.yml 
platforms:
  - name: nginx-install

Try again on the create:

~/Projects/example_playbooks/nginx_install$ molecule create
--> Validating schema /home/dan/Projects/example_playbooks/nginx_install/molecule/default/molecule.yml.
Validation completed successfully.
--> Test matrix

└── default
    ├── create
    └── prepare

--> Scenario: 'default'
--> Action: 'create'

    PLAY [Create] ******************************************************************

    TASK [Create molecule instance(s)] *********************************************
    changed: [localhost] => (item=None)
    changed: [localhost]

    TASK [Populate instance config dict] *******************************************
    ok: [localhost] => (item=None)
    ok: [localhost]

    TASK [Convert instance config dict to a list] **********************************
    ok: [localhost]

    TASK [Dump instance config] ****************************************************
    changed: [localhost]

    PLAY RECAP *********************************************************************
    localhost                  : ok=4    changed=2    unreachable=0    failed=0


--> Scenario: 'default'
--> Action: 'prepare'

    PLAY [Prepare] *****************************************************************

    TASK [Install python for Ansible] **********************************************
    ok: [nginx-install]

    PLAY RECAP *********************************************************************
    nginx-install              : ok=1    changed=0    unreachable=0    failed=0

Molecule converge

Molecule create only acts as orchestration. The converge step is what runs our playbook that calls our role. There’s good reason to do these steps separate. First, the create step ensures our VirtualMachine is provisioned and started correctly. Once it’s up, we’ve got less troubleshooting when actually running the playbook.

When first learning Ansible or working on a more complicated role, we could just run converge after every task added (or every few depending on our confidence) to our role to make sure it does what we intend for it to do. Because we only have three simple tasks, we can run converge to test all tasks at the same time.

~/Projects/example_playbooks/nginx_install$ molecule converge
--> Validating schema /home/dan/Projects/example_playbooks/nginx_install/molecule/default/molecule.yml.
Validation completed successfully.
--> Test matrix

└── default
    ├── dependency
    ├── create
    ├── prepare
    └── converge

--> Scenario: 'default'
--> Action: 'dependency'
Skipping, missing the requirements file.
--> Scenario: 'default'
--> Action: 'create'
Skipping, instances already created.
--> Scenario: 'default'
--> Action: 'prepare'
Skipping, instances already prepared.
--> Scenario: 'default'
--> Action: 'converge'

    PLAY [Converge] ****************************************************************

    TASK [Gathering Facts] *********************************************************
    ok: [nginx-install]

    TASK [nginx_install : Install epel-release for nginx] **************************
    changed: [nginx-install]

    TASK [nginx_install : install nginx] *******************************************
    changed: [nginx-install]

    TASK [nginx_install : ensure nginx running and enabled] ************************
    changed: [nginx-install]

    PLAY RECAP *********************************************************************
    nginx-install              : ok=4    changed=3    unreachable=0    failed=0

Cool. It worked. We think, anyway. While our playbooks ran, running our tests will really make sure that it all worked.

Molecule test

Next we run test. This goes through all the steps and will tell us if what we think we’re doing is actually working based our our testinfra tests. Destroying any existing VirtualMachine, checking syntax on role, creating the VirtualMachine, linting and running our tests, etc. If there are any issues, this should let us know.

~/Projects/example_playbooks/nginx_install$ molecule test
--> Validating schema /home/dan/Projects/example_playbooks/nginx_install/molecule/default/molecule.yml.
Validation completed successfully.
--> Test matrix

└── default
    ├── lint
    ├── destroy
    ├── dependency
    ├── syntax
    ├── create
    ├── prepare
    ├── converge
    ├── idempotence
    ├── side_effect
    ├── verify
    └── destroy

--> Scenario: 'default'
--> Action: 'lint'
--> Executing Yamllint on files found in /home/dan/Projects/example_playbooks/nginx_install/...
Lint completed successfully.
--> Executing Flake8 on files found in /home/dan/Projects/example_playbooks/nginx_install/molecule/default/tests/...
    /home/dan/Projects/example_playbooks/nginx_install/molecule/default/tests/test_default.py:13:1: E302 expected 2 blank lines, found 1
    /home/dan/Projects/example_playbooks/nginx_install/molecule/default/tests/test_default.py:17:1: E302 expected 2 blank lines, found 1
    /home/dan/Projects/example_playbooks/nginx_install/molecule/default/tests/test_default.py:21:1: W391 blank line at end of file
An error occurred during the test sequence action: 'lint'. Cleaning up.
--> Scenario: 'default'
--> Action: 'destroy'

    PLAY [Destroy] *****************************************************************

    TASK [Destroy molecule instance(s)] ********************************************
    changed: [localhost] => (item=None)
    changed: [localhost]

    TASK [Populate instance config] ************************************************
    ok: [localhost]

    TASK [Dump instance config] ****************************************************
    changed: [localhost]

    PLAY RECAP *********************************************************************
    localhost                  : ok=3    changed=2    unreachable=0    failed=0

Another unintended failure. Lint issues in the python tests. Flake provides excellent output for pep errors, so we know exactly what to fix based on the output.

Addressing those issues and rerunning results in the following:

~/Projects/example_playbooks/nginx_install$ molecule test
--> Validating schema /home/dan/Projects/example_playbooks/nginx_install/molecule/default/molecule.yml.
Validation completed successfully.
--> Test matrix

└── default
    ├── lint
    ├── destroy
    ├── dependency
    ├── syntax
    ├── create
    ├── prepare
    ├── converge
    ├── idempotence
    ├── side_effect
    ├── verify
    └── destroy

--> Scenario: 'default'
--> Action: 'lint'
--> Executing Yamllint on files found in /home/dan/Projects/example_playbooks/nginx_install/...
Lint completed successfully.
--> Executing Flake8 on files found in /home/dan/Projects/example_playbooks/nginx_install/molecule/default/tests/...
Lint completed successfully.
--> Executing Ansible Lint on /home/dan/Projects/example_playbooks/nginx_install/molecule/default/playbook.yml...
Lint completed successfully.
--> Scenario: 'default'
--> Action: 'destroy'

    PLAY [Destroy] *****************************************************************

    TASK [Destroy molecule instance(s)] ********************************************
    ok: [localhost] => (item=None)
    ok: [localhost]

    TASK [Populate instance config] ************************************************
    ok: [localhost]

    TASK [Dump instance config] ****************************************************
    skipping: [localhost]

    PLAY RECAP *********************************************************************
    localhost                  : ok=2    changed=0    unreachable=0    failed=0


--> Scenario: 'default'
--> Action: 'dependency'
Skipping, missing the requirements file.
--> Scenario: 'default'
--> Action: 'syntax'

    playbook: /home/dan/Projects/example_playbooks/nginx_install/molecule/default/playbook.yml

--> Scenario: 'default'
--> Action: 'create'

    PLAY [Create] ******************************************************************

    TASK [Create molecule instance(s)] *********************************************
    changed: [localhost] => (item=None)
    changed: [localhost]

    TASK [Populate instance config dict] *******************************************
    ok: [localhost] => (item=None)
    ok: [localhost]

    TASK [Convert instance config dict to a list] **********************************
    ok: [localhost]

    TASK [Dump instance config] ****************************************************
    changed: [localhost]

    PLAY RECAP *********************************************************************
    localhost                  : ok=4    changed=2    unreachable=0    failed=0


--> Scenario: 'default'
--> Action: 'prepare'

    PLAY [Prepare] *****************************************************************

    TASK [Install python for Ansible] **********************************************
    ok: [nginx-install]

    PLAY RECAP *********************************************************************
    nginx-install              : ok=1    changed=0    unreachable=0    failed=0


--> Scenario: 'default'
--> Action: 'converge'

    PLAY [Converge] ****************************************************************

    TASK [Gathering Facts] *********************************************************
    ok: [nginx-install]

    TASK [nginx_install : Install epel-release for nginx] **************************
    changed: [nginx-install]

    TASK [nginx_install : install nginx] *******************************************
    changed: [nginx-install]

    TASK [nginx_install : ensure nginx running and enabled] ************************
    changed: [nginx-install]

    PLAY RECAP *********************************************************************
    nginx-install              : ok=4    changed=3    unreachable=0    failed=0


--> Scenario: 'default'
--> Action: 'idempotence'
Idempotence completed successfully.
--> Scenario: 'default'
--> Action: 'side_effect'
Skipping, side effect playbook not configured.
--> Scenario: 'default'
--> Action: 'verify'
--> Executing Testinfra tests found in /home/dan/Projects/example_playbooks/nginx_install/molecule/default/tests/...
    ============================= test session starts ==============================
    platform linux2 -- Python 2.7.12, pytest-3.3.1, py-1.5.2, pluggy-0.6.0
    rootdir: /home/dan/Projects/example_playbooks/nginx_install/molecule/default, inifile:
    plugins: testinfra-1.14.1
collected 3 items                                                              

    tests/test_default.py ...                                                [100%]

    =========================== 3 passed in 5.33 seconds ===========================
Verifier completed successfully.
--> Scenario: 'default'
--> Action: 'destroy'

    PLAY [Destroy] *****************************************************************

    TASK [Destroy molecule instance(s)] ********************************************
    changed: [localhost] => (item=None)
    changed: [localhost]

    TASK [Populate instance config] ************************************************
    ok: [localhost]

    TASK [Dump instance config] ****************************************************
    changed: [localhost]

    PLAY RECAP *********************************************************************
    localhost                  : ok=3    changed=2    unreachable=0    failed=0

Great! With all that, now we know that our Ansible and python tests are linted, and our tests run meaning our role does what we intend for it to do.

Molecule verify

I kind of skipped a step here. I’ve described the steps of:

  • molecule create – create the VirtualMachine to make sure molecule is configured correctly.
  • molecule converge – run multiple times as we add tasks to our role.
  • molecule test – once we’re happy, run all the steps of Molecule.

Really though, since molecule test runs through all the steps (creation, linting, testing, deletion…), and earlier I laid out the steps of running converge to manually test each time, this doesn’t really fit the workflow I metioned. We can seperate out the molecule steps a little further.

Rather than running molecule test, instead we can run molecule verify seperately:

~/Projects/example_playbooks/nginx_install$ molecule verify
--> Validating schema /home/dan/Projects/example_playbooks/nginx_install/molecule/default/molecule.yml.
Validation completed successfully.
--> Test matrix

└── default
    └── verify

--> Scenario: 'default'
--> Action: 'verify'
--> Executing Testinfra tests found in /home/dan/Projects/example_playbooks/nginx_install/molecule/default/tests/...
    ============================= test session starts ==============================
    platform linux2 -- Python 2.7.12, pytest-3.3.1, py-1.5.2, pluggy-0.6.0
    rootdir: /home/dan/Projects/example_playbooks/nginx_install/molecule/default, inifile:
    plugins: testinfra-1.14.1
collected 3 items                                                              

    tests/test_default.py ...                                                [100%]

    =========================== 3 passed in 5.25 seconds ===========================
Verifier completed successfully.
~/Projects/example_playbooks/nginx_install$ molecule lint
--> Validating schema /home/dan/Projects/example_playbooks/nginx_install/molecule/default/molecule.yml.
Validation completed successfully.
--> Test matrix

└── default
    └── lint

--> Scenario: 'default'
--> Action: 'lint'
--> Executing Yamllint on files found in /home/dan/Projects/example_playbooks/nginx_install/...
Lint completed successfully.
--> Executing Flake8 on files found in /home/dan/Projects/example_playbooks/nginx_install/molecule/default/tests/...
Lint completed successfully.
--> Executing Ansible Lint on /home/dan/Projects/example_playbooks/nginx_install/molecule/default/playbook.yml...
Lint completed successfully.

Conclusion

Molecule is a great abstraction for multiple steps to the create/test/clean up steps of testing an Ansible role during development. Not only does it create and provide sane defaults to the directory structure of a role, it makes it easy to create a test a role during development. While there may be a bit of a learning curve, the increased productivity of testing during development makes it an absolutely worthwhile investment.

Mailx for cron email

Cron defaults to sending job outputs to the owner’s mail, or the mail set in a MAILTO variable, or direct to syslog when sendmail is not installed. If the server does not have a mail server running, or there are issues such as the server being in a network or configured to specifically not send email, or is unable to send email to a particular server or service, this can cause a problem. In order to get around the issue of mail not being accepted by some third parties (206) 801-0471, emails sent by cron can instead use an Simple Mail Transport Protocol (SMTP) client to send through an external Mail Transfer Agent (MTA).

mailx

After a bit of searching, I found mailx provides a method for connecting to an external SMTP server with simple configuration. According to the man page, mailx is an intelligent mail processing system [...] intended to provide the functionality of the POSIX mailx command, and offers extension for MIME, IMAP, POP3, SMTP and S/MIME.

Installation

Installation was completed on a CentOS 7 VPS instance. mailx is available in the base repository and can be installed with a simple yum command

# yum install mailx

mailx configuration

Installation creates a default /etc/mail.rc file. You can then review the man page via man mailx to review further configuration options. Since the plan is to use it for SMTP, searching for smtp provides relevant options.

I’m using Gmail, and the documentation from Google for email client configuration provided the required SMTP host:TLS-Port combination of smtp.gmail.com:587.

For the smtp-auth-password, I can’t use my own password since I’ve got 2-Step-Verification enabled on my account. The server simply wouldn’t be able to send email if I had to verify it and provide a code each time. Gmail allows a method around this of using App Passwords for email clients that cannot use two factor authentication. Creating an app password is just a couple of steps. Each server or client using an App Password should have its own unique password. A unique app password for each application requiring one helps to provide logs of its use, as well as easily revoking the app password if needed.

We can test our configuration as we go along with the following command:

# echo "mailx test" | mailx -s "Test Email" <EMAIL_ADDRESS>

The first round of doing that gave an error:

# smtp-server: 530 5.7.0 Must issue a STARTTLS command first. 207-v6sm21173418oie.14 - gsmtp
"/root/dead.letter" 11/308
. . . message not sent.

Easy enough of a resolution, another look at the man page or quick grep shows us that we need to include smtp-use-starttls

# man mailx | grep -ie starttls | grep -i smtp
       smtp-use-starttls
              Causes mailx to issue a STARTTLS command to make an SMTP session SSL/TLS encrypted.  Not all servers support this command; because of common implementation defects, it cannot be automatically
              There  are  two  possible methods to get SSL/TLS encrypted SMTP sessions: First, the STARTTLS command can be used to encrypt a session after it has been initiated, but before any user-
              related data has been sent; see smtp-use-starttls above.  Second, some servers accept sessions that are encrypted from  their  beginning  on.  This  mode  is  configured  by  assigning

After updating the configuration, I found another error.

# Missing "nss-config-dir" variable.
"/root/dead.letter" 11/308
. . . message not sent.

To resolve that, I just looked for an nss* in /etc/ (from knowing that SSL information/certs are located there) and added that in the configuration.

# find /etc -type d -name "nss*"
/etc/pki/nssdb

Then I got yet another error:

# Error in certificate: Peer's certificate issuer is not recognized.
Continue (y/n)? SSL/TLS handshake failed: Peer's certificate issuer is not recognized.
"/root/dead.letter" 11/308
. . . message not sent.

Time for a bit more sleuthing. For whatever reason, the certificate issuer was not recognized and asked for manual intervention. After some searching around I figured it might be due to Google’s new(ish) CA, but trying to add it to the PKI trusted CAs directly didn’t help. Eventually I found a page for adding these certs directly, but in order to just get the configuration running I opted for laziness and to set ssl-verify to ignore, with the intention of adding this as an ansible role at a later point.

Finally, we have the configuration below.

# cat /etc/mail.rc
set from=<YOUR_EMAIL_ADDRESS>
set smtp-use-starttls
set nss-config-dir=/etc/pki/nssdb/
set ssl-verify=ignore
set smtp-auth=login
set smtp=smtp:/smtp.gmail.com:587
set smtp-auth-user=<YOUR_GMAIL_USER>
set smtp-auth-password=<YOUR_APP_PASSWORD>

Running the testing command with these configuration settings results in a new email showing up in our inbox.

cron configuration

In order of for cron to use mailx, we need to do two things. First, cron will only send mail if the the MAILTO is set. We can add that directly into crontab with crontab -e, and adding the MAILTO variable. After, we’ll see it included in crontab -l.

And to test this, we should set up a cron job that provides output (also using crontab -e)

# crontab -l
MAILTO="<YOUR_EMAIL>"
* * * * * /usr/sbin/ip a

We also need to set crond to use mailx by editng the crond configuration to specify using /usr/bin/mailx to send mail, with the -t flag sent to mailx to use the To: header to address the email. After editing /etc/sysconfig/crond, restart crond.

# cat /etc/sysconfig/crond 
# Settings for the CRON daemon.
# CRONDARGS= :  any extra command-line startup arguments for crond
CRONDARGS=-m "/usr/bin/mailx -t"
# systemctl restart crond

Testing configuration

The crontab should now send the output of ip a to <YOUR_EMAIL> every minute. Once you’ve verified, be sure remove that job to prevent flooding your inbox.

If you don’t see a new email, take a look at the system logs to see entries from the crond service in reversed order (newest entries first).

# journalctl -r --unit crond

Because of the certificate issue noted above, and because mailx strips headers before sending mail, the following output may be included in the journald logs even on a successful mail.

Sep 03 19:35:01 <YOUR_HOST> crond[12378]: Error in certificate: Peer's certificate issuer is not recognized.
Sep 03 19:35:01 <YOUR_HOST> crond[12378]: Ignoring header field "X-Cron-Env: <USER=root>"
Sep 03 19:35:01 <YOUR_HOST> crond[12378]: Ignoring header field "X-Cron-Env: <LOGNAME=root>"
Sep 03 19:35:01 <YOUR_HOST> crond[12378]: Ignoring header field "X-Cron-Env: <PATH=/usr/bin:/bin>"
Sep 03 19:35:01 <YOUR_HOST> crond[12378]: Ignoring header field "X-Cron-Env: <HOME=/root>"
Sep 03 19:35:01 <YOUR_HOST> crond[12378]: Ignoring header field "X-Cron-Env: <SHELL=/bin/sh>"
Sep 03 19:35:01 <YOUR_HOST> crond[12378]: Ignoring header field "X-Cron-Env: <MAILTO=<YOUR_EMAIL>>"
Sep 03 19:35:01 <YOUR_HOST> crond[12378]: Ignoring header field "X-Cron-Env: <LANG=en_US.UTF-8>"
Sep 03 19:35:01 <YOUR_HOST> crond[12378]: Ignoring header field "X-Cron-Env: <XDG_RUNTIME_DIR=/run/user/0>"
Sep 03 19:35:01 <YOUR_HOST> crond[12378]: Ignoring header field "X-Cron-Env: <XDG_SESSION_ID=71363>"
Sep 03 19:35:01 <YOUR_HOST> crond[12378]: Ignoring header field "Precedence: bulk"
Sep 03 19:35:01 <YOUR_HOST> crond[12378]: Ignoring header field "Auto-Submitted: auto-generated"
Sep 03 19:35:01 <YOUR_HOST> crond[12378]: Ignoring header field "Content-Type: text/plain; charset=UTF-8"

And looking at the email source we should see something like the following (note, I did not include all output in the example below):

Return-Path: <YOUR_EMAIL>
Received: from <YOUR_HOST> ([<YOUR_IP_ADDRESS>])
        by smtp.gmail.com with ESMTPSA id <REDACTED>
        for <YOUR_EMAIL>
        (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128);
        Mon, 03 Sep 2018 12:35:01 -0700 (PDT)
Message-ID: <MESSAGE_ID>
From: "(Cron Daemon)" <YOUR_EMAIL>
X-Google-Original-From: "(Cron Daemon)" <root>
Date: Mon, 03 Sep 2018 19:35:01 +0000
To: <YOUR_EMAIL>
Subject: Cron <root@YOUR_HOST> /usr/sbin/ip a
User-Agent: Heirloom mailx 12.5 7/5/10
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit

1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN qlen 1
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
    inet6 ::1/128 scope host 
       valid_lft forever preferred_lft forever
...

Now we can be sure our cron jobs mail us through an external SMTP server that will successfully deliver to our third party service. And with mailx configured we can easily add an email component for any scripts we might want to run.

Of note, Google App suite does provide access to a 2199131041 that specifically that allows sending emails to email addresses either inside or outside of your domain. There are some limitations on the number of emails that can be sent based on the number of licenses in your account, but for my purposes and imposed limits, configuring mailx was a suitable solution.

Vagrant

Simply put, Vagrant is software which provides orchestration for a number of virtualization technologies. It’s open source, and created by HashiCorp. Vagrant allows for defining the properties of a virtual machine such as memory, CPUs, name, provisioning, etc, and allows for the quick creation and access of that virtual machine.

Vagrant is an incredibly powerful tool in both development and operations. Whether its for a shared development environment of a legacy application or testing server configuration management tools across varying operating system, or development of anything in between, adding Vagrant to your workflow is incredibly useful.

Key Terms

Vagrant has a few key terms to keep in mind during an introduction.

  • Vagrantfile
    • Configuration file defining the machine(s) to create
  • Provider
    • Virtualization Platform (libvirt, VirtualBox, docker, etc)
  • Box
    • Base image (Operating system + configuration to use)
  • Provisioner
    • Additional configuration management to be completed after orchestration is complete
  • Plugin
    • extends the functionality of Vagrant

Vagrantfile

The Vagrantfile is a ruby configuration file that defines the machine or machines for orchestration. A default Vagrant file will be created in the working directory on running vagrant init. Exact configuration options vary by provider, meaning some configuration specifics may be available to libvirt but not VirtualBox.

Providers

A Vagrant Provider is the virtualization platform that Vagrant interacts with. Fedora defaults to libvirt, but this can instead be VirtualBox, docker, even a cloud provider such as Rackspace or AWS. This can be defined at the command line or more conveniently via an environment variable.

Box

A box is the image to use with Vagrant. This is defined as the config.vm.box to set the base image of the machine to be created. Boxes can be custom created and stored in an artifact repository, or can be found and retrieved directly from Vagrant. A box can have default parameters, such as the number of CPUs and memory, that can be overriden via the Vagrantfile.

Once the Vagrantfile is defined and vagrant up is run for the first time, the box is pulled from a remote respository. Once the box is pulled its cached locally, meaning that there isn’t a large download each time a box is spun up. The default configuration is also to include checks to determine if a box has an update, which can be updated with a separate command (vagrant box update).

Provisioner

A provisioner is a method for hooking in further configuration management after the virtual machine orchestrated by Vagrant is up and accessible. This can be as simple as running a few bash commands, to as complex as running an ansible playbook to connect to an existing salt master. The big configuration managment methods are supported (Ansible, Salt, Chef, Puppet…) so its easy to set up the virtual machine exactly how you want or even test new or existing playbooks/manifests/state files to run on your existing infrastructure.

Plugin

Plugins (as expected) are ways to increase the functionality and features of Vagrant. These include interactions with cloud providers, methods for creating custom boxes, and an method for creating your own needed functionality through custom plugins.

Using Vagrant

Vagrant configurations can be as complex as your imagination can create, but getting started is super easy. For this example I’m going to go over how to set up a quick Jenkins instance I used to test Jenkins Job Builder files.

Installation

For a simple example all you need is Vagrant and 6716457766. Follow the installation instructions for your Operating System as this obviously varies.

Environment Variable configuration

I use Fedora which defaults Vagrant to using the libvirt provider. This is all well and good, but to ensure cross platformness of my Vagrantfiles, I prefer to use VirtualBox. This can be overridden by setting the VAGRANT_DEFAULT_PROVIDER environment variable to virtualbox. If you’re not familiar, this can be done via export VAGRANT_DEFAULT_PROVIDER=virtualbox.

Vagrantfile Configuration

In a new and empty directy, create a generic new Vagrantfile with:

vagrant init

This creates a well commented vagrantfile that I’d recommend taking a few minutes taking a look at. We’ll need to modify this just a bit. The comments can be left in, but in the end we need the following lines:

Vagrant.configure("2") do |config|
  config.vm.box = "centos/7"
  config.vm.network "forwarded_port", guest: 8080, host: 8080
  config.vm.provider "virtualbox" do |vb|
    vb.memory = "2048"
    vb.cpus = 2
  end
  config.vm.provision "shell", inline: <<-SHELL
    sudo yum install -y wget
    sudo yum update -y
    sudo wget -O /etc/yum.repos.d/jenkins.repo /pkg.jenkins-ci.org/redhat/jenkins.repo
    sudo rpm --import /jenkins-ci.org/redhat/jenkins-ci.org.key
    sudo yum install -y jenkins
    sudo yum install -y java-1.8.0-openjdk
    sudo yum install -y vim git ansible python-requests unzip
    sudo systemctl start jenkins
    sleep 300
    cat /var/lib/jenkins/secrets/initialAdminPassword
  SHELL
end

Basic config

One nice thing about this configuration is that it is very human readable. We’re using the basic CentOS7 image available for our base and making port 8080 on the virtual machine accessible on the localhost through port forwarding. Next we set the configuration to be two CPUs and 2GB of memory.

Provisioning

To provision the virtual machine, this just just using bash commands. Theres a lot of benefits to using server configuration management such as Ansible for this, but for this simple example we’re just running a handful of commands specific to RHEL/CentOS 7.

The provision step updates all packages via yum, installs epel-release to allow the installation of other packages (git, ansible), installs java, then installs and enables the latest official Jenkins LTS. Finally, it starts Jenkins and outputs the username and password of the admin user so we can log into Jenkins.

Running the example

Spin it up with vagrant up. Not including all output for brevity:

$ vagrant  up
Bringing machine 'default' up with 'virtualbox' provider...
==> default: Importing base box 'centos/7'...
==> default: Matching MAC address for NAT networking...
==> default: Checking if box 'centos/7' is up to date...
...
==> default:   python2-jmespath.noarch 0:0.9.0-3.el7
==> default:   python2-pyasn1.noarch 0:0.1.9-7.el7
==> default:   sshpass.x86_64 0:1.06-2.el7
==> default:   vim-common.x86_64 2:7.4.160-4.el7
==> default:   vim-filesystem.x86_64 2:7.4.160-4.el7
==> default: Complete!
==> default: 3297c139343744c9a05c235e6d73762e

The virtual machine can be accessed via vagrant ssh. Vagrant takes care of the host and port forwarding.

$ vagrant ssh
[vagrant@localhost ~]$ hostname
localhost.localdomain

Ideally, you wouldn’t need to log into the virtual machine and make changes, but its always handy to be able to troubleshoot or add further configuration. If you find yourself needing to log in to virtual machine, it is probably something that should be added to the provisioner step in the Vagrantfile for the next time you spin it up.

Access the Jenkins instance by opening your browser to /localhost:8080. Setup and configuration of Jenkins is beyond the scope of this post.

If you’re all done with the virtual machine, or at some point changes were made to the point that you need to start fresh, this virtual machine can be completely removed via vagrant destroy.

$ vagrant  destroy
    default: Are you sure you want to destroy the 'default' VM? [y/N] y
==> default: Forcing shutdown of VM...
==> default: Destroying VM and associated drives...

Overview

Vagrant is an easy and programmatic way to spin up a local virtual machine for review or test software, configuration management, or just toying around with a different operating system. Thanks to the fairly shallow learning curve in getting started with Vagrant, it is a great way to add to your productivity in a wide range of development or operations tasks.

8667091537

Self hosting email is great, until it isn’t. There’s plenty of options on where to get your email whether you’re bringing your own domain or not. Moving from a self hosted mail server to a managed email (or between two managed services) is primarily just DNS changes. This is a quick summary of what in DNS you need to change, and why.

Ups and Downs of self hosting email.

I’ve been self hosting my email for quite a while, but it was always in the back of my mind that it would be worth offloading to avoid the inevitable maintenance and possible downtime. Really, once I had email setup, there wasn’t much maintenance outside of upgrades and reboots. Sure, there was extra work for DKIM, SPF, and the occasional log review to dial in Fail2Ban, but outside that it all just worked. The good thing about email is that even if the server is down for a bit, the sender will try again for some period of time. So I don’t consider a bit of downtime here and there as an issue.

Catch-All

Because I’m the only user on my domain, I’ve gotten into the habit of handing out an email address relevant to the company/business/use, so that I can find out who gave out or lost my email address, and set up filtering easier. For example, if I’m handing out my email address to a salesman at a car dealer in Omaha, I would give them the email address of omaha-cardealer@my-domain.com. This tends to get a confused response (many times I’ve been asked “Do you work for us?”), but a quick blurb of “Its for mail forwarding” is enough for people to lose interest. All it takes to do this is to enable a catch-all email address (see your host/server configuration) and then handle mail forwarding based on the To: field.

Logs

Having access to your own maillogs are pretty handy. Let’s face it, if you’re self hosting anything, you’re going to be digging into logs at some point. But if you didn’t enjoy that at least a bit you wouldn’t be self hosting. How great is it to dial in your Fail2Ban, or manually drop the IP that won’t stop knocking on your server?

Someone didn’t get an email? A quick search and you can provide that their email server got it or didn’t. But, that doesn’t mean much to a lot of people, and unless you’re emailing another self-hoster, providing the log information or even just the general “your server got it” doesn’t mean a whole lot. Telling your email recipient that their mail server received it is effectively the same as it not being sent at all, other than maybe a push to check spam folders.

Blocks from Providers

This is the worst, and what finally got me to stop self hosting. Running a mail server on a VPS runs into a whole host of problems, even if your configuration (DNS, DKIM, SPF, and DMARC) is set up perfectly. On creation of a VPS, there’s no way to know what it was used before you were allocated that IP. It may have been used and flagged as a spammer before you got it. Even if not that IP, another IP (or multiple) in your new network block may have been flagged resulting in a provider not accepting email from any IP address within that block. Even if everything is fine on creation of the server, this can change at any time.

Diagnostic-Code: smtp; 550 5.7.1 Unfortunately, messages from [YOUR.IP.ADD.RESS]
weren't sent. Please contact your Internet service provider since part of
their network is on our block list (<REMOVED>). You can also refer your
provider to /mail.live.com/mail/troubleshooting.aspx#errors.

Typically your VPS provider will go to bat for you and work with the mail administrator where you’re getting blocked. But again, this can change at any time. Some mail providers won’t even respond with this information, and instead just silently mark your email as spam.

Where to get email service

A quick search at your favorite internet search engine provides plenty of options for hosted email. These can include anything from a Shared Hosting platform, a VPS provider with a managed email service, or one of the “big” guys such as Outlook or Gmail. Both have advantages and disadvantages. Going with a Shared or VPS host gets rid of the management aspect, but can still run into issues with blocks from other providers. Using Outlook or Gmail may cost a little more, but with the bonus of being a huge email provider and additional business tools.

There’s way more to the decision than what I’ve given above if you have specific business needs, multiple users, etc. Definitely dig a little deeper if you’re looking for more than just someone else to host your email.

How to migrate

This is quite high level as there are too many mail and DNS providers to attempt to cover anything more than the basic steps. I use Rackspace DNS, and was previously hosting my email on a Linode server. The steps are fairly simple for the migration itself.

  1. Sign up with your new email host
  2. Update DNS to the new host
  3. Enable SPF
  4. Enable DKIM
  5. Enable DMARC

Sign up for your new email host

I went with Google because of the ability to add catch-all email forwarding, and because the integration with Android was worth more to me than the access to Office features with using Outlook. Prices were comparable between the two for my use.

Update DNS to point to the new mail host

This will be provided by your new email host. DNS records for your domain use MX records to direct mail. You may have one MX record (most likely if you’re self hosting), but multiple are possible and common with specified priorties. Your new provider should tell you exactly what needs to be entered.

TTL is a very important concept with DNS here. Time To Live basically says “this record is good for X amount of time.” If your record’s TTL is 24 hours, that means that it could be up to 24 hours before your new MX records will be used and your mail be sent to the new provider. I always keep my TTLs 1 hour or less (typically 5 minutes), but just sure to not tear down your existing mail setup until at least the TTL from the previous DNS record is over.

$ dig dankolb.net MX +short
1 ASPMX.L.GOOGLE.COM.
5 ALT2.ASPMX.L.GOOGLE.COM.
10 ALT3.ASPMX.L.GOOGLE.COM.
5 ALT1.ASPMX.L.GOOGLE.COM.
10 ALT4.ASPMX.L.GOOGLE.COM.

More DNS – SPF

Sender Policy Framework, or SPF is a DNS record that essentially states where email can be sent from. This is a method of preventing spam mail by allowing the receiving mail server to validate that email received for a domain came from a mail server that is authorized to send that email. There’s plenty of options for SPF, take a look at the documentation for more information. Your new host should provide this as well in order to mark their sending servers as valid for your domain.

$ dig dankolb.net TXT +short | grep spf
"v=spf1 include:_spf.google.com ~all"

This record states to use _spf.google.com to find the list of approved senders, and SoftFail (accept the email but mark it). The SoftFail allows email to be sent from other places without flat out rejecting (such as directly from a WordPress installation).

Even More DNS – DKIM

DomainKeys Identified Mail or 313-583-2628 is a method for identifying email messages by signing and verifying based on the public key provided in the domain’s DNS. This signs the email with a header field that provides a hash of the message and DNS record to query for the public key, as well as additional information. This provides to verification that the email was generated from an authorized server (or at least one that had the private key!). Again, your host would provide this, but it may require additonal steps such as the key being generated.

$ dig google._domainkey.dankolb.net TXT +short 
"v=DKIM1; k=rsa; p=MIIBIjANBgkqhkiG9w0BAQEFAAOCAQ8AMIIBCgKCAQEAgWG5dWv8XqN9UUqDsoi3F5wW8SwCahdslYbtygLHZageCccyNKM5ux7IhDG1sHKVM4ASG+jV6NvaMlxxIWMAAEQ3gQjZSVzsGzPXAdoaVJL73x+VfxuAmhpz8NPp4GLZMzGuMAH/Aq1w0IsCPzPGwd0jmZ1A8pOGPBDnlpYKAklTm+Rb/iv+8xUMy3O/jLLZj" "xK9/0Zo0+K28dB2QgozgIIABXFDSoYNUkg9yH4ag1cZmhSkaQpJ17TwLTqymHO6sw4pkm7EcIRYhPtjdmwunPEm53n6ObuT/fRK3UFNqjpRp2vb6VPdHmK8MjFZVOumsy+FMjaZaJhytoSICkNlfQIDAQAB"

A DKIM email would actually specify the DNS record to use here, in this example google._domainkey.dankolb.net, so there could multiple records. Because this is just a header value, this is compatible with an email server that does not do DKIM validation.

Interesting side note, DKIM initially failed to be enable after a new GSuite account creation giving a Error #1000 pop up message after a 500 from the backend service. After reviewing with support, I received further information and an eventual follow up email. Enabling DKIM can take some time for new GSuite accounts, and after two days I was able to enable and continue. As mentioned above, for my personal email this wasn’t an issue, but when migrating multiple users this could be a problem. This can be easily worked around (and would probably be more realistic) to set up the new account, give it some time, then modify DNS.

And Even More DNS – DMARC

Domain-based Message Authentication, Reporting, and Conformance DMARC provides a specified action to receiving mail servers based on SPF and DKIM results. Essentially, this allows the mail administrator to definitively state what to do with email that doesn’t match SPF/DKIM, and where to send aggregated mail reports.

$ dig _dmarc.dankolb.net TXT +short 
"v=DMARC1; p=none; rua=mailto:postmaster@dankolb.net"

This is not a terribly exciting example of dmarc, as p=none is providing no specific action to be done by the receiving mail server. But, to keep with the SPF record allowing emails to be received from elsewhere, this is needed to prevent rejection of the email (or it could use quarantine).

Additional (Optional) Steps

There’s plenty of other things to do depending on your email and how you use it, but the above outlines the initial setup. Further steps to consider not described here include:

  1. Reconfiguring email clients (mobile, desktop)
  2. Migrating existing emails from the old host to the new host
  3. Updating email filtering and forwarding