I’ve recently been working with the new pipeline plugin dsl in Jenkins 2.0 and the more I work with it the more I like it - I truly believe that it’s the next generation of pipeline as code even though it has several limitations and gotchas that I will describe.

A brief history of pipelines and job generation in Jenkins

Pre Jenkins 2.0 was simply not designed for complex Continuous Delivery pipelines. What the community created with plugins such as Build pipeline plugin and Delivery pipeline plugin was basically a hack, sewing together separate jobs by passing parameters to them. While functionally it was basically just a big ugly hack underneath with lots of overhead.

And let’s talk about the way to generate these pipelines. A common solution was to have template jobs that were used every time when creating new component pipelines. This is a maintenance nightmare; no code stored in scm, all local in Jenkins, forks of the templates occurred so you’ll end up with several template generators that have diverged from another.

Enter job dsl! Originally created by netflix, this plugin actually gave us a way to generate jobs using a groovy dsl. That actually let us have all the job config stored as code. Here’s a simple example taken directly from the projects GitHub page:

def gitUrl = 'git://github.com/jenkinsci/job-dsl-plugin.git' 

job('PROJ-unit-tests') { 
  scm { 
    git(gitUrl) 
  } triggers { 
    scm('*/15 * * * *') 
  } steps { 
    maven('-e clean test') 
  } 
}

While this is awesome the jobs created were still translated as xml files. And it didn’t solve the headache of having several jobs tied together by passing parameters. Also - the job dsl programs themselves usually bloated up considerably. There are a lot of plugins that are supported in job dsl but those who aren’t supported will give you a real headache. The solution for those is to directly manipulate the underlying xml of the Jenkins config.xml. Here’s an example that I had to solve; set the sonar variable JobAdditionalProperties to the sonar publisher:

configure { project -> 
	project / 
		publishers / 
		'org.jenkins__ci.plugins.flexible__publish.FlexiblePublisher' / 
		publishers / 
		'org.jenkins__ci.plugins.flexible__publish.ConditionalPublisher' / 
		publisherList / 
		'hudson.plugins.sonar.SonarPublisher' / 
		jobAdditionalProperties(‘...') 
}

This gets old… real fast! Embarrassingly it took me half a day and a stack overflow post to figure it out. I still think the maintainers of the project are doing a great job so I’m not out to bash them here.

Jenkins 2.0 - Pipelines as first class citizens

So what’s the fuzz about the pipelines in Jenkins 2.0? Well, the main improvement in Jenkins 2.0 is that pipelines are built in as first class citizens. We no longer need to create 10 different jobs for every component and knot them together. One job can now describe the whole pipeline. This brings far less overhead. I think that the pipeline plugin expands on what job dsl plugin have first achieved and goes way further.

With the Jenkinsfile you can describe the whole pipeline in a groovy dsl, let’s take a look at a simplistic Jenkinsfile:

node('master') {
  stage('checkout and build') {
    try {
      checkout scm
      env.PATH = "${tool 'Maven 3'}/bin:${env.PATH}"
      sh 'mvn clean install'
    } catch (e) {
      currentBuild.result = 'FAILURE'
      throw e
    } finally {
      step([$class: 'WsCleanup', cleanWhenFailure: false])
    }
  }
}

The example above runs the stage on the master node, checks out code from scm and runs maven clean install. We add a try/catch/finally block which enables us to clean up the workspace only if it did not fail. This is actually pretty sweet - we both get the pipeline visualisation for free and we can define the build logic in a programmatic fashion.

So how should you use the Jenkinsfile? The first obvious thought is that each dev team put a Jenkinsfile in their components repo and describe their pipeline a´la travis ci. However if you’re not a small organisation I don’t think this approach is ideal at all. Continuous Delivery pipelines tend to get complex, and the Jenkinsfile can easily swell to 500 loc or even more. Also I don’t think that dev teams should put too much focus into writing the dsl because of the risk having the pipelines diverging and lastly there is no code-reuse at all with this approach.

So what is a better choice then? My firm belief is: use the global lib plugin!

Global Lib Plugin / Shared Pipeline Libraries

This plugin allows you to create a shared code library repo which is stored in scm and can be used by all pipeline jobs (aka Jenkinsfile). It also enables us to not reinvent the wheel every time and keep the code DRY. Basically we can create a single big modualarized program that can handle all the logic in your organization’s CD pipeline. Furthermore we have the possibility to replace all the brittle shell scripts that many of us have :facepunch:

Let’s take a look at setting this up:

  • Define the name and scm url to your global library in Jenkins at: Manage Jenkins » Configure System » Global Pipeline Libraries; give it a name (I’ll use codemonkeyLib here)
  • Next set up the directories; the project should use a standard java directory structure:
roger@MacBook-Pro:~$ tree .
src
└── org
    └── codemonkey
        ├── builder
        │   ├── Gradle.groovy
        │   └── Maven.groovy
        ├── docker
        │   ├── Docker.groovy
        │   └── Kubernetes.groovy
        ├── scm
        │   └── Git.groovy
        └── utilities
            └── RestCaller.groovy

And a groovy class can look like this (src/codemonkey/scm/Git.groovy):

class Git implements Serializable {
  String branch
  def stages

  Git(stages) {
    this.branch = stages.env.branch
    this.stages = stages
  }

  def checkout() {
    stages.echo 'checking out code from scm..'
    //checkout scm here, ie. stages.checkout...
  }
}

Basically all valid Groovy code is accepted except some limitations that I will describe later. This class can be imported and used in the Jenkinsfile script like this:

@Library('codemonkeyLib')
import se.codemonkey.scm.Git

withEnv(["branch"="master"])

node('master') {
  stage('checkout and build') {
    def branch
    def git = new se.codemonkey.scm.Git(this)
    git.checkout()
  }
}

The @Library annotation we defined earlier tells Jenkins to check out this repo from scm. Then we can import the classes avilable and instantiate objects and run methods. Note that to be able to run dsl-specific stuff in your library classes you need to instantiate the object with the instance of the script (i.e this). We have defined that as def stages in the constructor in the example above. This enable us to put all logic into the library classes and use the Jenkinsfile/script as a base template on how the structure of the pipeline will look like.

What about the job dsl plugin?

So is the job dsl plugin totally deprecated now? Well, from my point of view no, although it will play a minor role. Let me explain: although the pipeline dsl is great we still need a way to generate new jobs and I have not found a good solution for that in the pipeline dsl. The job dsl plugin actually supports the pipeline plugin, so with a wrapper job dsl seed job we can easily create new pipeline jobs.

Take the following scenario: we have a json file in scm that let teams describe their pipelines in an array, changes in this file triggers a job-dsl seed job in Jenkins that generates new pipeline jobs. It can look like this:

import groovy.json.JsonSlurper;

class JobGenerator {
  def static Add(job, url) {
    job.with {
      environmentVariables {
        env('scm', url)
      }
    }
    return job
  }
}

def returnType(name) {
  def result
  switch(name) {
    case 'springboot':
      result = 'pipeline-types/workflow.groovy'
    break
    case 'tomcat':
      result = 'pipeline-types/workflow2.groovy'
    break
  }
  return result
}

def SomeThing(String project, String name, String url, String builder) {

  folder("$project") {
    displayName(project)
  }

  def type = returnType(name)

  def myJob = pipelineJob("$project/$name")
  myJob = JobGenerator.printStuff(myJob, url, builder)

  myJob.with {
    definition {
      cps {
        script(readFileFromWorkspace(type))
	sandbox()
      }
    }
  }
}

def InputJson = new JsonSlurper().parseText(readFileFromWorkspace("pipeline.json"))
def Project = InputJson.project

InputJson.pipelines.each { SomeThing(Project, it.name, it.url, it.builder) }

Pipeline plugin gotchas & limitations

Here are some things that I find bothersome and restricting with the pipeline plugin:

  • You cannot use each loop in the groovy classes so you are limited to the old C-style for loop
  • Developing and testing the classes can be a bit tiresome and time consuming. Push the code, run the job and so on.
  • Strange exceptions from Jenkins sometimes
  • You cannot use superclass / subclasses that passes the instance of this from the subclass constructor with super()
  • Rebuild stages are a commercial addition. If a build fails at the end, we have to rebuild it from the start. You can solve this by adding try/catch and retry but I really hope we will get an open source implementation soon :neckbeard:

UPDATE: 2017-09-08

This post is almost one year old now and has gotten a good number of views which I’m happy to see. There has been quite a few updates and stuff that has changed regarding Pipelines as well as my experience working with them. This update will highlight some of these things.

Declarative vs. Scripted pipeline

Early 2017 Cloudbees announced a new way to write the Jenkinsfile which they call declarative pipeline and has since became the standard syntax in their documentation. I don’t know if this was a super good idea; while the declarative syntax is more expressive and less verbose than the scripted (groovy dsl) one it lacks the power and flexibility and adds confusion to have two separate dsl:s with different functionality.

Restart a failed pipeline

So as I described earlier in this post if a pipeline fails you do not have the ability to restart it at the failed stage, instead you need to re-run it from scratch (which is not ideal when you have a long-running and complex pipeline). I was happy too see the announcement from Cloudbees that they were planning to open source the ‘Checkpoints’ plugin and instantly disappointed when they withdrew from this claim, see this jira

So the options we’re left with are try / catch blocks with a retry around potentially flakey segments, like the following example where we let the user retry the deploy step twice before we fail the whole job:

stage('Deploy to ci') {
  node('build') {
    try{
      sh('./deploy.sh')
    } catch (Exception e) {
      retry(2) {
        input "Deployment failed - retry or abort?"
        sh('deploy.sh')
      }
    } finally {
      // optional step
    }
  }
}

This works but it’s laborious and can be error prone, unfortunately it’s the only work around I figured out so far.

Running stages/task in parallel

The ability to run tasks in parallel has been supported for quite some time, however only inside of a stage. The pipeline visualization will still only show one stage and not several. I suppose BlueOcean has a better visualization in this regard but it also has some drawbacks (the amount of plugins required to install BlueOcean is staggering).

For running task in parallel inside of a stage using the same Jenkinsfile you can do the following (here we’re running mvn clean install and npm install concurrently:

stage('Build & compile') {
  node('build') {
    parallel (
      phase1: { sh "mvn clean install" },
      phase1: { sh "npm install" }
    )
  }
}

The other option we have is to create another job which we call in parallel. This job could both be a freestyle job or another pipeline job which accepts arguments. Example below runs a deploy job in several environments in parallel:

stage('Deploy to ci, test and stage environment') {
  node('build') {
    parallel (
      phase1: { build job: 'generic-deploy-job', parameters: [string(name: 'environment', value: 'ci'), string(name: 'service', value: 'service-product'), string(name: 'version', value: '1.1')]},
      phase2: { build job: 'generic-deploy-job', parameters: [string(name: 'environment', value: 'test'), string(name: 'service', value: 'service-product'), string(name: 'version', value: '1.1')]},
      phase3: { build job: 'generic-deploy-job', parameters: [string(name: 'environment', value: 'stage'), string(name: 'service', value: 'service-product'), string(name: 'version', value: '1.1')]}
    )
  }
}

The advantage here is that you can split the logic so you don’t need to add a lot of logic inside the phases and add that in the generic-deploy-job instead. However the disadvantages are that this will spawn 3 jobs so you will consume 4 executors, also you will have to jump between jobs to read the logs for each deploy job.

Manual Steps and Optional Steps

Let’s say your CI/CD workflow isn’t totally automated, perhaps when the code reaches stage environment there is a manual QA process that needs to be done before the job can proceed further. In this case we can just use the input functionality before we aquire a node (the job will not consume an executor in this state). I recommend to use a timeout otherwise the job will never time out, we can achieve this like so:

stage('QA sign-off') {
  timeout(time:14, unit:'DAYS') {
    input message: 'Confirm when ready to proceed to Release'
  }
  node('build') {
    // do stuff
  }
}

In the pipeline visualization the job will be presented as paused, and if you highlight the step with the mouse you will get a pop-up with ‘proceed’ or ‘abort’ options. An optional step sort of works in a similar way but requires some ugly work around to not mess up the visualization (when adding or removing a stage the previously jobs will disappear). An optional step could be that it’s optional to deploy to a certain environment or not for example. Here is my take on that:

stage('Deploy to stage') {
  timeout(time:14, unit:'DAYS') {
    env.CHOICE=input message: 'Deploy to stage?', parameters: [choice(choices: 'YES\nNO', description: 'Chooce YES to deploy to stage. Choose NO to skip to next stage.')]
  }
  node('build') {
    if ( env.CHOICE == 'NO' ) {
      sh('echo skipping stage deploy')
    } else {
      sh('./deploy.sh')
    }
  }
}

This is certainly not pretty. The reason I have the if/else statement in the node is because if the code does not reach that point it will mess up the visualization of all the previous jobs. Hopefully you won’t have this requirement in the first place though.

Pipeline Shared Libraries

Still one of the things I like best with Jenkins. Using this plugin will let you put all the common/shared stuff inside a groovy program and helps you reduce the logic needed in the Jenkinsfile. However it’s still a bit awkward to work with and develop, an unfortunate commit on master can mess up all the jobs that import it. It almost a must to have an indentical test Jenkins instance where you test branches on while developing new features. Another annoyance is that commits from the Shared Lib will show up in all the jobs that import it - which is very likely to cause confusion for developers owning that job. There is a PR on github that fixed this issue but it haven’t gotten any love from the maintainer :/

EDIT: Since version 2.9 (released Sep-13 2017) the default behavior now is to exclude shared libraries from the changelog