in Engineering

Building our new developer experience: Part 1

Our API is at the heart of everything we do. Not only does it allow developers to build powerful Direct Debit solutions, but it also powers the GoCardless Dashboard along with our integrations with partners like Xero, Teamup and Zuora.

As engineers, we know firsthand the importance of having great resources at your disposal when you’re getting started with an API. So back in September we kicked off a major project to revamp our developer onboarding experience. We wanted to build something we could be really proud of and which would also delight our customers.

In this blog post, the first of a series of two, we’ll take you through the journey of building our new developer site from idea to delivery.

Where we started

From the earliest days of GoCardless, developers have been a major focus for us. We’ve invested a lot of time in building an intuitive and powerful API with great tooling.

When we were building the current API in 2014 we learnt lots of lessons from our original legacy API from 2011 - we sought to provide consistency and predictability, implemented versioning from the get-go and made pagination the default.

One of the biggest problems with many APIs is that the documentation and libraries get out of date all too easily. When you’re moving as fast as we do and trying to continuously add value for customers, you’ll always be making improvements to to your API. Making regular changes every few weeks means that your developer resources quickly become stale. Typically, this means lots of manual work, painstakingly and manually updating documentation and libraries when the API itself changes. For instance, when we add a new optional attribute when creating a payment via the API, within minutes it’s mentioned in the documentation and supported within our API libraries.

To reduce friction in making changes to the API and help developers stay up to date, we’ve automated this process as much as possible. We describe our API in the form of a JSON Schema, embedding all the different resources, attributes and endpoints available. We then use an internal tool to apply this schema to templates we’ve built for our documentation and each of our client libraries.

Every time either the schema or one of these templates changes, we re-generate the resulting source code and automatically deploy it to our website or push it to GitHub.

This has provided a great base to work from for building documentation that we could be proud of and delivering what we saw as the bare minimum: up to date and complete documentation. We heard from satisfied users every week who found our docs great to work with. But we were aware that we didn’t have the right experience for people entirely new to the API - beyond a blog post, there were no code samples or step-by-step walkthroughs to help developers hit the ground running.

Working out what to build

With a basic understanding of the problem, we needed to decide what to build. To help us make the right choices, we defined three goals for the project:

  1. Help developers see the value in our API and reach that “aha!” moment as soon as possible
  2. Make it easy and seamless for people to transition from our sandbox environment to a live integration when they’ve finished building
  3. Identify the right developer leads and pass them to the sales team, so they can reach out and offer help at the right time

With the APIs we know and love, we can all remember the time when we first grasped their power. But as things stood, we were doing almost nothing to bring users to that point - after signing up to our sandbox environment, developers were just pointed to the docs and left to their own devices. We knew we could do better.

With that in mind, we started researching onboarding flows for other APIs, taking a look at Intercom, Stripe and Clearbit, amongst others, seeking to learn from what they did and didn't do well. For example, Clearbit did a great job with their “quick start” documentation, but it was frustrating that the flow didn’t remember your choice of what platform you wanted to integrate with.

I also spent time with our Sales and API Support teams internally - they spend all of their time talking directly with customers, so they could give us great insights into what our customers found most confusing and unintuitive. Both teams flagged up demand for code samples, which we weren’t providing beyond the readme files of our libraries.

Talking to the sales team made us all the more conscious of the importance of great documentation in the sales cycle - when large prospective users of our API are weighing up options for managing their Direct Debit payments, they’ll tend to get their developers to have a look.

The API Support team were able to give us great access to customers who had recently finished building integrations. We reached out to them and got direct feedback on what they found hardest in working with our API and what was least clear.

Considering our own internal feedback, chatting to real users of the API and our own intuitions as developers, it seemed clear that the biggest source of value would be adding a dedicated “getting started” guide for developers new to our API, supported by in-product messaging to guide them to and through it.

Delivering the new

The first improvement we made was to add code samples to our reference docs. We did think about automating this, but decided it would make it harder to provide contextually relevant code examples.

These examples made a big difference in terms of making our libraries easier to use but they didn’t do much to help customers understand the underlying concepts behind the API or point them to what they should be doing first. For this, we decided to write a “getting started” guide, separate from the reference documentation. Trying to merge the two together didn't make sense - they’re very different kinds of content: for example, a guide is naturally read start-to-finish whereas reference documentation is something you dip into as required.

First, we had to decide which steps should be in the guide. We wanted to introduce users to the basic concepts of Direct Debit (e.g. mandates and payments), help them set up their API client and then take them through the key API flows: adding customers, taking payments, and handling webhooks. This gave us what we needed to decide on a rough layout for the guide:

  • Setting up your language’s API client
  • Adding a customer using the Redirect Flow
  • Collecting payments
  • Listening for webhooks
  • What to do next (how to go live, further resources for more advanced integrations)

We decided to write a prototype guide with PHP code samples (our most-used language) as soon as possible, so we could test it out with real users and iterate, before writing code samples for all of the other languages we wanted to support.

From the beginning, we emphasised conveying best practices in a way which didn’t really fit with the reference documentation. For instance, we wanted to encourage developers to use a package manager, store their credentials in environment variables and take advantage of our API’s idempotency keys. We also wanted to show people ways to test their code and iterate quickly - for example by using ngrok to test webhooks locally and testing code snippets in a REPL like Ruby’s irb.

Having written a first version, we put the draft in front of two developers completely new to the API who were considering an integration with us. We thought at this stage it’d be really useful to work with people who’d never used the API or Direct Debit at all, since the code is not just about showing you what code to write, but explaining the structure of GoCardless and the underlying Direct Debit system. We learnt a huge amount from this - not only about minor mistakes (like missed syntax errors in our code samples), but also where we’d not conveyed concepts as clearly as we’d hoped.

After iterating on our draft, we added code samples in other languages. Knowing that the quality (and thus copy-and-pastability!) of the samples is vitally important, we implemented automated syntax checking of our snippets as part of the continuous integration (CI) process, alongside a bunch of other automated tests (like checking the validity of links and making sure all our images have alt attributes) using HTMLProofer.

In parallel, we also wanted to bring the design of our developer content up to date with our current website and brand to provide a consistent experience from first discovering GoCardless to integrating with the API to landing in your Dashboard. So James and Achim got to work on designing and implementing a brand new look for the site.

Our developer homepage

When trying to get the guides looking and feeling perfect, we spent a lot of time on our code samples — which are in many ways the heart of the guides. On the page, you can instantly switch between languages, with your choice remembered between pages. Many hours of painstaking effort went into the apparently simple and non-consequential task of stopping the page from jumping around when you switch languages - but we felt that this attention to detail was hugely important in providing the experience we wanted.

Similarly, we wanted to make it easy to link to and bookmark parts of the guide, marked by subheadings. Eventually, we managed to get the behaviour just right so that the anchor in the address bar changes as you scroll up and down the page, whilst maintaining great performance.

Having finished and launched the “getting started” guide for the API, we had demand from elsewhere in the business to go further so we also built and launched a guide to our Partner API for our partnerships team and a guides to our developer tools for our Developer Support team.

Our getting started guide

As a final icing on the cake, while getting inspired at the Write the Docs EU conference in Prague in September, I translated the guide into French, with a member of the French sales team, Côme, helpfully proofreading my work: a great way of carrying over the great new experience and showing our commitment to the French market.

What we’ve learnt

We’re hugely proud of what we’ve built: a beautiful new site for developers which not only includes up-to-date reference documentation, but also provides step-by-step help for developers using the API for the first time.

Through the process of building the new site, we learnt two key lessons which will feed into how we work in the future on our developer experience and the rest of our product:

1. Attention to detail is key

When we reviewed others’ API documentation as part of research for this project, the difference that love and attention to detail makes was clear. The small touches are hugely impactful - for instance, the animated transition when changing languages on the Stripe docs brought a real smile to our faces and was something that we wanted to show others on the team.

These small touches - things like stopping the page jumping when you change language - often take a disproportionate amount of time but are a key part of a polished final product and demonstrating just how much we care about our API.

2. Talk to users as early, and as often, as you can

It’s difficult to write a “getting started” guide as someone who is already familiar with an API and the concepts behind it. You can try and put yourself in the shoes of another developer but this will never be as effective as actually putting your work in front of someone.

We used the experiences of people who had just finished working on integrations to inform our first draft. We then showed that to people who’d never used to API or Direct Debit at all, to get a real sense of how effective our guide was for a complete beginner.

This project has given us a clear reminder of just how important it is to talk to customers, in terms of getting an awesome end result that delivers real value. We’re going to double down on this and make sure that our engineers as well as our Product team get regular developer-to-developer contact with our users.

Coming up next: building an in-product experience for developers

In part 2 of this series, I’ll introduce our new in-dashboard developer experience which guides developers through building their integration with real-time feedback, as well as generating valuable data which we can use to further improve the onboarding experience in future.

Interested in joining GoCardless?
We're hiring
in Engineering, People

(Re)designing the DevOps interview process

Interviewing is hard. Both the company and the candidate have to make an incredibly important decision based on just a few hours’ worth of data, so it’s worth investing the time to make those precious hours as valuable as possible.

We recently made some changes to our DevOps interview process, with the aim of making it fairer, better aligned with the role requirements, and more representative of real work.

We started by defining the basics of our DevOps roles. What makes someone successful in this role and team? What are the skills and experience that we're looking for at different levels of the role?

It was important that the process would work for candidates with varying experience levels, and so it needed to be flexible and clear to assess skills at each of these levels.

The skills we’re looking for fall into three broad categories: existing technical knowledge (e.g., programming languages), competency-based skills (e.g., problem solving), and personal characteristics (e.g., passion for the role, teamwork and communication skills). After defining these skills, we mapped out how we would assess them at each stage of the interview process.

The application review stage

At this stage, we want to check whether your background and interests align with our expectations for the role. Your CV and cover letter tell us about your experience and prior achievements; however, we also ask for a description of a technical project or problem that you’ve found particularly interesting. With this question, we’re looking to hear more about what motivates you and what kind of challenges you enjoy working on.

The phone interview

Assesses: motivations and communication

If we think there’s a match, the next step is a call with someone from the People team. In this call we’ll:

  • Say hi! 👋
  • Ask any questions we have about your application or background.
  • Find out what you’re looking for and your motivations for the role.
  • Guide you through how the rest of the interview process will work.
  • Give you plenty of opportunity to ask questions about the role, the company, or anything else on your mind - interviews are two-way processes and it’s as important for us to know about you, as for you to know about us (we hope the answer is yes!).

The video interview

Assesses: motivations and interests, and web knowledge

This stage involves a one hour video call with two of our engineers. They will start by digging into your technical interests and background and finish with some questions about web fundamentals and troubleshooting. As with the phone interview, you will have plenty of time to ask our team questions, so don't be shy!

The live challenge

Assesses: problem solving, communication, logic, OS knowledge, troubleshooting

Research indicates that work sample tests provide some of the best indicators of future performance (e.g., Bock, 2015). For this interview, we looked through post-mortems for issues that we’ve debugged on the job, and built an EC2 instance that exhibits a number of these issues. We’ll give you remote access to the VM, and work with you to solve them.

The main aim of this stage is not to solve everything, but to gradually progress as you encounter different issues during the interview. For example, we want to know how you approach problems and why you might choose one process over the other. We’re interested in how you communicate with the team during the process and how you address the issues that you find.

The final stage

If everything else goes well, we’ll invite you to our office in London to meet the team. This final stage consists of three interviews.

The technical interview

Assesses: domains of technical knowledge (e.g., Linux internals, networking, storage)

The technical interview is usually held by someone from the Platform team, along with our CTO. Our engineers believe in giving people the best opportunity and will allow you to select the topics you will discuss during the interview. The level of difficulty within each topic will gradually increase until you decide to move the next one.

Pair-coding with our engineers

Assesses: collaboration, communication, logic and coding

This is another live exercise, but this time you’ll work directly with the team. We'll spend 90 minutes working together on a coding exercise that’s representative of the kind of task you’ll be doing on the job.

The soft skills interview

Assesses: motivation, team work and influence, and shared values

The final interview is a chat with someone from our management team. We’ll discuss your motivations and aspirations and whether your values match those of the company. Most of all, we want to know that you’ll thrive at GoCardless and within the Engineering team. This interview isn’t a personality test - we’re not looking to hire people who are the same as us. Instead, we’re looking to hire people that care about the same things as us. Making sure we add to our culture is as important as the tech skills you bring in; after all, we like where we work, but we love the people we work with!

The outcome(s)

Redesigning the interview process has hugely improved the experience of our engineering team and their confidence in making decisions after interviews. We've also received great feedback from candidates, especially regarding the live problem-solving challenge, which many find challenging and fun.

Overall, the redesign was a success in targeting the main issues that the team was facing. However, we're always looking for ways to improve and we're constantly adapting our processes based on feedback we receive from candidates and interviewers.

If you're curious about GoCardless, why not apply?

in Engineering

From idea to reality: containers in production at GoCardless

As developers, we work on features that our users interact with every day. When you're working on the infrastructure that underpins those features, success is silent to the outside world, and failure looks like this:

Recently, GoCardless moved to a container-based infrastructure. We were lucky, and did so silently. We think that our experiences, and the choices we made along the way, are worth sharing with the wider community. Today, we're going to talk about:

  • deploying software reliably
  • why you might want a container-based infrastructure
  • what it takes to reliably run containers in production

We'll wrap up with a little chat about the container ecosystem as it is today, and where it might go over the next year or two.

An aside - deployment artifacts

Before we start, it's worth clearing up which parts of container-based infrastructure we're going to focus on. It's a huge topic!

Some people hear "container" and jump straight to the building blocks - the namespace and control group primitives in the Linux kernel1. Others think of container images and Dockerfiles - a way to describe the needs of their application and build an image to run it from.

It's the latter we're going to focus on today: not the Dockerfile itself, but on what it takes to go from source code in a repository to something you can run in production.

That "something" is called a build artifact. What it looks like can vary. It may be:

  • a jar for an application running on the JVM
  • a statically-linked native binary
  • a native operating system package, such as a deb or an rpm

To deploy the application the artifact is copied to a bunch of servers, the old version of the app is stopped, and the new one is started. If it's not okay for the service to go down during deployment, you use a load balancer to drain traffic from the old version before stopping it.

Some deployment flows don't involve such concrete, pre-built artifacts. A popular example is the default Capistrano flow, which is, in a nutshell:

  • clone the application's source code repository on every server
  • install dependencies (Ruby gems)
  • run database schema migrations
  • build static assets
  • start the new version of the application

We're not here to throw shade at Capistrano - a lot of software is deployed successfully using this flow every day. We were using it for over 4 years.

It's worth noting what's missing from that approach. Application code doesn't run in isolation. It needs a variety of functionality from the operating system and shared libraries. Often, a virtual machine is needed to run the code (e.g. the JVM, CRuby). All these need to be installed at the right version for the application, but they are typically controlled far away from the application's codebase.

There's another important issue. Dependency installation and asset generation (JavaScript and CSS) happen right at the end of the process - during deployment. This leaves you exposed to failures that could have been caught or prevented earlier2.

It's easy to see, then, why people rushed at Docker when it showed up. You can define the application's requirements, right down to the OS-level dependencies, in a file that sits next to the application's codebase. From there, you can build a single artifact, and ship that to each environment (e.g. staging, production) in turn.

For us, and - I think - for most people, this was what made Docker exciting. Unless you're running at huge scale, where squeezing the most out of your compute infrastructure really matters, you're probably not as excited by the container primitives themselves.

What mattered to us?

You may be thinking that a lengthy aside on deployment artifacts could only be there to make this section easy, and you'd be right. In short, we wanted to:

  • have a uniform way to deploy our applications - to reduce the effort of running the ones we had, and make it easier to spin new ones up as the business grows
  • produce artifacts that can reproducibly be shipped to multiple environments3
  • do as much work up-front as possible - detecting failure during artifact build is better than detecting it during deployment

And what didn't matter to us?

In a word: scheduling.

The excitement around containers and image-based deployment has coincided with excitement around systems that allocate machine resources to applications - Mesos, Kubernetes, and friends. While those tools certainly play well with application containers, you can use one without the other.

Those systems are great when you have a lot of computers, a lot of applications, or both. They remove the manual work of allocating machines to applications, and help you squeeze the most out of your compute infrastructure.

Neither of those are big problems for us right now, so we settled on something smaller.

What we built

Even with that cut-down approach, there was a gap between what we wanted to do, and what you get out-of-the-box with Docker. We wanted a way to define the services that should be running on each machine, and how they should be configured. Beyond that, we had to be able to upgrade a running service without dropping any requests.

We were going to need some glue to make this happen.

Step one: service definitions

We wanted to have a central definition of the services we were running. That meant:

  • a list of services
  • the machines a service should run on
  • the image it should boot
  • the environment variable config it should be booted with
  • and so on

We decided that Chef was the natural place for this to live in our infrastructure4. Changes are infrequent enough that updating data bags and environment config isn't too much of a burden, and we didn't want to introduce even more new infrastructure to hold this state5.

With that info, Chef writes a config file onto each machine, telling it which applications to boot, and how.

Step two: using those service definitions

So we have config on each machine for what it should run. Now we need something to take that config and tell the Docker daemon what to do. Enter Conductor.

Conductor is a single-node orchestration tool we wrote to start long-lived and one-off tasks for a service, including interactive tasks such as consoles.

For the most part, its job is simple. When deploying a new version of a service, it takes a service identifier and git revision as arguments:

conductor service upgrade --id gocardless_app_production --revision 279d9035886d4c0427549863c4c2101e4a63e041

It looks up that identifier in the config we templated earlier with Chef, and uses the information there to make API calls to the Docker daemon. Using that information, it spins up new containers with those parameters and the git SHA provided. If all goes well, it spins down any old container processes and exits. If anything goes wrong, it bails out and tells the user what happened.

For services handling inbound traffic (e.g. API requests), there's a little more work to do - we can't drop requests on the floor every time we deploy. To make deploys seamless, Conductor brings up the new containers, and waits for them to respond successfully on a health check endpoint. Once they do, it writes out config for a local nginx instance with the ports that the new containers are bound to, and issues a reload of nginx. Before exiting, it tells the old containers to terminate gracefully.

In addition to long-running and one-off tasks, Conductor supports recurring tasks. If the application supplies a generate-cron script, Conductor can install those cron jobs on the host machine. The application's generate-cron script doesn't need to know anything about containers. The script outputs standard crontab format, as if there was no container involved, and Conductor wraps it with the extra command needed to run in a container:

# Example job to clean out expired API tokens
*/30 * * * *  /usr/local/bin/conductor run --id gocardless_cron_production --revision 279d9035886d4c0427549863c4c2101e4a63e041 bin/rails runner ''

Step three: triggering Conductor on deploys

There's one small piece of the puzzle we've not mentioned yet - we needed something to run Conductor on the right machines during deployment.

We considered a couple of options, but decided to stick with Capistrano, just in a reduced capacity. Doing this made it easier to run these deployments alongside deployments to our traditional app servers.

Unlike the regular Capistrano flow, which does most of the work in a deployment, our Capistrano tasks do very little. They invoke Conductor on the right machines, and leave it to do its job.

One step beyond: process supervision

At that point, we thought we were done. We weren't.

An important part of running a service in production is keeping it running. At a machine level this means monitoring the processes that make up the service and restarting them if they fail.

Early in the project we decided to use Docker's restart policies. The unless-stopped and on-failure options both looked like good fits for what we wanted. As we got nearer to finishing the project, we ran into a couple of issues that prompted us to change our approach.

The main one was handling processes that failed just after they started6. Docker will continue to restart these containers, and neither of those restart policies make it easy to stop this. To stop the restart policy, you have to get the container ID and issue a docker stop. By the time you do that the process you're trying to stop has exited, been swept up by Docker, and a new one will soon be started in its place.

The on-failure policy does have a max-retries parameter to avoid this situation but we don't want to give up on a service forever. Transient conditions such as being isolated from the network shouldn't permanently stop services from running.

We're also keen on the idea of periodically checking that processes are still able to do work. Even if a process is running, it may not be able to serve requests. You don't see this in every process supervisior7, but having a process respond to an HTTP request tells you a lot more about it than simply checking it's still running.

To solve these issues, we taught Conductor one more trick: conductor supervise. The approach we took was:

  • check that the number of containers running for a service matches the number that should be running
  • check that each of those containers responds to a HTTP request on its health check endpoint
  • start new containers if either of those checks fail
  • do that no more frequently than every 5 seconds to avoid excessive churn

So far, this approach has worked well for us. It picks up containers that have fallen over, and we can tell conductor supervise to stop trying to pick up a service if we need to.

That said, it's code we'd love not to maintain. If we see a chance to use something else, and it's worth the time to make the switch, conductor supervise won't live on.

The road to production

So that's the setup, but moving our apps into that stack didn't happen overnight.

Our earliest attempts were at the end of last year (September/October 2015). We started with non-critical batch processes at first - giving ourselves space to learn from failure. Gradually, we were able to ramp up to running more critical asynchronous workers. By December we were serving a portion of live traffic for some of our services from the new stack.

We spent January and February porting the rest of our services over8, and adjusting our setup as we learned more9.

By early March we had everything working on the new stack, and on the 11th we shut down the last of our traditional app servers. 🎉

Many ways to get to Rome

So here we are, 3 months after completing the move to the new infrastructure. Overall, we've been happy with the results. What we built hits the mark on the goals we mentioned earlier. Since the move, we've seen:

  • more frequent upgrades of Ruby - now that the busy-work is minimal, people have been more inclined to make the jump to newer versions
  • more small internal services deployed - previously we'd held back on these because of the per-app operational burden
  • faster, more reliable deployments - now that we do most of the work up-front, in an artifact build, deployment is a simpler step

So should you rush out and implement something like this? It depends.

The world of deployment and orchestration is moving rapidly right now, and with that comes a lot of excitement and blog posts. It's very easy to get swept along and feel that you need to do something because a company you respect does it. Maybe you would benefit from a distributed scheduler such as Mesos. Perhaps the container-based systems are too immature and fast-moving, and you'd prefer to use full-on virtual machine (VM) images as your deployment primitive. It's going to vary from team to team.

Even if you decide that you want broadly similar things to us, there are multiple ways to get there. Before we finish, let's look at a couple of them.

A VM option

There are plenty of hosting providers that support taking a snapshot of a machine, storing it as an image, and launching new instances from it. Packer is a tool that provides a way to build those images from a template and works with a variety of providers (AWS, Digital Ocean, etc - it can even build Docker images now).

Once you have that, you need something to bring up those VMs in the right quantities, and update load balancers to point to the right places. Terraform is a tool that handles this, and has been gaining a lot of popularity recently.

With this approach you sidestep the pitfalls of the rapidly-changing container landscape, but still get the benefits of image-based deployments.

A different container option

Docker has certainly been centre stage when it comes to container runtimes, but there are others out there. One which provides an interesting contrast is rkt.

Docker, with its daemon model, assumes responsibility for parenting, supervising, and restarting container processes if they fail. In contrast, rkt doesn't have a daemon. The rkt command line tool is designed to be invoked and supervised by something else10.

Lately, a lot of Linux distributions have been switching to systemd for their default init process11. systemd brings a richer process supervision and notification model than many previous init systems. With it comes a new question of boundaries and overlap - is there a reason to supervise containerised processes in a different way to the rest of the processes on a machine? Is Docker's daemon-based approach still worthwhile, or does it end up getting in the way? I think we'll see these questions play out over the next year or two.

There's less contrast when it comes to images. There's the acbuild tool if you want to build rkt-compatible images directly and they've also cleverly supported Docker images. It has conversion built-in with the docker2aci tool, which means you can continue to use Docker's build tools and Dockerfile.

So...what's next?

We mentioned earlier that deployment and orchestration of services are fast-moving areas right now. It's definitely an exciting time - one that should see some really solid options stabilise over the next few years.

As for what to do now? That's tough. There's no one answer. If you're comfortable being an early-adopter, ready for the extra churn that comes with that, then you can go ahead and try out some of the newer tooling. If that's not for you, the virtual machine path is more well-established, and there's no shame in using proven technology.

To sum up:

  • start by thinking about the problems you have and avoid spending time on ones you don't have
  • don't feel you have to change all of your tooling at once
  • remember the tradeoff between the promise of emerging tools and the increased churn they bring

If you'd like to ask us questions, we'll be around on @GoCardlessEng on Twitter.

Thanks for reading, and good luck!

  1. If the Kernel's own docs are more your thing, you can read the man pages for unshare (namespaces) and cgroups

  2. There might be transitory issues with the gem server, or worse, the gem version you depend on might have been yanked. 

  3. To give an example of how our existing deployments weren't like this, we'd encountered situations where upgrading a native library would cause a bundle install to fail on a dependency with native extensions. The simplest way out was to move the existing bundle away, and rebuild the bundle from scratch - an inconvenient, error-prone work around. 

  4. We were already using Chef, and didn't feel a strong need to introduce something new. 

  5. If this changes, we'll likely introduce one of etcd, Consul, or Zookeeper as a central store. 

  6. For example, if a Rails initialiser requires an environment variable to be present, and bails out early if it's missing. 

  7. And perhaps this shouldn't be part of Docker's responsibility. While we'd like it, it's completely fair that they've not added this. 

  8. Previously, our apps used templated config rather than environment variables, and didn't know how to log to anything other than a file. There was a fair amount of work in preparing our apps for a new, more 12-factor-like world. It ain't all glamorous! 

  9. conductor supervise didn't exist until the start of February. 

  10. The rkt docs have a section contrasting their approach to that of other runtimes. 

  11. The latest long-term support edition of Ubuntu, 16.04, ships with it, and many other distributions have also made the switch. 

Sound like something you'd enjoy?
Join our team

An introduction to our API

The GoCardless API allows you to manage Direct Debit payments via your own website or software. When a customer signs up for your services they can give you a Direct Debit authorisation online. Your integration can then create and manage payments and subscriptions automatically - there’s no need to manually add a new customer to GoCardless. Our API provides you with full flexibility when it comes to payment creation, and we offer it to all of our merchants at no extra cost.

In this blog post we’ll guide you through the steps needed to use our API, from customer creation to taking your first payment.

Let’s look at how Direct Debit payments work and how the GoCardless API is organised. In order to charge a customer’s bank account, you will first need their authorisation to collect payments via Direct Debit. This can be via our secure online payment pages or, if you’re using GoCardless Pro, you can take complete control of the payment process by hosting the payment pages on your own website.


Using GoCardless the process of creating a new customer is as follows:

  1. You direct your customers to the GoCardless payment page, allowing them to complete the authorisation to take payments from their account.
  2. Once complete, we redirect your customers back to your website. We’ve called this the redirect flow. When the customer is returned to your website, the redirect flow will already have created a customer record on the GoCardless system. Associated with the customer record will be a customer bank account, which itself will be associated with a mandate.
  3. You can now create payments and subscriptions against this mandate.

GoCardless Pro

If you host your own payment pages your clients will never have to leave your website to give you a Direct Debit authorisation.

  1. You use our API to create a customer record, followed by a customer bank account which is linked to the customer.
  2. Next you create a mandate by referencing the customer bank account.
  3. You can now create payments and subscriptions against this mandate.

Example requests

Now that we’ve covered the basics let’s look at the actual requests to the API. In order for you to follow these steps you will need the following:

  • A GoCardless sandbox account, get one here
  • An access token to use the API, create one here

In order to send a HTTP request to our API you will first need to set the URL where you want the request sent to. The base URLs for the GoCardless API are

  • for live
  • for sandbox

As we’re using the sandbox we’ll use which is then followed by the endpoint you want to send a request to. You will also need to specify if you want to send a POST (sending information) or a GET (requesting information) request and you will need to set the headers. Our API requires several headers to be set:

  • Authorization uses the access token you’ve created in the developer settings, preceded by the word Bearer
  • Accept tells the API that you’re expecting data to be sent in the JSON format. This needs to be application/json.

If you’re sending data to us, for example to create a new payment, you’ll also need to specify the content type:

  • Content-Type specifies the format of the content sent to the API (if any). This needs to be application/json.

An example request to our customers endpoint to list all customers on an account using curl would look like this:

curl \
-H "Authorization: Bearer ZQfaZRchaiCIjRhSuoFr6hGrcrAEsNPWI7pa4AaO" \
-H "Content-Type: application/json" \
-H "Accept: application/json" \
-H "GoCardless-Version: 2015-07-06"

Creating a customer using the redirect flow

To send your customer to the GoCardless payment pages you will need to create a redirect flow. This will be a POST request, and the redirect flow endpoint requires at least two parameters:

  • session_token This is used as an identifier allowing you to link the redirect flow to the respective customer in your integration. You could use the customer's email address or generate a random ID for this - it’s how you will identify this customer when they’re returned to your site after authorising payments
  • success_redirect_url This is the URL we redirect the customer to when they complete the payment pages.
  • description (optional) This will be shown to the customer when they’re on our payment page.

These parameters will need to be send with the request in a JSON blob, wrapped in a redirect_flows envelope:

curl \
-H "Authorization: Bearer ZQfaZRchaiCIjRhSuoFr6hGrcrAEsNPWI7pa4AaO" \
-H "Content-Type: application/json" \
-H "Accept: application/json" \
-H "GoCardless-Version: 2015-07-06" \
-d '{
  "redirect_flows": {
    "description": "Magazine subscription",
    "session_token": "session_ca853718-99ea-4cfd-86fd-c533ef1d5a3b",
    "success_redirect_url": "http://localhost/success"

The response from the API

   "redirect_flows": {  
      "id": "RE00005H8602K9J5C9V77KQAMHGH8FDB",
      "description": "Magazine subscription",
      "session_token": "session_ca853718-99ea-4cfd-86fd-c533ef1d5a3b",
      "scheme": null,
      "success_redirect_url": "http://localhost/success",
      "created_at": "2016-06-29T13:28:10.282Z",
      "links": {  
         "creditor": "CR000035V20049"
      "redirect_url": ""

The response shows the redirect_url for the newly created redirect flow. An HTTP 303 redirect (or an alternative redirect method) can be used to send your customer to the our payment pages. This should be done immediately, as the redirect link expires after 30 minutes. The customer will then see the GoCardless payment page and can enter their details to authorise a Direct Debit to be set up.

Once the form is complete, we will redirect the customer back to the redirect_uri you originally specified and append the parameter redirect_flow_id like this: http://localhost/success?redirect_flow_id=RE00005H8602K9J5C9V77KQAMHGH8FDB.

In order for the API to know that the customer has been returned safely back to your integration you will need to complete the redirect flow by sending the following request to the API. This is a mandatory step and the customer won’t be set up if this is not completed.

curl \
-H "Authorization: Bearer ZQfaZRchaiCIjRhSuoFr6hGrcrAEsNPWI7pa4AaO" \
-H "Content-Type: application/json" \
-H "Accept: application/json" \
-H "GoCardless-Version: 2015-07-06" \
-d '{
  "data": {
    "session_token": "session_ca853718-99ea-4cfd-86fd-c533ef1d5a3b"

Notice that the ID of the redirect flow and the required action was appended to the URL, and the session_token (as set by your integration when creating the redirect flow) was sent in the body of the request.

The response from the API

   "redirect_flows": {  
      "id": "RE00005H8602K9J5C9V77KQAMHGH8FDB",
      "description": "Magazine subscription",
      "session_token": "session_ca853718-99ea-4cfd-86fd-c533ef1d5a3b",
      "scheme": null,
      "success_redirect_url": "http://localhost/success",
      "created_at": "2016-06-29T13:49:00.077Z",
      "links": {  
         "creditor": "CR000035V20049",
         "mandate": "MD0000TWJWRFHG",
         "customer": "CU0000X30K4B9N",
         "customer_bank_account": "BA0000TCWMHXH3"

The customer’s details have now been saved, and GoCardless will take care of setting up an authorisation to collect payments from their bank account. You’ll use the mandate ID (provided in the links) to create payments and subscriptions, so you’ll want to store that ID in your database. You may find it useful to store the other references to your customer's resources in your database as well.

Creating a payment will be just one more call to the API, using the payments endpoint. A quick look into the developer documentation shows the three required parameters:

  • amount The payment amount, given in pence/cents. So to take £10.00 the value would be 1000
  • currency The currency of the payment you’re taking
  • links[mandate] The mandate that should be charged

Another helpful parameter is charge_date, which specifies when the payment leaves the customer’s bank account. If no charge_date is provided, the payment will be charged on the earliest possible date.

curl \
-H "Authorization: Bearer ZQfaZRchaiCIjRhSuoFr6hGrcrAEsNPWI7pa4AaO" \
-H "Content-Type: application/json" \
-H "Accept: application/json" \
-H "GoCardless-Version: 2015-07-06" \
-d '{
  "payments": {
    "amount": 1000,
    "currency": "GBP",
    "links": {
      "mandate": "MD0000TWJWRFHG"

The response from the API:

   "payments": {  
      "id": "PM0001G6V7BSN4",
      "created_at": "2016-07-01T09:27:52.352Z",
      "charge_date": "2016-07-06",
      "amount": 1000,
      "description": null,
      "currency": "GBP",
      "status": "pending_submission",
      "amount_refunded": 0,
      "reference": null,
      "links": {  
         "mandate": "MD0000TWJWRFHG",
         "creditor": "CR000035V20049"

You have now set up a customer and charged your first payment using the GoCardless API!

The API offers you many more options, allowing you to integrate Direct Debit functionality into your existing website or software.

If you’re using PHP, Java, Ruby or Python you can also make use of our client libraries. To get started, check out our easy "getting started" guide with copy and paste code samples.

Any API-related questions or feedback can be sent to our developer support team at [email protected].

Interested in joining the GoCardless team?
We're hiring
in Engineering

Update on the GoCardless service outage


On Thursday, we experienced an outage of 1 hour and 40 minutes that affected all our services. During that time you may have seen brief service recovery but for the most part our service was unavailable.

What happened

On Tuesday 19th, at 22:00 UTC, we received a notification from our infrastructure provider scheduling emergency maintenance in one of their Amsterdam datacentres on the next day at 23:00 UTC.

They were hoping to fix a recurring issue with one of the routers by performing a reboot of the chassis pair. As part of this maintenance they also planned to perform firmware upgrades, for an estimated total downtime of approximately 20 minutes.

Since our outage in July 2015 we have been working on getting each component of our infrastructure to span multiple data centres so that we can gracefully handle this kind of datacentre-wide failure. All the infrastructure on the critical path for our service was, as of January 21st, available in multiple locations, except for our database.

On Wednesday 20th, in the morning, the Platform Engineering team gathered and put together a plan to migrate the database to a new datacentre before the start of the maintenance.

On Wednesday 20th, at 22:00 UTC, we decided that the plan was not viable to complete before the maintenance. During the day several complications prevented us properly testing the plan in our staging environment. We made the call to take the full 20 minutes of downtime rather than cause more disruption by executing a plan that we could not fully prepare.

On Thursday 21st, at 00:25 UTC, our provider performed the reboot of the chassis, which instantly caused the GoCardless API and Pro API to be unavailable.

At 00:49 UTC, they announced that the maintenance was over. Our services started slowly recovering.

At 00:50 UTC, our monitoring system alerted us that some of our services were still unavailable. We immediately started troubleshooting.

At 01:00 UTC, a scheduled email announcing the end of the maintenance window was sent incorrectly.

At 02:00 UTC, we discovered the issue. Our database cluster is setup in such away that any write to the database, for example, creating a payment, needs to be recorded on two servers before we say it’s successful. Unfortunately during the maintenance, the link between our primary and our standby server broke. That meant that no write transactions could go through until that link was restored. Also, since the writes were blocking, we quickly exhausted our connection slots and read requests started failing too. After a few seconds the writes would time out and we would start having successful read requests again. That explains why we were sporadically up during that time. Once we brought our cluster back together, requests started flowing through again.

By 02:05 UTC, all services were fully operational.

Final words

We are following up with our infrastructure provider to figure out why they had to perform maintenance in that datacentre at such short notice.

We have now provisioned a standby cluster in another datacentre so that we are able to migrate to a new datacentre with very little downtime in case this happens again. Long term, as we said in our last blog post, we have plans for how we can further minimise disruption in the case of datacentre-wide failure. This project is the main focus of the Platform Engineering team for the next six months.

in Engineering

Zero-downtime Postgres migrations - a little help

We're pleased to announce the release of ActiveRecord::SaferMigrations, a library to make changing the schema of Postgres databases safer. Interested how? Read on.

Previously, we looked at how seemingly safe schema changes in Postgres can take your site down. We ended that article with some advice, and today we want to make that advice a little easier to follow.

A recap

In a nutshell, there are some operations in Postgres that take exclusive locks on tables, causing other queries involving those tables to block until the exclusive lock is released1. Typically, this sort of operation is run infrequently as part of a deployment which changes the schema of your database.

For the most part, these operations are fine as long as they execute quickly. As we explored in our last post, there's a caveat to that - if the operation has to wait to acquire its exclusive lock, all queries which arrive after it will queue up behind it.

You typically don't want to block the queries from your app for more than a few hundred milliseconds, maybe a second or two at a push2. Achieving that means reading up on locking in Postgres, and being very careful with those schema-altering queries. Make a mistake and, as we found out, your next deployment stops your app from responding to requests.

How we reacted

In the months following our lock-related outage, we became extra careful about every schema migration we ran.

We spent longer in code reviews. We pushed most schema changes outside of high traffic hours. We got more reliant on a few people's Postgres knowledge to greenlight even simple schema changes.

This isn't how we like running things. We'd ended up with clunky process, and it was hampering our ability to build the product, so we started looking for a way out.

A better solution

Right at the end of the last blog post, we mentioned a Postgres setting called lock_timeout, which limits the amount of time Postgres will wait for a query to acquire a lock (and consequently how long other queries will queue up behind it).

As well as lock_timeout, there's statement_timeout, which sets a cap on how long an individual statement can hold a lock for. For schema migrations, this protects you from queries that take an exclusive lock on the table and hold it for a long time while updating rows (e.g. ALTER TABLE foos ADD COLUMN bar varchar DEFAULT 'baz' NOT NULL).

We decided that we wanted lock_timeout and statement_timeout to automatically be set to reasonable values for every migration3.

We wrote a library which does just that for ActiveRecord (the Rails ORM) migrations being run against a Postgres database4. Today, we're open sourcing it.

Introducing ActiveRecord::SaferMigrations

ActiveRecord::SaferMigrations automatically sets lock_timeout and statement_timeout before running migrations. The library comes with default values for both settings, but you can change them if your app has different constraints.

What this means is that each statement in your migration will spend no longer than:

  • 750ms waiting to acquire a lock
  • 1500ms running a single query

The default values can be changed globally or overridden per-migration, and we have a simple rule at GoCardless - if you want to override them then you need to explain in your pull request why it's safe to do so5.

If either of those limits are hit, Postgres will return an error, and you'll have to retry your migration later. Inconvenient, but delaying the deployment of a new feature beats taking the service down!

Of course, there's a 'but'

That's all good, but we're left with one problem - transactions. Let's start with a couple of facts about transactions:

  • When a lock is acquired inside a transaction, it is held until that transaction either commits or rolls back.
  • By default, ActiveRecord wraps every migration in a transaction6.

You can see where this is going. Every statement in a transaction might run faster than the limits we impose, but if the transaction is sufficiently large, the locks acquired by the first statement could be held for much longer than the service can tolerate.

There's no easy way out of this one, no transaction_timeout for us to fall back on. You've just got to make sure you keep your transactions short.

We've found the easiest solution is to split large schema migrations into multiple files so that ActiveRecord runs each part in a separate transaction. This means it's easy to retry deployments which fail7 and the risk of blocking app queries for too long is minimal.

What happened next

We've been running all our migrations this way for about 2 months now, and it's gone well. On a few occasions, the timeouts have prevented locking issues from taking our API down. Having sensible defaults and letting Postgres enforce them has made routine schema changes feel routine again.

It's a small library, but it's had a big impact for our team. We hope it'll do the same for you.

  1. Almost all DDL statements are like this. The topic is well covered in our original post, a post by Braintree and in the Postgres docs themselves

  2. Everyone's constraints are different. For some products, it's fine to have small periods of unavailability. For a payments API, that's costly. 

  3. What's reasonable for us might not be reasonable for you. Maybe a second is too long a pause, or maybe you're fine with 5. 

  4. While our library is specific to ActiveRecord, what it does isn't. You can use the same principle with other systems which manage schema migrations. 

  5. It's fairly common to override statement_timeout. There are some operations (e.g. CREATE INDEX CONCURRENTLY) which are safe, but not fast. 

  6. You can disable this behaviour with disable_ddl_transaction!, but often you don't want to. If a non-transactional migration bails out half way through, your schema is left in an intermediate state, and the migration won't apply unless you go in and revert those partial changes. 

  7. Unlike with disable_ddl_transaction!, this approach can't leave you with a partially-applied migration. 

Want to work with us on a growing payments infrastructure?
We're hiring
in Engineering

The Troubleshooting Tales: issues scaling Postgres connections

After making some changes to our Postgres setup, we started noticing occasional errors coming from deep within ActiveRecord (Rails’ ORM). This post details the process we went through to determine the cause of the issue, and what we did to fix it.

The situation

First, it’s important to understand the changes we made to our Postgres setup. Postgres connections are relatively slow to establish (particularly when using SSL), and on a properly-tuned server they use a significant amount of memory. The amount of memory used limits the number of connections you can feasibly have open at once on a single server, and the slow establishment encourages clients to maintain long-lived connections. Due to these constraints, we recently hit the limit of connections our server could handle, preventing us from spinning up more application servers. To get around this problem, the common advice is to use connection pooling software such as PgBouncer to share a small number of Postgres connections between a larger number of client (application) connections.

When we first deployed PgBouncer, we were running it in “session pooling” mode, which assigns a dedicated Postgres server connection to each connected client. However, with this setup, if you have a large number of idle clients connected to PgBouncer you’ll have to maintain an equal number of (expensive) idle connections on your Postgres server. To combat this, there is an alternative mode: “transaction pooling”, which only uses a Postgres server connection for the duration of each transaction. The downside of transaction pooling is that you can’t use any session-level features (e.g. prepared statements, session-level advisory locks). After combing through our apps to remove all usages of session-level features, we enabled transaction pooling.

Shortly after making the switch, we started seeing (relatively infrequent) exceptions coming from deep within ActiveRecord: NoMethodError: undefined method 'fields' for nil:NilClass. We also noticed that instances of this exception appeared to be correlated with INSERT queries that violated unique constraints.

Investigating the problem

Some initial digging indicated that on executing certain queries, the async_exec method in the Ruby Postgres driver was returning nil, rather than PG::Result as ActiveRecord was expecting. To get a better sense of what could be causing this, we set about finding a way to reliably reproduce the exception.

We set up a test database cluster that matched our production setup (see image above), and wrote a script that used the Ruby Postgres driver to issue lots of unique-constraint-violating queries in parallel, using one connection per thread. No dice - we didn't see any exceptions. Next, we tried introducing a generic connection pooling library, so we started sharing connections between threads. Again, it all worked as expected. Finally, we swapped out the connection pooling library for ActiveRecord, and we were immediately able to reproduce the exception.

Given that we had just switched over to transaction-pooling mode, we became curious about whether wrapping the INSERT in a transaction would change anything. We tried issuing a BEGIN, followed by the constraint-violating insert, then a COMMIT (sent to the database server one at a time), and the error persisted. However, when we wrapped it all up into one string: BEGIN; INSERT …; COMMIT, the error suddenly stopped occurring. Something didn’t seem right - wrapping a single statement in a transaction should have no effect at all. We then tried running SELECT 1; INSERT … and found that it had the same effect: the error went away. So this “fix” actually generalised to command strings that included multiple statements. Most peculiar.

To shed more light on why the issue happened with ActiveRecord but not other libraries, we turned to tcpdump to get a more complete view over what was going on. We quickly noticed that when using ActiveRecord, we’d see a load of extra queries were being sent over the wire. These queries all seemed to be changing session-level settings, and session-level settings don’t play nice with transaction pooling. The reason that transaction pooling doesn’t work with session-level settings is each transaction (or query, when not in a transaction) may be sent to a different Postgres connection. So, modifying session-level settings will change the setting on a random connection, and not necessarily affect successive queries, which may be sent to different connections entirely.

Looking over the settings one-by-one, most seemed pretty innocuous1 - ensuring that the right timezone was set, making sure “standard_conforming_strings” were being used, etc. Then we spotted a query that was setting client_min_messages to PANIC. client_min_messages determines which messages are reported back to the client. Usually it’s set to NOTICE. Postgres considers unique constraint violations to be ERRORs, which are below PANIC so were not being reported. The Ruby Postgres driver was issuing a query, and expecting either a normal result or an error. However, because errors were disabled, it was getting nothing back, causing it to return nil. Finally, we found the issue!

The resolution

But why was ActiveRecord disabling errors? It turns out that in Postgres 8.1 standard_conforming_strings was read-only, and any attempt to set it would result in an error. ActiveRecord enabled the setting if it was available, but didn’t want to show the error if it wasn’t available or was read-only. The solution was to set client_min_messages to PANIC, then try to set standard_conforming_strings, then reset client_min_messages back to its original value. Fortunately, Rails has since dropped support for Postgres 8.1, so the fix was easy: simply remove the queries that modify client_min_messages and assume that standard_conforming_strings isn't read-only. Our patch that changes this has been present in Rails since v4.2.5.

  1. Even though most of the settings appear innocuous, they could still cause issues when combined with transaction pooling. For instance, you could end up with different connections in different timezones. We've brought this up on the rubyonrails-core list, but as of Dec '15 it's still an unresolved issue. 

Find this kind of work interesting?
We're hiring

In search of performance - how we shaved 200ms off every POST request

While doing some work on our Pro dashboard, we noticed that search requests were taking around 300ms. We've got some people in the team who have used Elasticsearch for much larger datasets, and they were surprised by how slow the requests were, so we decided to take a look.

Today, we'll show how that investigation led to a 200ms improvement on all internal POST requests.

What we did

We started by taking a typical search request from the app and measuring how long it took. We tried this with both Ruby's Net::HTTP and from the command line using curl. The latter was visibly faster. Timing the requests showed that the request from Ruby took around 250ms, whereas the one from curl took only 50ms.

We were confident that whatever was going on was isolated to Ruby1, but we wanted to dig deeper, so we moved over to our staging environment. At that point, the problem disappeared entirely.

For a while, we were stumped. We run the same versions of Ruby and Elasticsearch in staging and production. It didn't make any sense! We took a step back, and looked over our stack, piece by piece. There was something in the middle which we hadn't thought about - HAProxy.

We quickly discovered that, due to an ongoing Ubuntu upgrade2, we were using different versions of HAProxy in staging (1.4.24) and production (1.4.18). Something in those 6 patch revisions was responsible, so we turned our eyes to the commit logs. There were a few candidates, but one patch stood out in particular.

We did a custom build of HAProxy 1.4.18, with just that patch added, and saw request times drop by around 200ms. Job done.

Under the hood

Since this issue was going to be fixed by the Ubuntu upgrades we were doing, we decided it wasn't worth shipping a custom HAProxy package. Before calling it a day, we decided to take a look at the whole request cycle using tcpdump, to really understand what was going on.

What we found was that Ruby's Net::HTTP splits POST requests across two TCP packets - one for the headers, and another for the body. curl, by contrast, combines the two if they'll fit in a single packet. To make things worse, Net::HTTP doesn't set TCP_NODELAY on the TCP socket it opens, so it waits for acknowledgement of the first packet before sending the second. This behaviour is a consequence of Nagle's algorithm.

Moving to the other end of the connection, HAProxy has to choose how to acknowledge those two packets. In version 1.4.18 (the one we were using), it opted to use TCP delayed acknowledgement.

Delayed acknowledgement interacts badly with Nagle's algorithm, and causes the request to pause until the server reaches its delayed acknowledgement timeout3.

HAProxy 1.4.19 adds a special case for incomplete HTTP POST requests - if it receives a packet which only contains the first part of the request, it enables TCP_QUICKACK on the socket, and immediately acknowledges that packet.

More than just search

Having understood what was happening, we realised the fix had a far wider reach than our search endpoint. We run all of our services behind HAProxy, and it's no secret that we write a lot of Ruby. This combination meant that almost every POST request made inside our infrastructure incurred a 200ms delay. We took some measurements before and after rolling out the new version of HAProxy:

POST /endpointA
average (ms/req) before HAProxy upgrade: 271.13
average (ms/req) after HAProxy upgrade: 19.08

POST /endpointB
average (ms/req) before HAProxy upgrade: 323.78
average (ms/req) after HAProxy upgrade: 66.47

Quite the improvement!


Even though the fix was as simple as upgrading a package, the knowledge we gained along the way is invaluable in the long-term.

It would have been easy to say search was fast enough and move on, but by diving into the problem we got to know more about how our applications run in production.

Doing this work really reinforced our belief that it's worth taking time to understand your stack.

  1. We used the same query in each request, and waited for the response time to settle before measuring. We also tried Python's requests library, and it performed similarly to curl

  2. At the time, we were trialling Ubuntu 14.04 in our staging environment, before rolling it out to production. 

  3. On Linux the timeout is around 200ms. The exact value is determined by the kernel, and depends on the round-trip-time of the connection

Sound like something you'd enjoy?
We're hiring
in Engineering

Friday's outage post mortem

The below is a post-mortem on downtime experienced on Friday 3rd July. It was sent around the GoCardless team and is published here for the benefit of our integrators.

Any amount of downtime is completely unacceptable to us. We're sorry to have let you, our customers, down.


On Friday 3rd July 2015, we experienced three separate outages over a period of 1 hour and 35 minutes. It affected our API and our Pro API, as well as the dashboards powered by these APIs.

The first incident was reported at 16:37 BST and all our service were fully up and stable at 18:12 BST.

We apologise for the inconvenience caused. More importantly, we are doing everything we can to make sure this can't happen again.


On Friday 3rd July 2015, at 16:35 BST, we started experiencing network failures on our primary database cluster in SoftLayer’s Amsterdam datacentre. After several minutes of diagnosis it became evident that SoftLayer were having problems within their internal network. We were experiencing very high latency when troubleshooting and seeing high packet loss between servers in our infrastructure. We immediately got in touch with SoftLayer’s technical support team and they confirmed they were having issues with one of their backend routers, which was causing connectivy issues in the whole datacentre.

At 16:37 BST, all the interfaces on SoftLayer’s backend router went down. This caused our PostgreSQL cluster to be partitioned since all nodes lost connectivity to one another. As a result, our API and Pro API became unavailable, as well as the dashboards using these APIs. At 16:39 BST the network came back online and we started repairing our database cluster.

By 16:44 BST, we brought our PostgreSQL cluster up and normal service resumed.

At 17:07 BST, the interfaces on SoftLayer's router flapped again, causing our service to be unavailable. This time one of the standby nodes in our cluster became corrupted. While some of our team worked on repairing the cluster, the remainder started preparing to fail over our database cluster to another datacentre.

By 17:37 BST, all of SoftLayer’s internal network was working properly again. We received confirmation from SoftLayer that the situation was entirely mitigated. We came to the conclusion that at this point, transitioning our database cluster to a different datacentre would cause unnecessary further disruption.

At 18:03 BST, we saw a third occurrence of the internal network flapping, which caused our service to be unavailable again for a short period of time.

By 18:12 BST, our API, Pro API and dashboards were fully operational.

Post mortem

We don't yet have any further details from SoftLayer on the cause of the router issues, but we are awaiting a root cause analysis.

Currently, our PostgreSQL cluster automatically fails over to a new server within the same datacentre in case of any single-node failure. Following these events we have brought forward work to completely automate this failover across datacentres so that we can recover faster from datacentre-wide issues.

New API Version - 2015-07-06

Version 2015-07-06 is released today, with the following changes:

  • Removes /helpers endpoint from the API
  • Renames subscription start_at and end_at to start_date and end_date
  • Enforces date format when passing a payment charge_date

For the majority of integrations, the upgrade will be extremely simple, but we will continue our support for v2015-04-29 until the 6th of January, 2016.


Upgrading to the new API version should be extremely simple:

  1. Update your version header:

    GoCardless-Version: 2015-07-06
  2. If you use the subscriptions endpoint, update your integration to use start_date and end_date keys instead of start_at and end_at.

  3. If you generate PDF mandates, update your integration to use the new mandate_pdfs endpoint.

  4. If you use the old /helpers/modulus_check endpoint, update your code to use the new bank_details_lookups endpoint.

Why are GoCardless making these changes?

The above changes achieve two improvements to the GoCardless API:

  1. Dates now all have an _date key, whilst timestamps all have an _at key.

  2. All endpoints act as first class resources. Previously the /helpers endpoints were inconsistent with the rest of the API, making them harder to use.

Off the back of these changes we will release version 1.0 of our Java and Ruby client libraries this week. Python and PHP will follow shortly afterwards.

Need help upgrading or have any questions?
Get in touch
in Engineering

Coach: An alternative to Rails controllers

Today we're open sourcing Coach, a library that removes the complexity from Rails controllers. Bundle your shared behaviour into highly robust, heavily tested middlewares and rely on Coach to join them together, providing static analysis over the entire chain. Coach ensures you only require a glance to see what's being run on each controller endpoint.

At GoCardless we've replaced all our controller code with Coach middlewares.

Why controller code is tricky

Controller code often suffers from hidden behaviour and tangled data dependencies, making it hard to read and difficult to test.

To keep your endpoints performant, you never want to be running a cacheable operation more than once. This leads to memoizing your database queries but the question is then where to store this data? If you want the rest of your controller code to be able to access it, then storing the data on the controller instance is the easiest way to make that happen.

In an attempt to reuse code, controller methods are then split out into controller concerns (mixins), which are included as needed. This leads to controllers that look skinny but have a large amount of included behaviour, all defined far from their call site.

Some of these implicitly defined methods are called in before_actions, making it even more unclear what code is being run when you hit your controllers. Inherited before_actions can lead to a controller that runs several methods before every action without it being clear which these are.

One of the first things we do when a request hits GoCardless is parse authentication details and pull a reference to that access token out of the database. As the request progresses, we make use of that token to scope our database queries, tag our logs and verify permissions. This sharing and reuse of data is what makes writing controller code complex.

So how does Coach help?

Coach rethinks this approach by building your controller code around the data needed for your request. All your controller code is built from Coach::Middlewares, which take a request context and decide on a response. Each middleware can opt to respond itself, or call the next middleware in the chain.

Each middleware can specify the data it requires from those that have ran before it, and can declare what data it will pass to those that come after. This makes the flow of data explicit, and Coach will verify that the requirements have been met before you ever mount an endpoint.

Coach by example

The best way to see the benefits of Coach is with a demonstration...

Mounting an endpoint

class HelloWorld < Coach::Middleware
  def call
    # Middleware return a Rack response
    [ 200, {}, ['hello world'] ]

So we've created ourselves a piece of middleware, HelloWorld. As you'd expect, HelloWorld simply outputs the string 'hello world'.

In an example Rails app, called Example, we can mount this route like so...

Example::Application.routes.draw do
  match "/hello_world",
        via: :get

Once you've booted Rails locally, the following should return 'hello world':

$ curl -XGET http://localhost:3000/hello_world

Building chains

Suppose we didn't want just anybody to see our HelloWorld endpoint. In fact, we'd like to lock it down behind some authentication.

Our request will now have two stages, one where we check authentication details and another where we respond with our secret greeting to the world. Let's split into two pieces, one for each of the two subtasks, allowing us to reuse this authentication flow in other middlewares.

class Authentication < Coach::Middleware
  def call
    unless User.exists?(login: params[:login])
      return [ 401, {}, ['Access denied'] ]

class HelloWorld < Coach::Middleware
  uses Authentication

  def call
    [ 200, {}, ['hello world'] ]

Here we detach the authentication logic into its own middleware. HelloWorld now uses Authentication, and will only run if it has been called via from authentication.

Notice we also use params just like you would in a normal Rails controller. Every middleware class will have access to a request object, which is an instance of ActionDispatch::Request.

Passing data through middleware

So far we've demonstrated how Coach can help you break your controller code into modular pieces. The big innovation with Coach, however, is the ability to explicitly pass your data through the middleware chain.

An example usage here is to create a HelloUser endpoint. We want to protect the route by authentication, as we did before, but this time greet the user that is logged in. Making a small modification to the Authentication middleware we showed above...

class Authentication < Coach::Middleware
  provides :user  # declare that Authentication provides :user

  def call
    return [ 401, {}, ['Access denied'] ] unless user.present?

    provide(user: user)

  def user
    @user ||= User.find_by(login: params[:login])

class HelloUser < Coach::Middleware
  uses Authentication
  requires :user  # state that HelloUser requires this data

  def call
    # Can now access `user`, as it's been provided by Authentication
    [ 200, {}, [ "hello #{}" ] ]

# Inside config/routes.rb
Example::Application.routes.draw do
  match "/hello_user",
        via: :get

Coach analyses your middleware chains whenever a new Handler is created. If any middleware requires :x when its chain does not provide :x, we'll error out before the app even starts.


Our problems with controller code were implicit behaviours, hidden data dependencies and as a consequence of both, difficult testing. Coach has tackled each of these, providing the framework to restructure own controllers into code that is easily understood, and easily maintained.

Coach also hooks into ActiveSupport::Notifications, making monitoring performance of our API really easy. We've written a little adapter that sends detailed performance metrics up to Skylight, where we can keep an eye out for sluggish endpoints.

Coach is on GitHub. As always, we love suggestions, feedback, and pull requests!

We're hiring developers
See job listing
in Engineering

Prius: environmentally-friendly app config

We just open-sourced Prius, a simple library for handling environment variables in Ruby.

Safer environment variables

Environment variables are a convenient way of managing application config, but it's easy to misconfigure or forget them. This can cause big problems:

# If ENCRYPTION_KEY is missing, a nil encryption key will be used
encrypted_data = Crypto.encrypt(secret_data, ENV["ENCRYPTION_KEY"])

# If FOO_API_KEY is missing, this code will bomb out at run time"FOO_API_KEY")).make_request

Prius helps you guarantee that your environment variables are:

  • Present - an exception is raised if an environment variable is missing, so you can hear about it as soon as your app boots.
  • Valid - an environment variable can be coerced to a desired type (integer, boolean or string), and an exception will be raised if the value doesn't match the type.


# Load a required environment variable (GITHUB_TOKEN) into Prius.

# Use the environment variable.

# Load an optional environment variable:
Prius.load(:might_be_here_or_not, required: false)

# Load and alias an environment variable:
Prius.load(:short_name, env_var: "LONG_NAME_WE_HAVE_NO_CONTROL_OVER")

# Load and coerce an environment variable (or raise):
Prius.load(:my_flag, type: :bool)

How we use Prius

All environment variables we use are loaded as our app starts, so we catch config issues at boot time. If an app can't boot, it won't be deployed, meaning we can't release mis-configured apps to production.

We check a file of dummy config values into source control, which makes running the app in development and test environments easier. Dotenv is used to automatically load this in non-production environments.

We're hiring developers
See job listing

Safely retrying API requests

Today we're announcing support for idempotency keys on our Pro API, which make it safe to retry non-idempotent API requests.

Why are they necessary?

Here's an example that illustrates the purpose of idempotency keys.

You submit a POST request to our /payments endpoint to create a payment. If all goes well, you'll receive a 201 Created response. If the request is invalid, you'll receive a 4xx response, and know that the payment wasn't created. But what if something goes wrong our end and we issue a 500 response? Or what if there's a network issue that means you get no response at all? In these cases you have no way of knowing whether or not the payment was created. This leaves you with two options:

  • Hope the request succeeded, and take no further action.
  • Assume the request failed, and retry it. However, if the request did succeed you'll end up with a duplicate payment.

Not an ideal situation.

Idempotency Keys

To solve this, we've now rolled out support for idempotency keys across all our creation endpoints. Idempotency keys are unique tokens that you submit as a request header, that guarantee that only one resource will be created regardless of how many times a request is sent to us.

For example, the following request can be made repeatedly, with only one payment ever being created:

Idempotency-Key: PROCESS-ME-ONCE
  "payments": {
    "amount": 100,
    "currency": "GBP",
    "charge_date": "2015-06-20",
    "reference": "DOLLAR01",
    "links": {
      "mandate": "MD00001EKBQ412"

If the request fails, then it's perfectly safe to retry as long as you use the same idempotency key. If the original request was successful, then you'll receive the following response:

HTTP/1.1 409 (Conflict)
  "error": {
    "code": 409,
    "type": "invalid_state",
    "message": "A resource has already been created with this idempotency key",
    "documentation_url": "",
    "request_id": "5f917bf9-df56-460f-a165-15d9e77414cb",
    "errors": [
        "reason": "idempotent_creation_conflict",
        "message": "A resource has already been created with this idempotency key",
        "links": {
          "conflicting_resource_id": "PM00001KKVGTS0"

It's worth noting that we haven't added support for idempotency keys to our update endpoints, as they're idempotent by nature. For example, trying to cancel the same payment multiple times will have no adverse effect.

We're constantly improving our API to provide a better experience to our integrators. If you have any feedback or suggestions then get in touch, we'd love to hear from you!

Wondering whether GoCardless Pro could be for you?
Find out more

Zero-downtime Postgres migrations - the hard parts

A few months ago, we took around 15 seconds of unexpected API downtime during a planned database migration. We're always careful about deploying schema changes, so we were surprised to see one go so badly wrong. As a payments company, the uptime of our API matters more than most - if we're not accepting requests, our merchants are losing money. It's not in our nature to leave issues like this unexplored, so naturally we set about figuring out what went wrong. This is what we found out.


We're no strangers to zero-downtime schema changes. Having the database stop responding to queries for more than a second or two isn't an option, so there's a bunch of stuff you learn early on. It's well covered in other articles1, and it mostly boils down to:

  • Don't rename columns/tables which are in use by the app - always copy the data and drop the old one once the app is no longer using it
  • Don't rewrite a table while you have an exclusive lock on it (e.g. no ALTER TABLE foos ADD COLUMN bar varchar DEFAULT 'baz' NOT NULL)
  • Don't perform expensive, synchronous actions while holding an exclusive lock (e.g. adding an index without the CONCURRENTLY flag)

This advice will take you a long way. It may even be all you need to scale this part of your app. For us, it wasn't, and we learned that the hard way.

The migration

Jump back to late January. At the time, we were building invoicing for our Pro product. We'd been through a couple of iterations, and settled on model/table names. We'd already deployed an earlier revision, so we had to rename the tables. That wasn't a problem though - the tables were empty, and there was no code depending on them in production.

The foreign key constraints on those tables had out of date names after the rename, so we decided to drop and recreate them2. Again, we weren't worried. The tables were empty, so there would be no long-held lock taken to validate the constraints.

So what happened?

We deployed the changes, and all of our assumptions got blown out of the water. Just after the schema migration started, we started getting alerts about API requests timing out. These lasted for around 15 seconds, at which point the migration went through and our API came back up. After a few minutes collecting our thoughts, we started digging into what went wrong.

First, we re-ran the migrations against a backup of the database from earlier that day. They went through in a few hundred milliseconds. From there we turned back to the internet for an answer.

Information was scarce. We found lots of blog posts giving the advice from above, but no clues on what happened to us. Eventually, we stumbled on an old thread on the Postgres mailing list, which sounded exactly like the situation we'd ran into. We kept looking, and found a blog post which went into more depth3.

In order to add a foreign key constraint, Postgres takes AccessExclusive locks on both the table with the constraint4, and the one it references while it adds the triggers which enforce the constraint. When a lock can't be acquired because of a lock held by another transaction, it goes into a queue. Any locks that conflict with the queued lock will queue up behind it. As AccessExclusive locks conflict with every other type of lock, having one sat in the queue blocks all other operations5 on that table.

Here's a worked example using 3 concurrent transactions, started in order:

-- Transaction 1
SELECT DISTINCT(email)     -- Takes an AccessShare lock on "parent"
FROM parent;               -- for duration of slow query.

-- Transaction 2
ALTER TABLE child          -- Needs an AccessExclusive lock on
ADD CONSTRAINT parent_fk   -- "child" /and/ "parent". AccessExclusive
  FOREIGN KEY (parent_id)  -- conflicts with AccessShare, so sits in
  REFERENCES parent        -- a queue.

-- Transaction 3
SELECT *                   -- Normal query also takes an AccessShare,
FROM parent                -- which conflicts with AccessExclusive
WHERE id = 123;            -- so goes to back of queue, and hangs.

While the tables we were adding the constraints to were unused by the app code at that point, the tables they referenced were some of the most heavily used. An unfortunately timed, long-running read query on the parent table collided with the migration which added the foreign key constraint.

The ALTER TABLE statement itself was fast to execute, but the effect of it waiting for an AccessExclusive lock on the referenced table caused the downtime - read/write queries issued by calls to our API piled up behind it, and clients timed out.

Avoiding downtime

Applications vary too much for there to be a "one size fits all" solution to this problem, but there are a few good places to start:

  • Eliminate long-running queries/transactions from your application.6 Run analytics queries against an asynchronously updated replica.
    • It's worth setting log_min_duration_statement and log_lock_waits to find these issues in your app before they turn into downtime.
  • Set lock_timeout in your migration scripts to a pause your app can tolerate. It's better to abort a deploy than take your application down.
  • Split your schema changes up.
    • Problems become easier to diagnose.
    • Transactions around DDL are shorter, so locks aren't held so long.
  • Keep Postgres up to date. The locking code is improved with every release.

Whether this is worth doing comes down to the type of project you're working on. Some sites get by just fine putting up a maintenance page for the 30 seconds it takes to deploy. If that's not an option for you, then hopefully the advice in this post will help you avoid unexpected downtime one day.

  1. Braintree have a really good post on this. 

  2. At the time, partly as an artefact of using Rails migrations which don't include a method to do it, we didn't realise that Postgres had support for renaming constraints with ALTER TABLE. Using this avoids the AccessExclusive lock on the table being referenced, but still takes one on the referencing table. Either way, we want to be able to add new foreign keys, not just rename the ones we have. 

  3. It's also worth noting that the Postgres documentation and source code are extremely high quality. Once we had an idea of what was happening, we went straight to the locking code for ALTER TABLE statements

  4. This still applies if you add the constraint with the NOT VALID flag. Postgres will briefly hold an AccessExclusive lock against both tables while it adds constraint triggers to them. 9.4 does make the VALIDATE CONSTRAINT step take a weaker ShareUpdateExclusive lock though, which makes it possible to validate existing data in large tables without downtime. 

  5. SELECT statements take an AccessShare lock. 

  6. If developers have access to a console where they can run queries against the production database, they need to be extremely cautious. BEGIN; SELECT * FROM some_table WHERE id = 123; /* Developer goes to make a cup of tea */ will cause downtime if someone deploys a schema change for some_table

We’re hiring developers
See job listing

How we built the new

Our website source code is no longer open source on GitHub. We are planning to release a more generic boilerplate project.

We recently deployed and open sourced a new version of our website. We've had some great feedback on how fast the site is so we wanted to give a high level overview of our technical approach in the hope that others will find it useful.

In this post we'll cover both how we made the site so fast and what our development setup looks like.


When browsing you'll notice that after the initial page load navigation around the site is super snappy. Until you leave our splash pages, or choose a different locale, all your browser is requesting are new images. If you happen to be on a slow connection, or you have disabled JavaScript, the site still works fine.


Using React to render on the server means that we can deliver not only a fully rendered page to the browser, but also the rest of our website too. Once the initial page is displayed, a request is made to fetch the rest of the site as a React application (only 250kB). All subsequent navigation within the website is blindingly fast, not suffering the latency of HTTP requests since the browser already has the whole app.

If the user has disabled JavaScript, or they are on a poor connection and the JavaScript takes a while to load, they still experience the fully rendered page and only miss out on the fast navigation.

Static deploy

While we develop against an Express server, we don't deploy that. Since is entirely static, we're able to deploy and serve static HTML. One huge benefit of this is that we can host the site off an S3 bucket and not have to worry about a running web server and everything that goes along with it, such as exception handling, monitoring, security issues, etc.

In order to generate our static HTML we need to know every available URL. Since we're keeping all of our routing configuration in one place, this is trivial; we're able to easily extract the paths for all pages in every locale. Once we have those URLs, we simply crawl our locally running Express app and write the responses to disk ready for deployment.

Development tools


The newest version of JavaScript, ES6 (or ES2015 as it's now known), contains numerous improvements that make building applications in JavaScript a much nicer experience. We use Babel which makes it possible to take advantage of language additions now by compiling ES2015 (and even ES7 code) down to ES5, which has much better browser support.


We use webpack to bundle our CSS and JavaScript. There are plenty of other tools which do that too, however we particularly love the development experience when using the React Hot Loader plugin for webpack. This lets us see our changes live in the browser without losing app state.


The main motivation for rebuilding our website was to allow us to more easily manage pages across different locales. We wanted to be able have one place where we could see, for each page, its handler, which locales it was available in, and its route for that locale.

Having one data structure holding all of this information means that you can see the structure of the entire site in one place. A downside of this is that it can get a bit tricky querying such a large and nested structure. Since the whole application relies on this structure, we also need to be really careful not to accidentally mutate it. This is typically dealt with by lots of defensive cloneDeeping. Not doing so would (and initially did) lead to subtle, hard to diagnose bugs across the application.

Our solution to the above problems was to use Facebook's Immutable library. Using immutable data structures, such as Map and List, means that we don't have to worry about mutating the actual routes data structure as it gets passed into functions that operate on it. Immutable also has a great API for dealing with nested properties, including getIn and setIn, which we use extensively.


We're pretty happy with the simplicity of our solution and the user experience that it enables. We hope others can benefit from our experience and, of course, if you have any feedback or suggestions, please get in touch!

We're hiring developers
See job listing
in Engineering

Interning at GoCardless

I'm wrapping up a four month internship at GoCardless and thought I'd share some of the projects I've worked on.

GoCardless has a real variety of projects, which has kept every day interesting and challenging. From front-end javascript to PHP client libraries, I've worked on almost everything at GoCardless and been able to learn from some awesome developers. Here are a few highlights:

React / How I came to love React

A couple of months ago we decided to rewrite the GoCardless marketing pages using React to support proper internationalisation. I was on the team from the inception of the project, figuring out how the app would work and seeing it through to completion.

Since last summer I'd been hearing great things about React, so it was great to get an opportunity to give it a try. We rewrote the site into an isomorphic site based off of dynamic React components, and I was really impressed by the flexibility and speed of the framework - both in development and production. GoCardless is planning to open source the work, and I'll definitely use React again.

Query Counting / How hard can a count be?

While working on the GoCardless dashboard I saw a few cases where having a count in the API responses would improve search and exports. Knowing if your search has two or two hundred pages can be helpful! I chatted to the team, and set about adding it.

As it turns out, getting a precise count isn't so straightforward. The API is built for queries that could match millions of rows, and needs to quickly return the paginated collection. After a few experiments with count optimisation and estimate counting in PostgreSQL I worked with the Platform team and we settled on using an exact count up to a limit. It was great to dig a bit deeper into the limits of PostgreSQL, and to figure out a pragmatic solution.

Stubby / Hijacking XMLHttpRequests for fun and profit

Internally, GoCardless uses JSON Schema extensively - all API requests are validated against a schema, and the GoCardless Pro docs are auto-generated from it. In my last couple of weeks at GoCardless I worked on adding another use for JSON Schema - stubbing out API requests in front-end app's tests, additionally validating that the requests themselves are valid.

There's more detail about Stubby, which is our open-source solution to the above, here. It was great to build something open-source, and I learned a lot about the process (docs, tests, READMEs and config all suddenly became much more important). I'm really glad that GoCardless as a company is invested in open source and emphasises giving back to the tech community as a whole, looking beyond their own product.

PHP Library / When I thought I was done writing PHP

Finally, when I started this internship I wasn't expecting to write any PHP. Even stranger, I wasn't expecting to write any PHP in Go...

A lot of our customers were requesting a PHP library for the new GoCardless API, so I stepped up to build one. GoCardless auto-generates all its Pro client libraries using an internal tool (soon to be open-sourced) called Crank, which combines Go template files with a JSON Schema to build a library.

I'll be honest, working with PHP was a bit of a hassle! In many cases we were torn between writing idiomatic PHP and making the library consistent with GoCardless libraries in other languages (we generally chose the former, but the trade-off in maintainability is very real). Some of PHP 5.3's new object oriented features such as magic methods and namespaces allowed for decent workarounds, but it was certainly tricky to make the decision of using them. Having put a lot of thought and effort into it the most satisfying part of the project was seeing integrators using the library immediately, and being able to respond to their feedback.

We're hiring developers
See job listing
in Engineering

Stubby: better mocking of HTTP requests in client side tests

Today we're open sourcing Stubby, a small library for stubbing client-side AJAX requests, primarily designed for use in front end tests. Stubby lets your implementation code make real HTTP requests which are intercepted just before they hit the network. This keeps your tests fast and easier to work with.


Stubbing a request is done by creating an instance of Stubby and calling stub:

var stubby = new Stubby();
  url: '/users',
}).respondWith(200, ['jack', 'iain']);

Any GET request to /users will be matched to the above stub, and ['jack', 'iain'] will be returned. It is possible to match requests against headers, the request body and query params. The Stubby README fully documents the functionality Stubby provides. The repository also includes examples of tests that use Stubby.


To keep our implementation tests close to reality we wanted to avoid stubbing any application code and instead stub the requests at the network layer. Stubby uses Pretender, which replaces the native XMLHttpRequest object, meaning our request stubs don't interact at all with our application code.

JSON Schema Validation

Stubby provides an optional plugin for validating stubs against a JSON schema. When this is used, any stubs will be validated against the given schema, ensuring the URLs, parameters and request bodies stubbed are all valid. If they are not, the tests will fail. This helps us ensure our stubs are accurate and realistic, and prevents our tests falling out of date with our API. At GoCardless we've invested heavily in the JSON Schema specification for describing our API.

Getting Started

You can install Stubby with Bower:

bower install stubby

Or for further information and to report issues, you can check out Stubby on GitHub.

We're hiring developers
See job listing

New API Version - 2015-04-29

We last released a new version of the GoCardless Pro API back in November. Since then we’ve made countless improvements and recently we've been working on several new features which warrant the release of a new version.

Version 2015-04-29 is being released on Wednesday, along with a corresponding update to our dashboards. For the majority of integrations, the upgrade will be extremely simple (see upgrading).

The new API docs are available at:

We will continue our support for v2014-11-03 until the 1st of July, 2015.

Summary of changes

  • Simplifies authentication and permissions by moving to OAuth2
    • The API now uses token based Authentication.
    • Api Keys and Publishable Api Keys have been replaced with Access Tokens and Publishable Access Tokens respectively.
    • Removes Users and Roles from the API (user permissions can still be managed using the dashboard).
  • Replaces the UK-specific sort_code field on bank accounts with the more generic branch_code.


For most integrators, upgrade steps are as follows:

  1. Generate an access token in the dashboard and update your authorisation header:

    Authorization: Bearer TOKEN
  2. Update your version header:

    GoCardless-Version: 2015-04-29
  3. Stop using the sort_code key in bank account creation calls, and instead use branch_code.

The use of users, roles, api_key and publishable_api_key endpoints was extremely low and in most cases integrators made no use of these endpoints. However, if you do make use these in your integration and are unsure how to proceed, please don’t hesitate to get in touch.

Full details of the changes

API Authorisation

Permissions in the new API version are replaced by scopes. There are two possible scopes: full_access and read_only.


We've assigned scopes to your users based on their previous permissions. If a user had full access to any resources before, they have been assigned full_access, otherwise they've been given read_only access.

Please take a moment to check your dashboard and ensure you're happy with your users’ scopes.

API Authentication

To support future improvements we're switching API authentication from HTTP Basic to OAuth2.

In the future this will allow us to support partner application use cases, enabling you to authorize applications to perform actions on your behalf. It also simplifies our API authentication strategy (one token, as opposed to two components with base 64 encoding).

To authenticate with the new API version, you will need to modify your authorisation header (docs):

# Old
Authorization: Basic BASE64_ENCODED_CREDS

# New
Authorization: Bearer TOKEN

You can continue to manage API Keys and Publishable API Keys in the API using v2014-11-03, but they won't appear on your dashboard any more. For migration instructions, please see the upgrade guide


We’ve moved the concept of webhook endpoints to their own section of the dashboard and changed the detail used to sign them: going forward, webhook endpoints have a “secret”, which is used to sign requests (docs).

For existing integrations, we've copied across webhook details from API Keys and set the secret to be the API Key's key. As a result, no changes to existing integrations are required unless you rely on the Webhook-Key-Id header, which has been removed.


Farewell Sort Codes, Long Live Branch Codes!

Since we now support local details for all the countries you can collect payments from, we've replaced the UK-specific sort_code field on bank accounts and bank account tokens with the generic branch_code field.

Most integrations will already be using the branch_code field as it has been the documented interface since December.

Need help upgrading or have any questions?
Get in touch
in Engineering

GoCardless Basic API v2 Beta

We’re releasing a new system that promises a better integration experience for developers. Following the launch of our Pro API in December 2014, we've been working hard to extend some of the new Pro features to our Basic platform. In particular, we now provide better development tooling and the option to seamlessly transition your payments to your own SUN if the need arises.

We're looking for developers to beta test this new system. Do you meet the following criteria?

  • Do you plan to collect recurring / multiple payments from your customers?
  • Are you currently looking to integrate with the GoCardless Basic API?
  • Do you have the technical expertise required to perform such an integration? Our client libraries are also in development so you will be able to beta test these too.

As well as a better overall experience, we can offer the following:

  • No fees for 6 months
  • Additional support from our developers

If you match this profile or have any questions about the beta, get in touch at [email protected]!

If you're currently using v1 of the Basic API, don't worry - we will continue to support the Basic API for all of our customers without you noticing the difference.

Interested in early access to the GoCardless Basic API v2?
Contact us

It's now easier to take Direct Debit globally

Taking Direct Debit payments used to mean integrating with a different payments scheme in each country you wanted to collect from. We're changing that. From today, we no longer require you to specify the Direct Debit scheme when setting up a mandate through our Pro API. We'll automatically detect the best Direct Debit scheme to use, based on your customer's bank account:

HTTP/1.1 200 (OK)
  "customer_bank_accounts": {
    "account_holder_name": "Edith Piaf",
    "country_code": "FR",
    "currency": "EUR",
  "mandates": {
    "links": {
      "customer_bank_account": "BA123",
      "creditor": "CR123"

HTTP/1.1 201 (Created)
Location: /mandates/MD123
  "mandates": {
    "id": "MD123",
    "scheme": "sepa_core",

We're the only provider that links UK and SEPA Direct Debit in a single API, but this is just the first step in building a truly global Direct Debit network. You shouldn't need to worry where your customers are located to be able to take payments from them, and we'll be working on making this even easier over the next few months.

First we'll be making it possible to take payments in any currency from any account, without worrying about the forex, and also to receive payouts in whatever currency you prefer. This will mean you can collect £10 a month from all your customers, whether they're in London or Luxembourg. You'll also be able to collect in several currencies and receive all your payouts in one currency.

Then we'll be adding more schemes, from Swedish Autogiro to Australian BECS, so you can take Direct Debit payments from anywhere using just one API.

Wondering whether GoCardless Pro could be for you?
Find out more
in Engineering

Visualising GoCardless' UK Growth

The above animation is a time lapse of customers using GoCardless for the first time. It covers the last 3 years and only maps the UK for now, with each red dot representing a new customer joining GoCardless, then becoming a blue dot for the remainder of the clip. It works even better if you view it full-screen.

It started as just something to do for fun on the side, we've been seeing a lot of growth recently and as we move into Europe and launch our new API I was curious about what it'd be like to take a moment to look back on how far we've come. The result actually turned out to be pretty interesting and the post below explains how I generated it.

Generating the Data Set

Using street addresses would have been a little messy, but luckily UK addresses include a postal code. There are around 1.8 million postcodes, providing a good level of granularity throughout the UK and Northern Ireland for around 29 million potential addresses. Sadly, this meant that the rest of Europe wasn't plottable this time around - a challenge for another day.

Data from GoCardless
|     created_at      | postal_code |
| 2011-09-21 22:15:44 | EC1V 1LQ    |
| 2011-09-27 12:42:17 | TA9 3FJ     |

Unfortunately, postal codes also aren't distributed throughout the country in a neat grid system. There was no easy way to translate a postcode to a location on screen. What I really needed was something uniform and regular - latitudes and longitudes.

A quick search revealed several services online which provide the latitude and longitude for a given postcode, however, with 1.8m postcodes to potentially sift through, and taking into account the rate limiting on a lot of these services, this wasn't quite going to cut it.

As is often the case, it seems I'm not the first person to come across this problem and after some more searching I discovered who have compiled a database containing all the 1.8m+ UK postcodes together with their latitude and longitude!

Data from the postcode data dump
|   id    | postcode |      latitude      |     longitude      |
| 1237144 | EC1V1LQ  | 51.531058677023300 | -.100484683590648  |
|  210341 | TA93FJ   | 51.229186607412900 | -2.976539258481700 |

After importing all of this into SQL, a few queries later I finally had the data I needed.

|      timestamp      |      latitude      |     longitude      |
| 2011-09-21 22:15:44 | 51.531058677023300 | -.100484683590648  |
| 2011-09-27 12:42:17 | 51.229186607412900 | -2.976539258481700 |

Plotting Locations on a Map


R is a language I'd been looking for an excuse to experiment with for a while. It's free, open source and and after checking out some available packages like maps and mapdata it quickly became apparent that plotting latitudes and longitudes shouldn't be too much hassle.

Sure enough, after a little playing around, it was possible to map all the customers represented by little blue dots.

Very exciting, but still some room for improvement - the points seemed a bit large and very messy. In areas of high density (for example, London) the map was solid blue. It was awesome to see how many people have used us but it didn't make for the best of visuals.

After toying with some point sizes and opacity values, things were looking much more interesting and the densities naturally emerged.

Before After
# draw.r


# Set variables
gocardless_blue = rgb(0.31,0.57,0.85)
gocardless_blue_translucent = rgb(0.31,0.57,0.85,0.1)

# Read data
customer_data <- read.csv("data/r-customers-export.csv", header = TRUE)

# Set output file
png("output.png", height=3000, width=2000, pointsize = 80)

map('worldHires', c('UK', 'Ireland', 'Isle of Man','Isle of Wight'), xlim=c(-11,3), ylim=c(49,60.9), fill=FALSE, col=gocardless_blue, mar=rep(1,4))
points(customer_data$lon,customer_data$lat, col=gocardless_blue_translucent, pch=20, cex=0.2)
title("GoCardless Customers", col.main=gocardless_blue)

# Execute with: R --slave -f draw.r

Where, Meet When

At this point I had plotted all the customers and made good use of the where portion of the data, but hadn't done anything with the when side of things.

The principle of animating this kind of data is generally conceptually straightforward. For each day, split the customers who were added on that day into chunks corresponding to the number of frames you want to animate per day. After rendering each chunk, output an image, then at the end, stitch them all together - you've got yourself an animation!

To better visualize the new customers I highlighted them in red to make them stand out against the previously plotted customers. I also removed the outline of the UK until the very end frames, resulting in the emergent GoCardless UK outline you see above.

Speeding Up R

Initially, the first rendering of these frames took hours and the concept of "rapid iteration" went squarely out the window. However, when you're fiddling with point sizes and opacities this doesn't really work well, there had to be a better way.

After some digging, it transpired that R was only using one of my CPUs. After looking around, it seemed that R had support for parallel operations, but in order to parallelise loops I'd need doParallel and foreach.

After altering the code to leverage these packages, I was then generating one frame on each core, in my case resulting in a 4x speed-up.

You can see the final code here. It's my first foray into R so there are doubtless improvements, in which case I'd love to hear from you at [email protected].

Stitching it all together

The final step is to put together all the frames. Since we've named them using a %06d format, on OSX we can leverage ffmpeg to join them together.

ffmpeg -f image2 -r 60 -i output/frame_%06d.png customer-growth.mpg

If you're on linux (using AWS for example), you can also do this using avconv in a similar way.

We're gonna need a bigger boat...

At this point you may have noticed the intense heat and noise of your laptop gradually melting in a fiery blaze of screaming CPUs. Thousands of frames and video rendering don't generally agree with your average laptop. To get around this, I suggest renting time on an AWS spot instance or Digital Ocean instance. You can leverage some seriously beefy machines with a bunch of CPUs and RAM, then just SCP down the results once it's done.

Next Steps

I'm thinking of doing some more visualisations in future and there's doubtless other areas of the business I could explore, if you have any ideas, let me know. Also, if this seems like the kind of thing you'd love to do, we have plenty of other interesting challenges, we're hiring and would love to hear from you.


We’re hiring developers
See job listing
in Engineering

Ibandit: simple IBAN manipulation

We just open-sourced Ibandit, a simple library for working with IBANs.


Constructing an IBAN from national details:

iban =
  country_code: 'GB',
  bank_code: 'BARC', # optional if a BIC finder is configured
  branch_code: '200000',
  account_number: '55779911'

# => "GB60BARC20000055779911"

Deconstructing an IBAN into national details:

iban ="GB82 WEST 1234 5698 7654 32")

# => "GB"
# => "82"
# => "WEST"
# => "123456"
# => "98765432"
# => "WEST98765432"

Validating an IBAN's format and check digits (national modulus checks are NOT applied):

iban ="GB81 WEST 1234 5698 7654 32")

# => false
# => { check_digits: "Check digits failed modulus check. Expected '82', received '81'"}

Why convert to/from IBAN

IBANs are used for all SEPA payments, such as the collection of SEPA Direct Debit payments, but most people only know their national bank details. Further, most countries have validations which apply to their national details, rather than to IBANs.

Ibandit lets you work with national details when communicating with your customers, but with IBANs when communicating with the banks. Its conversions are based on data provided by SWIFT and our experience at GoCardless, and it's heavily commented with descriptions of each country's national bank details. Internally, we use it as our standard interface for "bank details".

Given an IBAN Ibandit can also validate the format and IBAN check digit, and can deconstruct the IBAN so country-specific checks can be applied. It is not a modulus checking gem, and will not perform national modulus checks, but it does include implementations of some of these checks.

Other libraries

Another gem, iban-tools, also exists and is an excellent choice if you only require basic IBAN validation. However, iban-tools does not provide a comprehensive, consistent interface for the construction and deconstruction of IBANs into national details.

We're hiring developers
See job listing
in Engineering

Syncing Postgres to Elasticsearch: lessons learned

At GoCardless we use Elasticsearch to power the search functionality of our dashboards. When we were building our Pro API, we decided to rethink how we got data into Elasticsearch.

At a high level, the problem is that you have your data in one place (for us, that's Postgres), and you want to keep a copy of it in Elasticsearch. This means every write you make (INSERT, UPDATE and DELETE statements) needs to be replicated to Elasticsearch. At first this sounds easy: just add some code which pushes a document to Elasticsearch after updating Postgres, and you're done.

But what happens if Elasticsearch is slow to acknowledge the update? What if Elasticsearch processes those updates out of order? How do you know Elasticsearch processed every update correctly?

We thought those issues through, and decided our indexes had to be:

  • Updated asynchronously - The user's request should be delayed as little as possible.
  • Eventually consistent - While it can lag behind slightly, serving stale results indefinitely isn't an option.
  • Easy to rebuild - Updates can be lost before reaching Elasticsearch, and Elasticsearch itself is known to lose data under network partitions.

Updating asynchronously

This is the easy part. Rather than generating and indexing the Elasticsearch document inside the request cycle, we enqueue a job to resync it asynchronously. Those jobs are processed by a pool of workers, either individually or in batches - as you start processing higher volumes, batching makes more and more sense.

Leaving the JSON generation and Elasticsearch API call out of the request cycle helps keep our API response times low and predictable.

Ensuring consistency

The easiest way to get data into Elasticsearch is via the update API, setting any fields which were changed. Unfortunately, this offers no safety when it comes to concurrent updates, so you can end up with old or corrupt data in your index.

To handle this, Elasticsearch offers a versioning system with optimistic locking. Every write to a document causes its version to increment by 1. When posting an update, you read the current version of a document, increment it and supply that as the version number in your update. If someone else has written to the document in the meantime, the update will fail. Unfortunately, it's still possible to have an older update win under this scheme. Consider a situation where users Alice and Bob make requests which update some data at the same time:

Alice Bob
Postgres update commits -
Elasticsearch request delayed -
- Postgres update commits
- Reads v2 from Elasticsearch
- Writes v3 to Elasticsearch
Reads v3 from Elasticsearch -
Writes v4 to Elasticsearch Changes lost

This may seem unlikely, but it isn't. If you're making a lot of updates, especially if you're doing them asynchronously, you will end up with bad data in your search cluster. Fortunately, Elasticsearch provides another way of doing versioning. Rather than letting it generate version numbers, you can set version_type to external in your requests, and provide your own version numbers. Elasticsearch will always keep the highest version of a document you send it.

Since we're using Postgres, we already have a great version number available to us: transaction IDs. They're 64-bit integers, and they always increase on new transactions. Getting hold of the current one is as simple as:

SELECT txid_current();

The asynchronous job simply selects the current transaction ID, loads the relevent data from Postgres, and sends it to Elasticsearch with that ID set as the version. Since this all happens after the data is committed in Postgres, the document we send to Elasticsearch is at least as up to date as when we enqueued the asynchronous job. It can be newer (if another transaction has committed in the meantime), but that's fine. We don't need every version of every record to make it to Elasticsearch. All we care about is ending up with the newest one once all our asynchronous jobs have run.

Rebuilding from scratch

The last thing to take care of is to handle any inconsistencies from lost updates. We do so by periodically resyncing all recently written Postgres records, and the same code allows us to easily rebuild our indexes from scratch without downtime.

With the asynchronous approach above, and without a transactional, Postgres-backed queue, it's possible to lose updates. If an app server dies after committing the transaction in Postgres, but before enqueueing the sync job, that update won't make it to Elasticsearch. Even with a transactional, Postgres-backed queue there is a chance of losing updates for other reasons (such as the issues under network partition mentioned earlier).

To handle the above, we decided to periodically resync all recently updated records. To do this we use Elasticsearch's Bulk API, and reindex anything which was updated after the last resync (with a small overlap to make sure no records get missed by this catch-up process).

The great thing about this approach is you can use the same code to rebuild the entire index. You'll need to do this routinely, when you change your mappings, and it's always nice to know you can recover from disaster.

On the point of rebuilding indexes from scratch, you'll want to do that without downtime. It's worth taking a look at how to do this with aliases right from the start. You'll avoid a bunch of pain later on.

Closing thoughts

There's a lot more to building a great search experience than you can fit in one blog post. Different applications have different constraints, and it's worth thinking yours through before you start writing production code. That said, hopefully you'll find some of the techniques in this post useful.

We’re hiring developers
See job listing
in Engineering

Using ES6 Modules with AngularJS 1.3

At GoCardless we use ES6 and AngularJS 1.3 to build front-end applications. Today, we’re open sourcing a skeleton app that demonstrates how we do it.

In October we travelled to Paris for Europe’s first AngularJS conference, ngEurope. AngularJS 2.0 was announced on the second day, and a lot is changing.


We’re particularly excited to see AngularJS embrace ECMAScript 6, part of which includes replacing its own module system with the ES6 module system.

The ES6 modules spec has been finalised, and it will now begin to be introduced to browsers, although we can't expect full support for a while yet. However, that shouldn’t stop you using it with AngularJS 1.3 today, and this post we'll show you how.

Why ES6 modules?

Module and dependency management is essential when building a large client-side application, and the JavaScript community has come up with a number of solutions to this problem. These include Browserify, RequireJS, and jspm, along with most frameworks (including AngularJS) providing their own solution. However, now the syntax of ES6 modules are confirmed it's a great time to get familiar with the specification, syntax and future proof your application.

If you’d like to read about the syntax in depth, ECMAScript 6 modules: the final syntax by Axel Rauschmayer gives a fantastic, detailed overview of the syntax and practicalities of the module system.


ES6 usage is enabled by a combination of three libraries:

The ES6 Module Loader adds System.import, a new function that will be available in ES6 that allows us to programmatically load modules.

SystemJS is a universal module loader that supports three different module syntaxes in the browser:

  • Asynchronous Module Definition (used by RequireJS)
  • CommonJS (used by NodeJS, Browserify)
  • ES6 Modules

Finally, Traceur transpiles ES6 into ES5, enabling you to use not just modules, but many of the upcoming ES6 features.

An ES6 and AngularJS 1.3 sample application

To help you get started we are open sourcing a sample AngularJS 1.3 and ES6 application based exactly on ours, using the same tooling, frameworks and configuration. Please feel free to fork it, clone it and use it as a base for your applications. If you have any questions, please open an issue.

The application comes with the following set up:

  • a sample unit test using Jasmine and Karma
  • end to end tests using Protractor
  • the directory structure defined in our AngularJS style guide
  • ES6 imports used throughout
  • scripts for building the project, running tests and starting a server
  • bower for handling front end dependencies

The README has detailed instructions on how to set up and get everything running on your machine.

We’re hiring developers
See job listing
in Engineering

Introducing Logjam

Internally at GoCardless, we are undergoing a top-down restructuring of our server infrastructure. We are planning on moving away from the one-app-per-server model with a mixture of cloud and dedicated hardware, to a more general compute style cluster; powered by Apache Mesos and Docker on top of dedicated hardware.

One of the problems to solve leading up to this change is what we do with logs. Currently we use logstash-forwarder to pick up entries in our application's rails.log and forward them to our Logstash server. Given that our applications can possibly come and go frequently, be created for the sole purpose of running a one-off or recurring task, and the separation of the container and the host filesystem; we are posed an interesting problem which led us to deem our current solution inadequate.

Possible solutions included:

  • Mounting an external volume on the docker container, binding to its logs directory. However this would require a large change to our applications to ensure they write to a unique log file. Additionally we find logstash-fowarder has difficulty with dynamic log collection and traversing a long directory tree.
  • Moving logstash-forwarder from the host server into each individual container. However this would increase the complexity of the container and has the addition of each container now creating N+1 processes.
  • Having applications log directly to logstash. This potentially decreases the reliability of the application by introducing potential faults through reliance on a further external service.

After reviewing our options, we decided to design our own log collection agent using a unique combination of existing methods to provide a simple and reliable system for our logging. We called this Logjam.

Logjam is a daemon which listens locally on a UDP socket and accepts a stream of JSON log entries. These entries are then buffered (either in memory or on disk) and forwarded to a remote collection server (in our case logstash) for storage and querying. In our use case, we configure logjam on the host system to listen on the network bridge interface created by docker (called docker0), by doing this we can ensure that all running containers can communicate with logjam using the network bridges IP address (by default

logjam pipeline

Logjam implements a simple processing pipeline consisting of two major subsystems. The first of which, called receiver, is responsible for listening on a local network interface and receiving incoming log entries. The second, called shipper, is responsible for taking log entries and reliably submitting them to a remote collection server for storage. Inbetween the two subsystems consists an in-memory buffer (optionally persisted to disk) of configurable size of recent log entries to ship.

Logjam is written in Go and is available from GitHub.

We will be writing throughout this major re-architecting of our infrastructure, documenting our experiences, as we design and implement it.

Find this kind of thing interesting?
Come work with us
in Engineering

Building APIs: lessons learned the hard way

This post is about the lessons we’ve learned building and maintaining APIs over the last three years. It starts off with some high-level thoughts, and then dives right into the detail of how and why we've made the design decisions we have, while building our second API - GoCardless Pro.

The problem with APIs

The hard thing about APIs is change. Redesign your website and your users adapt; change how timestamps are encoded in your API and all of your customers' integrations break. As a payments API provider even a single broken integration can leave thousands of people unable to pay.

As a fast-moving startup we're particularly poorly positioned to get things right first time. We're wired to constantly iterate our products: ship early then tweak, adjust, and improve every day. As a startup the uncertainty goes even deeper: we're always learning more about our customers and making course corrections to our business.

Building dependable APIs in this environment is hard.

Different types of change

The most important lesson we've learned is to think about structural changes differently to functionality changes.

Structural changes affect the way the API works, rather than what it does. They include changes to the URL structure, pagination, errors, payload encoding, and levels of abstraction offered. They are the worst kind of changes to have to make because they are typically difficult to introduce gracefully and add little to existing customers. Fortunately they're also the decisions you have the most chance of getting right first time - an API's structure isn't tied to constantly-evolving business needs. It just takes time and effort, and is discussed in "Getting your structure right" below.

Functionality changes affect what the API does. They include adding new endpoints and attributes or changing the behaviour of existing ones, and they're necessary as a business changes. Fortunately they can almost always be introduced incrementally, without breaking backwards compatibility.

Getting your structure right

In our first API we made some structural mistakes. None of them are serious issues, but they have led to an API that is difficult to extend due to its quirks. For instance, the pagination scheme relies on limits and offsets, which causes performance issues. We also had frequent discussions about whether resources should be nested in URLs, which resulted in inconsistencies.

Before we started work on GoCardless Pro, we spent a lot of time laying the structural foundations. Thinking about the structure of GoCardless Pro was helpful for several reasons:

  • we were forced to think upfront about issues that would affect us down-the-line, such as versioning, rate-limiting, and pagination;
  • when implementing the API, we could focus on its functionality, rather than debating the virtues of PUT vs PATCH;
  • consistency across the API came for free, as we’re not making ad-hoc structural decisions.

We adopted JSON API as the basis for our framework and put together a document detailing our HTTP API design principles. The decisions made came from three years of experience running and maintaining a payments API, as well as from examining similar efforts by other companies.

Amongst other things, our framework includes:

  • Versioning. API versions should be represented as dates, and submitted as headers. This promotes incremental improvement to the API and discourages rewrites. As versions are only present in incoming requests, WebHooks should not contain serialised resources - instead, provide an id to the resource that changed and let the client request the appropriate version.
  • Pagination. Pagination is enabled for all endpoints that may respond with multiple records. Cursors are used rather than the typical limit/offset approach to prevent missing or duplicated records when viewing growing collections, and to avoid the database performance penalties associated with large offsets.
  • URL structure. Never nest resources in URLs - it enforces relationships that could change, and makes clients harder to write. Use filtering instead (e.g. /payments?subscription=xyz rather than /subscriptions/xyz/payments)

This list is by no means exhaustive - for more examples, check out our full HTTP API design document on GitHub.

When starting work on a new API we encourage you to build your own document rather than just using ours (though feel free to use ours as a template). The exercise of writing it is extremely helpful in itself, and many of these decisions are a question of preference rather than "right or wrong".

Functionality changes

You can’t ever break integrations, but you have to keep improving your API. No product is right first time - if you can’t apply what you learn after your API is launched you’ll stagnate.

Many functionality changes can be made without breaking backwards compatibility, but occasionally breaking changes are necessary.

To keep improving our API whilst supporting existing integrations we:

  • Use betas extensively to get early feedback from developers who are comfortable with changes to our API. All new endpoints on the GoCardless Pro API go through a public beta.
  • Version all breaking changes and continue to support historic versions. By only ever introducing backwards-incompatible changes behind a new API version, we avoid breaking existing integrations. As we keep track of the API version that each customer uses, we can explain exactly what changes they'll need to make to take advantage of improvements we've made.
  • Release the minimum API possible. Where we have a choice between taking an API decision and waiting, we choose to wait. This is an unusual mentality in a startup, but when building an API, defaulting to inaction is probably the right approach. As we’re constantly learning, decisions made later are decisions made better.
  • Introduce “change management” to slow things down. This is not our typical approach - “change management” is a term that makes many of us shudder! But changes to public APIs need be introduced carefully, so however uneasy it makes us, putting speed bumps in place can be a good idea. At GoCardless, all public API changes need to be agreed on by at least three senior engineers.

Stop thinking like a startup

To sum up: when building an API, you need to throw most advice about building products at startups out the window.

APIs are inflexible so the advice founded on change being easy doesn’t apply. You still need to constantly improve your product, but it pays to do work up-front, to make changes slowly and cautiously, and to get as much as possible right from the start.

Find this kind of thing interesting?
Come work with us
in Engineering

The GoCardless AngularJS Style Guide

We've used AngularJS since March 2013, and have blogged before about why we chose it and how we use it.

Since then a lot has changed: our team is 2x as large, we have 4x as much AngularJS to maintain, and we recently kicked off a major new project. To help keep our code consistent we've written and open-sourced a style guide.

The document is valuable as a reference and we hope you find it useful. For us, the process of writing it was equally as valuable, so we encourage you to write your own too. While writing ours, developers who have written AngularJS at GoCardless for a long time reconsidered why they do things the way they do, and new developers quickly learnt both how AngularJS works at a low-level and how it’s written at GoCardless.

TL;DR - AngularJS style at GoCardless

This style guide addresses four areas: High-level Goals; Directory and File Structure; Parts of AngularJS; and General Patterns and Anti-patterns.

High-level Goals

Our high-level goals guide low-level decisions. As a company one of our values is "Build to Last". As engineers we want our code to add long-term value. But you rarely get it right the first time, so the best you can do is make future change easy.

We would, for example, pick verbose over DRY if that results in code that’s easier to read and change. We have found that inheritance and mixins make change especially hard. The code is neat when first written, but as it changes your classes end up knowing and doing too much, and your code becomes complex through layers of indirection. Composable service objects fit well with AngularJS dependency injection and make logic easier to track down.

We plan to write a lot more AngularJS code going forward. In AngularJS 2.0, the biggest change will be to the module system, moving from a custom one to ES6 modules. In preparation for this we have adopted ES6 modules on top of the current module system. Having two module syntaxes is verbose, but will make the migration to AngularJS 2.0 easier.

Directory and File Structure

Our directory and file structure is designed for non-trivial web applications rather than specifically for AngularJS. This fits well with the plans for AngularJS 2.0, where all core parts are composable, dependency injection sits on ES6 modules, and directives become sugar on top of Web Components.

Each module of functionality, whether component, route, or service, is kept in its own directory. For example, a route module would include the route config, controller, template, and tests. Updating a route usually means updating the controller, template and tests at the same time. This makes refactoring easier as the likelihood of forgetting to update a part of a component is reduced, and those components without tests are revealed.

Parts of AngularJS; General Patterns and Anti-patterns

Instead of rewriting well-documented best practices, the remainder of the style guide focuses on patterns and anti-patterns we found particularly good or bad. First for each part of AngularJS, then application-wide.

The style guide is a living document. We will change it as and when the way we write AngularJS changes. In the meantime, we hope you find it useful and look forward your feedback.

Read the style guide
Style Guide
in Engineering

gc-http-factory: an easier way to work with APIs in Angular

Today we're announcing and open sourcing gc-http-factory, a tool for working with APIs in Angular.

Using APIs in Angular without HttpFactory

In a regular Angular app you'd use a service to make requests to an API:'app', []).factory('UsersService', function($http) {
  function findOne(id) {
    return $http.get('/api/users/' + id);

  function findAll() {
    return $http.get('/api/users');

  function create() {
    return $'/api/users');

  return {
    findOne: findOne,
    findAll: findAll,
    create: create

If you've got lots of API resources which follow similar conventions the above becomes repetitive pretty quickly.

Using HttpFactory

HttpFactory provides an abstraction for creating services that make requests to API resources. You use HttpFactory to create your service and methods by telling it some information about your API.

For example, to create the UsersService we defined previously, you'd use:'app', ['gc.httpFactory']).factory('UsersService', function(HttpFactory) {
  return HttpFactory.create({
    url: '/api/users/:id'
  }, {
    findOne: { method: 'GET' },
    findAll: { method: 'GET' },
    create: { method: 'POST' }

You can then use the service you've created as normal:

UsersService.findAll(); //=> GET /api/users

  params: { id: 2 }
}); //=> GET /api/users/2

  data: { name: 'Jack' }
}); //=> POST /api/users with { name: 'Jack' } as data


You can install gc-http-factory from Bower:

bower install gc-http-factory

Or just grab the source from GitHub.


We've been using HttpFactory in our Angular apps for a while and we're really happy with how cleanly it lets us create components to interface with our APIs. It's avoided duplication across our Angular apps and we hope it might do the same for you.

If you have any suggestions, ideas or bugs, please either report them on GitHub or feel free to find me on Twitter.

We're hiring developers
See job listing
in Engineering, Support

Getting started with the GoCardless PHP library

This post relates to the Legacy GoCardless API. If you're starting a new integration, you'll need to use the new GoCardless API - for help getting started, check out our guide.

PHP is the web's most popular coding language, used by nearly 40% of websites, so we've made it super simple to integrate with GoCardless using our open-source API library.

In 90% of cases, you'll be wanting to set up pre-authorizations. These are variable Direct Debits, letting you have your customer authorise once through our secure payment pages, after which you'll be able to charge them automatically whenever you need to with just two lines of code.

1. Add the PHP library to your project

First of all, you'll need to sign up for a GoCardless account and enable developer mode. Once you've done that, you'll get your authentication details which you'll be able to copy and paste into your code. This will take about 5 minutes - just follow our guide here.

Next, you'll need to grab the latest version of our library. There are two options for doing this:

  • Download the ZIP file here and add it to your project.
  • Add GoCardless to your composer.json file:
    "require": {
        "gocardless/gocardless": ">=0.4.2"

We'd recommend storing the extracted GoCardless.php file and the gocardless folder inside a lib/ directory. This makes it easy to organise your code's dependencies.

Next, create a central configuration file. Using this will make it easy to share all the GoCardless setup between different pages, meaning you only have to change your settings in one place. I'll call it


  // By default, your application will run in the sandbox - here,
  // you can test with no money changing hands. When you're
  // reading to go live, simply uncomment this line.
  // GoCardless::$environment = 'production';

  // Let's add your authentication details for the production and
  // sandbox. You can get them both from your Dashboard - copy them
  // in to replace the examples.
  if (GoCardless::$environment == 'production') {
    $account_details = array(
      'app_id'        => 'INSERT_LIVE_APP_ID',
      'app_secret'    => 'INSERT_LIVE_APP_SECRET',
      'merchant_id'   => 'INSERT_LIVE_MERCHANT_ID',
      'access_token'  => 'INSERT_LIVE_MERCHANT_ACCESS_TOKEN'
  } else {
    $account_details = array(
      'app_id'        => 'INSERT_SANDBOX_APP_ID',
      'app_secret'    => 'INSERT_SANDBOX_APP_SECRET',
      'merchant_id'   => 'INSERT_SANDBOX_MERCHANT_ID',


2. Send the customer to the GoCardless payment pages

Setting up a pre-authorization will allow you to get authorisation from your customer once, and then charge them whenever you need to in the future. To do that, you'll need to send the customer through our secure payment process.

Our API library makes this simple:

// Bring in the file we made earlier, which includes the GoCardless
// library and all of our configuration

// Provide the details to use to set up the Direct Debit. In
// particular, make sure you set the amount to be higher than
// you'll ever need to collect in a month.
$pre_authorization_details = array(
  "amount" => "10000",
  "interval_length" => 1,
  "interval_unit" => "month"

// The GoCardless library puts everything together and
// generates a personalised link to our secure payment pages
$pre_authorization_url = GoCardless::new_pre_authorization_url($pre_authorization_details);

// Now we could put that generated URL behind a link...
echo "<a href='" . $pre_authorization_url . "'>Set up a Direct Debit</a>";
// or we could redirect the user right away...
header("Location: " . $pre_authorization_url);

3. Confirm the setup of the Direct Debit

Once the customer has finished going through our payment flow, we'll send them back to your website, where you'll need to confirm the authorization via our API.

Let's create a file called callback.php. You'll need to set the URL of that as your "redirect URL" in your Dashboard's developer settings:

Setting your redirect URI

// Let's include our central configuration file again.

// Pull out the required data from the redirect
$confirm_params = array(
  'resource_id'    => $_GET['resource_id'],
  'resource_type'  => $_GET['resource_type'],
  'resource_uri'   => $_GET['resource_uri'],
  'signature'      => $_GET['signature']

// We'll add this - it'll come into effect if you implement
// a more advanced integration,
if (isset($_GET['state'])) {
  $confirm_params['state'] = $_GET['state'];


// If all has gone well, you'll now be able to charge this
// customer. To do this, you'll need to store their
// pre-authorization's ID somewhere - for instance on a
// user's database record.
$pre_authorization_id = $resource_id;

<p><b>Thank you!</b></p>

4. Charge your customer

When you're ready to charge your customer, it's just a few simple lines of code:


// You'll need to pull our the customer's pre-
// authorization ID which you saved somewhere in
// the previous step.
$pre_authorization_id = fetch_id_from_database();

$pre_auth = GoCardless_PreAuthorization::find($pre_authorization_id);

// Set the details for the payment to collect. We'll
// display the "name" in emails sent to the customer.
$bill_details = array(
  "name" => "Invoice for June 2014",
  "amount" => "25.00"

$bill = $pre_auth->create_bill($bill_details);

// It's likely that you'll want to store the bill's ID
// - this is useful for staying up to date on its status
// later.
$bill_id = $bill->id;

What next?

You've now seen how to set up the GoCardless PHP library, create your first Direct Debit and then take a payment. Here's a few things to check out as your next steps:

  • Web hooks allow you to be informed about the progress of a payment or any changes to your customers' Direct Debits - so for example, we'll let you know when money has been successfully collected.
  • Pre-populating information makes it simpler for your customers to pay and increases conversion.
  • Using the "state" parameter lets you pass data through the payment process, and collect it at the other end.

If you have any questions or we can help at all, just get in touch - email us.

Want to find out more or ask questions?
Book a chat with an expert
in Engineering

Business: simple business date calculations in Ruby

We just open-sourced business, a simple library for doing business date calculations.


calendar =
  working_days: %w( mon tue wed thu fri ),
  holidays: ["01/01/2014", "03/01/2014"]

calendar.business_day?(Date.parse("Monday, 9 June 2014"))
# => true
calendar.business_day?(Date.parse("Sunday, 8 June 2014"))
# => false

date = Date.parse("Thursday, 12 June 2014")
calendar.add_business_days(date, 4).strftime("%A, %d %B %Y")
# => "Wednesday, 18 June 2014"
calendar.subtract_business_days(date, 4).strftime("%A, %d %B %Y")
# => "Friday, 06 June 2014"

date = Date.parse("Saturday, 14 June 2014")
calendar.business_days_between(date, date + 7)
# => 5

But other libraries already do this

Another gem, business_time, also exists for this purpose. We previously used business_time, but encountered several issues that prompted us to start business.

Firstly, business_time works by monkey-patching Date, Time, and FixNum. While this enables syntax like + 1.business_day, it means that all configuration has to be global. GoCardless handles payments across several geographies, so being able to work with multiple working-day calendars is essential for us. Business provides a simple Calendar class, that is initialized with a configuration that specifies which days of the week are considered to be working days, and which dates are holidays.

Secondly, business_time supports calculations on times as well as dates. For our purposes, date-based calculations are sufficient. Supporting time-based calculations as well makes the code significantly more complex. We chose to avoid this extra complexity by sticking solely to date-based mathematics.

We're hiring developers
See job listing
in Engineering

Generate new API App Secrets

This post relates to the Legacy GoCardless API. If you're starting a new integration, you'll need to use the new GoCardless API - for help getting started, check out our guide.

If you're a developer using the GoCardless API, it's now possible for you to change the App Secret associated with your account. Simply head to the Developer section of you GoCardless account and click ‘Generate new app secret’.

This new feature also allows you have have multiple App Secrets for a single account. You'll be able to sign requests to the GoCardless API with any of your active App Secrets. When GoCardless needs to communicate with your servers (i.e., when we send you web hooks) we'll use the App Secret marked as ‘Primary’ to sign our requests.

If you ever need to delete an App Secret, simply select the ‘Revoke’ option under the relevant key - requests can’t be sent with a revoked App Secret.

Not only does this feature give you more security, it also allows you to rotate your App Secrets with the minimum amount of downtime.

Log in to your dashboard to try it out
Developer settings

More choice of payment dates within Sage

If you’ve set specific payment terms within a Customer Record in Sage, instead of setting the payment to be collected as soon as possible, you can now choose for a customer’s account to be charged in accordance with these terms.

Once you’ve selected ‘Collect Payments’ for an invoice, you are given the option to either collect the amount in full or installments - if the former is chosen, you will now have the ability to arrange for the payment to be charged on ‘settlement date’ or ‘due date’ as well as ‘immediately’.

Interested in using GoCardless with Sage?
Find out more
in Engineering

Heartbleed response

Earlier this week, Heartbleed - a security vulnerability in the OpenSSL library - was publicly disclosed. GoCardless uses software that depends on OpenSSL, which means we were among the large number of companies affected.

Our engineering team patched our affected software on Tuesday morning (April 8th), and replaced our SSL certificates. This means that we are no longer vulnerable to Heartbleed.

We have no reason to believe that any GoCardless data has been compromised, but given the nature of the vulnerability we recommend taking the following precautions:

  • We recommend that GoCardless users reset their passwords.
  • We have invalidated any sessions that were in use prior to the resolution of the issue.
  • We are adding the ability for API users to reset their API keys; we'll post an update as soon as this is possible.

If you have any questions, don't hesitate to email us at [email protected].

in Engineering

Charge date added to bulk payments

We're really excited to announce that it is now possible to specify a charge date when making bulk payment submissions. This has long been one of our most requested features that will enable businesses to upload multiple payment CSVs less often - saving time so that they can concentrate on the more important parts of their business.

Specifying a charge date is really simple

After you have generated a template CSV file, you will find a new column called 'Charge Date' - just enter the date that you would like the customer's payment to be charged from their account (using the format DD/MM/YYYY).

Future features

In the coming months we hope to release many similar features that help make Direct Debit simpler for businesses everywhere. If you have any feedback about enhancements that you would like to see, just pop an email to [email protected]

Log in to your dashboard to try it out
Import payments
in Engineering

Partner payout breakdown

If you have a partner account, you can now view the individual payments that make up each of your payouts from the dashboard (and export a CSV of these payments for your own records) much like you can for your merchant account.

Simply click through to the individual reference to see which payments and/or refunds are included in each payout - to be emailed a CSV of these details, you just have to hit the ‘Export’ button.

Log in to your dashboard to view your payouts
Partner payouts
in Engineering

More control over scheduled payments in Sage

You can now cancel any upcoming/scheduled payments from within the GoCardless add-on in Sage.

You can do this by opening up the GoCardless add-on, selecting ‘Unpaid Invoices’ and double-clicking the invoice you’d like to cancel upcoming payments for. Then simply select ‘Cancel all scheduled payments’ - this will cancel all ‘Scheduled’ payments for that invoice.

Interested in using GoCardless with Sage?
Find out more

Clearer API Documentation

This post relates to the Legacy GoCardless API. If you're starting a new integration, you'll need to use the new GoCardless API - for help getting started, check out our guide.

We’ve spruced up our API documentation, including clearer explanations and diagrams on payment flow and resource creation.

We’ve also added a new troubleshooting section to provide some answers for the most common API-related queries we receive.

Check out the updated docs here and see what you think. As always, if there’s anything you think is missing/incorrect, feel free to submit your own pull request on GitHub for review.

Earn £50 in credit by referring someone today
Refer a friend

Payyr – A Direct Debit experiment for friends

Today, we're releasing Payyr, a free Direct Debit platform for friends. Think of it as a little Christmas gift.

Direct Debit is a fantastic payments system for businesses. As the UK's largest provider, we've helped thousands of them take control of their cash flow with it.

So what happens if you reimagine Direct Debit to help individuals collect small payments from each other? You get our new app, Payyr.

Payyr is Direct Debit for friends, on your iPhone. We've put an incredibly simple interface on top of GoCardless' next generation payments platform. Together they make IOUs a thing of the past.

Even better, Payyr is completely free, and takes 60 seconds to get set up. No contracts, no set up fees, and no transaction fees; just simple payments at last. The only limit is a collection cap of £200 per month per friend.

Payyr is very much in the experimental phase, so we can't wait to hear your thoughts.

Download Payyr from the App Store now
Get Payyr
in Engineering

Our new API docs and Node library

I'm excited to announce the release of our rewritten and redesigned API docs at

We've rethought our documentation to provide more clarity and a better understanding of the core concepts of GoCardless. We've also added more example code for each of our official client libraries.

To help improve our documentation even more we're also making it open source.

New Node library

Swiftly following the joy of fresh, cleanly formatted docs I have the pleasure of introducing our new API library for Node.

Let's cut straight to creating a one off bill:

var url = gocardless.bill.newUrl({
  amount: '10.00',
  name: 'Coffee',
  description: 'One bag of single origin coffee'

// Now redirect to `url`

... or perhaps a subscription (we really like our coffee)

var url = gocardless.subscription.newUrl({
  amount: '30.00',
  interval_length: '1',
  interval_unit: 'month',
  name: 'Coffee',
  description: 'Fresh roast coffee subscription'

// Now redirect to `url`

Our Node library is currently in beta, so please get in touch if you experience any unexpected behaviour. For more example usage and documentation, please check out the Node library docs.

Open source and available today

Naturally, both our new API docs and Node client library are open source, and available on GitHub today.

This is also an opportune moment to say that we're hiring. If you love working on big problems then help us fix online payments.

We're hiring software developers
See job listings
in Engineering

Statesman: A modern, robust Ruby state machine

There's no shortage of Ruby state machine libraries, but when we needed to implement a formal state machine we didn't find one which met all of our requirements:

  • Easily composable with other Ruby objects. We need to define a state machine as a separate class and selectively apply it to our Rails models.
  • DB level data integrity. We run multiple application servers, so state change related race conditions should be prevented by a database constraint.
  • Full audit history of state transitions. We needed to persist transitions to the database and include unstructured metadata with each transition.

Today, we are pleased to release Statesman, the state machine library which we wish had existed.

Statesman logo

Statesman is extremely opinionated and is designed to provide a robust audit trail and data integrity. It decouples the state machine logic from the underlying model and allows for easy composition with one or more model classes.

It provides the following features:

  • Database persisted transitions for a full audit history.
  • Database level transition duplication protection.
  • Transition metadata - store any JSON along with a transition.
  • Decoupled state machine logic

How it works

A state machine might look like this:

class PaymentStateMachine
  include Statesman::Machine

  # Define all possible states
  state :pending, initial: :true
  state :confirmed
  state :paid
  state :cancelled

  # Define transition rules
  transition from: :pending, to: [:confirmed, :cancelled]
  transition from: :confirmed, to: :paid

This class expects a model instance and a transition class to be injected on instantiation. Let's assume we have ActiveRecord models set up like so:

class Payment < ActiveRecord::Base
  has_many :payment_transitions

class PaymentTransition < ActiveRecord::Base
  belongs_to :payment

We can set up a new state machine easily:

payment = Payment.find(some_id)
machine =, transition_class: PaymentTransition)

And call a few methods:

machine.current_state # => "pending"
machine.can_transition_to?(:confirmed) # => true
machine.transition_to!(:confirmed) # => true
machine.current_state # => "confirmed"

Under the hood statesman will create, associate and persist a new PaymentTransition object. When we send current_state to the machine object it queries associated transitions and returns the state of the most recent. This is fully normalized, state is not stored on the parent Payment model.

Guards & Callbacks

Often, the current state is not the only factor to determine whether a new state can be applied. Statesman supports guards which should return true or false, a false return value will prevent a transition.

# ...
guard_transition(from: :pending, to: :submitted) do |payment|
# ...

# Assuming payment.passed_fraud_check? evaluates to false
machine.can_transition_to?(:confirmed) # => false
machine.transition_to!(:confirmed) # => Statesman::GuardFailedError

Callbacks are defined in the same way as guards and allow extra actions to be performed before and after a state transition:

before_transition(to: :cancelled) do |payment, payment_transition|

after_transition(to: :cancelled) do |payment, payment_transition|

Storage Adaptors

The examples show a Rails app using ActiveRecord, but there's also an adapter for Mongo, and an in-memory adaptor out of the box. An adaptor need only prove a small number of methods to be compatible so it should be very easy to implement for your favourite ORM. A set of Rspec examples are provided - pass these and your adapter should work just fine. Pull requests are very welcome!

Getting started

Statesman includes two rails generators - one to add columns to an existing transition class and one to create a new table and transition class. Both expect to be passed the parent model name and the transition model name:

$ rails generate statesman:active_record_transition Payment PaymentTransition
# => Creates a new transition model using the ActiveRecord adapter

$ rails generate statesman:mongoid_transition Payment PaymentTransition
# => Creates a new transition model using the Mongoid adapter

$ rails generate statesman:migration Payment PaymentTransition
# => Adds required attributes to an existing PaymentTransition model


We've been using Statesman in production for a while now and have been extremely happy with the results. We've removed hundreds of lines of ad-hoc state checking code and have really enjoyed the benefit of decoupling state machine logic from our data models.

Statesman is available via RubyGems and the source is on GitHub. Suggestions and contributions are very welcome, please open an issue or pull request, or just sent me a tweet @appltn.

We're hiring developers
See job listing
in Business, Engineering

Rolling your own cloud phone system

At GoCardless, we're committed to providing great support to all of our users. This means building great tools, both for our customers and for internal use.

We've previously looked at our Nodephone cloud phone system which uses Twilio's telephony APIs and hooks up our phone support with our user data, as well as various internal tools.

Quite a few people have asked us about open sourcing the application. We might well do this, but my experience shows that different users will have very different requirements, so it makes sense to roll your own. This is a guide that will help you do just that.

Choosing the framework: Ruby on Rails + Backbone

I've previously experimented with Node.js and its Express framework and Ruby on Rails for these kinds of applications. Rails is the stand-out choice for three reasons:

  • It's the framework of choice for GoCardless and most other startups too, meaning the expertise is most likely already there
  • Rails provides so much for free, so you spend less time reinventing the wheel and more time building functionality
  • Ruby has an incredibly rich ecosystem of gems, providing functionality from Twilio API access in twilio-ruby to tagging with just one line of code with acts-as-taggable-on

On the frontend, I opted for Backbone.js. Nodephone was my first time working with a JavaScript framework like this, but I'm absolutely converted. Its models and collections made data super easy to handle, and encapsulating presentation logic in bound views helped avoid inevitable callback hell.

Choosing the platform: Heroku

We're experienced at deploying and scaling Rails apps at GoCardless, but a service like this didn't justify giving our dev ops team another service to manage. Instead, we chose Heroku to give us:

  • super simple deployments
  • high availability so users can always reach us

Heroku offers a lot of addons as well, making it easy to add extras like log management, a Redis database or exception tracking in a couple of clicks with no billing complications.

Heroku can be a little expensive. If you're looking for a cheaper alternative and more flexible customisation, a great choice would be Amazon Web Services Elastic Beanstalk, especially with the benefits of the AWS free tier.

Working with external services

One of the great things about building a custom phone system is the freedom to integrate with other services you use.

Nodephone pulls in the profiles of callers from GoCardless itself based upon their phone numbers, as well as working with external apps like Salesforce.

When you're working with Twilio, response times of your app are critical to providing a decent experience for callers. Slow external services, and worse still, applications that inevitably go down, threaten that.

The ever-popular background job library Resqueprovides an ideal solution, allowing tasks depending on background services to be performed asynchronously.

When someone calls in, we look up their phone number in GoCardless to see if we know who they are. To do this, we enqueue a Resque job to perform the API request whilst displaying a progress indicator to the client:

Looking up...

Once the job has finished in the background (which usually takes seconds) the app provides a real-time update to clients using Pusher's web sockets:

response = Gocardless::Merchant.find(args["merchant_id"])
Pusher.trigger('calls', 'lookup', { id: args["call_sid"], merchant: response })

Looked up caller

If GoCardless is slow to respond or out of action, there will only be a slight degredation of service, rather than failed calls.

Avoiding missed calls

We've all called customer support lines, and one of the biggest frustrations is having to wait for ages to speak to someone who can help.

In order to manage bursts of incoming calls, we introduced a queuing system. Since our average call lasts less than 3 minutes, most callers can get through to someone within a few minutes. However, we also wanted to ensure we didn't miss calls from people getting frustrated by longer waits in the queue.

We achieved this with more Resque magic. Once a caller has chosen an option, they're placed in the queue and agents can pick it up by hitting the "Answer" button on their screen. When a call is answered, we mark it as such in our database.

When the call joins the queue, at the same time we schedule a job for 3 minutes time which will check if the call has been answered:

Resque.enqueue_in(3.minutes, ForwardUnansweredCall,

If, 3 minutes from now, it hasn't, it'll ring on all of the phones in the office immediately. This means that every single call is answered within 3 minutes during office hours - but usually much sooner!

Open for business

A particular challenge when building phone systems, as simple as it sounds, is managing opening hours, thanks to complications like time zones. Existing gems that I found like business_time were one step removed from what I needed.

It quickly became clear that I wanted something that didn't already exist, so I built it myself in no time at all. My in_business library lets you set your business hours on a daily basis, and provides simple open? and closed? methods to call in your code:

# Configure the gem with hours of your choice, on a day-by-day basis
InBusiness.hours = {
  monday: "09:00".."18:00",
  tuesday: "10:00".."19:00",
  # ...
  saturday: "09:00".."12:00"
} # => true [if now is within the set hours] DateTime.parse("9th September 2013 08:00") # => false

Taking inspiration from business_time, I added support for the holidays gem, making it possible to take into account public holidays with just two lines of code:

Holidays.between(Date.civil(2013, 1, 1), 5.years.from_now, :gb).
  map{ |holiday| InBusiness.holidays << holiday[:date] } DateTime.parse("25th September 2013 10:00") # => false

An important final addition was adding overrides for the set hours. From time to time, you inevitably want to open and close on an ad-hoc basis. For instance, you might decide to open on a bank holiday when you hadn't planned to.

Overriding business hours

The end result

Twilio, accompanied by popular tools and frameworks and some ingenious thinking makes it relatively simple to build powerful bespoke phone systems which run entirely in the cloud.

Do you love talking to customers and building great tools to track and improve the service you provide?
Join the GoCardless team.
in Engineering

Hutch: Inter-Service Communication with RabbitMQ

Today we're open-sourcing Hutch, a tool we built internally, which has become a crucial part of our infrastructure. So what is Hutch? Hutch is a Ruby library for enabling asynchronous inter-service communication in a service-oriented architecture, using RabbitMQ. First, I'll cover the motivation behind Hutch by outlining some issues we were facing. Next, I'll explain how we used a message queue (RabbitMQ) to solved these issues. Finally, I'll go over what Hutch itself provides.

GoCardless's Architecture Evolution

GoCardless has evolved from a single, overweight Rails application to a suite of services, each with a distinct set of responsibilities. We have a service that takes care of user authentication, another that encapsulates the logic behind Direct Debit payments, another that serves our public API. So, how do these services talk to each other?

The go-to route for getting services communicating is HTTP. We're a web-focussed engineering team, used to building HTTP APIs, and debating the virtues of RESTfulness. So this is where we started. Each service exposed an HTTP API, which would be used via a corresponding client library from the dependent services. However, we soon encountered some issues:

  1. App server availability. There are several situations that cause inter-service communication to spike dramatically. We frequently receive and process information in bulk. For instance, the payment failure notifications we receive from the banks are processed once per day in a large batch. If another service needs to be made aware of these failures, an HTTP request would be sent to each service for each failure. This places our app servers under a significant amount of load. This issue could be mitigated by implementing special "bulk" endpoints, queuing requests as they arrive, or imposing rate limits, but not without the cost of additional complexity.

  2. Client speed. Often when we're sending a message from one service to another, we don't need a response immediately (or sometimes, ever). If a response isn't required, why are we waiting around for the server to finish processing the message? This situation is particularly detrimental if the communication occurs during an end-user's request-response cycle.

  3. Failure handling. When HTTP requests fail, they generally need to be retried. Implementing this retry logic properly can be tricky, and can easily cause further issues (e.g. thundering herds).

  4. Service coupling. Using HTTP for inter-service communication means that mappings between events and dependent services are required. For example: when a payment fails, services a, b, and c, need to know, when a payment succeeds, services b, c, and d need to know, etc, etc. These dependency graphs become increasingly unwieldily as the system grows.

It quickly became evident that most of these issues would be solved by using a message queue for communication between services. After evaluating a number of options, we settled on RabbitMQ. It's a stable piece of software that has been battle-tested at large organisations around the world, and has some useful features not found in other message brokers, which we can use to our advantage.

How we use RabbitMQ

Note: the remainder of this post assumes familiarity with RabbitMQ. I put together a brief summary of the basics, which may help as a refresher. Alternatively, the official tutorials are excellent.

We run a single RabbitMQ cluster that sits between all of our services, acting as a central communications hub. Inter-service communication happens through a single topic exchange. All messages are assigned routing keys, which typically specify the originating service, the subject (noun) of the message, and an action (e.g. paysvc.mandate.transfer).

Each service in our infrastructure has a set of consumers, which handle messages of a particular type. A consumer is defined by a function, which processes messages as they arrive, and a binding key that indicates which messages the consumer is interested in. For each consumer, we create a queue, which is bound to the central exchange using the consumer binding key.

RabbitMQ messages carry a binary payload: no serialisation format is enforced. We settled on JSON for serialising our messages, as JSON libraries are widely available in all major languages.

This setup provides us with a flexible way of managing communication between services. Whenever an action takes place that may interest another service, a message is sent to the central exchange. Any number of services may have consumers set up, ready to receive the message.


There are several mature Ruby libraries for interfacing with RabbitMQ, however, they're relatively low-level libraries providing access to the full suite of RabbitMQ's functionality. We use RabbitMQ in a specific, opinionated fashion, which resulted in a lot of repeated boilerplate code. So we set about building our conventions into a library that we could share between all of our services. We called it Hutch. Here's a high level summary of what it provides:

  • A simple way to define consumers (queues are automatically created and bound to the exchange with the appropriate binding keys)
  • An executable and CLI for running consumers (akin to rake resque:work)
  • Automatic setup of the central exchange
  • Sensible out-of-the-box configuration (e.g. durable messages, persistent queues, message acknowledgements)
  • Management of queue subscriptions
  • Rails integration
  • Configurable exception handling

Here's a brief example demonstrating how consumers are defined and how messages are published:

# Producer in the payments service
Hutch.publish('paysvc.payment.chargedback', payment_id:

# Consumer in the notifications service
class ChargebackNotificationConsumer
  include Hutch::Consumer
  consume 'paysvc.payment.chargedback'

  def process(message)

At it's core, Hutch is simply a Ruby implementation of a set of conventions and opinions for using RabbitMQ: subscriber acks, durable messages, topic exchanges, JSON-encoded messages, UUID message ids, etc. These conventions could easily be ported to another language, enabling this same kind of communication in an environment composed of services written in many programming languages.

Today, we're making Hutch open source. It's available here on GitHub, and any contributions or suggestions are very welcome. For questions and comments, discuss on Hacker News or tweet me at @harrymarr.

in Engineering

How to build a large Angular.js application

At GoCardless, Angular.js has been in production since March this year. We wanted to share some of the things we've learned while building and maintaining a fairly large single-page application (SPA). The core app is 9K lines of code.

This post is split into two sections: why we chose Angular and best practices that have worked for us during development.

Project background

Angular came up when discussing a new project early this year, a project that started as a way to make our existing PayLinks tool more flexible, but became a full dashboard re-write.

Our legacy dashboards are stiched together with lots and lots of jQuery selectors on top of Rails. PayLinks was built using Backbone and served from within our legacy dashboards. Maintaining a SPA within server generated views and seperate, often conflicting, jQuery logic had become too painful.

Why Angular.js?

At GoCardless we write a lot of specs, but when it came to our front end we had all but skipped unit testing and settled with high level integration testing. The front end web development community has only recently started taking testing seriously. Tooling has been poor and frameworks have not been built with testing in mind.

Writing testable javascript without a solid framework and good tooling in place is just too hard. In practice you often rush to get stuff done. Add some jQuery selectors here and there, stick it in application.js and be done with that fix. What you often end up with is code that is almost impossible to test. Do I mock the entire page html in the spec? What data do I pre-popluate it with?... and so on. Messy code, even messier tests.

Angular.js is built from the ground up with testing in mind. In our opinion this makes Angular different from all other frameworks out there. It is the reason we chose it.

This testability is due to the feature set of Angular and the tooling available. Feature wise, dependency injection (DI), modules, directives, data binding, and the internal event loop all work together to create a testable architecture.

Angular maintains its own event loop outside the browser event loop to do dirty checking and make sure data is in sync. It will check all known objects for changes on every loop tick. This is done asynchronously. Because this loop is maintained by Angular, you can flush the queue of any outstanding request or pending change at any time, meaning you can test async code in a synchronous manner.

The test runner Karma, makes testing directives exceptionally easy. It transparently loads your templates as scripts and exposes them as Angular modules. You can use the same concept to package your app for production.

With DI you only ask for instances to be provided to you, DI figures out how to actually create them. Making it easy to mock/decorate in tests.

Thanks to Angular's testability we've been able to keep our app maintainable even as it's grown to 9K lines of code. We've also made a number of changes to the way we work with the framework which have improved maintainability, described below.

Organising your files

Most project scaffolds, tutorials, and sample apps have a folder for each type of file. All controllers in one folder, all views in another, and so on.

After a lot of experimenation with a folder structures, we found ourselves putting all related and dependent files in the same component/page folder as shown below.

You know where to look. And more importantly you know where and how to add stuff. Everything is logically grouped with closely associated files.

Our structure is still a work in progress however, and the next step for us is to move specs into each page/component folder, as well as css styles.

Modules and dependencies

Most sample Angular apps have a single 'app' module namespace and all other files use this module to declare themselves.

The downside of this is implicit dependencies: it becomes hard to track down where stuff is coming from and what dependencies the part you are working on has. Bad news for maintenance. You also have to make sure you load all your modules in the right order. A pain in testing where you just want to glob load all the files.

After realising this we moved to one module per file, both for app code and templates.

Here's an example of a router for one of our show pages:

angular.module('gc.customerPlans', [
  function customerPlansRoute($routeProvider) {
    $routeProvider.when('/customers/:id/plans', {
      controller: 'CustomerPlansController',
      templateUrl: 'plans/customer-plans-template.html',
      resolve: {
        plans: ['$route', 'PlansService',
          function plansResolver($route, PlansService) {
            return PlansService.findAll({
              customer_id: $

For us, the explicit dependencies are a huge win. It also means that you can be very specific of which module to load in your unit tests and whenever you start working on some code you haven't touched in a while, you immediately know what components are in use.

The angular-app project is a great example of this in the wild.

Testing setup

We currently use Karma as a test runner for our unit tests. Our dashboard app has around 500 expectations, with spec coverage at 77%. A full run takes around 1s using Karma and PhantomJS. If you're not using Karma, you should.

For integration testing we're using Protractor. This project will replace what is currently the Angular Scenario runner. An awesome project with less awesome docs.

See our configs for each at the bottom of the post.

Replacing Rails

We used to have all app assets served by Rails. Not a great static file server, single threaded and synchronous. Page reloads in development were taking 15s!

After some consideration, we skipped Rails entirely for assets and now use Grunt instead. Full page reloads now take ~1.5s. Scss and Angular templates are compiled on the fly in development.

In development a connect server is run with the compilation middleware. A Grunt task then generates an index.html file that references all our app assets.

Rails is still used to serve the index.html page due to auth being done entirely in the backend. At some point Rails will only be an API to the front end.


So far, we're really happy with Angular: it's been easy to develop with and maintain for the last 6 months.

We also plan on fully splitting the front end from our Rails app. Something we'll definitely blog about - and also open sourcing a lot of our custom components, moving them to installable bower packages.

For now, most of our config and setup are below. In the near future we'll release a seed project that incorporates this.


Getting started:





in Engineering

Making Direct Debit simpler

Over the last month, GoCardless has added more than 30 new features. Here are some of the most requested:

  • Take 1000s of payments at once - upload a CSV to request payments from all your customers at once (read more).
  • Easily identify customers - you can now add custom references and company names to your customers (read more).
  • Know when you will be paid - simply click on a payment to see the exact payout date (read more).
  • Cancel payments - cancel via your dashboard, right until we submit the payment to the bank.

Don't forget to follow our Twitter to be the first to find out about new GoCardless features!

We'd like to know...

What's the most ridiculous reason a customer has given you for not paying on time? Let us know and we'll feature the best responses.

in Engineering

Introducing bulk payments

Today, we're excited to announce a feature that's been on our roadmap since we started the redesign of our dashboards: bulk payment submission.

Now, you can take thousands of payments by Direct Debit in just a few clicks.

Taking multiple payments is easy. We generate a .csv file of your customer data which you can edit in your favourite spreadsheet software. You upload your modified .csv back to us and we'll validate it and let you review & submit your payments.

You can find a detailed guide to taking bulk payments here.

We hope this new feature will make it even easier to save time and money for your business. As always, we can't wait to hear your feedback.

How can Direct Debit help your business?
Find out more

Talking to our customers 2.0

Back in January, I wrote about our customer support phone system, Nodephone. Based on Twilio's telephony platform, it allowed us to collect valuable data about our phone calls with customers.

Since then, GoCardless has grown to become the UK's largest Direct Debit provider. On support, we now generate huge amounts of data every day. We've continued iterating our phone system to help us:

  • Prioritise development by generating user request data
  • Train our staff by recording calls
  • Distribute our product by linking up with Salesforce
  • Deliver great customer support by routing calls efficiently
  • Allow flexible working so support agents can work from across the UK

This post describes how the system works.

Prioritising development by collecting feedback

Around two months ago, we started using our new feature tracking tool to log our customers' thoughts and comments. Often, however, we found feedback from calls was lost, as the process for recording it was still too time-consuming.

With our new phone system, it is extremely simple to log a feature request whilst on a call.

The agent simply enters the information in a textbox on-screen either during the call or immediately after.

It is then submitted instantly to the tracker, complete with the caller's details, so we can keep them up to date on progress or get in touch if we want to delve deeper into their thoughts.

Training our staff with call recordings

Providing great quality support means providing great training for our support agents. Further, to keep close to our customers we insist that everyone in the company spends time on support. That means a lot of people who need training!

One powerful way we've found to help people improve is to play back calls and discuss them. Our new phone system makes that easy - thanks to Twilio, we're able to record every phone call, providing amazing resources for training and monitoring.

Linking up with Salesforce

Many of our potential customers first contact us via our support channels, so we want the handover to our sales team to be as smooth as possible.

Now, we record data about prospects into Nodephone, and this is passed through automatically to our Salesforce CRM using their API.

Recording a prospect

What's more, the process for our sales team of calling back leads is massively simplified with a smart integration between Nodephone and our internal Chrome extension.

Call from Salesforce

Click the "Call" button and your phone rings and connects you straight through. Make some notes or schedule a callback in the window that pops up, and all the data gets passed back through to Salesforce automatically, including the recording. Magic.

Salesforce tagging

This has increased the call volume our sales agents are able to make by about 20% - a real difference for our productivity.

Providing a great customer experience

We want to provide awesome support, and that doesn't include waiting ages for an answer and then being sent around the houses. Our new phone system makes call routing simple, and ensures no-one waits longer than 3 minutes to speak to us.

Now, when customers call, they begin at a quick 10-second menu where we work what they're calling about - whether they're paying someone, already collecting or looking to start taking payments with us.

Once they've chosen an option, they enter the queue and show up in our revamped agent interface.

Queued call

After 3 minutes, if no-one has answered, all calls are forwarded to every phone in the office, ensuring a speedy response.

Allowing flexible working

As GoCardless has grown, we've needed to expand our support team. As we've done so, we've become more and more interested in a distributed support model, where agents work part-time from wherever they are.

Historically, it's always been easy to answer emails from home but not so simple for phone calls. Our new phone system fixes that.

When configuring Nodephone, agents enter the phone number they'd like to use when answering calls. Then, when they hit the "Answer" button, the call is put through to that phone. Agents can choose any phone number or even take the call direct from their laptop using Twilio Client.

More accessible data

In the original Nodephone, we recorded plenty of data for long-term stats, but it wasn't easily accessible on a day-to-day basis. We've taken two steps to fix this.

Firstly, we have a full interface for looking back at previous calls, incorporating recordings, notes and other data.


Secondly, if a caller has called before, at the click of a button you can get an at a glance view of their previous interactions.

Previous calls modal

To sum up

Nodephone 2 is already bringing great benefits to our internal processes and our customers.

If you're more of a technical type, stay tuned - in a few weeks' time, we'll have an in-depth look at the technical implementation of this new app.

Are you excited by building great tools to solve real business problems? Join the GoCardless team.

What do you think?
Share on Twitter
in Engineering

New payment timelines

Today, we're introducing a much improved payment timeline to your GoCardless dashboards.

The new interface provides more clarity around each stage of the payment process by recording events as they happen such as refunds or subscription cancellations. We also now combine each step of the process with actions that you can take.

When things don't go to plan, it's important to know exactly what's going on. The new timeline will display clearly any problems with your payment and allow you to easily resolve them.

For example, in the rare event that a payment fails because of insufficient funds, we'll tell you and let you retry the payment.

We hope the new payment timelines will make it easier to see the status of your payments as well as see exactly when you'll be getting paid.

Login to your dashboard now to check them out.

in Engineering

Re-writing from scratch

Generally speaking, you should never rewrite from scratch. We recently discovered an exception: making fundamental changes to the UI of a (simple) web application. Counter-intuitively, starting from scratch made us much more iterative.


The old GoCardless interface had been designed to make it easy to set up fixed payments. "PayLinks" were perfect for this - a single link with all the payment information embedded. Clicking one and completing our payment pages would set up a fixed payment plan to a merchant. Here's how the interface looked:

The old GoCardless dashboard

What "PayLinks" weren't good for was setting up and managing variable Direct Debits, against which ad-hoc future payments could be taken. These were quickly becoming our unique selling point to merchants.

Incorporating variable Direct Debits into our old interface required big changes in our UI. Not only were set up links required (as before), we needed functionality to manage customers once they were set up. No one was clear how to add this without adding significant complexity to the existing interface. Since simplicity is GoCardless's headline benefit that wasn't going to fly.

Our solution was to start from scratch in a new application. We built a new interface and then switched customers over. Here's what we learnt:

Starting from scratch let us iterate

Our old dashboards were at a local maximum, making it hard to iterate. Introducing elements of a variable Direct Debit interface would have added complexity, with no benefit until the whole new interface was ready.

For a fundamental change in UI that lack of iteration wasn't acceptable. Starting from scratch with an early beta let us to collect feedback on the new interface immediately. Our speed of iteration was also increased as we didn't have to worry about breaking an existing interface.

Starting from scratch let us launch faster

We launched our completely redesigned interface months before we tackled most of the harder problems in it. Initially launched for new customers only, in its first month the new interface added 10% to our revenue. 3 months in it accounts for 50%.

By building the new interface from scratch as a separate application, we were able to defer hard problems like migrating existing users over until after launch. Whilst we work on that migration the new interface is already having a positive effect on revenue.

Those tack-ons were there for a reason

As we rolled out our new interface to our older users, we heard requests for old features. Most were already on our roadmap, but there were others we'd missed. Typically, they were the small "fixes" that felt tacked on, but some users found essential.

After 3 months of iterations, for example, we've ended up with a top-level nav that looks spookily similar to before we started. Payments, customers, and payouts are all there, as is "Plans", which is very similar to "PayLinks":

The new GoCardless dashboard

We'd do it again

The cost of reimplementing existing features is obvious. Measuring the benefit of a rewrite is much harder. In our case, we've got no doubt that the benefits of an iterative approach outweighed the cost because it helped us escape a local maximum.

Adding an interface for variable Direct Debit was on our roadmap for 12 months, despite consistent demand from our users. 10 months of that delay were due to uncertainty around how to fit it into our existing product. That's a long time for any startup to be building the wrong things. Starting from scratch fixed that.

Did this resonate with you? Tweet it out!
Share on Twitter

Tier One Design Mobile App

GoCardless Mobile App

Here at GoCardless we love developers, especially when they put together something cool with our API.

We've tried to make our API as flexible as possible to encourage new and interesting use cases. Previously that mostly meant web applications, but no longer!

The clever folks at Tier One Design have created an iPhone App which allows your to see a summary of all your GoCardless information on the go. You can keep up to date with all your customers, bills, subscriptions and account information all in one place right on your phone.

We love hearing from people about their experience developing with GoCardless. Have you created something awesome recently, or are you planning to integrate GoCardless with your app? Let us know!

Update (10th July 2013):

Tier One Design have just released a major update: you can now take payments from within the app. Check it out on the App Store.

Interested in building something with GoCardless?
Read the API docs

Making something customers want

Since day one, GoCardless has been trying to build something businesses desperately want: a simple, easy, low-cost way to accept Direct Debit online.

Scaling brings problems. Thousands of business are now using us. It's easy to listen to your customers when you only have a dozen, but how do you collect and organize feedback when there are thousands of them?

Enter technology. To solve this, we built a nifty feedback tool: it enables us to collect every piece of feedback we receive no matter where it is, collate it in one place and share it with the team.

How we collect our feedback

Great options already exist for collating feedback from a single channel (Get Satisfaction and User Voice are two of our favourites), but nothing really helped us collect feedback across different channels. In a typical day, we'll get feedback through emails, feedback forms, phone calls, forums, in the pub and everywhere in between.

So we decided to build our own solution.

We started by linking up the feedback box on the new GoCardless Dashboard, an easy win allowing us to record feedback and customer information directly from within our app - this is a goldmine for feature suggestions.

We then updated our phone call tracking system, Nodephone, to include a feature request field, making it simple to track feedback and link it to specific phone conversations.

An Awesome Extension

What about feedback we receive in forums? In emails? Twitter? Facebook? Fleeting moments of genius from a team member on a caffeine-induced high? All of these happen in a browser (or at least in close proximity to one), so we went ahead and built a Chrome extension.

The extension enables us to select any piece of text, right click, and instantly log that feedback right there and then. It’s that simple.

After being tested by some brave volunteers, we launched to the whole team later that afternoon.

Turning 1000 voices into 1

We feed all of the information into a simple Rails app hosted on Heroku. With the new tool, feedback rapidly accumulated. Our next challenge was turning thousands of unstructured textual feedback snippets into an actionable, prioritised list that could drive development.

The primary challenges here are cataloging the feedback and then ensuring it gets seen by the right people. Rather than expecting everyone to look at every piece of feedback, someone from the Customer Support team blitzes through all submitted feedback each week and makes sure it’s tagged for the appropriate features and products.

This gets automatically compiled into a summary which outlines the top requested features and areas for improvement that week, comparing it to all-time demand.

Focus, Focus, Focus

The feature tool now sits at the centre of any dev-prioritisation meetings, allowing the most requested features to naturally shift towards the top of our roadmap. This helps to focus discussions, and back up our intuition with some hard data. On top of this, we can go into the tool and dig in past the summary to see exactly what our users have been saying.

We firmly believe every member of the team should understand our customers' needs, so every team member has access to the app to read feedback and add comments.

We encourage this behaviour with an automatic summary email each day outlining the previous day’s feedback, enabling people to quickly spot potential issues and tackle them early.

We have been using our feature tracker internally for just under a month now, and it has already led to a few great new product features:

  • Global search for payments, customers and plans
  • Custom meta-data on accounts (eg. add a membership ID to each customer)
  • Pending account balance view (i.e. how much money they have outstanding)

Final Thoughts

Building the feedback tool might have taken a day of our time, but the business application was clear and it's had immediate effect.

How do you manage feedback in your company? Should we open-source the Rails app / Chrome extension? Let us know

What do you think?
Share on Twitter
in Engineering

Cooking up an Office Dashboard Pi

Our Raspberry Pi Developer Dashboard

Since GoCardless started hiring a lot of new faces we've been looking for ways to keep everyone in touch with what's going on. One part of the solution has been adding dashboard screens around the office.

Putting together your own metrics dashboard is actually pretty simple and yields a lot of benefits. This post is a full how-to guide for building your own with a Raspberry Pi, an HDTV and a bunch of hackery.

Step One - Buy your ingredients

Raspberry Pi Components

TV Components

The TV supports a USB connection, so using the USB->Micro USB adapter, we can actually power the Pi without needing any additional wires going to the mains. Combine this with using a Wifi adapter instead of an ethernet cable and you can attach the Raspberry Pi to the back of the TV without any visible wires. Sweet.

For other configurations see here

Step Two - Prepare your filling

For Dashing Dashboards, use Dashing...

There were several options available to us for creating data dashboards but in the end, the one which seemed most flexible whilst remaining easy to implement was Dashing built by the awesome guys at Shopify.

Dashing allows you to very easily create jobs to pull and generate metrics which you can then send to pre-built (or custom) widgets on any number of customized dashboards in real time.

I won't go into more detail on how Dashing actually works, suffice to say it's awesome and you should check it out.

If you want to get hacking quickly, look at the Getting Started guide on the github page.


The one thing we found was missing for us was persistence. For regularly updated metrics, vanilla dashing is great, if you reboot, you see new data in seconds. However for longer-interval updates (CI status, Deployments etc), when screens are turned off or Pis rebooted we ended up with blank dashboards in the interim.

The workaround we devised was very simple and involved setting Dashing's history to a Redis hash substitute instead of a standard ruby hash.

To do this you will need to add redis-objects to your Gemfile:

# Lets us persist data across reboots
gem 'redis-objects'

and then in add:

  # Redis URI is stored in the REDISTOGO_URL environment variable
  # Use Redis for our event history storage
  # This works because a 'HashKey' object from redis-objects allows
  # the index access hash[id] and set hash[id] = XYZ that dashing
  # applies to the history setting to store events
  redis_uri = URI.parse(ENV["REDISTOGO_URL"])
  Redis.current = =>,
                            :port => redis_uri.port,
                            :password => redis_uri.password)

  set :history,'dashing-hash')

Continuous Integration

We've recently started using CircleCI for our Continuous Integration and we really wanted a visualization of our CI status where everyone could see it instantly.

Dashing makes it super easy to add new widgets and since CircleCI has an API, it was relatively easy to come up with an integration of the two, resulting in these widgets:

Our Raspberry Pi Developer Dashboard

I've open sourced both our Single Panel and List Style widgets; feel free to customize them and add your own improvements!

Step Two - Bake your Pi

1. Image the SD card

I was doing all this on OSX so I followed the RPi Easy SD Card Setup guide and installed Raspbian Wheezy.

2. Get connected

The whole idea of this is to have the Raspberry Pi hidden behind the screen so trailing Ethernet cables isn't ideal. Luckily the Pi supports a range of Wifi adapters.

The Edimax wireless adapter I use eats a USB port but since we don't need it anyway that's not a problem. After plugging it in, you'll need to make a few modifications to your Pi's network configuration:

sudo nano /etc/network/interfaces/

Ensure that it contains the following information:

auto wlan0
allow-hotplug wlan0
iface wlan0 inet manual

If you want to be able to access your Pi from a static IP (very useful for reliable SSH access when it's tied up behind a flatscreen) you'll need to make the following changes:

auto wlan0
allow-hotplug wlan0
iface wlan0 inet static # <-|
address 192.168.1.XXX   #   |
netmask   #   |
network     #   |
broadcast #   |
gateway 192.168.1.XXX   # <-|

Then run:

wpa_passphrase <SSID> <Passphrase>\
 | sudo tee /etc/wpa_supplicant/wpa_supplicant.conf
sudo ifdown wlan0
sudo ifup wlan0

You may see an error when running ifup but this didn't seem to affect the actual functionality and a quick ping to Google confirmed everything was working fine. At this point I switched to connecting via SSH and controlled the Dashboard Pi from the comfort of my Desk.

3. Update packages

sudo apt-get update && sudo apt-get upgrade -y # Update the Pi

4. Start Browser on Boot

Install x11 server utils and unclutter:

sudo apt-get install x11-xserver-utils unclutter

Install midori (you could also use epiphany, chromium or a host of other browsers):

sudo apt-get install midori

Then run:

sudo nano /etc/xdg/lxsession/LXDE-pi/autostart

Note: Thanks to Simon Vans-Colina who pointed out Midori is no longer the default browser and must be installed

Comment out the following:

# @xscreensaver -no-splash

Add these lines:

# Turn off screensaver
@xset s off

# Turn off power saving
@xset -dpms

# Disable screen blanking
@xset s noblank

# Hide the mouse cursor

Note: Thanks to Tom Judge who pointed out that inline comments were causing issues in xset

Mine looks like this:

@lxpanel --profile LXDE
@pcmanfm --desktop --profile LXDE
# @xscreensaver -no-splash

@xset s off
@xset -dpms
@xset s noblank

Add the following line to automatically load up your dashboard:

@midori -e Fullscreen -a

I chose Midori as it appears to render all the elements then refresh, rather than (for example) chromium which renders elements one by one on screen.

Enable booting straight to desktop by running

sudo raspi-config

Step Three - Tuck in!

As you've seen, getting metrics and dashboards up in front of the whole company is a relatively simple process and it's super easy to build your own.

As of writing, we currently have around 5 dashboards - 3 on screens and the others used by teams internally, tracking things like outstanding Github Issues, Revenue, Volume, User Sign-ups, Sales Pipelines and more. How we use these and their impact on our business is left for another post.

Hopefully you've found this useful and if you have any questions, feel free to email me: [email protected]. If you create your own dashboards or come up with any improvements I'd love to hear from you!

What do you think?
Share on Twitter

New features in the GoCardless API

This post relates to the Legacy GoCardless API. If you're starting a new integration, you'll need to use the new GoCardless API - for help getting started, check out our guide.

Today we're extending our developer API to include three great new features: payouts, pagination & improved filtering.


You can now query payouts in our REST API, making it possible to build features like bank reconciliation into your integration.

For each payout we'll tell you the amount, the fees charged and the reference that will appear on your bank statement.


Using our new pagination features through the page and per_page parameters, you can cycle through your data in manageable chunks.

This is especially useful for merchants with a large number of customers or payments.


API requests can now be filtered by a wide range of attributes. Full details are available here.

Available now

These additions have already been included in our API libraries.

If you have any questions, don't hesitate to get in touch by emailing us at [email protected] or giving us a call on 020 7183 8674.

in Engineering

Ditching responsive design

We've just redesigned our home pages, and moved from a responsive design to an unresponsive one. Given the trend towards a responsive web, we thought we'd share why.

Why we ditched responsive design

When we designed our old home pages we followed the trend towards responsive design. The result looked great on desktops and went some way towards being device agnostic.

Old design

I don't think anyone would argue that it isn't a good thing to provide a mobile friendly interface for your web applications. However, we had focused on fitting content to a flexible grid without really assessing the requirements of our site.

When we came to redesign our site again, we decided to think through the case for responsive design. There were three factors which tipped the balance against responsive design:

  • We were pitching to the wrong audience - it turns out that not many people shop around for Direct Debit on their mobiles. When we analysed our traffic, we found that only 2% of visits were from mobile devices.
  • It was much slower to implement - responsive designs took almost twice as long to design and implement compared to fixed-width designs. This was valuable time we could have spent improving other areas of the product.
  • It constrained our designs - we didn't have the resource to implement entirely different designs for desktop and mobile. This restricted us to simpler designs that could work for both formats. In some cases, this even led to compromised designs which weren't great for either format.

Old design

What did we do instead?

For our new design, we decided to stick to a fixed grid of 980px. This gave us a canvas that comfortably rendered on almost all desktops as well as on tablets.

Old design

Using a fixed grid roughly doubled the speed of the design and development process. It also gave us more flexibility to implement designs which wouldn't have worked at smaller sizes.

Old design

Not only did we save a lot of time by avoiding responsive design, we were also able to provide a better experience to 98% of our visitors.

When should you be responsive?

Sometimes the extra effort for responsive design is well worth the investment. We believe there are two criteria that determine this:

  • The proportion of mobile use - obviously, if a significant proportion of your traffic is mobile, you should design for that audience.
  • The purpose of the visit - will providing a better user experience for mobile users significantly impact your desired outcome?

For example, we believe it is really important to have a fully responsive design for our checkout pages. Even small changes in checkout page conversion can make a big difference to our customers. Whilst only ~3% of visitors to our checkout pages are on a mobile, having an appropriate design can significantly impact their conversion. So it's worth investing in, even for a small proportion of our visitor base.

Mobile visitors often have a very different set of objectives for visiting your site. In those cases, merely squashing content to fit on a smaller screen isn't particularly helpful. Instead, it is important to consider how different contexts change the content that users want.

Responsive design is definitely a useful tool to consider, but it's also important to be clear on the case for it before embarking on any new projects.

Did you find this interesting?
Share on Twitter
in Business, Engineering

Data-driven support

Here at GoCardless, we try to make all our decisions in a data-driven way. Staying customer-centric whilst we do so means making sense of all the interactions we have with our customers.

On the Customer Support team, we spend more time than anyone else speaking to customers. We generate a lot of rich but unstructured data: our email, calls and other interactions. Recently, we’ve been working on new ways to analyse it to improve our support and drive product development.

How things were

When I joined GoCardless we were providing great support but our tracking was ad-hoc. We were manually recording some interactions, but the data was patchy and the process was time-consuming.

Worse still, our support data wasn’t linked with the rest of our data, meaning we weren’t using what we did record. It was buried.

Enter Nodephone

We decided to begin our overhaul of support-tracking with our phone channel, and I set about building a new, metrics-driven system.

We had a good foundation to build on - all external calls come in on a Twilio number and are forwarded to phones in our office. To get good quality tracking I built a system called “Nodephone” to sit in the middle. Built in Node.js with the Express framework and, it’s on the other end of Twilio, responding with TwiML, but it also communicates with GoCardless and a web interface.

Any incoming call is logged and looked up in our merchant records. We then display the call on a simple web interface, where the support agent can see the caller’s name and click straight through to their records. At the end of the call, they can add descriptive tags and notes.

Nodephone interface

Now when customers call we know who they are and can greet them personally! If we don’t recognise their number and we find out who they are on the call, we can save it from the web client for future calls.

All the data entered, alongside the duration of the call, is saved on the merchant’s account for future reference.

A logged call

Data, data, data

All that data we’re collecting has already proved incredibly useful since we can analyse it to find the trends. For example, we want to know between what hours we should provide support, so I graphed the number of calls in different hours of the day over a typical month:

Calls over hours of the day

Clearly the vast majority of people call between 9am and 6pm, so we decided to set our office hours for then. We also use this kind of data to inform recruitment for the customer support team.

We can also see why people are calling - that is, whether they’ve paid via GoCardless (customers), collected money (merchants) or are interested in taking GoCardless payments (prospects) from the tags:

A logged call

As a start-up that prides itself on reducing friction to signup we were amazed to see so many prospects trying to call us. What’s more, they were finding our number from parts of our site designed for our current users. From the data above, we decided to put our number front and centre on our merchant signup pages - check it out here.

The tags we logged on calls from merchants also showed that there was a lack of helpful information on the site to answer common questions, so we’re revamping our FAQs with a whole range of new content.

What’s next

We’ve built a powerful new automated metrics system for logging our phone calls.

Next we’re targeting our other support channels. We’ll be using’s API to analyse our email support, providing the ability to do all sorts of calculations which aren’t possible with the software’s own analytics.

We're also going to build an internal dashboard so anyone can see the headline stats for support in a couple of clicks.

Over time, we want to make all our internal analytics as powerful as we can. If you find problems like this interesting, we’re hiring and would love to hear from you.

in Engineering

Hacking on Side Projects: The Pool Ball Tracker

By day, we're a London-based start-up that spends most of our time making payments simple so that merchants can collect money from their customers online. Occasionally, however, we enjoy hacking on side projects as a way of winding down while continuing to build stuff as a team.

Since we have a pool table at our office, we decided to build a system to automatically score pool games. This post focusses on how we approached the initial version of the ball tracker. It's by no means complete, but it demonstrates the progress we made on it during the first 48 hour hackathon.

The balls would be tracked via a webcam mounted above the pool table (duct taped to the ceiling).

We split the system into three components:

  • Ball tracker: this reads the webcam feed (illustrated below), and tracks the positions of the balls.
  • Rules engine: accepts the ball positions as input, and applies rules to keep track of the score.
  • Web frontend: a web-based interface that shows the state of the game.

The Setup

  • Refurbished pub pool table
  • Set of pool balls (red and yellow)
  • Consumer webcam

The camera was set up directly above the centre of the pool table to avoid spending time fighting with projective transformations.

We chose to write the ball detector and tracker in C, using OpenCV. In retrospect, it may have made more sense to prototype the system in Python first. However, C is a language most people are comfortable with, and many of the online OpenCV resources cover the C API.

The Approach

We spent some time thinking about different approaches to tracking balls. There are a few main steps to the tracking process:

  • Filter the image based on the balls' colours, to consider only the relevant parts of the image.
  • Find objects that looked roughly ball-shaped.
  • Use knowledge of previous ball positions to reduce noise and filter out anomalies.

Colour Range Extraction

We converted the input frames to the HSV colour space, which made selecting areas based on a given hue easier. The image could then be filtered using cvInRangeS, which makes it possible to find pixels that lie between two HSV values. We ran multiple passes of this process - once for each of the ball colours - yellow, red, black, and white.

Finding the Balls

Our initial stab involved using the Hough transform (cvHoughCircles) to locate the circular balls. After spending some time tweaking parameters, we got some promising results. 

Tracking Moving Balls

One problem that became immediately apparent was that the tracked balls would frequently pop in and out of existence. One cause for this was the Hough transform failing to handle the deformation of the balls in motion (caused by a relatively slow shutter speed) The colour mask would also occasionally hide balls due to changes in lighting. We needed some kind of tracking.

The first approach was the simplest thing we could do. The position of the balls was stored in memory, and if a pool ball was detected within a threshold it would add confidence to this position. Positions that hasn't been detected for a set number frames were discarded.

Later, we expanded on this approach and mapped the balls onto previous positions with a simple distance heuristic. This meant they would more smoothly track across the table instead of leaving 'ghosts'. This approach can potentially be expanded in interesting ways - for example, using a basic physics simulation to predict where the ball should be based on its past trajectory.

Balls Ain't Round

The approach so far was working for simple cases, where balls didn't touch each other and were sitting still. However, as soon as balls started to move they transformed from sharp, bright u-circles to blurry, elongated blobs. This made them very hard to track using the Hough Transform.

We reimplemented the ball detection code using a generic blob detector, and a bit of morphology. A great deal of parameter tweaking was necessary before we started getting convincing results. In the end, the blob tracker performed much better than the Hough transform did, especially when it came to fast moving balls.

This video shows the ball tracking progress at the end of the hackathon:

Next Steps

Although we didn't get perfect results, we were happy with the progress we made. But this isn't the end - we plan to continue working on it. The main priorities are:

  • Speed. The tracker currently runs at about 10 frames / second, which isn't nearly fast enough. We're currently experimenting with moving parts of it to the GPU.
  • Frame rate. The new GoPro sports camera streams high-definition video at 60fps. This should make it easier to track moving balls between frames.
  • More advanced tracking. The motion tracking we're currently doing is very naïve. We've been discussing how we could use more intelligent approaches to compensate for more of the errors from the detection phase.

This is still a work in progress, so if you're interested in helping out or have any advice to offer us, drop us an email or a tweet. We also had help from the London Ruby community - thanks in particular to Riccardo Cambiassi on this project.

In a future post we'll talk about the other parts of the system - the rules engine and the web interface.

If you find problems like this interesting, get in touch - we're hiring.

Discuss this post on Hacker News

in Engineering

A Second Look at the GoCardless PHP Library

This post relates to the Legacy GoCardless API. If you're starting a new integration, you'll need to use the new GoCardless API - for help getting started, check out our guide.

In our first blog post about our PHP library we looked at how to create a payment URL and then confirm the payment once it had been created. Examples of this are included in the repo itself (in the /examples folder) and detailed in our documentation too. Now let's look at a couple of other features available with the library.

Passing a variable through the payment process

Very often when the user arrives at the Redirect URI (where you must confirm the new payment object) you will want a way to refer to what the payment was for. We have a variable for that called 'state'. You can pass in 'state' as a variable when you generate a new payment link:

// The parameters for the payment
$subscription_details = array(
 'amount' => '10.00',
 'interval_length' => 1,
 'interval_unit' => 'month',
 'state' => $reference

// Generate the url
$subscription_url = GoCardless::new_subscription_url($subscription_details);

// Display the link
echo '<a href="'.$subscription_url.'">New subscription</a>';

It then gets passed back to the Redirect URI page as a GET variable which you can access like this:

$reference = $_GET['state'];

Creating bills within a pre-authorization

Our pre-authorization payment type lets you bill your customer whenever you want to up to an agreed limit. To create a bill within a pre-authroization, you can do the following:

// Load your pre-authorization
$pre_auth = GoCardless_PreAuthorization::find($pre_auth_id);

// Details of the bill to create
$bill_details = array(
 'amount' => '5.00'

// Create the bill
$bill = $pre_auth->create_bill($bill_details);

Cancelling a subscription

// Load the subscription
$subscription = GoCardless_Subscription::find($subscription_id);

// Call the cancel method
$subscription = $subscription->cancel();


We fire off a webhook a few days after a payment is made to confirm whether the payment was successful or not. In the most recent version of the PHP library we included a webhook listener demo. The webhook content is sent in the body of the request (rather than in the headers) which you can extract like this:

// Use this line to fetch the body of the HTTP request
$webhook = file_get_contents('php://input');

// Convert json blog to array
$webhook_array = json_decode($webhook, true);

// Validate webhook
if (GoCardless::validate_webhook($webhook_array['payload'])) {
 // Send a success header
 header('HTTP/1.1 200 OK');
in Engineering

GoCardless PHP library

This post relates to the Legacy GoCardless API. If you're starting a new integration, you'll need to use the new GoCardless API - for help getting started, check out our guide.

PHP is the most requested API client library here at GoCardless HQ. Today we’re pleased to announce that we have released our PHP library on Github. This post will give you a walkthrough of how to use it to collect a regular subscription payment. Once you've read it through then it might be useful to check out the examples included within the library.

First, check out our general overview of signing up to GoCardless. This walks you through signing up for a sandbox account. Once you have logged into the sandbox, click the 'Developer' tab to find your API keys - you'll need these shortly.

'git clone' or download the latest version of the PHP library from Github and copy it into a subdirectory. The essential files that you need are in the /lib folder. Then initialize the library within your code like this:

// Include library
include_once 'lib/gocardless.php';

// Config vars
$account_details = array(
 'app_id' => XXXXXXX,
 'app_secret' => XXXXXXX,
 'merchant_id' => XXXXXXX,
 'access_token' => XXXXXXX

// Initialize GoCardless

Next generate a URL to send users to to subscribe to your service:

// The parameters for the payment
$subscription_details = array(
 'amount' => '10.00',
 'interval_length' => 1,
 'interval_unit' => 'month'

// Generate the url
$subscription_url = GoCardless::new_subscription_url($subscription_details);

// Display the link
echo '<a href="'.$subscription_url.'">New subscription</a>';

When the user clicks this link they will be redirected to GoCardless to enter their bank details and create a new subscription. After this is complete, they will be redirected to the path you’ve set as the 'Redirect URI' in the Developer Panel.

The next step is to confirm the payment. You'll need the following code on the page that you've specified as the 'Redirect URI' in Developer settings:

// Default confirm variables
$confirm_params = array(
 'resource_id' => $_GET['resource_id'],
 'resource_type' => $_GET['resource_type'],
 'resource_uri' => $_GET['resource_uri'],
 'signature' => $_GET['signature']

// State is optional
if (isset($_GET['state'])) {
 $confirm_params['state'] = $_GET['state'];

$confirmed_resource = GoCardless::confirm_resource($confirm_params);

GoCardless will now generate a new payment for £10 and debit it directly from the user’s bank account every month. The first payment will be taken immediately.

You can now fetch information about all of your subscriptions from the API like this:


And you're done!

We hope this post has been useful. If you want to dive deeper into the GoCardless API then check out our documentation and our follow-up post on the PHP library.

Oh and we're hiring too!

in Engineering

Getting started with the GoCardless Ruby gem

This post relates to the Legacy GoCardless API. If you're starting a new integration, you'll need to use the new GoCardless API - for help getting started, check out our guide.

In this post, we’re going to learn how to implement recurring payments with GoCardless into a Ruby-powered website in just a few minutes. We’ll be using the GoCardless Ruby gem. You can find the full documentation for our Ruby library here and the source code is on Github.

The code below also powers our example site at

Before we start, check out our overview of getting started with GoCardless and follow the instructions for signing up for a sandbox account and getting your authentication details from the Developer panel - you’ll need them later.

We’re going to use Sinatra, a lightweight Ruby framework, for this demo. Nathan Humburt has written a simple post on getting Sinatra running on Heroku if you need a refresher.

First of all, let’s initialize our GoCardless client:

# In app.rb, or your main app file
# We're using the sandbox environment for testing

GoCardless.environment = :sandbox
GoCardless.account_details = {
  :app_id => 'XXXXXXX',
  :app_secret => 'XXXXXXX',
  :token => 'XXXXXXX,
# In app.rb, or your main app file
# We're using the sandbox environment for testing

GoCardless.environment = :sandbox
GoCardless.account_details = {
  :app_id => 'XXXXXXX',
  :app_secret => 'XXXXXXX',
  :token => 'XXXXXXX,

We’ll then fetch some information about existing subscribers from the API;

# Define the index path in app.rb

get '/' do
  @subscriptions = GoCardless.client.merchant.subscriptions
  haml :index

Now we need a view to display that information, and a form for creating new subscriptions:

# views/index.haml

%h2= "We've had #{@subscriptions.length} signups so far!"

%form{:action => url('/subscribe'), :method => 'POST'}
  %h2 Enter your email to subscribe
  %input{:type=>"text", :name => "email"}

Next, we need to handle the post request sent from the form submission. It’s extremely important that you don’t allow user input for sensitive values - particularly the amount! Interval_length and interval_unit tells you how often the user will be billed - in this case, every 2 months.

You can provide a more information about the user, including their first_name and last_name. This is used to pre-populate the checkout form, and is particularly useful if the person has already filled in the details as part of your website’s signup process. See the docs for more details

# Implement the subscribe path in app.rb

post '/subscribe' do
  # We'll be billing everyone £10 per month
  # for a premium subscription
  url_params = {
    :amount => 10,
    :interval_unit => "month",
    :interval_length => 2,
    :name => "Premium Subscription",
    # Set the user email from the submitted value
    :user => {
      :email => params["email"]

  url = GoCardless.new_subscription_url(url_params) redirect to url

At this point, the user will be redirected to GoCardless to enter bank details and create a new subscription. After this is complete, he will be redirected back to the path you’ve set as the “Redirect URL” in the Developer Panel. You can use localhost:9292/confirm if you’re developing on a local machine.

Every 2 months from now on, GoCardless will generate a new payment for £10 and debit it directly from the user’s bank account. The first payment will be taken immediately.

Finally, you’ll need to confirm that your server has received this new subscription before it’s activated. We can do this with a new route in app.rb

# Implement the confirm path in app.rb

get '/confirm' do
  begin GoCardless.confirm_resource
    params "New subscription created!"
  rescue GoCardless::ApiError => e
    @error = e
    "Could not confirm new subscription. Details: #{e}"

You’re all done! The great thing about this simple solution is that you don’t need to store any user data.