Internally at GoCardless, we are undergoing a top-down restructuring of our server infrastructure. We are planning on moving away from the one-app-per-server model with a mixture of cloud and dedicated hardware, to a more general compute style cluster; powered by Apache Mesos and Docker on top of dedicated hardware.
One of the problems to solve leading up to this change is what we do with logs. Currently we use logstash-forwarder to pick up entries in our application's
rails.log and forward them to our Logstash server. Given that our applications can possibly come and go frequently, be created for the sole purpose of running a one-off or recurring task, and the separation of the container and the host filesystem; we are posed an interesting problem which led us to deem our current solution inadequate.
Possible solutions included:
- Mounting an external volume on the docker container, binding to its
logsdirectory. However this would require a large change to our applications to ensure they write to a unique log file. Additionally we find logstash-fowarder has difficulty with dynamic log collection and traversing a long directory tree.
- Moving logstash-forwarder from the host server into each individual container. However this would increase the complexity of the container and has the addition of each container now creating N+1 processes.
- Having applications log directly to logstash. This potentially decreases the reliability of the application by introducing potential faults through reliance on a further external service.
After reviewing our options, we decided to design our own log collection agent using a unique combination of existing methods to provide a simple and reliable system for our logging. We called this Logjam.
Logjam is a daemon which listens locally on a UDP socket and accepts a stream of JSON log entries. These entries are then buffered (either in memory or on disk) and forwarded to a remote collection server (in our case logstash) for storage and querying. In our use case, we configure logjam on the host system to listen on the network bridge interface created by docker (called
docker0), by doing this we can ensure that all running containers can communicate with logjam using the network bridges IP address (by default
Logjam implements a simple processing pipeline consisting of two major subsystems. The first of which, called
receiver, is responsible for listening on a local network interface and receiving incoming log entries. The second, called
shipper, is responsible for taking log entries and reliably submitting them to a remote collection server for storage. Inbetween the two subsystems consists an in-memory buffer (optionally persisted to disk) of configurable size of recent log entries to ship.
Logjam is written in Go and is available from GitHub.
We will be writing throughout this major re-architecting of our infrastructure, documenting our experiences, as we design and implement it.