Based on our roots in open data, transparency is big here. The Namara DevNotes series is a glimpse inside of our tech, bringing the latest updates on our Namara platform. Chris Sandison is the Principal Engineer at ThinkData Works and provides an overview of the Namara integration with Sentry.io.
I joined ThinkData late in the summer of 2016. At the time, the development team’s responsibilities included:
- the Namara API (a matured Ruby on Rails monolith),
- the Ingest pipeline (4 ruby services),
- Unity (a Ruby on Rails full-stack application, which was just crawling out of its alpha stage).
The users we designed for were our in-house team of data engineers and research analysts. Our use cases centred around fielding client requests and reliably ingesting a growing volume of data.
Having a close proximity with our power users offered some advantages:
- We could quickly work through design cases and perform comprehensive QA on new features
- We would receive immediate notification if anything went wrong with a new release
- We were able to work alongside affected users to diagnose and reproduce issues
Addressing issues involved log parsing, re-ingesting partial or entire datasets, and having users take us through the steps to reproduce bugs. As much as we love to get our hands dirty, that's not something you can do with enterprise clients. It was clear that integrating with an issue tracking and error notification system was needed.
We required a solution that:
- Notified us when errors occur on any part of the platform
- Tracked error history, frequency, and regressions
- Exposed error context and specifics, including but not limited to affected user details, request payload, suspect data, and any other customizable fields
The Evolving Use Case
Fast forward to 2020: summer is here, and with it, a lot of changes to our platform and how we design. In the past year, we have been re-architecting Namara as a suite of micro-services (we'll tell you all about that in a later blog post).
We've been using the Twirp RPC framework by Twitch, which generates server and client code using Golang’s net/http library. Services are initialized from an evolving boilerplate application that defines our best practices for coding and hooks into any shared infrastructure — namely logging, message queueing, and our OpenTracing service.
The growth of our user base means we design for remote data scientists working on dedicated deployments of Namara. They ingest and manage their own data in an isolated and secure environment. This change means that we have lost the luxury of building a product for an in-house team. Our team has had to change the way that we design, code, test, and deploy.
We now required a solution that would:
- Notify us when errors occur on any part of the platform at least as fast as any one client can report it
- Tracks error history, frequency, and regressions
- Exposes error context and specifics, including but not limited to affected user details, request payload, suspect data, affected client deployments, and any other customizable fields
- Allow us to triage and verify errors without the client’s involvement or access to their data
After evaluating several solutions, we went with Sentry. Along with meeting all of our core requirements, it boasts a collection of continually growing client libraries and integrations. We have made excellent use of their team and project management features by including Sentry signup as part of our engineering team on-boarding. We also utilize their Github tie-ins as well as email and Slack notifications to make us aware and prepared faster.
Most important of all, addressing and triaging incoming bugs has become a task that can be delegated to a single developer, creating comprehensive product ownership amongst any team.
Integrating with a MonolithFor our APIs, we employed two patterns:
- A middleware approach to notify us of any unexpected errors for any request
- A service call that could be dropped in-line for any application
For both cases, we found that wrapping Sentry’s Raven library in a service class allowed us to do both. It could accept user and requests context, environment variables, and generic payloads.
Integrating with Micro-services
Golang applications do not handle errors with the same broad strokes as a Ruby on Rails application. Handling returned errors is a cornerstone of Go development, and in our case, it always touches the same pieces of service infrastructure. It’s only natural that registering errors with Sentry is another piece of that infrastructure.
By starting with Sentry’s golang net/http docs, we construct a per-request wrapper that ties into Sentry’s notifier, alongside logging, tracing, error handling, and service recovery. On any error or panic, the handler constructs a custom error context with details, tags, a stacktrace, and the final error to return to the service consumer.
What This Means for the Namara Platform
In all cases, integrating with Sentry improved turnaround for addressing bugs, tracking error regressions, catching issues with fresh releases, detecting problems with service-to-service communication, and watching for suspicious user behaviour.
Sentry is part of our daily work — all the way from design to maintenance. It is crucial to our application infrastructure, right alongside logging, opentracing, and service recovery patterns. It has scaled-out to many services, in many environments, and on many deployments. It has allowed the Namara platform codebase to stay reliable, extensible, and accessible to any developer who joins our team.