It’s hard to believe that 2018 is coming to a close, and I’m reflecting back over all that we have accomplished this year as a company. From my vantage point at Graphium Health, I can see so many great accomplishments this past year. Given my area of primary responsibility, I feel it’s only fitting to highlight the large scale infrastructure migration we completed in November as my personal favorite among 2018 feats. Not only is it a feather in the cap of Graphium Health and our team of engineers, but it offers a rare peek behind the curtain to anyone interested in the technical operations that underpin the Graphium Health platform.
We are a partnership of clinicians and technologists working side by side, which I believe is one of the things that makes us unique as a company. We are innovators who love to do so in the healthcare space, specifically anesthesia – but not just for the sake of innovation. We are driven to simplify common workflows that have been neglected for too long. On one hand, the bar is low for the specialty of anesthesia, at least in the realm of innovation. It’s been largely ignored by the big EHR vendors. On the other hand, however, these “common” workflows in both the practice and business of anesthesia can be very complex. That’s why we always try to stay focused on making sure our technical innovations don’t get in the way. We believe technology should be simple. Achieving that goal means we often bear the burden of simplification for our end-users in order to keep the hard parts behind the scenes.
That’s why this migration effort is so relevant. I have been working with data, systems architecture and systems integration over the course of my entire career. As a former consultant for Fortune 500/1000 companies, I had also performed much larger scale data migrations. But this particular migration was much more operationally complex since our platform never stops. Our users are spread across the nation in every timezone, and surgeries are obviously being performed around the clock. Likewise, our real-time integrations with customers and their facilities never stop. If they do, we’ve got problems, and alerts are likely going off somewhere!
Additionally, there are a lot of moving parts. As such, this effort had been at least two years in the planning. We knew there was going to be downtime, and we knew it would impact every one of our customers to varying degrees. We set an ambitious goal of completing it before the end of the year, and efforts ramped up significantly as we entered Q4.
While it was honestly grueling at times, it was also very exciting. We had to tackle difficult problems, yet we were able to leverage very cool technology for solutions. We were faced with the realities of early design decisions in our platform, yet we observed how well the vast majority of our original design had stood the test of time. And, it is here I must pause and pay homage to our legacy infrastructure provider – you know who you are. We had a significant set of infrastructure components that had run flawlessly for over 6 years without any outages. That’s a lot of uptime, folks!
Here are some key lessons I personally learned (or was at least reminded of) throughout this process:
- There’s no such thing as over-planning. Regardless of how much you prepare, things will still go wrong. Did things go wrong for us? At points along the way, yes. The first migration window, in fact, went three times as long as originally scheduled. We hit a snag we had just not planned for. All that planning and preparation, and yet we still found a scenario that had never occurred to us.
- There’s no such thing as over-communication. Regardless of how much you communicate, someone will still miss the message. We opted to break our communication plan into both global and customer-specific missives that aligned with our incremental migration strategy. We spent combined hours preparing and reviewing contact lists for each customer to make sure no one was missed. Yet, at the end of the day, some people were surprised by the outage. Things still worked out well, however. In fact, the vast majority of our customers experienced a much shorter downtime than was originally allotted.
- There’s no need to be afraid of the hard things. Instead, choose to do hard things well. We certainly were forced to tackle some hard things: Technology decisions with significant performance, scalability, and cost implications; Decisions that, if made incorrectly, could haunt us for years into the future; And any time data is involved (and to be fair, when isn’t it?), there is the potential for things to go very badly if not handled with extreme care. We had to do a lot of testing, a lot of trial and error, and a lot of contingency planning in case we were forced to rollback at any time. The point is, we faced the hard things and pressed through to the end.
- There’s something to be said for trusting your gut. When we were originally designing the platform, there were honestly times we struggled wondering if we were over-engineering certain features. It sure seemed as if we were, yet our collective gut told us to be prepared for growth. Fast forward 6 years, and I remain convinced we chose the right infrastructure partner in the beginning. I also remain convinced we chose the right multi-tenancy architecture. This architecture allowed us to migrate customers incrementally, in batch sizes of our choosing. That alone made it possible for us to afford the proper sensitivities to our larger customers when it came to downtime planning.
What was the final result? Over 100 database and applications components were successfully migrated, including our primary data and service tiers. We were delighted to see API response times drop to half their previous averages, and the consolidation of services was already projected to bring a similar reduction in infrastructure spend. We completed all the core migration efforts right before Thanksgiving, ahead of schedule and under budget, giving all of us one more very big thing for which to be thankful. In fact, the long Thanksgiving weekend was worth all the work leading up to it, because it was one of the quietest “ops” weekends in recent memory.
So, hats off to the team at Graphium, as well as to all of our customers who endured their respective periods of downtime during this migration. I’m thankful for everything we accomplished together in 2018, and I look forward to the exciting things coming in 2019.