Technology stack transitions are hard

Mike Webb 2023-07-15

Technology

In today’s rapidly evolving technological landscape, businesses often face the challenge of transitioning between different technology stacks. In 2019, we made the difficult decision to move away from using Ruby and switch to Haskell as our primary stack. We have found this process to be complex and demanding, particularly when dealing with legacy systems. Retaining skilled developers who can maintain knowledge of both old and new stacks has been crucial to our transition. We’ve focused on making incremental progress, delivering tangible value at each stage. In this blog post, we will delve into the intricacies of managing the transition between technology stacks, addressing the importance of retaining talent, maximizing value, and effectively handling “leftover” code.

In a transition like this your legacy stack will inevitably become viewed as “lesser” and, once this happens, it is difficult to retain developers capable of maintaining it. Once we had made the decision to move to Haskell, we lost several developers who were committed to Ruby and did not see a transition to Haskell as their preferred career path. We believed that preserving knowledge of the legacy codebase was vital for a smooth transition. We identified developers that were willing to learn to develop in the new stack and ensured they were well-supported in their learning journey. We ran introductory workshops, book clubs and mentoring sessions, and assigned developers to projects building basic features in the new stack. This was an effective strategy for mitigating the risks associated with the transition and ensured continuity.

In a business like ours where software is not the “product” but the business is critically reliant upon it, we need to transition in safe increments and produce tangible value with each increment. In some instances the act of transitioning alone delivers sufficient value because the new stack is a better fit. With Haskell, we have been able to demonstrate a significant reduction in technical problems and outages as we migrate parts of our systems. In other instances we need to wait until a change is required in the legacy system and port that functionality in the process of delivering the change. We try to find ways to add value during this process. For example, we have many systems that use Domain Specific Language (DSL)-based configuration rules and Haskell is fantastic for DSLs. We’ve been able to re-use a common DSL for configuration across these systems, and by doing so, kept our internal-user interfaces consistent and the DSLs more powerful with each new release. Adding the ability to configure a system using this common DSL has been a great selling point for transitioning legacy modules.

A natural consequence of an incremental transition strategy is that it will take a long time before our transition is completed. Bellroy is roughly 3 years into this transition and, looking at the trend in our Ruby line count, we’re probably about 2 years away from completing it. For all of that time, we need to keep our dependencies up to date, fix the odd bug in the legacy code, and support the growth of the business. Given the way things have proceeded so far, we have found that Haskell offers significant improvements in reusability, stability and maintainability. The cumulative benefits of those improvements will mean that we can successfully complete the transition before the legacy codebase becomes too much of a burden.

Even with careful planning and execution, as we approach the end of the transition we’ll have a substantial amount of “leftover” legacy code. This is code that is critical to ongoing operations, but is difficult to improve or add value to during the transition to the new stack. Migrating this code may not directly contribute to Bellroy’s business objectives, making it challenging to allocate resources for the migration. However, leaving the code in its current state can have negative implications for future maintenance and scalability. Quantifying these intangible costs can help with making the case to allocate resources to migrate that code. At Bellroy, we have built a project evaluation engine (in Haskell, naturally) that lets us estimate the probabilistic Net Present Value (NPV) of a proposed project using textual descriptions of future cash flows. We use this engine to model the opportunity cost of maintaining legacy code (as opposed to building new features) and the risk of retaining legacy systems with a dwindling amount of specialised expertise in the team.

Now if you’ll excuse me - I need to go and think about which part of our systems we’ll supercharge next…