Putting the Ops in DevOps – An Infrastructure Story - IT Revolution

Putting the Ops in DevOps – An Infrastructure Story - IT Revolution

In this transcript from the 2021 DevOps Enterprise Summit, Stephen Farley shows how a large-scale fortune 100 company is infusing DevOps principles into Infrastructure engineering even though DevOps has had a strong presence for many years at Nationwide Insurance. Hear the story of “Putting the Ops in DevOps” after many years of DevOps adoption for our Software Development community. Come on the journey of how infrastructure teams both support Software teams while at the same time need to think about their own product orientation to drive efficiency for the software we serve and the company products we sell. We will illustrate the guiding principles, building blocks, and playbooks used to embrace DevOps “Product Principles” and improve how we build, ship, and run our infrastructure in partnership with our software teams. Attendees will get a perspective of Product Taxonomy at a large fortune 100 company and the complexity of defining products in the world that consists of business products, software products, platform as a product, and the journey maps and customer experiences they enable. See how the equation of Lean+Agile+DevOps is being implemented for Infrastructure teams and how we transformed the way we price and deliver infrastructure at Nationwide.

I’ll tell you a little bit about myself, kind of how I got into this infrastructure role in this story, but really, this is a story of our journey in DevOps, which started many, many years ago and how maybe we left infrastructure behind and what we’re doing to bring infrastructure into the picture. As Gene said, my new role is Vice President of Infrastructure and Operations, what we know as hosted solutions, hosting solutions in Nationwide, and I spent over 30 years at Nationwide in the IT industry all on the software development side of the house.

This all was about building software and building solutions for people. 18 months ago, close to two years ago, I was given the opportunity to take a new role at Nationwide, which was to lead Infrastructure and Operations. In my prior role, as we’ve spoken here before, and I’ll tell part of the story, is we had been on a multi-year journey to infuse Lean, Agile, and DevOps principles into well over 200, now 300 Agile teams across Nationwide, and it was very rewarding for me. You’ll hear a little bit about that story, but the story I’m going to tell today is about how we really brought infrastructure teams into the fold. As I mentioned, I’ve been here a long, long time.

Graduated in computer technology from Ohio State University, the Ohio State University, and hold my insurance designation, FLMI designation. Very proud of working for an insurance and financial services company so that I have the context to the business as I was building software throughout the years and leading software organizations. Interesting fun facts, so if you ever get a chance to … I wish we were in person out in Vegas, but if you ever do get a chance to kind of meet me, I’m a second-generation by marriage at Nationwide. My father-in-law was a Nationwide agent for many, many years.

My wife works here at Nationwide as a Release Manager. Very fun fact of the day is anybody who loves horses, come talk to me about … I have a small horse farm just north of here in Columbus, Ohio, so just a little bit about me. I’ll give some credit here about things that have influenced me most over the last 10 years, and you can see, a little credit to Gene there, The DevOps Handbook, The Phoenix Project, some credit to Mik Kersten and Project to Product, as both Gene and Mik have presented here at Nationwide at what we know as now called TechX, what was called TechCon at our conferences as we’ve worked that journey, and then a little bit of context about Nationwide. A lot of people don’t know we are a Fortune 100 company in the insurance and financial services. We have the money side of the equation of $46 billion in sales on an annualized basis, assets of $274 million.

We have an investment portfolio of $125 billion. We’re actually number one in what we know as public sector 457, Retirement Plan Management. We are actually number one in pet insurance. We’re number one in insuring farms and ranches, which I hold a farm policy myself on my little horse farm, and so we do a lot of great things, a big company, so this is an at-scale kind of conversation today. With that said, you can see the context of Nationwide Technology as well.

We have about 8,400 technology workforce of associates and contractors working on our technology on a regular basis. We’re articulating now about 300 plus software development Agile teams, which we call lines here at Nationwide. We spend a little north of a billion dollars in annual IT expenses across many technologies. We have everything, as Gene would’ve alluded to, we have mainframes, we have distributed technologies, we have cloud-hosted technologies, we have recordkeeping systems, digital solutions, mobile apps, a plethora of what we call primary technologies. For this context today, kind of pay attention to, we have 60 plus infrastructure product teams or teams, and that’s a pretty new number to us.

We just now articulated, “What does product mean to our infrastructure teams?” Next up is this has been our journey, so this is what Gene would’ve alluded to in our prior presentations. It traces way back to about 2000, the pre-2011 era when Agile was coming on in the scene, and we made our way through that evolution. In about 2014, we had about 50 Agile software development teams working in traditional Agile best practices in fashion, co-located, Sprint-based, Scrum-based fashion, and then 2016 hit, and the word DevOps came on the scene. If you look at that 2018 number, we actually had 200 plus teams where 86% of them were defined as mature and agile, and many DevOps practices.

Clearly, from a Lean principle, and you can see the perspective, we believe in all things transparency, Gemba walks, accountability, systems of accountability, systems of transparency, visual management, and you can see those intersections that exist with Agile that we have around those principles and practices, and you can see some of the detailed Agile principles and practices in the yellow section of the slide, but it was really for illustrated purposes. We were doing this as we started to tear down that separation of siloed plan, build and run mindsets, and get away from those vertical practices to horizontal improvement of full-stack, build it, run it type of teams, so we believe that DevOps was a set of practices and mechanisms that facilitated continuous delivery in support of our already existing Lean management systems and our recognition that we needed to go faster. Then, if you look at kind of what was happening, I can do this build out here of the slides, and you can see around that circa 2000, we were still a dominant waterfall type of company, and we were doing physical hardware provisioning. We evolved to those iterative solution practices, Agile moving into some early water-scrum-fall practices, which often were not healthy, but that’s how we evolved into it. Here we are in circa 2020 around continuous delivery with DevOps, and you can see I drew a pink circle off on the right.

Not red, pink because it’s not like red in trouble, but it’s, “Hey, we need to focus on the infrastructure components, and how do we infuse and transform our infrastructure environment as well?,” so starting with infrastructure automation and getting into this concept of product-centric infrastructure and product-centric implementation, which is really what the rest of the talk is all about. When I came to the Infrastructure team 18 months ago, I felt like something was missing. I actually had somewhat of a guilty feeling like, “Oh, I was so indexed on the software side of the house that I may have not been paying the right attention to that ops word.” That’s the key component of DevOps, and like, “What are the infrastructure teams doing to not only infuse themselves with the software teams but also to become DevOps themselves or become Lean and Agile themselves within the products that they are responsible for?” We had just come off a presentation by Dr. Mik Kersten on the book, Project To Product, and we said, “Well, product seems to be the place to go, that we need to organize around the principles and mindsets, and this could be our catalyst.”

In that mindset, we need to use all the things at our disposal from our learnings over the last 10 years and really bring them all forward for our Infrastructure teams. We said, “Well, why do we want to take this journey from an industry perspective?” so we’re seeing strong emergence in product-centricity and product focus in the industry. We’re getting positive feedback. We are seeing better quality from teams who truly own their Build and Run together with full accountability for their product, product teams who demonstrated lead time improvement or speed improvement when they own the entire life cycle, and it actually fostered better employee engagement overall. As we headed into this, we said, “Well, what are our challenges?” Well, one of the biggest challenges we face and one of the things I could use help from this community is, “What is the definition of product?”

If you’re coming from a software company, compared to an insurance company, compared to any other company in the industry, the definition of product is complex. Is it a homeowner’s insurance policy? Is it an annuity, a life insurance policy, or a pet policy? The other challenge is our technology is tightly coupled, and it’s very decentralized in how we build software and manifest infrastructure, so how would we define our products in a tightly coupled world was a very interesting conversation, and so those kinds of independent assets that are owned across the company from a technology perspective, they made it difficult to understand the true value stream independence that we may or may not have. We just felt that enabling full-stack product capability would drive a lot of unit cost improvement for us in the time when we needed efficiency in our IT organization and where we needed efficiency across the company.

We felt like we could drive organizational efficiency as well in the design of product teams, but we knew designing what we call a product taxonomy would be difficult, and we needed to do that in partnership with finance because one of the key elements of this product-centric model was, “How do we charge for our products? How do you unitize them?,” and I’ll talk about that, and, “How do you actually bring transparency to what people want to pay for and what they are paying for?” so those good decisions can be happening by the consumers of your products. These are all the things that we thought about why the product model seemed right for us. What did we want to get out of this? We knew the three biggest outcomes that we thought we could get out of this is we could actually get labor-related savings. We could bring efficiency to organizational alignment and structure.

We could organize around full-stack teams and we could drive labor-related savings and how we deliver the work against our products. The second thing is really key. We could get product stack savings in the form of hardware and software, and then one of the most important things but one of the hardest things to harvest is productivity by both development teams who use our products, but also productivity by the teams ourselves as we build, run, ship our own products that get consumed by the software development community. Next was the tough, tough part, which is, “Well, what is a product for an Infrastructure team, and how do we think about product at Nationwide?” This slide, we built says, “Okay, everything that we know from the industry is product takes on many forms,” and at Nationwide, we talk about this all the time, and everybody has a different perspective on what is that product, so you could go all the way to the left.

It could be a customer experience. In the middle there, it could be a business product like a homeowner’s policy. It could be something like an annuity product that we sell in financial services, but more importantly, as you look through the DevOps lens as an IT leader, what it meant to me was there’s some software product orientation as well, and there’s some technology platform product orientation as well, and where did we fit in this ecosystem within Nationwide and Infrastructure? Where we landed in Infrastructure and Operations is that we want to treat our products as technology products or products as a platform. First up is we had to understand our products.

We had to create focus. We had to establish a mindset. What does it mean to be a product manager of an Infrastructure platform, what are those guiding principles, and what does it mean to leverage Lean, Agile, and DevOps? Part of that was mindset training for a bunch of new roles that we define, which we went through role definition, so now, we have technology product managers of our teams, and we said, “Hey, in order to be a product manager, you have to own all the components of the product. You have to have sustainable operations. You have to be paying attention to unit cost and drive unit cost.”

“You have to understand your SLAs and your metrics, and you have to understand the complete life cycle, and you have to create feedback loops to drive customer feedback,” so everything around this circle represents the mindset of what we wanted product managers to be accountable for. No longer do we plan it, throw it over the wall to a Build team, who then throws it over the wall to a Run It team, and this is complete product management for the products we defined. We also said the guiding principles needed to be anchored to, we wanted industry-leading, competitive costs. That meant unitizing our products. How do we want to charge for our products in the form of a unit, and how do we work with our finance and our CFO teams to actually build what we call a financial simplification model to be able to outwardly charge for the consumption of our products?

We wanted to drive speed. We wanted to drive availability. We wanted to improve feedback loops of, “How did people like our products, and how did they actually weigh in on the backlog of our products?,” and then we needed to transform how we deliver those products, and this is a key element of infusing Lean and Agile and DevOps into Infrastructure teams. We ended up with 44 product teams that get supported by 26 service teams. A product team has to be a team that truly builds, ships, and runs a product, that delivers something that’s unitized and paid for by consumers.

A services team is many of the teams that help foster the adoption of those products, the experiences with those products, the success of those products, and helps people be successful across a very large organization. We worked through a taxonomy that started with, “What is a product suite?” “What is a product grouping?” “What is a product at the lowest level that gets unitized in that we charge for?” and, “How do we create the hierarchy or the relationship of those three connotations?” On the left-hand side, we tend to organize around product suites, and this is how we departmentalize our organization. On the right-hand side, we tend to unitize at the lowest level and charge for our products. Again, these are representative products that we literally unitize and said, “These are the things that we build, ship, and run.”

“These are the things that we will improve through product evolution, independent if these things are on-prem or on the cloud, and how do we actually help consumers understand how they’re consuming these products and pay for these products?” That was what we mean by product as a platform. These are the platforms we service and run. They change over time as new things come in. Event streaming, for example, is fairly new over the last couple of years for us.

As that came into play for us, lots of storage solutions, lots of compute type solutions, lots of database types of solutions that we charge for independently. Here’s an example of what we did for identity and access management. Actually, one of the members of my team, Tod Bickley, had given another DevOps Summit presentation around moving identity and access management specifically to a product model, and that meant some pretty good coupling with our CSO office and our Information Risk Management office around their definition of product. Our IRM office basically defined product as a capability and said, “We need to drive capability-based products,” and one of their products is identity and access management. Now, we build, ship, and run very specific platforms related to that and we actually have three subproducts that we build, ship, and run, but it has a direct correlation to the product model that our CSO office and the same thing, Information Risk Management office has, but those alignments are really important.

You can see on the right the result of that, which is we’re able to articulate the cost per ID of an internal ID and the cost of an ID for an external ID. This is a product that we do an allocated model on. We don’t unitize it, but you can see, we could easily get there and say, “Okay. Well, who do we want to charge $260 per ID?” That’s what it costs to manage this across both the Security office and the Infrastructure Product office.

All right. New roles are critical. If you’re going to be a Product Manager, you have to have roles in the organization that actually set the stage for what should a Product Manager do, so we mapped over 700 associates and we moved from 14 to eight roles. We introduced new roles in this journey, one called a Technology Product Manager away from what used to be like a General Engineering Leader, Engineering Manager to a defined Technology Product Manager. We cleaned up some of the basic engineering roles, but we also inserted new roles like a Technology Delivery Leader.

You’ll not see us use the word project manager in our Infrastructure teams as we moved from Project to Product thinking, so we moved to Technology Delivery Leader. We put in Solution Engineering roles, and we put in a TSP, Technology Services Practitioner role, Site Reliability Engineers emerged and came into play, as did a role called Technology Delivery Professional, and that is a role likened to the Scrum master role in the industry that we embraced on the software side of the house, but we wanted to make a leap of faith into kind of the next generation of what that represented and say, “Hey, this is a role that we want to be able to also contribute to the work of the product team to be engineering grounded, but also guide those best practices of the team,” so not just best practices from a Lean, Agile and DevOps, but truly anchored to engineering the product that they support, so went through a pretty massive role change to help people understand the expectations of the work that they did and the role that they held. The next big thing was really building a foundation and helping people learn, so rolling out things like a Lean management system, some common language, some learning paths for both leaders and associates, and you’ll see kind of what we did in that. “How did we get beyond project thinking in our portfolio and get into product-centric thinking in our portfolio, and then how do we get financially ready, what I mentioned earlier around a FINSIM model, to help bring transparency to the cost of the products and allow our consumers, the software teams and the business solution area, IT teams to understand what they’re paying for?,” and that’s what we call our FINSIM model for financial simplification. Learning paths for associates and leaders, we actually partnered with a community college university, Columbus State Community College, and we built a series of training of, “What does it mean to learn some of these new techniques?”

We actually fund this and we get tuition reimbursement for our associates who go and get certified in these capabilities. It’s a certification program that takes 31 to 56 weeks, but really, deep-rooted technology learning, and what does it mean to think the way of product-centric and think some of the DevOps practices, so you can see, we teach things like, “What is infrastructure of code with infrastructure as code as it relates to GitHub and Source code management and some of the new networking capabilities and cloud basics, and infusing a developer mindset into infrastructure teams with a tool around developer tools and mindset?” Those were core partnerships to trying to transform the organization and move it forward. This was a key element, was unitizing our products and working with our finance office to bring transparency and cost recovery, so we basically pay for these products. We know the cost of the full-stack product, and then we unitize it and recover the cost of those products in our charge structure to those who consume the product.

We also decided, “Well, we need to understand what a product stack looks like,” and this is kind of representative of that. It says, “Here are the drivers of product unit costs.” Part of it is, “What does it take to operate the product?” We could have incidents. We have patching and vulnerabilities and minor upgrades and things like that, but we also have product evolution. These are things where products are evolving and moving on us, and we need to embrace those evolutions and invest in them.

Then, we have to consult what we do outwardly for the adoption of our products. Then, last but not least, we do product startup. We might find ourselves in a situation where there’s actually a new product that we want to start up and offer out there that we want to unitize and charge for in our infrastructure. This is kind of how we think about our product stack and what their unit cost is made up of. We envisioned what it meant to drive empathy, what it meant to drive feedback loops, and we kind of came up with a three-loop model that said, “We’d like to have feedback and empathy across our products, so if we’re managing 44 defined, detailed products of infrastructure, we want to know, how do people feel about the collective set of products?”

“Is something missing? How are we performing organizationally? We want to get feedback for the purpose of product evolution, like what is happening in a product as it’s evolving, especially as it moves from on-prem to cloud-native offerings and services?” That’s in the middle loop, and then at the moment of delivery, this is classic, Agile principles of product owner sitting with the team and helping to groom backlog and making sure this product team is actually doing the right things, and investing in the right things to make our consumers happy, and we ask each team to go out and start thinking about their product loops and figure out how do they get that perspective, which just so you know, one of the takeaways here today is like, “What do we struggle with?” That’s still one of the things, like infrastructure teams, like who acts as a product owner when you’re delivering product around compute or storage or an iSeries platform that multiple consumers use, and how do you get feedback on what’s the right things to work on in that product?

One thing I had a passion for was if we create feedback loops, how do we make sure that we respond appropriately? Nothing worse than a feedback loop that feels like a meeting, and basically, people go to a meeting and nothing happens. People feel like you’re unheard at that point, so how do you take and create continuous improvement from those feedback loops so that people feel heard and that you’re baking those CIs into your backlogs and actually addressing them as part of your product orientation and your flow? Those things, we built these customer feedback frameworks for people to take and adopt, not trying to be prescriptive, but trying to give frameworks so people can pull themselves forward. Then, we get to, “Well, how are we going to help these teams transform?”

We actually got to this transformation approach that says, “These are the things that help guide product model support.” I talked a lot about facilitated trainings. We still offered some Agile learnings because we had some teams that were still coming forward on what is Agile in an infrastructure world? We had some infused DevOps training, and we had a, what we call a Leader series to help leaders because the worst thing you can do is go have a bunch of training for your associates and leave leaders behind, and help them understand, “Hey, we want you to be the advocates for where we’re heading and we want our leaders to be able to interact with our associates in Gemba walks and help them understand the why behind the equation and help them understand how do we best move in this new world, and how do we manage our work in this new world? We had self-service resources?”

I’m a big believer in pull systems, so basically, how do teams pull themselves forward in their practices? One of the greatest benefits that we see that we get from a product-oriented mindset is being able to see the complete picture, being able to see what the cost of this product is, how we’re either infusing new cost, how we’re taking out costs, and how we’re driving unit cost efficiency. I’ll show you some examples, but until you start to look at it, you often can’t make decisions about how to drive unit cost improvement, and happy to say, one of the biggest outcomes we’ve achieved in this transformation is over $25 million of identified savings over the next couple of years through product stack efficiency and analysis, and saying, “Hey, we can make decisions about the tooling within our product stack, about the contracts within our product stack.” I’ll give you a couple of examples of that of how that evolved. Here’s an example, so storage as a product.

Storage gets broken down into object, file, block if you know infrastructure, and it gets broken down into data protection storage, and their key technologies, these are the vendors we use. Don’t get kind of anchored to that, so we go work with these vendors and say, “Hey, you are a critical component of our stack and we want to optimize that stack, and we want to leverage your capabilities, but we also want to optimize our spend, so how do we go through that analysis and work with our vendors, and then ultimately, drive out these opportunities?” You can see in this stack over the next year, we see about a $1.1 million opportunity as we analyzed everything from contracts, the hardware and software that we use, and how we bring in different components of the stack to manage that product? Another example would be network as a product, so we treat the network as a product that has separate product streams, four of them, and everything from our firewalls to our web gateways, to our connectivity and our devices, in our campuses and our data centers, and what does that stack look like and what do we see in opportunities? Not as much opportunity this year, but we do see some opportunity in firewall rule optimization that we can achieve, but that kind of analysis will be ongoing.

Then, another one, you have to take into consideration is we have a product, a container-based product, so we made a transformation to rancher as part of this stack, but this is a stack that you don’t get a lot of savings out of because this product is in a high-growth state. Basically, we’re spending money now and we’re growing the number of consumers as people move to a container-based platform to move to the cloud, so we expect actually more consumers of this product to help us drive down that unit cost versus driving down expense savings within the product, so an example of a growth-oriented product that we have. I mentioned the outcomes that we were striving for, and one of those outcomes was customer empathy and feedback, making and delighting our customers in our unit cost and understanding what they’re getting charged for. Our history has been we allocate our expenses out for many years and people don’t know what they’re paying for. How much of this is mainframe versus how much of this is distributed technology and the products, so what I can say today is, I think if you ask any of our CTOs, they are delighted by that transparency that we now bring in this model, clearly knowing how many databases they’re consuming and paying for in the services they get for those products, how many compute environments, how much storage, how much MIPS they’re using on the mainframe as we unitize that, how many IDs they have in the mix and how they’re consuming those IDs, so I think that’s a great outcome that we achieved in this model

The other one was delighting our associates, and these journeys are always tough. Some people move faster in the change curve than others, but if you just look through the lens of our Gallup engagement scores, we dramatically raised those engagement scores over the last couple of years. We’re now across just about all teams in infrastructure and operations, top-quartile Gallup scores. I hope that some of the product and organizational changes that we put in place are helping to drive that because one of the key questions on raising associate engagement is, “I know what’s expected of me,” and I just firmly believe that if you know that you own a product and you know that you’re accountable for the plan, build, run, you are accountable to build it, ship it, run it and make sure that you’re delighting your customers, you’re naturally going to raise engagement. I think that’s happening.

I’m not going to say it’s all attributed to our product model and our Lean, Agile, DevOps journey, but I think it has something to do with it. I’m going to close with really, what obstacles still remain? This is more of a question mark where I could use all of your help. A couple of those obstacles are clarity on product across the enterprise, so while we may have clarity inside of our infrastructure teams, how do we create those right intersections with other people in the software world, the business world, the consumer experience, customer experience world as they think about their products? We’re still working on that as an enterprise team and how we can think about product definition and product taxonomy, but it’s still a remaining obstacle at the enterprise level. Maybe not so much for our Infrastructure teams.

We’re in the middle of a cloud transformation at the same time, so we live in a very hybrid world, and kind of what is the cloud going to bring and the orientation of product for us? How do we define those products as we make those migrations to the cloud, and how do product teams that traditionally have managed infrastructure from an on-prem perspective, and how do they think about that product orientation in the cloud? Does it even exist, or do we have some responsibilities as a product team in the cloud? At the end of the day, that’s what I have to offer in our journey, and hopefully, that’s an interesting infrastructure story and how we started to really think about the journey we need to take in Infrastructure side by side with what’s been happening on the software development side of the house. I think that is all I had, and thank you very much.

Images Powered by Shutterstock