Mike Kavis: Cloud Complexity is a CIO Job Killer

Today we share a recent conversation with Mike Kavis, Chief Cloud Architect at Deloitte Consulting. Mike has held numerous technical roles such as CTO, Chief Architect, and VP positions with over 30 years of experience in software development and architecture. He led a team that built the world's first high speed transaction network in Amazon's public cloud and won the 2010 AWS Global Startup Challenge. Kavis shares his thoughts on new operating models for IT, and the difficulty of making cloud infrastructure decisions.

OpsRamp: What are enterprise IT executives struggling with the most now when it comes to developing the right cloud strategy for their organization?

MK: Many companies have been on their cloud journey for a while. They know how to build in the cloud but struggle to run IT in the cloud. I'm working with clients to design new operating model designs to help them run what they build. What’s happening is that companies are missing SLAs because they are using old processes and ticketing systems, which were never meant for rapid deployment on virtual infrastructure. There’s a mind-shift change needed right now in IT operations to build resilient cloud based systems.

Another hot topic is the integration between cloud and edge. I keep reading articles that the cloud is dead, but you need both; the edge is often used for triggering real-time decisions based off of reading sensors and then the data is sent back to the cloud so that the edge devices can have their decision criteria fine-tuned based on analytics and machine learning in the cloud. Executives right now are thinking about creating new cloud operating models and developing a strategy for multi-cloud and hybrid cloud implementations to support that.

OpsRamp: Tell us more about this new operating model for IT operations.

MK: We hear a lot about shifting left in DevOps with testing and security. Only recently are people talking about shifting operations left. The new model is “you build it, you run it”. But it gets dangerous when businesspeople build and run solutions without adhering to the numerous security policies and regulatory controls that enterprises must support these days. So the idea is to create a platform team that acts like a cloud service provider. They put up the guardrails and standards on top of clouds. Then the business units can only consume what is approved and made available to them. There are lots of human considerations to this change. You’ll have operations and security engineers who code and follow the same software development practices as developers. Platform teams are set up as full stack teams made up of cross-functional expertise like development, security, operations, etc. The platform team is essentially becoming a product team. Their customer is the developer, which is a foreign concept, because there has typically been a lot of contention between developers and other teams. But now this is a developer-centric product organization that exists to provide capabilities to developers so that they can move fast but in a safe manner.

OpsRamp: So are developer roles moving outside of the IT organization?

MK: We used to have one team owning the whole stack including the apps. Now we’ll have a platform team that builds and manages the platform, and they really don’t care what the developers build. They are just providing the services and are making sure the services meet their SLOs, like AWS. In three to five years out, IT will serve primarily as an integrator and a platform provider and will run core IT services like email and SaaS for payroll and finance. Meanwhile, the business units will have budgets to build and run on top of the platforms and core IT services. So yes, the business units are starting to own the application build and run piece. This is a big change and there will be many steps to get there.

In three to five years out, IT will serve primarily as an integrator and a platform provider and will run core IT services like email and SaaS for payroll and finance.

OpsRamp: What’s the end goal of this change?

MK: The speed of business is moving at a pace like never before. But we can't let developers run wild. The overall purpose of the platform team is to provide developers with an agile platform, while giving them a safe way to build and cut out all the meetings, tickets and legacy processes that were designed for physical datacenter and bi-annual releases of large change sets. We need to move to the new model of small and increment releases that are weekly, daily or even multiple times a day. The cloud platform provides that secure and compliant environment while replacing a lot of legacy processes with automation.

OpsRamp: What are your thoughts about hybrid cloud environments? Is this a transition solution or is it long-lasting?

MK: First, let’s start with definitions. When I talk about hybrid cloud, I am talking about on-prem and public cloud, while multi-cloud is using multiple public clouds. Then you also have hybrid apps which is where I have to support an application both on-premises and in the cloud. There are very few use cases to have a hybrid app. I can think of a cruise ship because they are disconnected once they leave the shore or the mainframe application that is a system of record and you can’t get rid of it, but you want to create a cloud front end. Going back to hybrid cloud infrastructure, this will persist for a long time. Unless there is a solid business reason to shut down all the datacenters, the time and money to replace all of the applications isn’t feasible. But year-over-year, the on-premises footprint should continue to shrink.

OpsRamp: What about multi-cloud – there are lots of different opinions here.

MK: If you talk to the infrastructure and governance people, they like the multi-cloud option to mitigate the risk of vendor lock-in. This approach limits how much of the cloud provider’s service catalog you can use and puts more onus on you to manage infrastructure and middleware on the cloud. However, developers typically prefer one cloud because they can go faster and deeper into the cloud provider’s service catalog; they can use a powerful service like Google’s Big Query on day one instead of waiting for an infrastructure team to stand up and manage a Hadoop cluster or some other third-party solution.

To me, using the right cloud for the right workload is a better recipe instead of the cloud-agnostic approach mentioned above. For example, I can go deep into the Google stack for analytics and machine learning, Azure for their excellent IoT services, and AWS for everything else since they are the most mature. However, complexity kills and building cloud agnostic platforms is a very expensive and complex undertaking. You might be better served spending the money and investing in a PaaS solution. In my opinion, it all depends on the business priorities.

OpsRamp: In what cases would you want an application to run in multiple clouds?

MK: The complexity of having one app in multiple clouds is huge, so you better have a significant use case to justify the additional complexity. Cloud is already complex because you are creating highly-distributed systems and now you are doing this across multiple cloud providers and on premises. There is a dream that I put everything in a container and it magically makes everything portable across clouds. But that’s all it is: a dream. Each cloud provider has its own security, identity access management, and other services that make up their security framework and those are not portable. The middleware is portable but the app is only portable if you don’t leverage the higher level APIs of the cloud provider. This turns the cloud into nothing more than commoditized infrastructure in someone else’s datacenter. The value of the cloud is in those higher-level services like database-as-a-service, machine learning, IoT and others.

OpsRamp: Making these decisions sound very complex.

MK: This is what gets CIOs fired! This conflict over cloud strategy and the need to address new operating models is a job killer. CIOs have to get rid of the silos. I work with so many clients who are working on their third cloud strategy and the third CIO. Usually there is a TCO analysis at the beginning. Then the project starts and all these turf wars start, and there’s no progress. Then two years later, they’ve got a million-dollar cloud spend and the business is still waiting for new capabilities and nothing gets shut off in the datacenter. Then the CIO gets fired. And it starts all over again. The CIO has to take a stand on the strategy and give the marching orders and get people aligned. Success in the cloud requires rethinking how software is delivered and how the organization needs to be designed to support the new approach for the cloud. Right now, I believe we need top-down leadership to make this work and scale across a large organization.