SaaSOps Lessons for IT Operations

Across the IT community, there is a buzz around the SaaSOps movement, and it has been consistently gaining momentum. As David Politis of BetterCloud describes it: “SaaSOps is a practice referring to how Software as a Service (SaaS) applications are managed and secured through centralized and automated operations (Ops), resulting in reduced friction, improved collaboration, and better employee experience”. With increased SaaS adoption across the enterprise, the old IT playbook is no longer relevant. Organizations need a new set of principles, roles, and responsibilities to manage the modern enterprise IT stack. We spoke to Viswanatha Penmetsa, OpsRamp’s Director of SaaSOps, to share the secret sauce and how we deploy OpsRamp for SaaSOps excellence.

With increased SaaS adoption across the enterprise, the old IT playbook is no longer relevant.


SaaSOps At OpsRamp

OpsRamp is a SaaS platform used to manage the entire lifecycle of IT operations. We also leverage AIOps to reduce alert noise and improve operational efficiencies. To deliver this, we deploy PODs (Point of Delivery), which are multiple locations from which software is delivered. We have 10 PODs across North America, Europe, and Japan, which are hosted in data centers and in the public cloud. We manage roughly 1.4 million resources and process around 6.5 billion metrics and 500 million events per day. 

Below, I’ve laid out the key areas of focus for our SaaSOps practice:

  • Ensuring 24/7 Performance. OpsRamp as a platform has multiple touchpoints in that it monitors on-prem, cloud and cloud-native resources as a comprehensive ITOM solution. Hence there are a lot of moving parts and it's important to run it efficiently. To handle this, we use a feature called Alert Browser to monitor the entire infrastructure along with alert correlations to reduce the noise. OpsRamp creates tickets and routes them to the right team members based on policies, which speeds up resolution. We also have a knowledge base portal integrated within OpsRamp which can be accessed by all teams regardless of location and time zone.

  • Managing Global PODs. To achieve complete visibility across multiple infrastructure elements, data centers and public clouds and ensure that we are the first to know about an issue, we have deployed the PODs within OpsRamp as clients. This gives our Ops team members who are globally distributed the ability to view all the infrastructure in one place.  We use role-based access (RBAC) and two-factor authentication for added security. OpsRamp automatically records all remote sessions so that we can go back and see who made changes, to aid troubleshooting and audits.

  • Automation for Efficiency. Our POD deployment process is 90-95% automated and we are working to increase runbook automation to reduce repetitive tasks. Another exciting feature we are implementing is process automation to execute a sequence of automation tasks, where workflows can be triggered by alerts, on updates to resources, or on a recurring schedule. Ramping up automation has made a great difference in managing operations remotely this year. It helps our teams use their time efficiently and allows us to scale faster.

  • Service Maps. This capability in OpsRamp shows how related infrastructure elements support an application or service so we can find the root cause of issues faster. Service maps help us quickly identify service health rather than focusing on individual elements.

  • User Experience Monitoring. As cloud adoption grows and matures, performance metrics will be less about infrastructure and more about service availability, customer responsiveness, and customer experience. We use synthetic monitoring to measure end-user experience so we can see if there are issues doing tasks and accessing features or if there are slow load times and so on.

With the explosion of enterprise SaaS applications and cloud workloads, the SaaSOps movement is no longer a choice, but an essential practice for IT organizations to embrace for efficiency, agility and reliability.

Next Steps:


Recommended posts