In this article:

  • Learn about OpsQ Recommend Mode, a new feature delivering auto-suggestions for alert escalation policies with OpsRamp’s machine learning algorithms.
  • Recommend Mode builds upon Observed Mode by enabling one-click automation of suggested actions for alert escalation management.
  • You can follow two approaches to use OpsQ Recommend Mode: First-Response and Alert Escalation. 

During last year's Thanksgiving and Black Friday holiday, Costco's North American e-commerce websites suffered a crash that lasted for more than 16 hours and cost nearly $11 million during the holiday season. Given the high stakes (lost sales, reputational damages, and customer dissatisfaction) involved in technology outages, enterprises have raced to adopt AIOps tools to get ahead of service disruptions and maintain the quality of their digital experiences.  

AIOps solutions use a combination of machine learning and big data techniques for delivering real-time insights and impact analysis for IT incident management. IT professionals can tame alert storms and reduce constant firefighting with predictive insights—provided they can confidently rely on the accuracy and interpretability of machine learning algorithms. 

In 2019, OpsRamp introduced OpsQ Observed Mode that presents shadow alert inferences for live event streams. Observed Mode lets incident responders preview the power of the OpsQ event management engine without actually enabling AIOps in a production environment. Observed Mode delivers a safe and low-risk option for enterprise IT teams to assess the accuracy and utility of machine learning algorithms for incident analysis. 

In 2019 we also introduced first-response policies to auto-suppress seasonal and learned alert noise. Machine learning insights come to play here so that IT operations teams no longer need to address each repetitive and redundant alert. Observed Mode for alert escalation and first-response policies enables instant recommendations on incident routing, prioritization, categorization.

Recommend Mode: Informed Decision Making for Event and Incident Management

Can machine learning algorithms help IT operations teams proactively address and resolve frequently occurring issues with auto-suggested actions? Typically, IT practitioners spend countless hours trying to understand the critical context behind a sudden service disruption. Once an incident responder has a good grasp on the specifics behind an outage, they need to immediately route the incident to the right on-call teams that can promptly address the issue. 

The OpsRamp Winter 2020 Release introduces Recommend Mode (powered by OpsQ Bot) that delivers machine learning-based suggestions for first-response policies. Recommend Mode speeds up incident response by offering clear next steps for alert escalation policies and by executing recommended actions in a single click. Instead of forcing IT pros to dig through hundreds of alerts for incident diagnosis, Recommend Mode suggests appropriate actions to resolve the issue. OpsQ Recommend Mode is powered by predictive analytics and helps IT practitioners understand algorithmic recommendations for dynamic event management.  

Digital operations teams can enable Recommend Mode so that the OpsQ Bot tags relevant next steps for both native and third-party alerts. Suggested actions can include auto-suppression of an alert or incident ticket creation with relevant metadata. Recommend Mode delivers the right situational context for incident management by showcasing the alert history and showing who has previously worked on a similar issue. IT teams can either accept machine-based suggestions for first-response or override the recommendation when appropriate. 

Here are the two approaches for configuring Recommend Mode for intelligent incident management: 

  1. First-Response Policies. IT teams can use first-response policies to automate the suppression of seasonal alerts without affecting their overall operations. You can configure Recommend Mode for existing and new first-response policies to predict and provide recommendations for seasonal alert suppression. From the Setup tab, navigate to Alerts and then click on First Response to enable Recommend Mode. 

    First Response
    Recommendmode_AlertSeasonality_PatternFigure 1 - Eliminate alert noise and prevents operators from drowning in repetitive alerts with Recommend Mode for first-response policies.

  2. Alert Escalation Policies. Alert escalation policies deliver automated response actions for incoming alerts and help IT teams manage incidents as per defined service levels. OpsRamp learns and delivers specific recommendations around resolver groups, assignees, category, sub-category, priority, and related knowledge articles. IT teams can change the mode of an alert escalation policy to ON, Recommend, Observed, or OFF modes to prevent manual incident creation and eliminate delayed actions for incident response.

    AlertEscalation_RecommendationFigure 2 - Fix outages faster with Recommend Mode for alert escalation policies.

Next Steps:


Recommended posts