Director, IT Incident and Problem Management

2 months ago


Belfast, United Kingdom Smarsh Founder Stephen Marsh receives Inc Full time

Summary

The Director of IT Incident and Problem Management is a senior leader responsible for shaping and transforming incident and problem management into a predictive and proactive discipline. You will drive a proactive, agile approach to incident response, building and leveraging AI-driven insights to enhance responsiveness and operational efficiency. Your leadership will underpin our pivot from a product to a platform-focused service, ensuring seamless, resilient service delivery that meets our high standards for reliability and customer satisfaction.

As a forward-thinking leader, you will balance traditional ITIL frameworks with modern tools and practices, such as incident.io and FireHydrant, and embed accountability across engineering and operational teams. You will work closely with cross-functional stakeholders including Engineering, Product, and Customer Support to ensure that incidents are resolved promptly and root causes are addressed comprehensively, with the overarching goal of minimizing business impact.

How will you contribute?
  • Strategic Leadership: Provide visionary leadership to evolve our incident and problem management practices, embedding modern approaches that use AI and automation and predictive capabilities to reduce response times and predict potential issues before they impact service.
  • Accountability and Performance: Foster a culture of accountability, holding engineering teams and incident responders to high standards for incident resolution. Ensure robust tracking and reporting of incident response metrics, creating transparency and setting clear performance expectations.
  • Platform-Centric Incident Management: Drive alignment between incident/problem management and the organization's shift towards a unified platform model, ensuring that incident management processes are scalable, adaptable, and aligned with platform objectives.
  • Modern Tool Proficiency: Deploy and optimize advanced incident management platforms such as incident.io and FireHydrant, utilizing these tools to enhance visibility, speed, and effectiveness of response across our platform. Adapt methodologies beyond traditional ITIL to remain agile and customer-focused.
  • Root Cause Analysis and Prevention: Lead comprehensive root cause analysis for major incidents, advocating a preventative stance through continuous improvement and resilience-focused practices. Apply SRE principles and drive actionable outcomes to prevent recurrence.
  • Data-Driven Insights and Reporting: Utilize data-driven insights to inform incident response strategies. Present trends, risk factors, and improvement opportunities to senior executives and stakeholders, supporting business decisions with clear, actionable metrics.
Typical Tasks:
  • Define and implement strategic roadmaps for incident and problem management, ensuring alignment with business objectives and platform goals. Regularly update practices to incorporate the latest in AI, automation, and predictive analytics.
  • Oversee major incident response efforts, ensuring fast, effective containment, resolution, and customer impact mitigation. Lead executive-level post-mortems and ensure comprehensive follow-ups.
  • Conduct and oversee in-depth root cause analyses for recurring or high-impact incidents, developing and deploying preventive measures across the platform to reduce recurrence.
  • Collaborate closely with IT operations, engineering, product, and support teams to ensure a unified approach to incident and problem resolution, with a focus on consistent customer experience.
  • Define, monitor, and optimise KPIs and performance metrics related to incident and problem management. Lead continuous improvement initiatives to ensure process agility and alignment with evolving business requirements.
  • Lead continuous improvement initiatives, including evaluating and refining AI algorithms and predictive models to align with evolving business needs and platform scalability.
  • Drive modular and scalable incident management practices, adaptable to the complexities of a multi-service platform architecture.
  • Develop and deliver reports on incident and problem management metrics for stakeholders, including executive leadership, product management, and customer success teams, to provide insights into trends, risks, and opportunities for improvement.
What will you bring?
  • Strategic Incident and Problem Management Expertise: 10-15 years of experience in IT incident and problem management, ideally within SaaS and platform-based environments, with a minimum of 5 years in a senior leadership capacity.
  • Modern Practices in Incident Management: Demonstrated expertise in using cutting-edge incident management tools (e.g., incident.io , FireHydrant) and AI-driven solutions to streamline processes, drive rapid response, and enhance service reliability.
  • Problem Management: Expertise in leading comprehensive root cause analysis and problem resolution efforts, incorporating Google SRE principles for preventive actions.
  • Google SRE Methodologies: In-depth knowledge of Google SRE philosophies, including error budget management, service level indicators/objectives (SLIs/SLOs), and effective incident response strategies.
  • Platform and SaaS Experience: Strong understanding of platform-oriented operations within B2B SaaS, ideally with experience in supporting a pivot from product to platform. FinTech experience is advantageous but not required.
  • Leadership and Accountability: Proven record of building and leading high-performing teams, with an emphasis on holding teams accountable to clear standards and ensuring consistency in incident response and resolution.
  • Collaborative Communication Skills: Excellent ability to influence and collaborate with cross-functional teams and executive-level stakeholders. Skilled in delivering complex insights to both technical and non-technical audiences.
  • Innovation and Continuous Improvement: Ability to drive continuous improvement through innovative practices, data insights, and strategic thinking. An advocate for evolving incident/problem management to proactively support business goals.
  • Cross-cloud environments: Experience managing incident and problem resolution in cross-cloud environments, ideally with a focus on seamless integration of diverse platforms.
Preferred Qualifications:
  • Bachelor’s degree in Computer Science, Information Systems, or a related field; a Master’s degree is preferred.
  • ITIL Expert certification and familiarity with Google SRE principles; advanced certifications in cloud platforms (AWS, GCP, Azure) or incident management tools are highly advantageous.
  • Familiarity with leveraging AI and machine learning within incident and problem management to predict incidents, automate responses, or identify root causes, showcasing an ability to bring innovative solutions to the role.
#J-18808-Ljbffr

  • Belfast, United Kingdom Smarsh Founder Stephen Marsh Receives Inc Full time

    Summary The Director of IT Incident and Problem Management is a senior leader responsible for shaping and transforming incident and problem management into a predictive and proactive discipline.You will drive a proactive, agile approach to incident response, building and leveraging AI-driven insights to enhance responsiveness and operational efficiency.Your...


  • Belfast, United Kingdom Smarsh Founder Stephen Marsh receives Inc Full time

    Summary The Director of IT Incident and Problem Management is a senior leader responsible for shaping and transforming incident and problem management into a predictive and proactive discipline. You will drive a proactive, agile approach to incident response, building and leveraging AI-driven insights to enhance responsiveness and operational efficiency....


  • Belfast, United Kingdom Smarsh Founder Stephen Marsh receives Inc Full time

    Summary The Director of IT Incident and Problem Management is a senior leader responsible for shaping and transforming incident and problem management into a predictive and proactive discipline. You will drive a proactive, agile approach to incident response, building and leveraging AI-driven insights to enhance responsiveness and operational efficiency....


  • Belfast, United Kingdom Smarsh Full time

    About the PositionWe are looking for a seasoned Director Machine Learning to lead our Incident and Problem Management team in Portland, OR. As a key member of our team, you will be responsible for driving a proactive, agile approach to incident response, leveraging AI-driven insights to enhance responsiveness and operational efficiency.The ideal candidate...


  • Belfast, United Kingdom Allstate Full time

    Incident Response Service Delivery DirectorWe are seeking an experienced Incident Response Service Delivery Director to lead our Cyber Incident Response team. As a senior leader, you will be responsible for delivering incident response services to our customers, ensuring the highest levels of service delivery and customer satisfaction.Key responsibilities...

  • Toc Incident Manager

    3 weeks ago


    Belfast, United Kingdom National Highways Full time

    About the job.Read the overview of this opportunity to understand what skills, including and relevant soft skills and software package proficiencies, are required.As a TOC Incident Manager, you will be responsible for management of incidents identified within the Technology Operations Centre (TOC) relating to the Operational Technology roadside devices which...


  • Belfast, United Kingdom Smarsh Full time

    About the RoleWe are seeking a highly experienced Director Machine Learning to lead our Incident and Problem Management team. The successful candidate will have a strong background in shaping and transforming incident and problem management into a predictive and proactive discipline.This is an exciting opportunity for a senior leader who can drive a...


  • Belfast, United Kingdom Smarsh Founder Stephen Marsh receives Inc Full time

    Summary The Director of IT Incident and Problem Management is a senior leader responsible for shaping and transforming incident and problem management into a predictive and proactive discipline. You will drive a proactive, agile approach to incident response, building and leveraging AI-driven insights to enhance responsiveness and operational efficiency....


  • Belfast, United Kingdom Smarsh Founder Stephen Marsh receives Inc Full time

    Summary The Director of IT Incident and Problem Management is a senior leader responsible for shaping and transforming incident and problem management into a predictive and proactive discipline. You will drive a proactive, agile approach to incident response, building and leveraging AI-driven insights to enhance responsiveness and operational efficiency....

  • Senior Data Engineer

    3 weeks ago


    Belfast, United Kingdom BT Full time

    Job DescriptionThe successful candidate will utilise their evolving technical ability to resolve technical incidents, service requests, changes, or problems. They will also deal with customer requests in line with Service definitions and SLAs.To succeed in this role, you will require strong service improvement and transformation skills, as well as...


  • Belfast, United Kingdom Rapid7 Full time

    **About the Team** The Rapid7 Incident Response team is considered the tip of the spear within Rapid7's Detection & Response practice. This team is primarily responsible for ensuring 24/7 breach response coverage for Rapid7's MDR and retainer customers, guaranteeing to be there for our customers in their greatest times of need. All services are delivered...


  • Belfast, United Kingdom Allstate Full time

    Cyber Incident Response LeaderAs a Cyber Incident Response Team Lead at Allstate, you will lead a team of Incident Handling Managers and their respective teams across a global footprint, offering 24x7x365 Cyber Incident Response services. You will manage relationships with MSSP vendors, ensuring SLAs and quality measures are met, balancing MSSP services with...


  • Belfast, United Kingdom Smarsh Full time

    About the JobWe are seeking a highly experienced Director Machine Learning to lead our Incident and Problem Management team at Smartsheet in Portland, OR. The successful candidate will have a strong background in shaping and transforming incident and problem management into a predictive and proactive discipline.This is an exciting opportunity for a senior...


  • Belfast, United Kingdom Be-IT Ltd Full time

    Director of Audit ServicesJob Summary:We are seeking a highly experienced Director of Audit Services to join our audit team in Belfast. As a Director of Audit Services, you will have the opportunity to lead complex audit engagements and provide strategic guidance to our clients.This role represents an excellent opportunity for career advancement and will...


  • Belfast, United Kingdom Funds-Axis Group Limited Full time

    Job Title: Data Management DirectorFunds-Axis Group Limited is seeking a seasoned professional to fill the role of Data Management Director. As a key member of our team, you will be responsible for overseeing our organization's enterprise-wide data strategy, governance, and management. This position requires a strong understanding of financial services and...


  • Belfast, United Kingdom HAYS Specialist Recruitment Full time

    Civil Engineer OpportunityYour new company Hays is partnering with a prominent engineering firm to recruit a Civil Engineer to be based in the UK and Ireland. The client is renowned for their expertise in applied structural problem-solving, emphasizing analysis, monitoring, and extension of the lifespan of existing structures through innovative...


  • Belfast, United Kingdom Fortive Full time

    About the RoleWe are looking for an experienced Manufacturing Process Director to lead our Value Stream operations. The successful candidate will have a strong background in lean manufacturing, leadership skills, and experience in managing cross-functional teams.Estimated salary: $150,000 - $220,000 per year.ResponsibilitiesSafety Leadership: Foster a...


  • Belfast, United Kingdom Dream Apartments Full time

    Dream Apartments is excited to announce that an amazing opportunity has arisen for the position of the Personal Assistant to the Managing Director. The role will include: **Job Duties** - Maintain and provide a support service for the Directors daily schedule, including forward planning of his workload, diary management and co-ordinating and collating all...


  • Belfast, United Kingdom VanRath Full time

    A Unique Opportunity:VanRath is seeking an exceptional individual to fill the position of Director of Cybersecurity and Compliance. As a key member of our team, you will play a vital role in shaping our cybersecurity strategy and ensuring the highest level of security across our organization.In this role, you will be responsible for leading the governance,...

  • Managing Director

    1 week ago


    Belfast, United Kingdom Artemis Human Capital Full time

    Managing Director Artemis Executive Search have been successfully retained by B4B Renewables a leading provider of solar panels, battery storage and heat pump solutions for businesses across UK and Ireland. B4B are dedicated to helping organisations reduce energy costs, enhance energy independence, lower carbon footprint and achieve ambitious sustainability...