Director of Production Engineering

2 weeks ago


North East, United Kingdom External Full time

Toshiba Global Commerce Solutions is seeking a Director of Production Engineering (Reliability Platform Engineering) to lead the reliability backbone of our global POS cloud and middleware platform. This strategic role owns system availability resilience performance observability and release reliability across a distributed mission‑critical commerce ecosystem. This leader will unify Site Reliability Engineering (SRE), Resilience & Performance Engineering, Observability and AI‑driven Reliability Automation into one cohesive function. As AI accelerates development velocity, verification and reliability become the core bottleneck, making this role a cornerstone of our engineering organization. You will partner closely with Architecture, Cloud Operations, Functional Quality Engineering and Software Development to ensure predictable reliability, smooth releases and dramatically fewer Sev‑1 / Sev‑2 incidents. Responsibilities System Reliability & Uptime : Define and enforce SLO / SLA frameworks, error budgets and release criteria Lead availability, resilience and performance strategy across all services. Own MTTR, MTBF, incident prevention and rollback strategies at scale. Unified Reliability Engineering Organization : Lead teams across SRE & L3 Engineering, Resilience & Performance Engineer Observability & Telemetry, AI Reliability Automation. Build a culture focused on prevention over firefighting. Architecture‑Level Reliability : Collaborate with Principal Engineers and Architects to define system guardrails, resilience patterns and failure modes. Ensure high‑quality Production Readiness Reviews (PRRs) and architectural consistency. Resilience & Performance Engineering : Own chaos, failover, load, stress and soak testing strategies. Validate store‑mode behavior, payment workflows, edge‑device dependencies and multi‑service interactions. Observability & Telemetry : Ensure complete, accurate signal for logs, traces, metrics and business health. Partner with AI systems to build intelligent anomaly detection pipelines. AI‑Driven Release Reliability : Integrate AI‑based reliability scoring, resiliency prediction, automated gating, regression analysis and incident pattern detection. Define the path toward autonomous release‑reliability pipelines. Cross‑Org Leadership : Partner with Software Development, Functional Quality Engineering, Cloud Operations, Architecture and TPM / TPO teams. Drive multi‑team initiatives and ensure readiness across complex release trains. Required Experience Bachelor's Degree in Computer Science, Engineering or 10–15 years direct experience. 10–15 years in SRE, Reliability Engineering, Production Engineering, Distributed Systems and Performance / Resilience Engineering. Proven ownership of uptime and system reliability in complex distributed architectures. Expertise in distributed systems, cloud platforms (AKS, Kubernetes), observability stacks (OpenTelemetry, Grafana, App Insights, Datadog), performance tuning, fault tolerance, network fundamentals, DB/service scaling, chaos testing. Architectural Leadership: Experience designing resilience patterns (timeouts, retries, hedging, circuit breakers) and strong partnership with architects and senior engineers. Operational Maturity: Led SRE/on‑call organizations. Defined SLOs, SLIs and error budgets at scale. Track record of driving incident‑prevention culture. Leadership & Communication: Builds strong engineering teams, hires top talent. Influential communicator with executives and cross‑functional teams. Highly collaborative and low‑ego. Preferred Requirements AI‑driven anomaly detection, regression analysis, incident clustering, reliability scoring. Experience with retail POS payments, edge devices or store environments. Hybrid cloud edge architectures. Leading reliability transformations and scaling engineering organizations (200–500). Why This Role Matters Uptime becomes engineered, not reactive. Development and QA operate at AI‑enabled speed. Our platform grows safely while delivering stability and performance. We match or surpass best‑in‑class tech organizations (Google, Amazon, Azure, Stripe). Benefits Group health coverage (medical, dental & vision) Employee Assistance Programs Pre‑tax spending accounts 401(k) plan with company match Company‑provided life insurance Pet insurance Employee discounts Generous paid holiday schedule, paid vacation & sick / personal days EEO Toshiba Global Commerce Solutions is an equal opportunity/affirmative action employer that evaluates qualified applicants without regard to age, ancestry, color, religious creed, disability, marital status, medical condition, genetic information, military or veteran status, national origin, race, sex, gender, gender identity, gender expression and sexual orientation or any other protected factor. We also consider qualified applicants regardless of criminal histories consistent with legal requirements. Individuals who need a reasonable accommodation because of a disability for any part of the employment process should email to request an accommodation. DIVERSITY EQUITY & INCLUSION We at Toshiba Global Commerce Solutions firmly believe that our people are an integral part to the success of our customers. We are committed to Diversity Equity & Inclusion for all our people, as highlighted by our 5 Core Principles (Create Outreache, Foster Belonging, Unleash Opportunity, Diverse Cultural Engagement, and Culture of Transparency). We’re passionate about our customers, the retail industry and becoming a more responsible company as we help create a brighter future. Key Skills Go, Lean, Management Experience, React, Node.js, Operations Management, Project Management, Research & Development, Software Development, Team Management, GraphQL, Leadership Experience Employment Type : Full Time Experience : years Vacancy : 1 #J-18808-Ljbffr


  • Production Engineer

    1 week ago


    East Sussex, United Kingdom Premier Engineering Full time

    **JOB- Production Engineer** **LOCATION- East Sussex** **TERM- Permanent** **SALARY- £35,000 - £40,000 per annum (dependent on experience)** We are looking for Production Engineer on a permanent basis in the East Sussex area with experience in the Electronics industry. As part of a busy and varied manufacturing Team, they are looking for someone to take on...


  • North East, United Kingdom External Full time

    A leading global commerce solutions provider is seeking a Director of Production Engineering to oversee reliability for a cloud platform. This strategic role requires 10–15 years of experience in Reliability Engineering, focusing on uptime and incident prevention. Ideal candidates should possess strong skills in SRE, performance engineering, and cloud...


  • North East, United Kingdom IC Resources Full time

    Product Assurance Engineer Join to apply for the Product Assurance Engineer role at IC Resources. Be among the first 25 applicants. We are seeking a Product Assurance Engineer to join our client’s growing team and play a vital role in delivering safe, robust, and mission‑critical aerospace systems. About The Role An engineer responsible for ensuring that...


  • North East, United Kingdom Port of Tyne Authority Full time

    Remunerated at £14,857 per annum for circa 1 day per month, and based at Maritime House, Tyne Dock, South Shields, NE34 9PT. This is a public appointment made by the Secretary of State for Transport with a three‑year tenure and the potential for renewal.Welcome to the Port of Tyne, and thank you for your interest in joining our Board as a Non‑executive...


  • North East, United Kingdom WSP in the UK & Ireland Full time

    OverviewAssociate and Associate Director - Fire Engineering at WSP. You will work on diverse projects, from universities to high-rise residential towers, aiming to enable innovative architectural design without compromising fire safety.ResponsibilitiesReview design information and liaise with other WSP disciplines (Facades, Acoustics, Structures).Present...

  • QHSE Director

    5 days ago


    North East, United Kingdom Jackson Hogg Ltd Full time

    Jackson Hogg are proudly supporting a specialist manufacturing business in the Durham area on a QHSE Director position, reporting to the Chief Operating Officer. Our client is seeking an experienced QHSE Director to lead the QHSE and Security functions on site. A senior leadership position that requires strong leadership skills, the ability to engage with...

  • Director of Quality

    4 days ago


    North East, United Kingdom Jackson Hogg Full time

    This range is provided by Jackson Hogg. Your actual pay will be based on your skills and experience — talk with your recruiter to learn more. Senior Consultant - QHSE - Operations at Jackson Hogg Limited Jackson Hogg are proudly supporting a specialist manufacturing business in the Durham area on a QHSE Director position, reporting to the Chief Operating...


  • East Hertfordshire, Hertfordshire, United Kingdom SES Engineering Services Full time

    SES Engineering Services are looking for an Operations Director to join our team to oversee multiple complex Mechanical and Electrical projects across the Northern Home Counties, Northamptonshire, Cambridgeshire and East Anglia region. This is a great opportunity to progress your career as part of a talented, diverse, and supportive team. About The Role Our...


  • East Kilbride, United Kingdom Technical Futures. Full time

    Job DescriptionRewarding opportunity for a hands-on Semiconductor industry Director of Software Engineering with the technical expertise to deliver complex, production grade systems. Hybrid working available.Youll be a key part of cutting-edge technology development which is revolutionizing wired connectivity by enabling the building of scalable, energy...


  • East Kilbride, United Kingdom Technical Futures. Full time

    Job DescriptionRewarding opportunity for a hands-on Director of Software Engineering with the technical expertise to deliver complex, production grade systems with the Semiconductor domain. Hybrid working available.Youll be a key part of cutting-edge technology development which is revolutionizing wired connectivity by enabling the building of scalable,...