See more Collapse

Technical Duty Officer, Network Operations

2 months ago


London, Greater London, United Kingdom Box Full time


WHAT IS BOX?

Box is the market leader for Cloud Content Management. Our mission is to power how the world works together. Box is partnering with enterprise organizations to accelerate their digital transformation by creating a single platform for secure content management, collaboration and workflow. We have an amazing opportunity to further establish ourselves as leaders in the space, and we need strong advocates to help us achieve that goal.

By joining Box, you will have the unique opportunity to help capture a majority of this developing market and define what content management looks like for the digital enterprise. Today, Box powers over 97,000 businesses, including 70% of the Fortune 500 who trust Box to manage their content in the cloud.

WHY BOX NEEDS YOU

Box is looking for a dynamic Global Site Reliability Technical Duty Officer to help lead our Global Technical Operations and oversee the continuous health, availability, and reliability of an industry-leading platforms and SaaS offerings. It is the responsibility of the TDO team to lead 24x7 GTOC teams in preventing, monitoring, identifying, troubleshooting, mitigating, and resolving issues that affect the availability and quality of Box's platforms and services.

This is an integral shift-based leader and single point of technical escalation within the GTOC organization, assuming accountability for overall production site health and the performance of core customer facing journeys. This role will help maintain total site awareness, detecting metric and service deviations, final level of change approval, and the proactive identification of potential issues; resolving them before they escalate to customer impacting incidents.

We are building a world class Operations Center and need the best talent possible to get us there. That's where you come in

WHAT YOU'LL DO

  • Own and direct live-site Major Incident Management from detection, identification, escalation, mitigation, and recovery.
  • Triage, refine, and verify the Problem Statement, notifies and coordinate the efforts of all appropriate SME resources, and lead cross-functional Incident Bridges to quickly identify and mitigate the problem and restore service. You'll be evaluated in how well you are able to reduce MTTD to MTTR.
  • Ensure accurate, valid and timely communication to key stakeholders and business entities.
  • Lead daily Incident and Change ticket reviews, coordinate and monitor change windows, and coordinate with Problem Management on TopOps Issues and action items.
  • Operate across organizational boundaries (Business, Dev, Ops, CS) to protect our customers, their data, and the availability of all Box services, from internal and external security threats, unanticipated volume surges, and significant performance issues.
  • Troubleshoot and identify critical problems in a SOA/API-based, global hybrid cloud, distributed edge architecture on multiple enterprise and public clouds regions.
  • Provide day to day technical expertise and experience to the organization to address issues in globally diverse, high velocity 24x7 environments - from policy and procedural decisions to key architectural and tooling insights to improve Box's Incident, Change, and Problem Management engineering capabilities.
  • Lead daily reviews of planned changes (CAB) in Jira; accountable for reviewing and minimizing change risk, ensuring adequate and appropriate change timing and duration, and complete rollout, validation, and rollback plans that are optimized to prevent site or service impact.
  • Ensure all customer-impacting Incident tickets are completely and correctly documented and augmented with appropriate metrics, timelines, actions taken, and actions still pending.
  • Contributes and reviews Incident postmortems to ensure adequate documentation and appropriate prioritization of action items related to reducing MTTI, MTTM and MTTR.
  • Participates in Problem Management scrums and Postmortems to identify leading organizational and company-wide technical issues, threats, and trends that block the ability of the organization or teams to perform their roles and provide services optimally and reliably.
  • Lead projects to improve tools and processes related to overall site and service manageability, observability, and resiliency.
  • Coordinate regularly with Infosec, Customer Success, Platform and Dev leaders to continuously access new security and customer on-boarding threats and known issues.
  • Continuously mentor and train Global NOC and system engineers.

WHO YOU ARE

  • You have 5+ years of large-scale production/platform operations experience in a large, SaaS provider environments, preferably as a TDO/Major Incident Manager, SRE team leader or Infrastructure (IaaS) or Platform (PaaS) Architecture SME in a Managed Service Provider environment.
  • Experience in bare metal, Openstack, and K-8 architectures supporting a large number of SOA-API-based services.
  • Exposure to Open Source Service-Meshes, Proxies, Caching, Message Buses (Kafka, MQS), NOSQL (Hbase, Hadoop), MYSQL clusters, and Search environments (SOLR, ES).
  • You should be competent in debugging global, distributed Web/API sites based on Linux systems (Ubuntu, RHL, Centos), BGP, iBGP, and IP Anycast networking in multi-vendor virtualized, Edge and hybrid public cloud architectures.
  • You are not expected to be an expert in all areas, but you should be familiar with common terminologies, processes, and architectures in Linux Open Source environments, as well as a thorough understanding of Virtualization, Containers, and Kubernetes.
  • You are confident and comfortable communicating and interacting with individual-contributors through C-level executives from multiple countries, ethnicities, and backgrounds.
  • You have a rock solid command presence and are calm and collected in highly stressful situations, such as a major service outage.
  • You're driven to continuously learn new skills and technologies.
  • Bachelor's degree in Computer Science or Information Systems or equivalent technical field, or similar work experience in a large-scale 24/7 production environment supporting critical, real-time applications.
  • Flexibility to work different shifts and provide weekend coverage depending on need.

Required Skills

  • Solid understanding of ITILv4 Service Lifecycle Management, Service Delivery KPIs, SLIs, SLOs, and Incident, Change, and Problem Management framework, terminology, tools (ServiceNow, Remedy, Jira Service Desk), and processes
  • Solid knowledge and understanding of security standards and best practices, such as: OWASP, W3C, ISO 27001, SOC1-2, PCI, and SOX
  • Ability to troubleshoot secured protocols such as: SSH, SSO, TLS, FTPS, WebDav, HTTPS
  • Solid understanding and debugging skills in TCP/IP, BGP, IP Anycast, and distributed internal and external DNS
  • Two years working experience and knowledge with multi-regional public cloud providers
  • Experience with observability tools and distributed tracing in large scale environments (Splunk, Datadog, Wavefront, Catchpoint, ThousandEyes, Sensu, SignalFX RUM, Open Telemetry, SNMP)
  • Good understanding and experience with configuration management tools and CI/CD pipelines - Puppet, Ansible, Terraform, Artifactory
  • Excellent interpersonal and communication skills

Desired Skills

  • Understanding of Agile methods and tools (Jira).
  • Experience with WAF, Bot Managers, and Content Delivery Networks (Cloudflare, Akamai)
  • Experience working in and transitioning into multi-regional hybrid cloud architectures (GCP preferred, AWS)
  • Understanding of Apache Zookeeper and Hadoop.
  • Experience with large production Scala, Java, Node, PHP environments helpful.
  • Experience working with various message bus technologies (Kafka, RabbitMQ, MQS)
  • Experience working with relational and non-relational databases and search engines (Mysql, Postgres, HBase, Elastic Search, SOLR)
  • Experience with caching apps (Squid, Redis, Memcache)
  • Experience with service mesh technologies in a hybrid-cloud environment (Zookeeper, Smart Stack)


BENEFITS

Box Benefits package includes pension, medical and dental coverage. We have a robust wellness program including 25 days of vacation (plus your birthday off) and subsidized gym membership. There is such a thing as a free lunch, our in-house chef prepares this daily along with lots of snacks and drinks. EMEA HQ office is located in the impressive White Collar Factory on Old Street; , European offices in Paris and Munich.

EQUAL OPPORTUNITY

We are an equal opportunity employer and value diversity at our company. We do not discriminate on the basis of race, religion, color, national origin, gender, sexual orientation, age, marital status, veteran status, disability, and any other protected ground of discrimination under applicable human rights legislation. Box strives to respect the dignity and ‎‎independence of people with disabilities and is committed to giving them the same ‎‎opportunity to succeed as all other employees. Accommodations are available ‎throughout ‎the application process and an employee's employment at Box.

For details on how we protect your information when you apply, please see our Personnel Privacy Notice.



#LI-EMEA


We have other current jobs related to this field that you can find below


  • London, Greater London, United Kingdom Qatar Duty Free Full time

    We are recruiting for the role of Airport Services Duty Supervisor in Ground Services based at London Heathrow, UK. Reporting directly to the Duty Officer you will supervise your team to deliver exceptional customer service to our customers to ensure they receive the award winning 5-star service Qatar Airways is renowned for. You will also be required to...


  • London, Greater London, United Kingdom EVEREC Full time

    Network Operations Support Engineer Role: Hybrid - 3 Days - London HQ Salary: £35,000 - £50, % bonus My client is looking for a detail-oriented Network Operations Support Engineer to join their growing team. Reporting to the Network Operations Supervisor, you will play a vital role in ensuring the smooth operation and uptime of the companies electric...


  • London, Greater London, United Kingdom EVEREC Full time

    Job DescriptionJob Opportunity: Network Operations Support EngineerRole: Hybrid - 3 Days - London HQSalary: £35,000 - £50, % bonus My client is looking for a detail-oriented Network Operations Support Engineer to join their growing team. Reporting to the Network Operations Supervisor, you will play a vital role in ensuring the smooth operation and uptime...


  • London, Greater London, United Kingdom Digital Ad-network Full time

    Salary: £35-40k base plus 10% bonus and excellent company benefits Date Posted: 22 April 2016 Job Type: Permanent Company: Premium Ad-Network Contact: Job Ref: AT35 Job Description Join this very successful international Premium Ad Network, running high impact brand campaigns across online verticals. Using the latest ad solutions to...


  • London, Greater London, United Kingdom Network Rail Limited Full time

    The railway has seen nearly 200 years of technology and innovation that has transformed how we provide services to passengers and freight customers. Technology is continuing to transform the railway industry. Imagine an exciting environment where Digital, Data, and Technology (DDaT) are not just buzzwords but the driving force behind every operation, every...


  • London, Greater London, United Kingdom UK Power Networks (Operations) Ltd Full time

    Network Options EngineerReference Number This Network Options Engineer will report to the Network Options Manager and will work within the DSO directorate based in our London office. You will be a permanent employee. You will attract a salary of £76,229.00 and a bonus of 3%. This role can also offer blended working after probationary period (6 months) -...


  • London, Greater London, United Kingdom UK Power Networks (Operations) Ltd Full time

    Network Options EngineerReference Number This Network Options Engineer will report to the Network Options Manager and will work within the DSO directorate based in our London office.You will be a permanent employee. You will attract a salary of £76,229.00 and a bonus of 3%. This role can also offer blended working after probationary period (6 months) - 3...


  • London, Greater London, United Kingdom UK Power Networks (Operations) Ltd Full time

    Network Options EngineerReference Number This Network Options Engineer will report to the Network Options Manager and will work within the DSO directorate based in our London office.You will be a permanent employee. You will attract a salary of £76,229.00 and a bonus of 3%. This role can also offer blended working after probationary period (6 months) - 3...


  • London, Greater London, United Kingdom Transport for London Full time

    Network Performance Officer041988Organisation NETWORK PERFORMANCEJob AdministrationPosition Type Full TimeSalary - £24,000Location - Pan-London, based from Southwark, London SE1, with some hybrid workingContract Type - TfLOverview of project/roleIf you're into solving problems, would like to develop new skills in a technical field, and want to make a...


  • London, Greater London, United Kingdom TikTok Full time

    TikTok is the leading destination for short-form mobile video. TikTok has global offices including Los Angeles, New York, London, Paris, Berlin, Dubai, Singapore, Jakarta, Seoul and Tokyo. Our platform is built to help imaginations thrive. That's how we drive impact - for ourselves, our company, and the communities we serve. Second, we are a business...

  • Network Officer

    2 weeks ago


    London, Greater London, United Kingdom Glasgow Full time

    Job Description Reporting to the Operations team, you will have a supporting role operating the Clyde Tunnel including safe operating procedures and systems, risk management, inspections, maintenance, traffic management and incident response. Good communication skills, basic IT skills, cope with pressure, work alone and with others including external...


  • London, Greater London, United Kingdom Pertemps Network Group Full time

    Job Description:My client is looking to recruit an Administrator to take on some of the office management duties.General Responsibilities: Managing the office and keeping it looking great and running smoothly Onboarding of new employees and due diligence on new and existing employee files including organising staff reviews Submitting payroll information...

  • Duty Officer

    2 weeks ago


    London, Greater London, United Kingdom Saga Full time

    Job IntroductionDuty Officer - Saga Travel GroupSalary £25,000 to £28,000 Depending on ExperiencePlus, Shift AllowancesPermanentFT - 38 hours per week**Homebased / Hybrid roleClosing date 13th March 2024Are you always one step ahead of what is happening in the world?Are you passionate about travel?Can you offer exquisite customer service to our valued...


  • London, Greater London, United Kingdom ALOIS Solutions Full time

    We have been investing in new technologies such as the latest generation of firewalls, Cisco switching and Wi-Fi 6E access points, with SD-Access. We're installing this ourselves which has allowed time to learn and set it up. There is external training too, if required. Also, the team is very collaborative, with engineers supporting each other as needed.The...


  • London, Greater London, United Kingdom Spencer Rose Full time £125,000

    Senior Network Security Infrastructure Engineer (Fortinet) City of London (Hybrid) Up to £125,000 per annum On behalf of a Leading City of London based financial services organisation, I am seeking an experienced Senior Network Security Infrastructure Engineer. You will have responsibility for the delivery of Network Security Engineering projects and...


  • London, Greater London, United Kingdom Synnovis Full time

    The post holder will perform routine and complex genetic diagnostic tests and associated administration duties, with mínimal supervision. The post holder will undertake laboratory duties such as prenatal sample preparation, DNA extraction, PCR set-up, capillary-based electrophoresis, cell culture and slide making. There will be a range of other routine...

  • Programme Officer x2

    4 weeks ago


    London, Greater London, United Kingdom www.kcl.acjobs089185-programme-officer-x2 Full time

    The Programme Officer acts as the first point of contact for queries from students and academic staff within the School of Law. They must carry out detailed and accurate work that supports the delivery of our taught programmes and enables student lifecycle activities to run smoothly. With support to develop a working knowledge of regulations and processes,...

  • Network Architect

    4 weeks ago


    London, Greater London, United Kingdom FIS Global Full time

    Position Type : Full time Type Of Hire : Experienced (relevant combo of work and education) Education Desired : Bachelor of Computer Science Travel Percentage : 0%Network Architect – define, design, maintain, and extend functional network and infrastructure architecture for Worldpay's networks, tooling, automation, and public cloud connectivity....

  • Network Engineer

    2 weeks ago


    London, Greater London, United Kingdom Rise Technical Recruitment Ltd Full time

    Network Engineer (TDA) Fully Remote £65,000 - £75, Days Holiday + Pension + Option to buy more Holiday + Healthcare + Life Insurance + Employee Benefits Portal + Health & Wellbeing Support/Vouchers + More Company BenefitsExcellent opportunity to for a TDA Network Engineer to be on the forefront of technological advancement within the Telecommunications...


  • London, Greater London, United Kingdom Real Time Consultants Limited Full time

    Senior Network Operations Engineer (RF) - London - Shift Pattern - Up to £80k + Bonus & Great BenefitsWe are currently working with a global communications company who are revolutionising access to the internet by providing connectivity to those previously left behind. They are on a mission to remove previously set barriers and set the new standards of...