Senior AI Infrastructure Engineer

4 weeks ago


London, Greater London, United Kingdom Chan Zuckerberg Initiative Full time
About the Role

We are seeking a highly skilled Senior AI Infrastructure Engineer to join our team at the Chan Zuckerberg Initiative. As a key member of our AI/ML and Data Infrastructure organization, you will play a critical role in building and scaling our shared tools and platforms to support our initiatives.

As a Senior AI Infrastructure Engineer, you will be responsible for designing, building, and deploying efficient, stable, and secure AI/ML and Data infrastructure engineering solutions. You will work closely with our research scientists, data scientists, and engineers to develop and implement complex systems integrating with our large-scale AI/ML GPU compute infrastructure and platform.

Key Responsibilities:

  • Design and implement efficient, stable, and secure AI/ML and Data infrastructure engineering solutions.
  • Collaborate with our research scientists, data scientists, and engineers to develop and implement complex systems integrating with our large-scale AI/ML GPU compute infrastructure and platform.
  • Use your solid experience and skills in building containerized applications and infrastructure using Kubernetes in support of our large-scale GPU research cluster.
  • Collaborate with our partners on data management solutions in our heterogeneous collection of complex datasets.
  • Help build tooling that makes optimal use of our shared infrastructure in empowering our AI/ML efforts with world-class GPU compute cluster and other compute environments.

Requirements:

  • BS or MS degree in Computer Science or a related technical discipline or equivalent experience.
  • 5+ years of relevant coding experience.
  • 3+ years of systems architecture and design experience, with a broad range of experience across Data, AI/ML, Core Infrastructure, and Security Engineering.
  • Scaling containerized applications on Kubernetes or Mesos, including expertise with creating custom containers using secure AMIs and continuous deployment systems that integrate with Kubernetes or Mesos.
  • Proficiency with Amazon Web Services (AWS), Google Cloud Platform (GCP), or Microsoft Azure, and experience with On-Prem and Colocation Service hosting environments.
  • Proven coding ability with a systems language such as Rust, C/C++, C#, Go, Java, or Scala.
  • Shown ability with a scripting language such as Python, PHP, or Ruby.
  • AI/ML Platform Operations experience in an environment with challenging data and systems platform challenges, including large-scale Kafka and Spark deployments.
  • MLOps experience working with medium to large-scale GPU clusters in Kubernetes (Kubeflow), HPC environments, or large-scale Cloud-based ML deployments.
  • Working knowledge of Nvidia CUDA and AI/ML custom libraries.
  • Knowledge of Linux systems optimization and administration.
  • Understanding of Data Engineering, Data Governance, Data Infrastructure, and AI/ML execution platforms.

About Us

The Chan Zuckerberg Initiative is a philanthropic organization dedicated to advancing human potential and promoting equal opportunity. We believe that the strongest teams and best thinking are defined by the diversity of voices at the table. We are committed to fair treatment and equal access to opportunity for all CZI team members and to maintaining a workplace where everyone feels welcomed, respected, supported, and valued.

We offer a wide range of benefits to support the people who make all we do possible, including a generous employer match on employee 401(k) contributions, annual benefit for employees, CZI Life of Service Gifts, paid time off to volunteer, funding for select family-forming benefits, and relocation support for employees who need assistance moving to the Bay Area.

Join us in our mission to build a more inclusive, just, and healthy future for everyone.



  • London, Greater London, United Kingdom microTECH Global LTD Full time

    Job Title: Senior AI Infrastructure EngineerJob Type: Fixed Term ContractWe are seeking a highly skilled Senior AI Infrastructure Engineer to join our team at microTECH Global LTD. As a key member of our AI Infrastructure Team, you will be responsible for managing large-scale AI development and training infrastructure.Key Responsibilities:* Oversee GPU...


  • London, Greater London, United Kingdom Chan Zuckerberg Initiative Full time

    The Chan Zuckerberg Initiative is a leading organization in the field of AI/ML and Data Infrastructure. We are seeking a highly skilled Senior AI Infrastructure Engineer to join our team.About the RoleThe Senior AI Infrastructure Engineer will be responsible for designing and building efficient, stable, performant, scalable, and secure AI/ML and Data...


  • London, Greater London, United Kingdom Chan Zuckerberg Initiative Full time

    The Chan Zuckerberg Initiative is a pioneering organization that leverages technology to drive meaningful change. As a Senior AI Infrastructure Engineer, you will play a critical role in building and maintaining the technical foundation that enables our mission.The OpportunityWe are seeking a highly skilled engineer to join our AI/ML and Data Infrastructure...


  • London, Greater London, United Kingdom Chan Zuckerberg Initiative Full time

    About the RoleWe are seeking a highly skilled Senior AI Infrastructure Engineer to join our team at the Chan Zuckerberg Initiative. As a key member of our AI/ML and Data Infrastructure organization, you will play a critical role in building shared tools and platforms to support our initiatives across the organization.Our team is responsible for designing,...


  • London, Greater London, United Kingdom Chan Zuckerberg Initiative Full time

    About the RoleWe are seeking a highly skilled Senior AI Infrastructure Engineer to join our team at the Chan Zuckerberg Initiative. As a key member of our AI/ML and Data Infrastructure organization, you will play a critical role in building shared tools and platforms to support our initiatives across the organization.Our team is responsible for designing,...


  • London, Greater London, United Kingdom Chan Zuckerberg Initiative Full time

    The Chan Zuckerberg Initiative is a leader in harnessing the power of technology to drive social impact. We are seeking a highly skilled Senior AI Infrastructure Engineer to join our team and help us build a more inclusive, just, and healthy future for everyone.About the RoleWe are looking for a talented engineer to design, build, and scale software systems...


  • London, Greater London, United Kingdom Chan Zuckerberg Initiative Full time

    About the RoleThe Chan Zuckerberg Initiative is seeking a highly skilled Senior AI Infrastructure Engineer to join our AI/ML and Data Infrastructure team. As a key member of our team, you will be responsible for designing, building, and scaling software systems to support our mission to build a more inclusive, just, and healthy future for everyone.We are...


  • London, Greater London, United Kingdom Chan Zuckerberg Initiative Full time

    About the RoleThe Chan Zuckerberg Initiative is seeking a highly skilled Senior AI Infrastructure Engineer to join our AI/ML and Data Infrastructure team. As a key member of our team, you will be responsible for designing, building, and scaling software systems to support our mission to build a more inclusive, just, and healthy future for everyone.We are...


  • London, Greater London, United Kingdom ZipRecruiter Full time

    Job Title: Senior MLOps EngineerLa Fosse is currently working with a cutting-edge AI start-up that utilises advanced robots to maximise human capacity and effectiveness. In this role, you will oversee the end-to-end lifecycle of AI/ML models, from development to deployment. You will ensure the reliability, scalability, and security of AI/ML infrastructure,...


  • London, Greater London, United Kingdom ZipRecruiter Full time

    Job Title: Senior MLOps EngineerLa Fosse is currently working with a cutting-edge AI start-up that utilises advanced robots to maximise human capacity and effectiveness. In this role, you will oversee the end-to-end lifecycle of AI/ML models, from development to deployment. You will ensure the reliability, scalability, and security of AI/ML infrastructure,...


  • London, Greater London, United Kingdom Gradient Labs AI Full time

    We are Gradient Labs, a pioneering AI company based in the UK.Our mission is to redefine customer support for the next decade by building a suite of LLM-based autonomous agents that can safely automate complex queries.We are looking for a skilled Backend Engineer to contribute to our "operating system" for future AI agents, ensuring it is safe, scalable, and...


  • London, Greater London, United Kingdom Xcede Full time

    Xcede is seeking an experienced Ai Infrastructure Engineer to join our growing GenAI team. This role requires a strong background in Python and proficiency in AWS, with a bonus for experience with Kafka, Databricks, and RAG. Your primary responsibility will be to develop effective prompts for AI models while fine-tuning them, collaborating with Data Science...


  • London, Greater London, United Kingdom Aitopics Full time

    Job Title: Senior Infrastructure Engineer - AI Development and TrainingHuawei R&D UK is seeking a highly skilled Senior IT Engineer to manage a large-scale AI development and training infrastructure.The role involves overseeing GPU servers, Kubernetes clusters (Rancher), and storage systems to ensure seamless operations and optimized performance.You will...


  • London, Greater London, United Kingdom Chan Zuckerberg Initiative Full time

    The Chan Zuckerberg Initiative is a pioneering organization that combines technology with grantmaking, impact investing, and collaboration to drive progress toward its mission of building a more inclusive, just, and healthy future for everyone.Our Central Operations & Partners team provides the support needed to push this work forward, and we are seeking a...


  • London, Greater London, United Kingdom Artifact AI Full time

    At Artifact AI, we're pushing the boundaries of accounting automation with intelligent, enterprise-grade AI agents. Our agentic workflows streamline complex, end-to-end accounting processes for businesses and accounting firms, enabling them to scale efficiently and focus on high-value tasks. Artifact AI empowers organizations by delivering automation with...

  • Senior AI Engineer

    2 weeks ago


    London, Greater London, United Kingdom FactSet Full time

    FactSet Senior AI Engineer - Cloud Infrastructure ExpertAt FactSet, we are looking for a highly skilled Senior AI Engineer to join our team as a Cloud Infrastructure Expert. This exciting opportunity will involve developing and maintaining machine learning pipelines to support our cutting-edge models, ensuring seamless integration and maintenance of model...


  • London, Greater London, United Kingdom microTECH Global LTD Full time

    Job Title: DevOps EngineerJob Type: Fixed Term Contract Our client, microTECH Global LTD, is a global telecommunication company seeking a highly skilled Senior DevOps Engineer to manage their AI Infrastructure Team. We are looking for an experienced professional to oversee the large-scale AI development and training infrastructure, ensuring seamless...


  • London, Greater London, United Kingdom ZipRecruiter Full time

    Job Title: Senior MLOps Engineer (AWS)Job Type: Full-timeLocation: Central LondonJob Description:We are seeking a highly skilled Senior MLOps Engineer to join our team in Central London. As a key member of our AI/ML infrastructure team, you will be responsible for overseeing the end-to-end lifecycle of AI/ML models, from development to deployment.Key...


  • London, Greater London, United Kingdom ZipRecruiter Full time

    Job Title: Senior MLOps Engineer (AWS)Job Type: Full-timeLocation: Central LondonJob Description:We are seeking a highly skilled Senior MLOps Engineer to join our team in Central London. As a key member of our AI/ML infrastructure team, you will be responsible for overseeing the end-to-end lifecycle of AI/ML models, from development to deployment.Key...


  • London, Greater London, United Kingdom La Fosse Full time

    Job Title: Senior MLOps Engineer (AWS)Job Summary: La Fosse is currently working with a cutting-edge AI start-up that utilises advanced Robots to maximise human capacity/effectiveness. As a Senior MLOps Engineer (AWS), you will oversee the end-to-end lifecycle of AI/ML models, from development to deployment, ensuring the reliability, scalability, and...