AI Infrastructure Engineer
1 week ago
About Writer
We are a leading provider of transformative AI solutions for enterprises, empowering hundreds of customers like Accenture, Intuit, L'Oreal, and Vanguard to revolutionize their workflows.
Our all-in-one platform makes it easy to deploy customized AI apps and workflows that accelerate growth, increase productivity, and ensure compliance. We provide enterprise-grade accuracy, security, and efficiency through our suite of development tools supported by Palmyra – our state-of-the-art family of LLMs – alongside our industry-leading graph-based RAG and customizable AI guardrails.
About this roleAs an AI Infrastructure Engineer, you will be responsible for deploying and managing cutting-edge infrastructure crucial for AI/ML operations. You will collaborate with AI/ML engineers and researchers to develop a robust CI/CD pipeline that supports safe and reproducible experiments. Your expertise will also extend to setting up and maintaining monitoring, logging, and alerting systems to oversee extensive training runs and client-facing APIs.
This role demands a proactive approach to maintaining large Kubernetes clusters, optimizing system performance, and providing operational support for our suite of software solutions. Some key responsibilities include:
- Designing and deploying a CI/CD pipeline that ensures safe and reproducible experiments
- Setting up and managing monitoring, logging, and alerting systems for extensive training runs and client-facing APIs
- Ensuring training environments are consistently available and prepared across multiple clusters
- Improving reliability, quality, and time-to-market of our suite of software solutions
- Measuring and optimizing system performance
- Providing primary operational support and engineering for multiple large-scale distributed software applications
We are looking for someone with professional experience in the following areas:
- Model training
- Huggingface Transformers
- Pytorch
- vLLM
- TensorRT
- Infrastructure as code tools like Terraform
- Scripting languages such as Python or Bash
- Cloud platforms such as Google Cloud, AWS or Azure
- Git and GitHub workflows
- Tracing and Monitoring
Familiarity with high-performance, large-scale ML systems is also essential. You should have a knack for troubleshooting complex systems and enjoy solving challenging problems. Proactive identification of problems, performance bottlenecks, and areas for improvement is also required.
We offer a competitive salary of $150,000 - $180,000 per year, depending on experience, plus benefits including generous PTO, medical, dental, and vision coverage, paid parental leave, fertility and family planning support, and annual work-life stipends for home office setup, cell phone, internet, wellness, and learning and development.
-
AI Infrastructure Engineer
3 weeks ago
London, Greater London, United Kingdom Xcede Full timeXcede is seeking an experienced Ai Infrastructure Engineer to join our growing GenAI team. This role requires a strong background in Python and proficiency in AWS, with a bonus for experience with Kafka, Databricks, and RAG. Your primary responsibility will be to develop effective prompts for AI models while fine-tuning them, collaborating with Data Science...
-
Cybersecurity Research Engineer
2 weeks ago
London, Greater London, United Kingdom AI Safety Institute Full timeWe are seeking an exceptional Cybersecurity Research Engineer to join our team at the AI Safety Institute. Our goal is to develop first-of-its-kind government-run infrastructure to benchmark the progress of advanced AI capabilities in cyber security. The selected candidate will work closely with a cross-functional team of cybersecurity researchers, machine...
-
London, Greater London, United Kingdom AI Safety Institute Full timeAs advanced AI systems continue to evolve, the potential risks associated with their cyber capabilities pose a significant threat to organizational and individual security. These risks are particularly concerning when combined with other AI risk areas, such as harmful outcomes from biological and chemical capabilities, and autonomous systems.The AI Safety...
-
AI Infrastructure Specialist
2 weeks ago
London, Greater London, United Kingdom Microsoft Full timeWe are looking for a highly skilled and motivated AI Infrastructure Specialist to join our team at Microsoft AI.About UsAt Microsoft AI, we are on a mission to create the leading pretraining platform to develop the world's most capable AI frontier models. This platform will span one of the world's foremost GPU clusters, pushing the boundaries of scale,...
-
AI Alignment Research Engineer
2 weeks ago
London, Greater London, United Kingdom Atla Ai Full timeAtla Ai: Safeguarding the Future of HumanityAbout Us:We're Atla Ai, a pioneering London-based start-up dedicated to engineering safe and beneficial AI systems. Our mission is to drive positive change in the world by developing cutting-edge AI evaluation models.Role Overview:As our alignment research engineer, you'll play a pivotal role in shaping the future...
-
AI Data Insights Engineer
2 weeks ago
London, Greater London, United Kingdom Engine AI Full timeWe are expanding the AI capabilities of our company, Engine AI, and seeking a seasoned AI Data Insights Engineer to spearhead the development of Data Agents. This critical role involves creating tools that translate natural language questions into actionable insights, including SQL query generation, entity matching, and data visualizations.This is an...
-
AI Alignment Research Engineer
1 month ago
London, Greater London, United Kingdom Atla Ai Full timeAbout AtlaWe are a London-based start-up building the most capable AI evaluation models. Our mission is to engineer safe, beneficial AI systems that will have a massive positive impact on the future of humanity.RoleAs Atla's alignment research engineer, you'll develop language models as evaluators and use your insights to construct safety guardrails for...
-
AI Infrastructure Engineer
4 days ago
London, Greater London, United Kingdom Encord Full timeWe are on the cusp of a revolution in AI infrastructure, and we need your expertise to take us to the next level. As a seasoned software engineer, you will play a crucial role in building and extending our cutting-edge platform. With $30M in Series B funding, we're a talented team of 60, working at the forefront of computer vision and deep learning.As a key...
-
AI Expert
5 days ago
London, Greater London, United Kingdom Engine AI Full timeSenior AI EngineerWe're expanding our AI capabilities at Engine AI and seeking a seasoned Senior AI Engineer to spearhead the development of Data Agents. This role involves crafting tools that translate natural language queries into actionable insights, including SQL query generation, entity matching, and data visualizations.As a key member of our team,...
-
AI Solutions Architect
1 week ago
London, Greater London, United Kingdom C3 AI Full timeAbout the RoleWe are seeking an experienced AI Solutions Architect to join our team at C3.ai. The ideal candidate will have a strong background in developing and deploying enterprise-scale AI applications.Job DescriptionThe successful candidate will work with large companies to build the next generation of AI-powered enterprise applications on the C3 AI...
-
Innovative AI Engineer Position
1 week ago
London, Greater London, United Kingdom Atla Ai Full timeAtla Ai is committed to creating safe, beneficial AI systems that will have a significant positive impact on humanity's future. We are a London-based start-up developing the most capable AI evaluation models.Role and ResponsibilitiesAs Atla Ai's Research Engineer, you'll develop and fine-tune language models as evaluators and use your insights to construct...
-
Senior Software Engineer
4 weeks ago
London, Greater London, United Kingdom Signal AI Full timeAbout the Reputation TeamThe Reputation Team at Signal AI is dedicated to delivering exceptional customer experiences in the Reputation space. Our mission is to provide innovative tools and solutions that help PR executives and Chief Communications Officers navigate the vast volume of world media data.As a key member of our team, you will be responsible for...
-
Senior Infrastructure Engineer
1 month ago
London, Greater London, United Kingdom Aitopics Full timeJob Title: Senior Infrastructure Engineer - AI Development and TrainingHuawei R&D UK is seeking a highly skilled Senior IT Engineer to manage a large-scale AI development and training infrastructure.The role involves overseeing GPU servers, Kubernetes clusters (Rancher), and storage systems to ensure seamless operations and optimized performance.You will...
-
AI Safety Engineer
1 week ago
London, Greater London, United Kingdom AI Safety Institute Full timeThe Post-Training Team at the AI Safety Institute is dedicated to optimizing AI systems for state-of-the-art performance in various risk domains. This involves a combination of scaffolding, prompting, supervised and RL fine-tuning of AI models.Key Responsibilities:Improve model performance using cutting-edge machine learning techniquesDevelop methodologies...
-
AI Engineering Specialist
4 days ago
London, Greater London, United Kingdom Higher - AI recruitment Full timeAbout the Job DescriptionThis Data Engineer position is an exciting opportunity to join our team at Higher - AI recruitment and contribute to the development of sophisticated data-driven products that support our clients' journey towards Net Zero.As a mid-senior level Data Engineer, you will have the opportunity to work with cutting-edge technologies and...
-
AI Engineering Team Lead
1 week ago
London, Greater London, United Kingdom Symphony Industrial AI, Inc. Full timeJob SummaryA highly skilled AI Engineer is needed to lead the development of cutting-edge AI solutions for our London-based Trading & Investing team. This role involves collaborating with Data Engineers and Full Stack developers to build innovative AI systems, leveraging expertise in multi-agent generative AI frameworks, machine learning, AIOps, and...
-
AI Infrastructure Specialist
1 week ago
London, Greater London, United Kingdom Tag Full timeAI Infrastructure SpecialistWe are seeking a highly skilled AI Infrastructure Specialist to join our team in London. In this role, you will be responsible for designing and implementing AI infrastructure that meets the needs of our data science teams.About the RoleYou will work closely with our data science teams to ensure seamless integration of machine...
-
AI Infrastructure Engineer
1 week ago
London, Greater London, United Kingdom Encord Full timeAbout EncordWe are a cutting-edge AI infrastructure company that is revolutionizing the field of computer vision and deep learning. Our team of talented engineers is pushing the boundaries of what is possible with AI, and we are looking for an experienced engineer to join us.Job Description:We are seeking an outstanding AI infrastructure engineer to help us...
-
Visionary AI Architect
5 days ago
London, Greater London, United Kingdom Artifact AI Full timeRedefine accounting with intelligent AI agents, automating complex financial processes for businesses and accounting firms.Lead ML research and develop enterprise-grade AI agents to automate workflows. Drive technical vision and create robust AI agents handling bookkeeping, tax compliance, and more.Design scalable AI agents tackling real-world challenges in...
-
Cloud Systems Engineer
1 week ago
London, Greater London, United Kingdom Predict X Full timePredictX is a leading SaaS scale-up revolutionising critical decision-making for global businesses.Job Description:We are seeking an experienced Cloud Systems Engineer to lead the migration of our on-prem systems to Google Cloud Platform (GCP). As a key player in our IT infrastructure team, you will be responsible for setting up and maintaining computer...