Data Engineer

3 days ago


United Kingdom Oh Full time

About Us

 

We're building the future of uncensored AI infrastructure & products. Our technology powers hyper-immersive experiences and enables the ownership of personalized, interoperable AI characters, unlocking vast monetization opportunities across our ecosystem and beyond.

 

We are initially focused on the Creator and Social-Fi landscapes, building interoperable 'superModel' characters powered by our advanced proprietary multi-modal, uncensored AI models. These superModels can be first experienced on our platform, OhChat, with additional platform integrations in the works.

OhChat, has gained 70,000 users across 174 countries in a matter of weeks. The site allows users to enjoy hyper-immersive experiences with digital AI characters, enabling real-time interactions and uncensored exchanges with original characters as well as ‘digital twins’ who are based on both celebrities and real-world creators, launched in partnership with them.

Website:

Job Overview

As a Data Engineer at Oh, you will play a crucial role in building and optimizing our data pipeline and infrastructure. You’ll be responsible for data collection, particularly large-scale image scraping, and managing structured and unstructured datasets for training generative AI models. You will work closely with machine learning engineers and developers to ensure data quality, availability, and scalability.

Key Responsibilities

  • Data Pipeline Development : Design, build, and maintain data pipelines to support the collection, ingestion, and processing of large-scale image, video, and audio datasets.
  • Data Scraping and Collection : Develop and optimize web scraping scripts to collect high-quality multimedia datasets
  • Data Storage and Management : Implement efficient storage solutions for large volumes of structured and unstructured data, ensuring data accessibility and scalability.
  • ETL Processes : Develop and manage ETL processes to transform raw data into formats suitable for model training.
  • Data Quality Assurance : Ensure data quality and consistency across different sources. Implement monitoring tools and workflows to maintain data accuracy and relevance.
  • Documentation : Maintain clear documentation of data sources, scraping processes, and pipeline workflows for team reference and reproducibility.

Required Skills & Qualifications

  • Programming Languages : Proficiency in either Python or JavaScript for data scraping, ETL, and pipeline development.
  • Web Scraping : Experience with web scraping tools and libraries (e.g., BeautifulSoup, Scrapy).
  • Data Storage and Processing : Experience with databases (SQL and NoSQL, such as PostgreSQL, MongoDB) and cloud storage (e.g., AWS S3, RedShift).
  • Data Pipeline and Workflow Orchestration : Familiarity with data pipeline tools such as Apache Airflow, Prefect, or Luigi.
  • Data Transformation : Strong knowledge of data transformation and processing techniques (e.g., Pandas, Dask for Python).
  • Data Quality Control : Experience with data quality monitoring tools (e.g. dbt, Great Expectations).
  • Version Control : Proficient in using Git for version control, as well as data versioning tools (e.g., DVC)
  • Pipeline Monitoring : Strong experience implementing and owning pipeline monitoring stacks (e.g., Sentry, Grafana, AWS CloudWatch)
  • Testing and code quality : Extensive experience with common frameworks for unit, behavioural, integration, and end-to-end testing (e.g., Pytest, Behave, Postman) and general code quality tools and principles (e.g., Ruff, MyPy, Bandit, Black).

Preferred Qualifications

  • Experience in Generative AI Data Collection : Understanding of the types of data needed for training generative AI models (e.g., GANs, LLMs, diffusion models).
  • Knowledge of ML/DL Basics : Familiarity with machine learning concepts, particularly around data needs for training and evaluation in the context of generative models.
  • Familiarity with Blockchain : Though not mandatory, a keen interest in the blockchain ecosystem and data sources is an advantage.
  • Data Governance : Understanding of legal and ethical implications of data collection, including copyright and privacy concerns.
  • Experience with Image and Video Processing : Familiarity with libraries for image processing (e.g., OpenCV, PIL) and video data handling is a plus.
  • Big Data Experience : Familiarity with big data tools and frameworks (e.g., Spark, Hadoop) is a plus.
  • DevOps: Some experience with common DevOps tools (e.g. CI/CD pipelines, Terraform/CDK, Docker) and best practices are a bonus.

As part of our team, you’ll enjoy:

  • The hustle of a startup with the impact of a global business
  • Tremendous opportunity to join a business pioneering the future of AI
  • Working with an extraordinary team of smart, creative, fun and highly motivated people
  • Flexible working hours, including remote working
  • Modern, uplifting work environment
  • Pension scheme
  • Generous starting salary

 


  • Senior Data Scientist

    1 month ago


    United Kingdom Data Science Talent Full time

    Data Scientist / Senior Data Scientist - F1 Motorsport Location: The time it takes to service an entire F1 racing car at the pit stop. The data you and your team help to deliver could make or break lap time for one of the world's leading Formula 1 teams. A lap time that over 500 million people will see at the most-watched annual sporting series in the...

  • Senior Data Scientist

    1 month ago


    United Kingdom Data Science Talent Full time

    Data Scientist / Senior Data Scientist - F1 Motorsport Location: The time it takes to service an entire F1 racing car at the pit stop. The data you and your team help to deliver could make or break lap time for one of the world's leading Formula 1 teams. A lap time that over 500 million people will see at the most-watched annual sporting series in the...

  • Senior Data Scientist

    3 weeks ago


    United Kingdom Data Science Talent Full time

    Data Scientist / Senior Data Scientist - F1 Motorsport Location: The time it takes to service an entire F1 racing car at the pit stop. The data you and your team help to deliver could make or break lap time for one of the world's leading Formula 1 teams. A lap time that over 500 million people will see at the most-watched annual sporting series in the...

  • Data Scientist

    1 month ago


    United Kingdom Data Science Talent Full time

    Data Scientist / Senior Data Scientist - F1 Motorsport Location: South East England -----2.62 seconds. The time it takes to service an entire F1 racing car at the pit stop. 2 weeks. How soon you could see the impact of your work translating into visible results for the team on the track. The data you and your team help to deliver could make or break lap...

  • Data Scientist

    1 month ago


    United Kingdom Data Science Talent Full time

    Data Scientist / Senior Data Scientist - F1 Motorsport Location: South East England -----2.62 seconds. The time it takes to service an entire F1 racing car at the pit stop. 2 weeks. How soon you could see the impact of your work translating into visible results for the team on the track. The data you and your team help to deliver could make or break lap...

  • Data Scientist

    3 days ago


    United Kingdom Data Science Talent Full time

    Data Scientist / Senior Data Scientist - F1 Motorsport Location: South East England ----- 2.62 seconds. The time it takes to service an entire F1 racing car at the pit stop. 2 weeks. How soon you could see the impact of your work translating into visible results for the team on the track. The data you and your team help to deliver could make or...


  • United Kingdom Equans Data Centres Full time

    Equans Data Centres are seeking a highly skilled and experienced Authorising Engineer (AE) in Electrical Engineering to oversee and ensure compliance with industry regulations and standards. The role involves providing expert guidance on safe electrical systems, reviewing designs, conducting audits, and authorising personnel to perform electrical work. Act...

  • Authorising Engineer

    3 weeks ago


    United Kingdom Equans Data Centres Full time

    Equans Data Centres are seeking a highly skilled and experienced Authorising Engineer (AE) in Electrical Engineering to oversee and ensure compliance with industry regulations and standards. The role involves providing expert guidance on safe electrical systems, reviewing designs, conducting audits, and authorising personnel to perform electrical work. What...

  • Data Engineer

    1 month ago


    United Kingdom nineDots Full time

    Unlock Your Data Engineering PotentialAre you ready to take your data engineering skills to the next level in a dynamic and exciting role? nineDots is partnering with Cloudsmith to find a talented Data Engineer to help them build a world-class data platform on AWS.As a Data Engineer at Cloudsmith, you'll be a key player in designing and developing scalable,...

  • Data Engineer

    1 month ago


    United Kingdom nineDots Full time

    Unlock Your Data Engineering PotentialAre you ready to take your data engineering skills to the next level in a dynamic and exciting role? nineDots is partnering with Cloudsmith to find a talented Data Engineer to help them build a world-class data platform on AWS.As a Data Engineer at Cloudsmith, you'll be a key player in designing and developing scalable,...

  • Data Engineer

    1 week ago


    United Kingdom Adecco Full time €40

    Role Title: Data Engineer Location: UK Remote Duration: 12 Months Working Hours: Normal business hours Minimum Hourly Rate: £40.38 The Role: The main function of the Data Engineer is to develop, evaluate, test and maintain architectures and data solutions within our organization. The typical Data Engineer executes plans, policies, and practices...

  • Data Engineer

    1 week ago


    United Kingdom Adecco Full time €40

    Role Title: Data Engineer Location: UK Remote Duration: 12 Months Working Hours: Normal business hours Minimum Hourly Rate: £40.38 The Role: The main function of the Data Engineer is to develop, evaluate, test and maintain architectures and data solutions within our organization. The typical Data Engineer executes plans, policies, and practices...

  • Data Engineer

    1 week ago


    United Kingdom Adecco Full time

    Role Title: Data EngineerLocation: UK RemoteDuration: 12 MonthsWorking Hours: Normal business hoursMinimum Hourly Rate: £40.38The Role:The main function of the Data Engineer is to develop, evaluate, test and maintain architectures and data solutions within our organization. The typical Data Engineer executes plans, policies, and practices that control,...

  • Data Engineer

    3 days ago


    United Kingdom Adecco Full time

    Role Title: Data Engineer Location: UK Remote Duration: 12 Months Working Hours: Normal business hours Minimum Hourly Rate: £40.38 The Role: The main function of the Data Engineer is to develop, evaluate, test and maintain architectures and data solutions within our organization. The typical Data Engineer executes plans, policies, and practices that...

  • Data Engineer

    3 days ago


    United Kingdom Adecco Full time

    Role Title: Data Engineer Location: UK Remote Duration: 12 Months Working Hours: Normal business hours Minimum Hourly Rate: £40.38 The Role: The main function of the Data Engineer is to develop, evaluate, test and maintain architectures and data solutions within our organization. The typical Data Engineer executes plans, policies, and...

  • Data Engineer

    1 week ago


    United Kingdom Adecco Full time

    Role Title: Data EngineerLocation: UK RemoteDuration: 12 MonthsWorking Hours: Normal business hoursMinimum Hourly Rate: £40.38The Role:The main function of the Data Engineer is to develop, evaluate, test and maintain architectures and data solutions within our organization. The typical Data Engineer executes plans, policies, and practices that control,...

  • Data Engineer

    1 week ago


    United Kingdom ShortList Recruitment Limited Full time €40,000

    Junior Data Engineer Chester £40,000 ShortList has an exciting opportunity for a Junior Data Engineer to join a thriving and growing business based in Chester. This is a dynamic and growing technology led business with an innovative approach to business. You will be joining an organisation that values creativity, collaboration, and continuous learning. ...

  • Data Engineer

    1 week ago


    United Kingdom ShortList Recruitment Limited Full time €40,000

    Junior Data Engineer Chester £40,000 ShortList has an exciting opportunity for a Junior Data Engineer to join a thriving and growing business based in Chester. This is a dynamic and growing technology led business with an innovative approach to business. You will be joining an organisation that values creativity, collaboration, and continuous learning. ...

  • Data Engineer

    1 week ago


    United Kingdom Synchro Full time €500 - €600

    Data Engineer – 3 Month Contract (Extension Available) – OUTSIDE IR35 – Fully Remote, UK We’re seeking an experienced Data Engineer to join our client for a 3-month contract with the possibility of extension. This role is essential for advancing data-driven initiatives by transforming unstructured data, such as PDFs and emails, into structured...

  • Data Engineer

    1 week ago


    United Kingdom Synchro Full time €500 - €600

    Data Engineer – 3 Month Contract (Extension Available) – OUTSIDE IR35 – Fully Remote, UK We’re seeking an experienced Data Engineer to join our client for a 3-month contract with the possibility of extension. This role is essential for advancing data-driven initiatives by transforming unstructured data, such as PDFs and emails, into structured...