Data Engineer

  • Location

    England

  • Sector:

    Life Sciences

  • Job type:

    Permanent

  • Salary:

    Negotiable

  • Contact:

    Janne Bate

  • Contact email:

    Janne.Bate@volt.eu.com

  • Job ref:

    76915-PHARM-JNB_1596541227

  • Published:

    4 months ago

  • Expiry date:

    2020-09-03

  • Start date:

    22/6/20

  • Consultant:

    #

Data Engineer required to join a new Data team in a leading biotech start-up.

Role:

You will design and build future-proof databases, large-scale processing systems and APIs in collaboration with Bioinformatics, Machine Learning and modeling experts, by developing, constructing, testing and maintaining data acquisition and dissemination methods. Deciding the best methods to acquire, curate, store and retrieve many primary and secondary data types along with metadata pertaining to various data domains.

Analysing characteristics of data sets (-omics, imaging, structural) required by Bioinformatics, Machine Learning and Science team members, and using that understanding to discover and develop methods to make them available.

Developing and implementing the most optimal methods for regular extraction, curation, transformation, storage, retrieval and delivery of large and complex scientific datasets for Research and Product Development

Recommending and implementing ways to improve data reliability, efficiency, and quality, through systems integration methods, automation of acquisition and quality control/assurance processes

Actively identifying patterns and anomalies in datasets using data surveillance tools as part of data performance reviews, and identify methods to improve existing processing pipelines.

Requirements

Bachelor's or Master's degree in computing science or equivalent experience
Prior experience of working as a software or data engineer
Experience writing Python/R scripts
Ability and eagerness to rapidly learn new languages/frameworks as required.
Experience working in a Linux command-line environment
Experience with Git version control
Experience working with containerisation e.g. Docker
Ideally:
Experience with Continuous Integration and R packages/ python modules
Knowledge of Data Management best practices
Demonstratable experience with processing and visualisation of biological datasets
Experience working with the Atlassian toolchain (Jira, Bitbucket, Confluence)
Experience working in an Agile/Scrum environment