DataStack Jobs logoBeta

Lead Data Engineer

At H1 we are creating a healthier future by delivering a platform that connects stakeholders across the healthcare ecosystem for greater collaboration and discovery. We believe providing a trusted and single source of truth for healthcare professional information will power connections in healthcare - and that these connections will lead us to a healthier future. Visit to learn more about us.

Data Engineering is responsible for the development and delivery of our most important asset - our data. Looking across thousands of data sources from across the globe, the data engineering team is responsible for making sense out of that data to create the world’s most extensive and comprehensive knowledge base of healthcare stakeholders and the ecosystem they influence. It is our job to ensure that only accurate, normalized data flows to our customers, and at a velocity that keeps up with the changes in the real world. As we rapidly expand the markets we serve and the breadth and depth of data we want to collect for our customers, the team must grow and scale to meet that demand.


As a Lead Data Engineer, you will be responsible for big data engineering, data wrangling, data analysis and user support primarily focused on the AWS platform. You will have direct founder-level interactions. You’ll not only learn about great technology and a great product, but you’ll also learn from the decision-makers who have successfully built and exited multiple startups. You will work directly with stakeholders across our company to deliver the best scalable, stable, and high-quality healthcare data application in the market.

You will:

  • Analyze the business needs, profile large data sets and build custom data models and applications to drive business decision making and customers experience
  • Build workflows that empower analysts to efficiently validate large volume of data
  • Design optimized big data solutions for data ingestion, data processing, data wrangling, and data delivery
  • Design, develop and tune data products, streaming applications, and integrations on large-scale data platforms (Spark, Kafka/Kinesis streaming, SQL server, data warehousing, big data, etc) with an emphasis on performance, reliability, and scalability, and most of all quality.
  • Identify, design, and implement internal process improvements: automating manual processes, optimizing data delivery, re-designing infrastructure for greater scalability, etc.
  • Build the infrastructure required for efficient extraction, transformation, and loading of data from a wide variety of data sources
  • Build data tools for analytics and data scientist team members that assist them in building and optimizing our product into an innovative industry leader.
  • Peer Review of the code developed by team members


  • 6+ years of professional experience with big data systems, pipelines, data processing, and reporting
  • 3+ years’ experience working on big data technologies like Spark or Hadoop preferably on AWS EMR
  • Practical hands-on experience with technologies like Apache Spark, Apache Flink, and Apache Hudi
  • Experience with data processing technologies like Spark Streaming, Kafka Streaming, K-SQL , Spark SQL, or Map/Reduce
  • Understanding of various distributed file formats such as Apache AVRO, Apache Parquet and common methods in data transformation
  • Must take data quality and security seriously- Someone with an ability to isolate, deconstruct and resolve complex data engineering challenges
  • Experience with AWS cloud preferred
  • Good to have experience working with ELK stack
  • Ability to be highly present either in person or virtually, to be reliable, and to act as a steward of H1
  • Be a great human who contributes to an amazing, accepting, and diverse culture
  • Someone who values documenting their work to allow them the opportunity to fight the next great fight, while others can pick up on their prior work 

Not meeting all the requirements but still feel like you’d be a great fit? Tell us how you can contribute to our team in a cover letter!


  • A competitive compensation package including stock options
  • A full suite of health insurance options, in addition to Unlimited Paid Time Off
  • Flexible work hours & the opportunity to work from anywhere, with optional commuter benefits
  • Investment in your success by providing you with the skills, knowledge, and mentorship to make you successful
  • An opportunity to work with leading biotech and life sciences companies, in an innovative industry with a mission to improve healthcare around the globe

H1 Insights is proud to be an equal opportunity employer that celebrates diversity and is committed to creating an inclusive workplace with equal opportunity for all applicants and teammates. Our goal is to recruit the most talented people from a diverse candidate pool regardless of race, color, ancestry, national origin, religion, disability, sex (including pregnancy), age, gender, gender identity, sexual orientation, marital status, veteran status, or any other characteristic protected by law.

H1 is committed to working with and providing access and reasonable accommodation to applicants with mental and/or physical disabilities. If you require an accommodation, please reach out to your recruiter once you've begun the interview process. All requests for accommodations are treated discreetly and confidentially, as practical and permitted by law.





Job type



Data Engineering


Apache FlinkAvroAWSBig Data
DataStack Jobs logo

Copyright © 2021

PrivacyTermsGet in touch