Location: San Mateo, CA
Position: Data Engineer
We have an opportunity to truly impact the lives of millions of patients with our intelligent care system. To do that, we are building a team that is passionate about providing service to others in the best way we know how - creating life altering software. With real-world data, real-time symptom management leveraging machine learning and a tool for clinicians to quickly and intuitively view and restructure patient information, our platform is allowing for truly individualized care for every patient.
Our intention is to have a team centered around that mission. To get there, we believe; navigating ambiguity with data, being an owner, inspiring authenticity, working as one team with the same dream and constantly designing will drive us toward success as a team, and a business.
We’re ready to build the most meaningful technology in cancer care, will you join us?
As one of the first engineers to join the Data Platform Team, you will be developing complex distributed streaming systems to ingest and process varied and huge amounts of data arriving from and delivered to our partner healthcare systems. Additionally you will develop Data Insights Platform, that will power ML workloads, in partnership with the Data Science team. Our stack is Python, Spark, Kafka, MySQL, Oracle ADWC and Cassandra. You will learn and build healthcare knowledge graphs that will be central to improve outcomes for cancer patients.
What You Will Do:
- Develop distributed streaming data systems, services and frameworks to address high-volume complex data collection, processing, transformation, ingestion and reporting.
- Develop data models, fixtures data and multi-stage distributed processing code for the models.
- Write code and unit tests in Python and conduct code reviews
- Drive continuous improvements by taking ownership of technical aspects of software development and identifying opportunities to adopt innovative methods and technologies.
- Partner with peers to collaboratively build software solutions to address user's pain points.
What We're Looking For:
- B.S. in Computer Science or 5+ years of experience with delivering production quality software
- Expert in Python, Ruby or Java; expert in SQL
- Expert level skills with building products using distributed technologies
- Strong and demonstrable experience with more than one of: Relational Stores (E.g Postgres, MySQL, Oracle)Columnar or NoSQL Stores (Oracle ADWC, Redshift, Cassandra, DynamoDB)Distributed / Async Processing (Apache Spark, Apache Storm, Celery, Sidekiq)Distributed Queues (Apache Kafka, Kinesis, RabbitMQ)
- Experience working with Oracle OCI, AWS or similar cloud platform technologies
- Experience with data science or machine learning, especially supervised ML algorithms, clustering, or natural language processing preferred
- Knowledge of healthcare datasets, data formats preferred
- Worked in a regulated industry (e.g healthcare or financial) preferred
- Experience with hierarchical, relational and unnormalized data formats preferred
What We Offer:
Our goal is to remove as many obstacles as we can so you are able to do the best work of your life. We offer the following benefits to help you do that:
- Opportunity to make an enormous impact on hundreds of millions of lives, while growing your career
- A team that is passionate about achieving our mission and each other’s success
- Medical, dental, and vision benefits
- Commuter stipend
- Quarterly learning stipend
- Phenomenal location within walking distance to the San Mateo CalTrain