GCP shop - must have
Need Databricks on GCP experience
Healthcare, pharma, med device experience a nice to have, or regulated as secondary
Primary Responsibilities:
Design, develop and maintain data pipelines, to ingest, transform, and store healthcare data from various enterprise data sources
Analyze raw data sources, data transformation, structural requirements for new software and applications
Identify, design and implement internal process improvements, such as
Data encryption
Data de-identification to ensure data security
Role-based data access
Automating manual processes
Optimizing data delivery
Tuning and optimizing data processes for greater scalability, etc.
Design, build and implement automation processes for production release of data engineering, machine learning and business intelligence processes
Code, test, document and maintain high-quality and scalable Big Data solutions
Define security and backup strategy for data solutions
Build and implement batch and near real time data pipelines
Help to ensure successful deployment of newly implemented data solutions
Works with less structured, more complex issues
Serves as a technical leader and mentor to others
You’ll be rewarded and recognized for your performance in an environment that will challenge you and give you clear direction on what it takes to succeed in your role as well as provide development for other roles you may be interested in.
Required Qualifications:
High School Diploma/GED (or higher)
6+ years of experience in Data Engineering
2+ years of experience in Python (Scala/Java)
2+ years of experience in SQL, including query optimization, performance tuning and handling large datasets
2+ years of hands-on experience with Data Lake, Data Warehouse optimization techniques and Medallion Architecture
2+ years of experience with knowledge of job workflow and token management
2+ years of experience in data migration, cross-cloud integrations (AWS/Azure is a bonus) and cloud networking
1+ years of proven experience with GitHub and cloud environment tools
1+ years of hands-on experience in developing big data processes using big data tools – Spark, Hadoop, Kafka, etc.
Preferred Qualifications:
Bachelor’s Degree in Computer Science, Engineering, a related field or equivalent industry experience
Professional GCP Data Engineer certification
Experience in leading teams/people management and mentoring
Data Visualization tools and technology experience
Experience with Metadata management and Data Catalog management process, tools and technologies
Experience with Spark in data processing and machine learning processes
Hands-on experience with related/complementary open-source software platforms and languages (e.g. Java, Angular, Scala Linux, Apache, Unix)
Experience with API technologies
Healthcare industry knowledge and experience with exposure to EDI, HIPAA, HL7 and FHIR integration standards