Sr Data Engineer
Mexico-Remote
Full-Time
Tools: SQL, Python (exp at application level not scripting), AWS, Snowflake
The
Senior Data Engineer
role will be the technical liaison between multiple groups including a data science team, the engineering team, product management, and business stakeholders. You do not need any insurance knowledge prior, however, you must quickly dive deep into the insurance world and ask questions to become a subject matter expert. You will be responsible for building a data platform to facilitate the data science team. You must be a self-starter that can build out features such as a data pipeline from scratch. There will be support from both engineering and data science for any buildout. This is a senior level position.
Position Responsibilities
Assemble large, complex data sets that meet functional / non-functional business requirements.
Identify, design, and implement internal process improvements: automating manual processes, optimizing data delivery, re-designing infrastructure for greater flexibility, etc.
Build and maintain the infrastructure required for optimal extraction, transformation, and loading of data from a wide variety of data sources using AWS technologies, SQL, Python, Docker, and Airflow.
Work with stakeholders including the Executive, Product, Data Science, and Engineering teams to assist with data-related technical issues and support their data infrastructure needs.
Work with data science and analytics teams to extend data systems with greater functionality using existing infrastructure and tooling.
Take ownership of technical project implementations from the requirements gathering stage through initial release and maintenance using a Kanban approach for tracking key milestones.
Minimum Qualifications
5+ years of data engineering experience from the requirements stage to production and maintenance
Bachelor’s Degree in Computer Science or related degree or equivalent experience
Strong experience building integrations between external systems and a Snowflake data warehouse, preferably using custom Python code to wrangle messy data sources.
5+ years of experience developing production-grade Python applications (not just scripts) in a cloud native production environment.
Strong knowledge of Python software development best practices: modular design, PEP8 compliance, type-hinting, and test automation (pytest preferred).
Experience with CI/CD concepts and tooling (specific experience with Github Actions preferred)
Comfortable working with lightweight internal frameworks and building abstractions using object-oriented programming or other design patterns in Python.
Experience building and maintaining cloud infrastructure, preferably with AWS cloud services: EC2, ECS, Batch, S3
Experience with version control: git
Experience with container technologies: Docker
Experience with Python packaging and dependency management (poetry or uv preferred).
A successful history of transforming, processing and extracting value from large disconnected, datasets from a variety of data sources (Flat files, Excel, databases, APIs, etc.)
Experience building processes supporting data transformation, data structures, metadata, dependency, and workload management.
Experience taking hands on technical ownership of small to enterprise impactful projects and leading communications with stakeholders
Experience working in a complex, fast-moving environment, working dynamically and collaboratively in a small team
Strong ability to mentor, collaborate and communicate with other team members and cross functional stakeholders
Strongly Preferred Qualifications:
Insurance industry systems and technology experience
Experience with data pipeline and workflow management tools: Airflow, Jenkins, AWS Glue, Azkaban, Luigi, etc.
Experience working with relational databases, strong query authoring (SQL) as well as working familiarity with a variety of databases (Redshift, MySQL, MSSQL, etc.)
Preferred Qualifications:
The following technologies represent our internal technology stack. Experience with these tools is preferred but not required if you have equivalent experience with similar technologies.
Experience working with Python packages: SQLAlchemy and Pydantic
Experience working with Python in concurrent or parallel settings (threading, asyncio, multiprocessing)
Experience writing Python code using enforced static types through a type checker (mypy, pyright or similar).
Strong experience building, optimizing and debugging data models, pipelines and data warehousing using DBT
Strong analytic skills related to working with unstructured datasets.
Experience with AWS Database Migration Service for full-load or change-data-capture tasks
Experience developing inside of a dev container for local application development
Experience with VSCode as your editor