We are seeking a highly skilled
Senior Software Engineer (SSE) / Staff Data Engineer
to design, build, and optimize large-scale data solutions. The ideal candidate will have strong expertise in
Python
,
Spark
, and
AWS
, with a proven track record in developing robust data pipelines, enabling advanced analytics, and ensuring data governance and security.
Key Responsibilities:
Data Engineering \& ETL:
Design and implement scalable
ETL pipelines
for data ingestion, transformation, and integration.
Ensure data quality, governance, and compliance across all stages.
Big Data \& Distributed Systems:
Develop and optimize solutions using
Apache Spark
,
Hadoop
,
Hive
, and
Kafka
.
Handle large-scale data processing and real-time streaming.
Cloud \& Infrastructure:
Build and maintain data solutions on
AWS
(S3, Glue, Redshift, Athena, IAM).
Implement secure and cost-efficient cloud architectures.
APIs \& Automation:
Develop
RESTful APIs
and automation scripts for data services and integrations.
Collaborate with application teams to enable data-driven microservices.
DevOps \& CI/CD:
Implement
CI/CD pipelines
using Jenkins, Git, Docker, and Kubernetes.
Ensure smooth deployment and monitoring of data applications.
Security \& Compliance:
Apply best practices for
data governance
, risk management, and regulatory compliance.
Agile Collaboration:
Work closely with cross-functional squads and stakeholders to deliver data solutions in an
Agile environment
.
Required Skills \& Experience:
Strong programming skills in
Python
(Scala experience is a plus).
Hands-on experience with
Spark
,
Hadoop
,
Hive
, and
Kafka
.
Expertise in
AWS services
(S3, Glue, Redshift, Athena, IAM).
Solid understanding of
ETL processes
, data modeling, and pipeline orchestration.
Familiarity with
RESTful APIs
, microservices, and automation scripting.
Knowledge of
CI/CD tools
, containerization (Docker), and orchestration (Kubernetes).
Understanding of
data security
, compliance, and governance principles.
Excellent problem-solving skills and ability to work in a fast-paced Agile environment.
Preferred Qualifications:
Experience with
data lake
and
data warehouse
architectures.
Exposure to
machine learning pipelines
or advanced analytics.
Certification in
AWS
or Big Data technologies.