We're seeking a Senior Data Engineer with expertise in building scalable data architectures and real-time data processing systems. You'll design and implement large-scale data pipelines to process unstructured data, driving insights and business value.
Key Responsibilities:
Design and develop scalable data architectures using Spark Streaming, PySpark, and Scala
Process and analyze large volumes of unstructured data from various sources
Build and maintain real-time data pipelines for data integration and analytics
Collaborate with cross-functional teams to integrate data insights into business applications
Optimize data processing workflows for performance, reliability, and scalability
Troubleshoot data pipeline issues and ensure high data quality
Requirements:
10+ years of experience in data engineering, with a focus on building scalable data systems
Strong expertise in Spark Streaming, PySpark, and Scala
Experience working with large-scale unstructured data and real-time data processing
Proficiency in data processing frameworks and tools (e.g., Apache Spark, Apache Kafka)
Strong analytical and problem-solving skills, with attention to detail and scalability
Excellent communication and collaboration skills
Nice to Have:
Experience with machine learning algorithms and model deployment
Knowledge of cloud-based data platforms (e.g., AWS, GCP, Azure)
Familiarity with containerization (e.g., Docker) and orchestration (e.g., Kubernetes)
What We Offer:
Competitive salary and benefits package
Opportunity to work on complex data engineering projects
Collaborative and dynamic work environment