Skills Required:
Netezza
SQL Server 2012 Integration Services (SSIS)
Role and Responsibilities:
Understand requirements from product owners and translate them into requirement and scope documents
Decide on the best fitment of technologies/services available within scope
Create solutions for data ingestion and transformation using Hadoop services such as Spark, Spark Streaming, Hive, etc.
Create technical design documents to communicate solutions and mentor the team in solution development
Build solutions with AWS and Hadoop services as per design specifications
Assist teams in building test cases and support testing efforts
Coordinate with upstream, downstream, and other supporting teams for production implementation
Provide post-production support for implemented solutions
Mandatory Skills and Experience:
Strong working experience with AWS data services (EMR, S3, Glue)
Hands-on experience with AWS EMR, Glue (Spark with Scala), and AWS S3
Strong hands-on experience in Hadoop services like Spark
Extensive experience in building batch workloads on AWS using AWS EMR
Adept at analyzing and refining requirements, consumption query patterns, and choosing the right technology fit (RDBMS, data lake, data warehouse)
Additional Technical Skills:
Knowledge of analytical data modeling on any RDBMS/MPP platform
Knowledge of Python
Practical experience migrating Hadoop-based data lakes from on-premises to AWS EMR on EMRFS
Nice-to-Have Skills and Experience:
Experience handling terabytes/petabytes of data and millions of transactions per day
Skills to develop ETL pipelines using Airflow
Knowledge of Spark Streaming or other streaming jobs
Ability to deploy code using AWS Code Pipeline and Bitbucket
Expertise in programming languages such as Scala, Java, and comfort working on Linux platform
Knowledge of cloud-based MPP platforms
Experience Required: 6-8 years