Duties
Design, develop, and optimize ETL processes using tools such as Talend, Informatica, and custom scripting (Python, Bash, Shell Scripting) to extract, transform, and load data from various sources including Oracle, Microsoft SQL Server, and cloud storage solutions like Azure Data Lake.
Build and maintain scalable data pipelines utilizing big data frameworks such as Hadoop, Apache Hive, Spark, and related technologies to handle large volumes of structured and unstructured data.
Develop and implement data models for warehousing solutions supporting analytics platforms like Looker and other BI tools.
Collaborate with data scientists on model training and validation by providing clean, well-structured datasets.
Design database schemas and optimize database performance through effective database design principles.
Integrate diverse data sources including linked data environments to enable comprehensive analysis capabilities.
Develop RESTful APIs for seamless data access across applications.
Ensure compliance with security standards and best practices for data privacy and governance.
Participate in Agile development cycles to deliver continuous improvements in data infrastructure.
Conduct analysis to identify opportunities for process automation and efficiency improvements within the data ecosystem.
Requirements
Proven experience with cloud platforms such as AWS and Azure Data Lake for scalable storage solutions.
Strong programming skills in Java, Python, VBA, Bash (Unix shell), and Shell Scripting.
Extensive knowledge of big data technologies including Hadoop, Spark, Apache Hive, and related ecosystems.
Proficiency with ETL tools such as Talend or Informatica; experience with SQL databases including Microsoft SQL Server and Oracle.
Familiarity with modern analytics tools like Looker; understanding of linked data concepts.
Experience designing large-scale data warehouses supporting enterprise analytics initiatives.
Knowledge of RESTful API development for system integrations.
Ability to work within Agile methodologies to deliver iterative enhancements efficiently.
Strong analysis skills with the ability to interpret complex datasets to inform business strategies.
Excellent understanding of database design principles combined with experience in model training for predictive analytics is a plus. This position offers an exciting opportunity to contribute to innovative projects at the forefront of data technology while working in a collaborative environment focused on continuous learning and growth.
Job Type: Full-time
Pay: $70,000.00-$130,000.00 per year