US Jobs US Jobs     UK Jobs UK Jobs     EU Jobs EU Jobs

   

Data Engineer - Expert

What will you do?


* • Develop, optimize, and manage robust ETL pipelines to handle large-scale data processing and transformation.

• Build and maintain data lake architectures on AWS S3 and data warehouse solutions on Redshift for efficient storage and retrieval.

• Use Databricks for advanced data processing, pipeline orchestration, and integration with downstream AI/ML systems.

• Leverage AWS services - EC2, EMR, RDS, S3, Athena, Redshift, Lambda, Glue ETL, Glue Crawlers, Glue Data Catalog, Glue Studio - for scalable, production-grade data solutions.

• Develop data processing logic in Python, PySpark, SQL, and object-oriented or functional languages such as Java or C++.

• Establish data quality assurance processes including validation, cleansing, and error handling to maintain data integrity.

• Enforce data privacy, governance, and security best practices to ensure compliance with corporate and regulatory policies.

• Monitor data pipelines for performance, troubleshooting issues to maintain operational reliability and data availability.

• Integrate and optimize real-time and batch data ingestion pipelines to support analytics, AI models, and automation workflows.

• Implement infrastructure-as-code for data workloads and ensure CI/CD practices for data pipelines.

• Conduct code reviews, mentor junior data engineers, and define best practices for high-performance data engineering.

• Collaborate closely with data scientists, AI engineers, and business stakeholders to align data infrastructure with analytical and automation needs.

What skills and capabilities will make you successful?


* o 5+ years of experience in data engineering, with successful delivery of enterprise-scale data projects.


* o Proven track record in data lake design and ETL/ELT pipeline development on AWS S3, AWS Redshift, and Databricks.

o Strong knowledge of AWS cloud services including EC2, EMR, RDS, S3, Athena, Redshift, Lambda, Glue ETL, Glue Data Catalog, Glue Studio, and AWS Crawlers.

o Advanced skills in Python, PySpark, and SQL with exposure to Java, C++, or similar programming languages.

o Hands-on experience with big data processing frameworks (Hadoop, Spark, Kafka) and stream-processing systems (Storm, Spark Streaming).

o Skilled with data pipeline orchestration tools (Airflow, Azkaban, Luigi).

o Knowledge of AWS log formats and expertise in processing operational and application logs at scale.

o Experience in data privacy, governance, and security frameworks suitable for enterprise environments.

o Strong ability to ensure data readiness for AI/ML and RPA use cases, including feature engineering pipelines and real-time data delivery.

o Excellent problem-solving, analytical thinking, and communication skills.

o Proven experience in mentoring technical teams and setting en...




Share Job