Data Engineer (Python + Aws)
Appsierra
Skills
Job Description
Job Scope : Looking for candidates who have a proven track record of handling large-scale data processing jobs, optimizing PySpark jobs, and designing efficient data pipelines. Strong experience in big data processing and ETL engineering, specifically those with deep knowledge and hands-on experience in PySpark and Databricks. Strong experience in cloud platforms, particularly AWS. Familiarity with Snowflake warehousing technologies is a plus.
Job Responsibilities:
∙ Build and maintain batch data pipelines across the advanced analytics platform.
∙ Design, develop, and orchestrate highly robust and scalable ETL/ELT pipelines.
∙ Develop a highly optimal codebase and perform Spark optimizations for Big Data use cases.
∙ Design, develop, and deploy optimal monitoring and testing strategies for the data products.
∙ Collaborate with stakeholders and advanced analytics business partners to understand business needs and translate requirements into scalable data engineering solutions.
∙ Collaborate with data scientists to prepare data for model development and production.
∙ Collaborate with data visualization and reporting application developers to ensure the sustainability of production applications and reports.
REQUIREMENTS:
- Computer science education in Good academic institution.
Mandatory Skills:
Expertise in SQL
PySpark
Databricks
AWS
Python with OOPS concepts
Working knowledge of CSV, JSON, PARQUET file formats and API access in python
Secondary Skills: Good to have
Snowflake
DBT
Linux CLI
Python Web scraping
Datadog
Github