This job listing has expired and may no longer be relevant!
2 Nov 2023

Data Engineer at SankuPHC

Recruit candidates with Ease. 100% recruitment control with Employer Dashboard.
We have the largest Job seeker visits by alexa rankings. Post a Job

Resubmit your Resume Today. Click Here to Start

We have started building our professional LinkedIn page. Follow


Job Description

Every day 8,000 children die and 2 billion people suffer from preventable illnesses because their diets lack basic vitamins and minerals. Sanku has the solution. Our vision: a world where everyone, ev

Role Overview: 

The Data Engineer role at Sanku Project Healthy Children is vital for the orchestration and optimization of the organization’s data pipeline. Tasked with ensuring the seamless integration of diverse data sources into a unified, secure, and fault-tolerant system, the role emphasizes the robustness and reliability of data flows. They are responsible for monitoring cloud data systems, overseeing database administration, administering data governance, and ensuring compliance with local and international regulations. Collaborating closely with the Data Scientist and Senior Data Analyst, the Data Engineer ensures the timely, secure, and efficient processing of data, thus empowering the organization to fulfill its mission effectively.

Key Responsibilities: 

Data Pipeline Administration: 

  • Monitor data pipeline on AWS, ensuring its robustness and reliability.
  • Build, maintain, and optimize data models, data structures, and ETL processes using Apache Airflow and Python.

Performance Monitoring: 

  • Regularly monitor data performance and make necessary pipeline modifications using CloudWatch.
  • Debug complex data pipeline issues ensuring continuous data flow.

Database Administration: 

  • Oversee and optimize databases like MySQL for efficient data handling and querying.
  • Generate, document, and test various scripts essential for operational metrics and reports.

Data Validation and Cleaning: 

  • Validate data extracted from the pipeline against other relevant data sources.
  • Automate processes using AWS Lambda ensuring consistent and accurate data extraction.
  • Develop and implement algorithms to clean and validate data.

Collaboration: 

  • Work closely with the Data Scientist and Senior Data Analyst to refine data-driven strategies.
  • Assist teams with data-related technical issues and fulfill their data pipeline needs.
  • Identify system enhancements and recommend changes.

Data Governance Administration: 

  • Implement and enforce standards and guidelines across the ETL and data pipeline processes.
  • Work collaboratively with stakeholders to define and maintain metadata standards, ensuring consistent data definitions and clarity.
  • Oversee data quality assurance processes, ensuring data integrity and reliability throughout the data pipeline.
  • Advocate for data privacy and security protocols, ensuring compliance with relevant local and international regulations.

Data Engineer Stack: 

  • Data Warehousing: Amazon S3, Amazon RDS
  • Data Integration & Processing: Apache Airflow, Python, AWS Lambda
  • Monitoring & Alerts: CloudWatch
  • Database Management: MySQL
  • Data Visualization & Reporting: Power BI
  • Data Exchange: REST APIs, JSON, NetSuite REST API, Postman
  • Infrastructure & Networking: AWS ECS, AWS EC2, AWS VPC, AWS Subnets, AWS Route Tables, AWS Security Groups
  • Automation & Scripting: IaC automation using Terraform, SQL, Python, Bash and Linux Scripting
  • Version Control: Git, AWS CodeCommit, GitHub Actions, AWS Code Pipeline

Key Performance Indicators (KPIs): 

  • Data Pipeline Efficiency: Measure the performance, speed, reliability, and cost-effectiveness (including cost management) of data pipelines, ensuring data is available for analysis in a timely manner.
  • Data Accuracy and Integrity: Monitor the accuracy of data ingested into systems and ensure that data cleaning processes are effectively maintaining the quality of data.
  • System Uptime and Resilience: Ensure that data pipelines, including databases and ETL processes, are consistently available with minimal downtime.

Qualifications & Experience: 

  • Bachelor’s or Master’s degree in Computer Science, Data Science, Engineering, or a related field.
  • Minimum of 5 years of of progressive experience in data engineering, especially in large scale and complex FMCG data environments.
  • Proficiency in the aforementioned data stack.
  • Demonstrated ability to build scalable data models and data pipelines.
  • Familiarity with big data tools and environments.
  • Strong problem-solving skills and analytical thinking.
  • Ability to work in a team-oriented environment and communicate effectively

erywhere, has guaranteed, affordable access to the nutrients they need to survive and thrive.



Method of Application

Submit your CV and Application on Company Website : Click Here

Closing Date : 15 November. 2023





Subscribe


Apply for this Job