This job listing has expired and may no longer be relevant!
25 Sep 2023

Engineer, Reliability at Standard Bank Group

Recruit candidates with Ease. 100% recruitment control with Employer Dashboard.
We have the largest Job seeker visits by alexa rankings. Post a Job

Resubmit your Resume Today. Click Here to Start

We have started building our professional LinkedIn page. Follow


Job Description

Standard Bank Group is the largest African banking group by assets offering a full range of banking and related financial services. “Africa is our home, we drive her growth” Our vision is to be the leading financial services organisation in, for and across Africa, delivering exceptional client experiences and superior value. This sets the primary goals and standard of excellence we intend to achieve in the medium term.

Job Purpose

To create a bridge between development and operations by applying a software engineering mindset to system administration. To focus on operations/on-call duties and developing systems and software that help increase site reliability and performance. To build self-service tools for users that rely on such services; to collaborate with product developers to ensure that the designed solution responds to non-functional requirements such as availability, performance, security, and maintainability.

Key Deliverables 

  • Automate CI/CD pipeline for both legacy architecture and containerized platforms using infrastructure as code and software development skills so as to increase the speed and quality of software delivery.
  • Automate the provision of, and modifications to infrastructure of production and non-production environments to minimize configuration drift and maintain consistency across environments.
  • Build dashboards to improve visability of the build and release processes, system performance, availability, latency, throughput and error rate.
  • Conduct and document post-mortems and incident reviews, and take action on outcomes to maximise learnings so as to prevent repeat incidents and improve future responses.
  • Continuously improve upon the monitoring, incident response, and the optimisation of service availability and performance, and suggest methodical approaches for implementation. Communicate proposed changes across the organisation to ensure efficient and structured production support and emergency response.
  • Define and communicate the scale, capacity, security, performance attributes, and requirements of the service and technology stack.
  • Define and implement mechanisms to monitor service-level indicators for the underlying service by setting units of measurement that define the service level that customers can expect of the system, defining the desired outputs of the system in terms of availability, and communicating the expected reliability of the service to customers in order to facilitate the speed at which business can release new features and services.
  • Design and implement monitoring solutions in order to identify performance errors and maintain service availability.
  • Develop software to automate manual processes to expedite problem detection and mitigation.
  • Drive collaboration between people, processes and technology to lead to a proactive system of incident response and remediation.
  • Drive the improvement of service performance metrics such latency, page load speed and ETL by proactively identifying performance issues across the system so that customers are enabled to make full use of the system.
  • Ensure an efficient system for incident response by making the appropriate information available in order to quickly identify and fix problems.
  • Identify and automate manual and repetitive work to reduce toil.
  • Identify and implement mechanisms to reduce the noise in alerting and maximise the signal so that notifications and problems are only sent for those that need human intervention and is directly related to a defined and agreed SLO.
  • Identify opportunities and implement solutions to optimise service monitoring, availability, performance.
  • Provide insight and guidance on the end-to-end performance and operability of a service. Partner with development teams to define and implement improvements in service architecture.
  • Provide insights into the design and implementation of services with a focus on security, resiliency, scale, and performance by having a rich understanding of the end-to-end configuration, technical dependencies, and overall behavioural characteristics of the production service/s.
  • Validate recovery and failover strategies by performing rigorous system failure testing.

QUALIFICATIONS

Minimum Qualifications

  • Type of Qualification: First
  • Field of Study: Information Technology

Experience Required
Software Engineering

  • Technology
  • 5-7 years
  • Proven experience in IT Software Development and at least one programming language and experience building scalable systems with service-oriented architectures.

ADDITIONAL INFORMATION

Behavioral Competencies:

  • Adopting Practical Approaches
  • Articulating Information
  • Checking Details
  • Developing Expertise
  • Documenting Facts
  • Embracing Change
  • Examining Information
  • Interpreting Data
  • Managing Tasks
  • Producing Output
  • Taking Action
  • Team Working

Technical Competencies:

  • Application Knowledge for Support
  • Business Continuity and Disaster Recovery Planning
  • Information Technology Architecture
  • Infrastructure and Platforms Support
  • IT Design Driven Development
  • Service Management Processes
  • Use of Build and Test Automation
  • Use of Version Control


Method of Application

Submit your CV and Application on Company Website : Click Here

Closing Date : 15 October. 2023





Subscribe


Apply for this Job