This job listing has expired and may no longer be relevant!
26 Oct 2023

Site Reliability Engineer III (Traffic) at Wikimedia Foundation

Recruit candidates with Ease. 100% recruitment control with Employer Dashboard.
We have the largest Job seeker visits by alexa rankings. Post a Job

Resubmit your Resume Today. Click Here to Start

We have started building our professional LinkedIn page. Follow


Job Description

The Wikimedia Foundation is the nonprofit that hosts Wikipedia and our other free knowledge projects. We want to make it easier for everyone to share what they know. To do this, we keep Wikipedia and Wikimedia sites fast, reliable, and available to all. We protect the values and policies that allow free knowledge to thrive. We build new features and tools to make it easy to read, edit, and share from the Wikimedia sites. Above all, we support the communities of volunteers around the world who edit, improve, and add knowledge across Wikimedia projects.

Summary

As an engineer in the SRE team you will be involved in defining and running the infrastructure and services that form the base of Wikimedia Foundation projects. This will include frequent work with other members of the SRE team to improve our infrastructure in terms of scalability, high availability, recoverability, monitoring and logging. You will participate in incident response and be oncall. You will also be frequently interacting with people not in SRE, like Security, Release and Software Engineers, who all strive to maintain and make Mediawiki and related software better.

You are responsible for:

  • Performing day-to-day operational/DevOps tasks on Wikimedia’s public facing infrastructure (deployment, maintenance, configuration, troubleshooting
  • Implementing and utilizing configuration management and deployment tools (Puppet, Kubernetes)
  • Leading continuous improvement, by automating the installation, configuration and maintenance of services on our platform
  • Assisting in the architectural design of new services and making them operate at scale
  • Assisting in or leading incident response, diagnosis, and follow-up on system outages and alerts across Wikimedia’s production infrastructure
  • Share our values and work in accordance with them

Requirements

Skills and Experience:

  • 2+ years experience in an SRE/Operations/DevOps role as part of a team
  • Experience with operating highly available infrastructure
  • Comfortable with shell and a programming language used in an SRE/Operations engineering context (Python, Go, Ruby, etc.)
  • Experience with package management for operating systems (Debian, etc)
  • Comfortable with Open Source configuration management and orchestration tools (Puppet, Ansible, TerraForm etc.)
  • Past exposure to automation and streamlining of tasks
  • Communicative technical English

Additionally, we’d love it if you have:

  • A history of contributing to Open Source projects
  • Prior participation in the Wikimedia movement


Method of Application

Submit your CV and Application on Company Website : Click Here

Closing Date : 15 November. 2023





Subscribe


Apply for this Job