Overview

Interswitch is an Africa-focused integrated digital payments and commerce company that facilitates the electronic circulation of money as well as the exchange of value between individuals and organisations on a timely and consistent basis. We started operations in 2002 as a transaction switching and electronic payments processing, and have progressively evolved into an integrated payment services company, building and managing payment infrastructure as well as delivering innovative payment products and transactional services throughout the African continent. At Interswitch, we offer unique career opportunities for individuals capable of playing key roles and adding value in an innovative and fun environment.

Job Position: Site Reliability Engineer

Job Location: Lagos

Job Description

  • Manage Availability and Capacity on the Core Applications. Provide support for the Applications and ensure their optimal performance. Implement setup of new Applications in the company’s environment.

Job Responsibilities

  1. Deployment of Applications
  2. Support the deployment of Applications on the production environment
  3. Implement projects involving Setup and deployment of new Applications and enhancement of existing applications
  4. Automation
  5. Implement Automations of Activities that are involved in the management of Applications.
  6. Application Environment Management
  7. Ensure 24×7 Availability of all Core Applications
  8. Carry out Capacity planning to ensure Applications are always available to meet demands.
  9. Create visibility into site health and key performance indicators of the Application Systems
  10. Ensure up-to date patching and full compliance to security standards of the Application Systems.
  11. Ensure up-to date documentation on all Core Applications as well as changes made
  12. Balance feature development speed and reliability with well-defined Service Level Objectives (SLO) and Service Level Indicators (SLI)
  13. Monitor Systems
  14. Monitor the performance, health, and capacity of:
    • Servers
    • Databases
    • Services
    • Storage
    • Network Links
  15. Use a variety of monitoring tools like Nagios, Solarwinds, Kibana, PagerDuty, AppDynamics, etc.
  16. Troubleshooting.
  17. Troubleshoot reported issues, and proactively identify areas in need of optimization
  18. Working with technical support engineers to resolve critical incidents
  19. Create and update clear troubleshooting guides for Applications
  20. Requests Fulfilment.
  21. Implement Requests relevant to the operation and enhancement of the Core Processing Applications.

Job Requirements

  1. Academic Qualification(s) – Good First Degree in Computer Science / Computer Engineering or other related fields
  2. Professional Qualification(s) – Service Management Certifications (eg ITIL) is an advantage.
  3. Experience (Number of relevant years) – Minimum of (1) year relevant experience.

Other Requirements:

  1. Expertise in Linux and Windows Operating systems and Shell scripting
  2. Technical experience working with cloud technologies
  3. Build and Deployment Management (Jenkins) in a CI/CD workflow
  4. Experience with Chef, Puppet or Ansible, automating all aspects of system and server management
  5. Good understanding of distributed systems and container technologies like Docker/Kubernetes container infrastructure and orchestration
  6. Good understanding of SLO and SLI for Applications
  7. Experience with DNS, Networking and High Availability solutions
  8. Proficient in at least one of the following languages: Python, Ruby, Go Ability to work across teams to continuously analyze system performance in production, troubleshoot reported issues, and proactively identify areas in need of optimization
  9. Previous experience with developing and driving real time monitoring solutions that provide visibility into site health and key performance indicators
  10. Working knowledge of databases
  11. Working understanding of Load balancing technologies.
  12. Working understanding of IT service management (Incident, Problem, Change and Knowledge management).
  13. Ability to work within a technical team of support engineers through day-to-day operations and critical incidents.

Application Deadline
8th September, 2022.

How to Apply
Interested and qualified candidates should:
Click here to apply online

Tagged as: Engineering, Technical