Cancelar
×
Regresar a la búsqueda
IBM

Site Reliability Engineering Professional (Compute SRE)

IBM

  •  Expira 10/06/2024
  •  Costa Rica
Iniciar sesión o registrarse para aplicar


Detalle de la Oferta

Área de la Empresa Tecnologias de Informacion
Cargo Solicitado Ingeniero en Sistemas
Puestos Vacantes 1
Tipo de Contratación Tiempo completo
Nivel de Experiencia Sin experiencia
Salario máximo (USD)
Salario minimo (USD)
Vehículo Indiferente
País Costa Rica
Departamento Otro

Descripción de la Oferta

Introduction
At IBM, work is more than a job - it's a calling: To build. To design. To code. To consult. To think along with clients and sell. To make markets. To invent. To collaborate. Not just to do something better, but to attempt things you've never thought possible. Are you ready to lead in this new era of technology and solve some of the world's most challenging problems? If so, lets talk.

Your Role and Responsibilities
As a Compute Operations Site Reliability Engineer, you will perform the following tasks:

- Remotely administer Power Server hardware environments across numerous datacenter locations around the world (currently 18 datacenters and growing).
- Develop automation to reduce manual toil (automated, repetitive tasks) using shell scripts (bash, etc), Python, Ansible, and related tools and languages.
- Perform code stack updates on infrastructure systems (VIOS, firmware, PowerVC, HMC, Novalink, NIM servers) as well as cloud supporting systems (jump servers, sobox, network nodes, gateways, TSM servers).
- Upload/maintain stock images.
- Remotely administer AIX and Linux servers
- Maintain UserIDs (Add/delete) and passwords.
- Monitor daily/weekly backups to ensure they are working.
- Manage and maintain Nagios monitoring environment, troubleshoot scripts/plug-ins if there is an issue.
- Perform periodic LPMs, inactive migrations, or remote restarts of customer VMs to perform system maintenance, balance workloads, or free up resources.
- Monitor and provide details of Capacity utilized in each Datacenter.
- Attend scheduled meetings planned by customer for cutover/maintenance windows.
- Verify capacity requirements in case of provisioning failure issues by customers.
- Work with customers to resolve any RSCT issues so that LPM activities can be performed without impacting customer workloads.

Required Technical and Professional Expertise

- In-depth knowledge of Power Server hardware.
- Significant scripting/coding experience for automating all aspects of IBM Power systems administration.
- Automation using Python, shell scripting (bash, etc), Ansible, and related tools and languages.
- Experience with AIX and Linux administration, commands and networking - - role requires experience at the OS level.
- Strong experience in one or more of the following: VIO, Novalink, and PowerVC. Familiarity with one more (to include installation, configuration, administration).
- In-depth knowledge of PowerVM including installation/configuration and administration.
- High level knowledge of Power Systems supported Operating Systems (AIX and IBMi).
- In-depth knowledge of how storage is connected and allocated to Power systems via NPIV connections.
- Good understanding of Power Systems network configuration at the system level.

Preferred Technical and Professional Expertise

- Experience with configuring and tuning PowerVC
- Experience training new personnel on tooling and processes
- Storage & Power RTS, MVS Network for Cisco, Juniper; general support skills

Educación Superior

Informática | Sistemas
Requerido
Universidad Completa | Graduado
Iniciar sesión o registrarse para aplicar



IBM
Ver más

Trabajos Recomendados

Ver más

Envíame ofertas como esta: Site Reliability Engineering Professional (Compute SRE)

Por favor, ingresa tu correo electrónico

Por favor, ingresa un correo electrónico válido