Announcement
Stay Alert: Beware of phishing emails/websites impersonating AXS. Always verify that you are on the official AXS website before making any payment. If in doubt, call our Customer Service at 6560 2727.
You are in
Site Reliability Engineer (SRE)
Department
Location
Job type
Job Description

We are looking for a motivated junior Site Reliability Engineer (SRE) to join our infrastructure and reliability team. In this role, you will help ensure the reliability, availability, and performance of our systems while learning best practices in automation, monitoring, and incident management. You will work closely with senior SREs and software engineers to support production systems and improve operational efficiency.

This role is ideal for candidates with a strong interest in cloud infrastructure, automation, and reliability engineering who are looking to grow into a full‑fledged SRE.

Responsibilities

1. System Reliability & Operations

  • Assist in maintaining the availability, performance, and reliability of production systems
  • Monitor system health using dashboards, alerts, and logs
  • Perform routine operational tasks such as system checks, backups, and deployments
  • Participate in on-call rotation with guidance from senior team members

2. Incident Management

  • Respond to system alerts and incidents following established runbooks
  • Assist in troubleshooting and resolving production issues
  • Support post-incident reviews (postmortems) and help document lessons learned

3. Automation & Tooling

  • Leveraging automation/AI-driven tools to design, implement, and/or maintain automated solution for deployment, monitoring, scaling, maintenance, for faster/better resolution and productivities
  • Help develop and maintain scripts and tools to automate repetitive operational tasks
  • Assist in improving CI/CD pipelines and deployment processes
  • Contribute to infrastructure automation using Infrastructure as Code (IaC) tools

4. Collaboration & Documentation

  • Work closely with software engineers to improve system design and operability
  • Maintain clear documentation for systems, procedures, and runbooks
Requirements
  • Bachelor’s degree in Computer Science, Engineering, Information Technology, or a related field (or equivalent practical experience)
  • Basic understanding of Linux/Unix systems and networking fundamentals
  • Familiarity with at least one programming or scripting language (e.g. Python, Bash, Go)
  • Basic knowledge of cloud platforms (AWS, GCP, or Azure)
  • Understanding of version control systems (e.g. Git)
  • Strong problem-solving skills and eagerness to learn.
  • Basic understanding of AI products and concepts (e.g. capabilities and limitations of LLMs, common business use cases)
  • Communication and stakeholder management skills, with the ability to collaborate across diverse teams
  • Exposure to containers and orchestration tools (Docker, Kubernetes)
  • Experience with monitoring and observability tools (e.g. Prometheus, Grafana)

Search

Suggested results