About Tencent
Tencent is an Internet-based platform company founded in Shenzhen, China, in 1998. We use technology to enrich the lives of Internet users and assist the digital upgrade of enterprises. Our mission is "Value for Users, Tech for Good". We embrace a culture of teamwork \& creativity and are driven by our values - Integrity, Proactivity, Collaboration and Creativity.
We are rapidly expanding our international operations and are looking for top talent to propel us forward. Combining the results-oriented nature of a start-up with the resources of a profitable and leading Internet company, Tencent offers a unique opportunity for aspiring individuals to thrive.
About WeChat
With over 1.3 billion users worldwide, WeChat is changing the mobile landscape by connecting people, services, and businesses in China and around-the world. The WeChat team in Singapore is responsible for managing and growing our core product including messaging and social networking for users around the world.
Join the WeChat team and play an impactful role in keeping people around the world connected, help redefine how people use their mobile devices to communicate and interact online and to better understand user behaviour and preferences of users.
Roles \& Responsibilities
Ensure site reliability by managing the deploy, scaling, and maintanence of new and existing online services that connect over a billion users around the world
Leverage your engineering skills while working directly with developers in order to help test and diagnose issues with newly deployed services, infrastructure resources, or code before and after they reach the production environment
Manage high severity incidents and incidents impacting end users by focusing on service monitoring, alerts, and rapid recovery
Use stress testing to help measure, tune, and optimize system performance and reliability for a wide variety of services
Develop and maintain automation tools/systems to help eliminate repetitive manual operations and ensure better site reliability
Produce and maintain documentation and standard operating procedures (SOPs) to more efficiently and reliably handle regular operations in conjunction with colleagues around the world
Qualifications
Bachelor's or higher degree in Computer Science, Computer Engineering, or related fields
Prior work experience in Cloud Engineering, Site Reliability Engineering (SRE), or DevOps for a major, public-facing internet service
Hands-on experience with at least one of the programming languages: Bash, Go, Python
Good command of Linux environment with deep understanding of the Linux operating system, including kernel, memory, processes, threads, static / shared libraries, IPC, RPCs, and signals
Understanding of standard networking protocols such as HTTP, DNS, SSL, TCP/IP, and ICMP
Experience in large-scale distributed environments. Familiarity with distributed systems including the CAP theorem and microservices.
Experience with container technologies such as Docker and Kubernetes
Experience with monitoring tools like Prometheus and Zabbix
Strong sense of ownership, reliability, and integrity demonstrated
Passion for eliminating repetitive manual processes via automation
Fast learning ability and a good team player
Fluency in both English and Mandarin to deal with international stakeholders and stakeholders who are based in HQ