Ensure the stability of the company's exchange business, respond quickly to incidents with the R\&D team, and establish mechanisms to improve handling efficiency
Participate in the construction of operation and maintenance tools and platforms and system risk identification (including DB/middleware), and promote operation and maintenance automation
Promote system optimization through continuous all-round data operation (including historical incidents, online issues, resource utilization, etc.)
Handle alerts so that they are properly disposed of
Formulate various operation and maintenance standards to promote the improvement of the overall operation and maintenance level
Build budget management, cost measurement, cost monitoring, and cost optimization systems, provide solutions for cost governance, and promote their implementation
Requirements
Bachelor's degree or above in computer science or related field, with 3+ years of experience in SRE / operation and maintenance / cloud native related work
Solid basic knowledge of computer software, proficiency in daily operation and maintenance and troubleshooting of Linux operating systems
Proficient in the principles and operation and maintenance of core components of distributed systems, such as MySQL (master-slave replication, read-write separation), Redis (clustering, persistence), Kafka (reliability of message delivery)
Familiar with one or more scripting languages, such as Python/Shell/GO
Possess systematic problem-solving skills, good communication skills, and a sense of ownership
Experience with related computing/distributed/big data systems is preferred (Nginx/Kubernetes/Docker, etc.)
Nice-To-Have:
Experience in blockchain node operation and maintenance and optimization
Experience in exchange or DeFi project operation and maintenance
*Only shortlisted candidates will be contacted.