Google will be prioritizing applicants who have a current right to work in Singapore, and do not require Google's sponsorship of a visa.
This role requires you to work in a shift pattern or non-standard work hours as required. This may include weekend work.
Minimum qualifications:
Bachelor's degree in Science, Technology, Engineering, Mathematics, or equivalent practical experience.
6 years of experience writing code in one or more general purpose programming languages (e.g., C++, Java, Python, Go, etc.).
Preferred qualifications:
Experience working directly with AI/ML computing hardware, including GPUs or other accelerators.
Experience working with large-scale distributed systems, and with common solutions, design patterns, or best practices.
Experience with containerization and orchestration technologies like Kubernetes or Slurm in an on-prem or cloud environment.
Experience with ML frameworks (e.g., TensorFlow, Pytorch).
Experience troubleshooting and advocating for customer needs, and triaging technical issues across the stack (e.g., hardware faults, low-level software, networking, virtualization, kernel drivers, firmware, and performance).
Ability to participate in an on-call rotation, including non-standard working hours, night shifts, weekends and holidays.
About the job
In this role, you will be a part of a global team that provides support to help customers seamlessly make the switch to Google Cloud. When customers cannot resolve issues themselves, your job is to ensure that we have the necessary tools and processes to resolve the issue. You will troubleshoot technical problems for customers with a mix of debugging, networking, system administration, updating documentation, and when needed, coding/scripting. You will make the products easier to adopt and to use by making improvements to the product, tools, processes, and documentation. The Customer Solutions Engineering team is focused on customer needs, and you will help drive the success and business growth of Google Cloud by understanding and advocating for our customers issues and tests.
Google Cloud accelerates every organization’s ability to digitally transform its business and industry. We deliver enterprise-grade solutions that leverage Google’s cutting-edge technology, and tools that help developers build more sustainably. Customers in more than 200 countries and territories turn to Google Cloud as their trusted partner to enable growth and solve their most critical business problems.
Responsibilities
Manage customer’s problems through effective diagnosis, resolution, or implementation of new investigation tools to increase productivity for customer issues on AI/ML infrastructure.
Develop an in-depth understanding of AI/ML workloads and underlying hardware architectures by troubleshooting, reproducing, and determining the root cause for customer reported issues, and build tools for faster diagnosis.
Be a consultant and subject matter expert for internal stakeholders in engineering, sales, and customer organizations to resolve complex deployment and operational obstacles in AI infrastructure environments.
Work closely with multiple product and engineering teams to find ways to improve the product, and interact with our Site Reliability Engineering (SRE) teams to drive high-quality production.
Participate in rotating on-call schedules, including during nights, weekends and holidays, to ensure prompt and proper resolution of customer-impacting technical challenges.
Google is proud to be an equal opportunity workplace and is an affirmative action employer. We are committed to equal employment opportunity regardless of race, color, ancestry, religion, sex, national origin, sexual orientation, age, citizenship, marital status, disability, gender identity or Veteran status. We also consider qualified applicants regardless of criminal histories, consistent with legal requirements. See also Google's EEO Policy and EEO is the Law. If you have a disability or special need that requires accommodation, please let us know by completing our Accommodations for Applicants form.