At Apple, our Platform Architecture group is responsible for connecting our hardware and software into one unified system. You’ll collaborate with engineers across Apple to design how all of our technologies work in unison, drive development of our renowned system-on-a-chip architecture and develop forward-looking prototype systems and software. Our team is driving performance enhancements in application and system software and developing novel algorithms to deliver integrated, highly optimized solutions based on Apple Silicon. In this role, you will analyze existing and new workloads to identify performance bottlenecks in the hardware and/or software. Working with your colleagues, you will address performance limitations and provide recommendations for Apple hardware and software improvements. In addition to working directly with developers, you will identify patterns of performance challenges on Apple silicon, emerging new usage models, and provide feedback to the silicon and software teams for potential improvements.
Description
- Conduct performance studies to inform and validate architecture decisions. - Create optimized implementations of machine learning workloads on Apple Silicon, including Neural Engine, GPU, and CPU. - Collaborate with system teams to create performance models of emerging AI/ML techniques and analyze system architecture trade-offs. - Work with software development tools teams to deliver performance analysis instruments and optimized libraries and frameworks for AI/ML applications.
Minimum Qualifications
Bachelor’s degree or equivalent job-related experience in Computer Engineering, Computer Science, or a related field.
Knowledge of computer architecture fundamentals.
Proficiency in some of the C/C++ family programming languages, and scripting languages such as Python.
Experience in software development for at least one of the following hardware IPs: AI/ML HW accelerators, GPUs processing units, image/video encoders, or similar.
Preferred Qualifications
M.S. or Ph.D. in Computer Science, Computer Engineering, Electrical Engineering, or a closely related field.
10+ years of relevant experience in software performance optimization, performance analysis tools, performance optimization process and development of efficient computational algorithms.
Ability to prototype and benchmark algorithms on CPU, GPU, and Neural Engine platforms, analyze performance metrics, and create high-level complexity models.
Experience with AI/ML, graphics, or HPC performance benchmarks and workloads.
Proficiency in popular AI/ML frameworks, such as PyTorch, and relevant software stacks.
Experience in developing highly efficient low-level performance libraries for AI/ML accelerators or GPUs.
Knowledge of operating system internals and compiler technologies.
Technical aptitude and curiosity, as well as ability to collaborate effectively with team members, partners, and stakeholders.
*
Submit Resume