Sr. Manager, Hardware Debug

Date: Sep 4, 2025

Location: San Jose, California, United States

Company: Super Micro Computer

Job Req ID: 27404

About Supermicro:

Supermicro® is a Top Tier provider of advanced server, storage, and networking solutions for Data Center, Cloud Computing, Enterprise IT, Hadoop/ Big Data, Hyperscale, HPC and IoT/Embedded customers worldwide. We are the #5 fastest growing company among the Silicon Valley Top 50 technology firms. Our unprecedented global expansion has provided us with the opportunity to offer a large number of new positions to the technology community. We seek talented, passionate, and committed engineers, technologists, and business leaders to join us.
 

Job Summary:

Supermicro Computer is looking for experienced Sr. Manager, Hardware Debug in the Rack solutions team located at our HQ in San Jose, CA.

Essential Duties and Responsibilities:

Includes the following essential duties and responsibilities (other duties may also be assigned):
• Responsible for building and managing a new system debugging engineering team focusing on GPU based systems and clusters
• Responsible for troubleshooting and debugging system HW and finding root causes after L10 and L11 testing
• Work with production closely to solve system HW problems and improve build quality 
• Provide engineering support at customer data center and solve system HW problems in deployments
• Develop efficient system troubleshooting methodologies and create related SOPs 
• Provide training to production team and onsite deployment team 
• Work with R&D team, system lab, product managers and testing engineers to improve overall product quality
• Work with customers and suppliers to solve system HW problems

Qualifications:

• Bachelor or Master degree in Electrical Engineering, Computer Engineering or equivalent
• 12+ years of experience in server HW debugging
• Hardware Expertise: In-depth knowledge of server hardware components, architectures (like x86, ARM, Nvidia and AMD GPUs), and technologies
• Troubleshooting & Problem Solving: Possessing strong analytical and problem-solving skills to diagnose and resolve hardware-related issues efficiently
• Diagnostic Tools Proficiency, familiar with Nvidia DCGM and Field Diagnostics is a plus
• Scripting Skills: Proficiency in scripting languages like Python and Bash for automating tasks and streamlining workflows
• Circuit Design & Analysis: Understanding of analog and digital circuit design principles
• Networking & Security Concepts: Familiarity with network infrastructure, protocols, and security best practices for server environments
• Communication & Collaboration: Effectively communicating technical information with diverse teams and individuals
• Attention to Detail: Ensuring precision and accuracy in troubleshooting and repair processes
• Adaptability & Continuous Learning: Staying updated with emerging technologies and adapting to new challenges in the field

Salary Range

$155,000 - $179,000

The salary offered will depend on several factors, including your location, level, education, training, specific skills, years of experience, and comparison to other employees already in this role. In addition to a comprehensive benefits package, candidates may be eligible for other forms of compensation, such as participation in bonus and equity award programs.

EEO Statement

Supermicro is an Equal Opportunity Employer and embraces diversity in our employee population. It is the policy of Supermicro to provide equal opportunity to all qualified applicants and employees without regard to race, color, religion, sex, sexual orientation, gender identity, national origin, age, disability, protected veteran status or special disabled veteran, marital status, pregnancy, genetic information, or any other legally protected status.


Job Segment: Cloud, Electrical Engineering, Electrical, Data Center, Engineer, Technology, Engineering