Sr. System Engineer

Date: Apr 2, 2025

Location: San Jose, California, United States

Company: Super Micro Computer

Job Req ID: 26371

About Supermicro:

Supermicro® is a Top Tier provider of advanced server, storage, and networking solutions for Data Center, Cloud Computing, Enterprise IT, Hadoop/ Big Data, Hyperscale, HPC and IoT/Embedded customers worldwide. We are the #5 fastest growing company among the Silicon Valley Top 50 technology firms. Our unprecedented global expansion has provided us with the opportunity to offer a large number of new positions to the technology community. We seek talented, passionate, and committed engineers, technologists, and business leaders to join us.
 

Job Summary:

As a System Engineer, you will work with team in porting, optimizing, and benchmarking AI/HPC applications on Supermicro server hardware platforms to enhance performance and efficiency. You will work closely with engineers from different teams, customers, and partners to support application deployment, troubleshooting, and performance tuning.

 

Your role will involve building, configuring, and maintaining AI/HPC clusters, both in-house and at customer sites, ensuring smooth operation and optimal resource utilization. You will contribute to technical documentation, assist with debugging, and provide solutions to complex system issues.

 

As part of a skilled engineering team, you will play a key role in on-site AI/HPC deployments, acceptance testing, and customer support, ensuring high availability, security, and performance of mission-critical infrastructure.

Essential Duties and Responsibilities:

Include the following essential duties and responsibilities (other duties may also be assigned):
• Optimize AI/HPC hardware platforms with team
• Set up and configure test software or applications by following provided instructions and documentation
• Identify and report gaps in test setups while assisting in implementing solutions for successful execution
• Troubleshoot installation and configuration issues, escalating complex problems as needed
• Write and deploy basic scripts to support specific tasks during onsite visits
• Develop technical relationships with customers and partners to support AI/HPC performance improvements
• Gain a strong understanding of AI/HPC domains and collaborate with customers and partners on solutions
• Provide technical training and knowledge sharing sessions on AI/HPC applications
• Support team in resolving customer support issues related to AI/HPC systems
• Assist in building processes and procedures for AI/HPC solutions
• Contribute to proof-of-concept testing and benchmarking for AI/HPC applications
• Assist in BIOS, OS, and network tuning for optimized system performance
• Support on-site deployment services and customer acceptance verification
• Draft and maintain technical documentation, including notes, diagrams, and reports
• Work closely with Product Management and Engineering teams to relay customer feedback for future product improvements

Qualifications:

• BS or higher in a computationally intensive science or engineering field
• 6+ years of experience in AI/Deep Learning, HPC, scientific computing, or related areas involving application optimization, compilers, digital signal processors, or GPUs
• Proficiency in Linux OS, shell scripting, and system internals
• Experience with Shell/Python, Containers, OpenMPI and familiarity with CUDA or other parallel programming models
• Hands-on experience with AI/HPC application benchmarks in any of the following is a plus: LS-Dyna, OpenFOAM, PowerFLOW, Star-CCM+, Ansys, WRF, NAMD, Amber, LAMMPS, TensorFlow, PyTorch, MXNet, Keras, MLPerf, etc.
• Understanding of networking, storage systems, and batch scheduling in AI/HPC environments
• Strong problem-solving skills, ability to multitask, and a proactive mindset
• Effective communication skills, with the ability to work both independently and as part of a team
• Willingness to work in customer-facing roles, providing on-site support as needed
• Travel required, and occasional work outside of regular business hours may be necessary

Salary Range

$140,000 - $158,000 

The salary offered will depend on several factors, including your location, level, education, training, specific skills, years of experience, and comparison to other employees already in this role. In addition to a comprehensive benefits package, candidates may be eligible for other forms of compensation, such as participation in bonus and equity award programs.

EEO Statement

Supermicro is an Equal Opportunity Employer and embraces diversity in our employee population. It is the policy of Supermicro to provide equal opportunity to all qualified applicants and employees without regard to race, color, religion, sex, sexual orientation, gender identity, national origin, age, disability, protected veteran status or special disabled veteran, marital status, pregnancy, genetic information, or any other legally protected status.


Job Segment: Cloud, Testing, Systems Engineer, Embedded, Linux, Technology, Engineering