Sr. System Engineer

Date: Oct 28, 2024

Location: San Jose, California, United States

Company: Super Micro Computer

Job Req ID: 25576

About Supermicro:

Supermicro® is a Top Tier provider of advanced server, storage, and networking solutions for Data Center, Cloud Computing, Enterprise IT, Hadoop/ Big Data, Hyperscale, HPC and IoT/Embedded customers worldwide. We are the #5 fastest growing company among the Silicon Valley Top 50 technology firms. Our unprecedented global expansion has provided us with the opportunity to offer a large number of new positions to the technology community. We seek talented, passionate, and committed engineers, technologists, and business leaders to join us.
 

Job Summary:

As a Senior System Engineer, you will work to port and optimize Digital Manufacturing and general HPC / AI applications using Supermicro server hardware platform, enabling breakthroughs in this rapidly moving field. You will create application notes and blog content, and work closely with the engineering teams, customers and partners. You will also act as a senior technical figure within our product support organization, debugging customer issues and providing concise summaries and recommended fixes to our core engineering teams. You will join the team to help build out, benchmark, and troubleshoot the cluster for our customers including the in-house implementation and the on-site HPC / AI deployment and acceptance. You will be part of a talented team of engineers that demonstrate superb technical competency, delivering mission critical infrastructure and ensuring the highest levels of availability, performance and security.
 

Essential Duties and Responsibilities:

Includes the following essential duties and responsibilities (other duties may also be assigned):
• Optimize the HPC/AI hardware platform
• Set up and configure complex test software or applications, even when provided with incomplete steps or unclear instructions.
• Analyze incomplete test setups, identify gaps, and independently devise solutions to ensure successful execution.
• Troubleshoot installation and configuration issues, using creative methods to resolve obstacles without all necessary information
• Write and deploy custom scripts for ad-hoc tasks to meet specific needs during onsite visits
• Develop strong technical relationships with our customers and partners and achieve breakthroughs in HPC / AI performance
• Develop a deep understanding of the state-of-the-art in HPC / AI domains and work with our customers and partner 
• Become a recognized expert on HPC /AI applications and deliver compelling training to our customers and partners
• Become a thought leader on HPC / AI application. Field & resolve challenging / complex customer support issues
• Build processes and procedures for the HPC /AI solutions
• Prove of concept design/test and provide optimized benchmarks on HPC/AI related applications in timely advance
• Optimize BIOS settings; OS / Network tuning and develop different configurations for various types of simulations and come up with efficient configurations for various loads
• Provide on-site deployment service and customer acceptance verification and post level-1&2 support 
• Draft and maintain technical documentations including technical notes, blog, drawing or diagram
• Develop, review and understand the HPC roadmap to be able to plan future software and hardware upgrades and refresh cycles to maintain outstanding HPC infrastructure
• Work with the Product Management and Engineering to ensure a good flow of customer feedback that can be incorporated into future products

Qualifications:

• MS or higher. in related computationally intensive science or engineering field.
• 8+ years of either AI/Deep Learning experience or related experience writing and optimizing applications in HPC, scientific libraries, compilers, digital signal processors or GPUs. 
• Strong scripting and Linux OS internals knowledge.
• Solid grasp of networking, storage systems and batch systems.
• Deep experience with C or FORTRAN, Shell/Python, Cuda and in-depth knowledge of computer architectures, high performance programming and parallel programming.
• Deep experience with HPC/AI application benchmarks at least three of from the lists: LS-Dyna; Openform; Powerflow; Starccm+;Ansys; WRF; NAMD; Amber; LAMMPS; Tensorflow; Pytorch; MXnet; Keras; MLPerf etc. 
• Ability to multitask effectively in a fast-paced environment; Action-oriented with strong analytical and problem-solving skills.
• Strong written and oral communications skills with the ability to effectively interface with management and engineering.
• Comfortable in a customer-facing environment; Strong team-working and excellent interpersonal skills.
• Work onsite at customer locations to complete the assignments and projects within tight deadlines
• Travel is required, and the role may involve working outside of regular business hours

Salary Range

$140,000 - $158,000 

The salary offered will depend on several factors, including your location, level, education, training, specific skills, years of experience, and comparison to other employees already in this role. In addition to a comprehensive benefits package, candidates may be eligible for other forms of compensation, such as participation in bonus and equity award programs.

EEO Statement

Supermicro is an Equal Opportunity Employer and embraces diversity in our employee population. It is the policy of Supermicro to provide equal opportunity to all qualified applicants and employees without regard to race, color, religion, sex, sexual orientation, gender identity, national origin, age, disability, protected veteran status or special disabled veteran, marital status, pregnancy, genetic information, or any other legally protected status.


Job Segment: Cloud, Systems Engineer, Embedded, Manufacturing Engineer, Testing, Technology, Engineering