System Engineer, AI

Date: Jan 31, 2024

Location: San Jose, California, United States

Company: Super Micro Computer

Job Req ID: 22565

About Supermicro:

Supermicro® is a Top Tier provider of advanced server, storage, and networking solutions for Data Center, Cloud Computing, Enterprise IT, Hadoop/ Big Data, Hyperscale, HPC and IoT/Embedded customers worldwide. We are the #5 fastest growing company among the Silicon Valley Top 50 technology firms. Our unprecedented global expansion has provided us with the opportunity to offer a large number of new positions to the technology community. We seek talented, passionate, and committed engineers, technologists, and business leaders to join us.
 

Job Summary:

Supermicro is seeking an experienced System Engineer who can lead the AI solutions development and integration on GPU server/workstation system product, in charge of the product from the development stage to EOL. As a System Engineer, you will play a key role in development the GPU server products, by leading projects in multiple area of expertise, including design, testing, installation, troubleshoot and debug with members from cross-functional teams.

Essential Duties and Responsibilities:

Includes the following essential duties and responsibilities (other duties may also be assigned): 
• Responsible for AI/Deep Learning solutions development and analytical work, implementation of tools and service programs
• Write and maintain custom automation application to increase system efficiency and minimize any human intervention on any tasks

• Manage and monitor all installed systems and infrastructure

• Install, configure, test and maintain operating systems, application software and system management tools

• Proactively ensure the highest levels of systems and infrastructure availability

• Monitor and benchmark test application performance for GPU server test, figure out potential bottlenecks, identify possible solutions

• Maintain security, backup, and redundancy strategies

• Draft and maintain technical documentations including drawing/diagram

• Provide on-site software deployment/customer acceptance verification test

• Provide post sales level 1 and 2 support

Qualifications:

• Bachelor or above in Computer Science or other Engineering related major is desirable

• Minimum of 3 years working experience in installing, configuring and troubleshooting UNIX/Linux based environments is preferred

• In-depth technology knowledge on Computer/GPU server, storage and network system

• Solid language programming skill (C, C++, SQL, Java)

• Experience with AI/Deep Learning Framework (PyTorch, Tensorflow, MxNet)

• Solid script writing skills (Shell Scripts, Python)

• Hands-on experience with workload/scheduler Managers (Slurm, Kubernetes) for server cluster.

• Experience with virtualization and containerization (VMware, Virtual Box, Docker)

• This position will need to travel for on-site projects

Salary Range

$80,000 - $137,000 

The salary offered will depend on several factors, including your location, level, education, training, specific skills, years of experience, and comparison to other employees already in this role. In addition to a comprehensive benefits package, candidates may be eligible for other forms of compensation, such as participation in bonus and equity award programs. 

EEO Statement

Supermicro is an Equal Opportunity Employer and embraces diversity in our employee population. It is the policy of Supermicro to provide equal opportunity to all qualified applicants and employees without regard to race, color, religion, sex, sexual orientation, gender identity, national origin, age, disability, protected veteran status or special disabled veteran, marital status, pregnancy, genetic information, or any other legally protected status.


Job Segment: Testing, Cloud, Embedded, Systems Engineer, Computer Science, Technology, Engineering