Returning Candidate?

Senior HPC Specialist

Senior HPC Specialist

Posting Number 
Location : Location 
US-NY-New York
Posted Date 
Compensation Grade 
Band 53
NYU IT (WS1170)
Research Technology Services

More information about this job

Position Summary

Provide technical leadership in design, development, installation and maintenance of hardware and software for the central High-Performance Computing systems and/or research computing services at New York University. Plan, design, and install Linux operating system's hardware, cluster management software, scientific computing software and/or network services. Analyze performance of computing systems; plan new system configurations; direct implementation of operating system enhancements that improve reliability and performance. Develop and implement monitoring and testing tools; monitor the systems for performance and security; analyze malfunctions; direct and/or support resolution of systems’ technical problems. Plan, implement, debug, document and maintain software to develop and/or support user applications for NYU. Research, evaluate, devise, select and authorize configurations for new systems, hardware and software. Participate in strategic planning for long-term requirements of system operations; develop and implement information system enhancements for the University.


Required Education:
Bachelor's degree

Preferred Education:
Master degree and Ph.D. preferred

Required Experience:
5 years' relevant experience; related specific experience with the following: running scientific applications on large scale computers, optimizing and/or developing applications on UNIX-based systems, designing/developing system enhancements and software applications, programming experience with modern languages and, application software, protocols, tools and utilities and may include installation, maintenance and support or an equivalent combination.

Preferred Experience:
Master's degree in technical field with 4 or more years’ experience as a Unix/Linux cluster administrator. The ability to learn new tools and concepts quickly is essential. Experience with some of the following will be important: compiling scientific software, Open MPI; PBS/Torque/Grid Engine/SLURM; configuring database, file, and web servers; installing and maintaining various UNIX OS's, Rocks, Ganglia, Nagios, Hadoop, etc; knowledge of backup solutions for large quantities of data; and tuning configurations for high performance needs, such as minimizing latency and maximizing throughput.

Required Skills, Knowledge and Abilities:
Proficiency with multi-vendor hardware/software configuration. May require any of the following: problem identification/resolution, performance management/tuning, and design configuration/planning. Knowledge of related large-scale computing systems and/or product installs and maintenance. Ability to provide technical leadership and management of complex, large-scale computing systems projects. Ability to clearly communicate technical concepts to non-technical audience. Excellent organizational and communication skills.

Preferred Skills, Knowledge and Abilities:
Queuing systems, schedulers and workload managers, configuration management, Lustre, OFED. Experience in Matlab, R, shell scripts, Perl and/or Python.

Additional Information

EOE/AA/Minorities/Females/Vet/Disabled/Sexual Orientation/Gender Identity

Connect with us

Sign-up to let us know about your interest in an NYU Career.