Shared Scalable Compute Cluster for Research (Hyak)
Hyak is a shared, high-performance computer cluster dedicated to research computing at UW. Participating academic units (sponsors) invest in a common high-performance infrastructure, while individual users fund the purchase of compute nodes and supplemental storage configured according to their specific requirements. The system design guarantees users immediate access to their own nodes as well as allowing users fair access to idle CPUs throughout the cluster. Hyak infrastrucutre is deployed in Islands (sub clusters), each supporting ~500 compute nodes. All Islands share a common job scheduler, high performance scratch storage, and login/data-mover nodes with high speed external network links.
- Low latency (<= 3.5µs) 10Gbs connections among all nodes in each Island
- 300Gbs or more of bandwidth between Islands
- 100GB of capacity of shared high-speed scratch space for each node
- 500GB of capacity in the lolo Archive File System for each node
- 50GB of capacity in the lolo Collaboration File System for each node
- Multiple 10Gbs network connections to campus, the Internet, and lolo
- Ssh-protocol access (ssh, scp, sftp, etc.) from all locations
- Two-factor authentication, using your UW NetID and a hardware token
- Red Hat Enterprise Linux operating system
- Full Intel software development environment, and other dev tools
- Several high-level software tools, including Mathematica and R
- Support for single-node virtualization, including MS Windows
Hyak is an excellent option for research groups requiring a fast, convenient, flexible, and cost effective alternative to operating their own scalable, high performance computing resource.
Hyak is part of an integrated, scalable scientific computing infrastructure operated by UW-IT, including the lolo Archive and Collaboration File Systems, and a high performance research network, supporting fast data transfers among these systems and between them and the campus and the Internet.
Customers may purchase between one and more than 100 compute nodes and between 22TB and more than 1PB of storage. Compute nodes and storage subsystems are available in a wide variety of configurations.
Customers may install and use nearly any software they like, including commercial applications, provided they are supported by the Hyak operating environment.
Customers may configure one or more of their nodes for per-core, rather than per-node scheduling. This is typically used in cases where groups intend to use some of their Hyak capacity for interactive, rather than batch-mode operations. Nodes configured in this fashion support a number of separate sessions equal to the number of CPU cores. For example, a 16-core node would support 16 separate interactive sessions.
Customers may assign one or more users as group admins with the authority to add and remove user access to the system and modify job priorities.
UW faculty; UW staff; UW students; UW researchers; UW academic units; Researchers at UW and UW affiliates associated with a sponsord academic unit
Email firstname.lastname@example.org and include the service name "Hyak" in the subject line.
Evaluation accounts are available upon request.
Outside scheduled maintenance, the system is designed to be available 24x7x365 with minimal unscheduled downtime. New equipment is typically deployed within six weeks of the PO clearing UW purchasing.
Hyak is offline from 9:00 a.m. to 5:00 p.m. for scheduled maintenance the second Tuesday of every month. Every third month (January, April, July, October), the maintenance window will last from 9:00 a.m. to 9:00 a.m. the following morning.
Other maintenance performed on Hyak is relatively rare, short (30-60 minutes), and does not impact running jobs. This type of maintenance is announced to the hyak-users list.
Customers pay a one-time charge for equipment which UW-IT deploys and operates in Hyak on their behalf. No indirect costs are charged for equipment deployed in Hyak. Compute nodes are operated for four years, while storage may be operated for up to six years. Prices are updated roughly quarterly.
Information about pricing is available HERE.
Sample prices for compute nodes, including all supporting infrastructure are:
|Model||CPU (dual)||GB RAM||GB Disk||Price|
|IBM HS23||E5-2650 v2||32||600||$4,471.50|
|IBM HS23||E5-2650 v2||64||600||$4,675.00|
|IBM HS23||E5-2650 v2||128||600||$5,527.50|
|IBM HS23||E5-2650 v2||256||600||$7,760.50|
|IBM HS23||E5-2650 v2 + M2090||64||600||$8,125.70|
A full price list for supplemental storage is available HERE.
Sample prices for storage (including file servers, GPFS licenses, network connections, etc.) are:
|Model||Drive GB||Drive RPM||# Drives||Usable TB||Price|
Hyak infrastructure and operations are funded by various sponsors, each of which is entitled to a portion of the system's overall capacity. All customers deploying equipment in Hyak must receive the approval of their sponsor, typically their dean's office. UW-IT can manage the approval process. Academic units currently sponsoring Hyak are:
- The College of Arts and Sciences
- The College of Engineering
- The College of the Environment
- The iSchool
- The School of Medicine, Baker Lab
- The School of Medicine, department of biochemistry
- UW Bothell STEM program
Other academic units interested in sponsoring Hyak capacity must be approved by the Hyak Governance Board and are limited by available data center capacity. Sponsorships supporting 50 compute nodes or more cost $3,000 for each slot ($150,000 for 50 nodes). A slot is the system capacity necessary to support a single compute node for six years.
Hyak is a word in Chinook Jargon, meaning "fast." Chinook Jargon is the trade language of the Pacific Northwest, incorporating terms from Chinook and Chehalis and other local languages, as well as French and English. We've chosen words from Chinook Jargon for the names of systems in the UW research cyberinfrastructure to emphasize their role in supporting the broad range of UW research users and our ties to our place between the mountains and Salish Sea.
Hyak support responds to problem reports in one to four hours; response can be escalated if there is a critical problem. Our goal is always to minimize user downtime by responding as quickly as possible.