Comparison of cluster software

The following tables compare general and technical information for notable computer cluster software. This software can be grossly separated in four categories: Job scheduler, nodes management, nodes installation and integrated stack (all the above).

General information


Software Maintainer Category Development status ArchitectureOCS High-Performance/ High-Throughput Computing License Platforms supported Cost Paid support available
Amoeba No active development MIT
Base One Foundation Component Library Proprietary
DIET INRIA, SysFera, Open Source All in one GridRPC, SPMD, Hierarchical and distributed architecture, CORBA HTC/HPC CeCILL Unix-like, Mac OS X, AIX Free
Enduro/X Mavimax, Ltd. Job/Data Scheduler actively developed SOA Grid HTC/HPC/HA GPLv2 or Commercial Linux, FreeBSD, MacOS, Solaris, AIX Free / Cost Yes
Ganglia Monitoring actively developed BSD Unix, Linux, Windows NT/XP/2000/2003/2008, FreeBSD, NetBSD, OpenBSD, DragonflyBSD, Mac OS X, Solaris, AIX, IRIX, Tru64, HPUX. Free
Grid MP Univa (formerly United Devices) Job Scheduler no active development Distributed master/worker HTC/HPC Proprietary Windows, Linux, Mac OS X, Solaris Cost
Apache Mesos Apache actively developed Apache license v2.0 Linux Free Yes
Moab Cluster Suite Adaptive Computing Job Scheduler actively developed HPC Proprietary Linux, Mac OS X, Windows, AIX, OSF/Tru-64, Solaris, HP-UX, IRIX, FreeBSD & other UNIX platforms Cost Yes
NetworkComputer Runtime Design Automation actively developed HTC/HPC Proprietary Unix-like, Windows Cost
OpenHPC OpenHPC project all in one actively developed HPC Linux (CentOS) Free No
OpenLava None. Formerly Teraproc Job Scheduler Halted by injunction Master/Worker, multiple admin/submit nodes HTC/HPC Illegal due to being a pirated version of IBM Spectrum LSF Linux Not legally available No
PBS Pro Altair Job Scheduler actively developed Master/worker distributed with fail-over HPC/HTC AGPL or Proprietary Linux, Windows Free or Cost Yes
Proxmox Virtual Environment Proxmox Server Solutions Complete actively developed Open-source AGPLv3 Linux, Windows, other operating systems are known to work and are community supported Free Yes
Rocks Cluster Distribution Open Source/NSF grant All in one actively developed HTC/HPC OpenSource CentOS Free
Popular Power
ProActive INRIA, ActiveEon, Open Source All in one actively developed Master/Worker, SPMD, Distributed Component Model, Skeletons HTC/HPC GPL Unix-like, Windows, Mac OS X Free
RPyC Tomer Filiba actively developed MIT License *nix/Windows Free
SLURM SchedMD Job Scheduler actively developed HPC/HTC GPL Linux/*nix Free Yes
Spectrum LSF IBM Job Scheduler actively developed Master node with failover/exec clients, multiple admin/submit nodes, Suite addOns HPC/HTC Proprietary Unix, Linux, Windows Cost and Academic - model - Academic, Express, Standard, Advanced and Suites Yes
Oracle Grid Engine | Oracle Grid Engine (Sun Grid Engine, SGE) Altair Job Scheduler active Development moved to Altair Grid Engine Master node/exec clients, multiple admin/submit nodes HPC/HTC Proprietary *nix/Windows Cost
Some of Grid Engine | Son of Grid Engine daimh Job Scheduler actively developed (stable/maintenance) Master node/exec clients, multiple admin/submit nodes HPC/HTC Open-source SISSL *nix Free No
SynfiniWay Fujitsu actively developed HPC/HTC  ? Unix, Linux, Windows Cost
Techila Distributed Computing Engine Techila Technologies Ltd. All in one actively developed Master/worker distributed HTC Proprietary Linux, Windows Cost Yes
TORQUE Resource Manager Adaptive Computing Job Scheduler actively developed Proprietary Linux, *nix Cost Yes
UniCluster Univa All in One Functionality and development moved to UniCloud (see above) Free Yes
UNICORE
Xgrid Apple Computer
Software Maintainer Category Development status Architecture High-Performance/ High-Throughput Computing License Platforms supported Cost Paid support available

Table explanation

  • Software: The name of the application that is described

Technical information

Software Implementation Language Authentication Encryption Integrity Global File System Global File System + Kerberos Heterogeneous/ Homogeneous exec node Jobs priority Group priority Queue type SMP aware Max exec node Max job submitted CPU scavenging Parallel job Job checkpointing Python interface
Enduro/X C/C++ OS Authentication GPG, AES-128, SHA1 None Any cluster Posix FS (gfs, gpfs, ocfs, etc.) Any cluster Posix FS (gfs, gpfs, ocfs, etc.) Heterogeneous OS Nice level OS Nice level SOA Queues, FIFO Yes OS Limits OS Limits Yes Yes No No
HTCondor C++ GSI, SSL, Kerberos, Password, File System, Remote File System, Windows, Claim To Be, Anonymous None, Triple DES, BLOWFISH None, MD5 None, NFS, AFS Not official, hack with ACL and NFS4 Heterogeneous Yes Yes Fair-share with some programmability basic (hard separation into different node) tested ~10000? tested ~100000? Yes MPI, OpenMP, PVM Yes Yes, and native Python Binding
PBS Pro C/Python OS Authentication, Munge Any, e.g., NFS, Lustre, GPFS, AFS Limited availability Heterogeneous Yes Yes Fully configurable Yes tested ~50,000 Millions Yes MPI, OpenMP Yes Yes
OpenLava C/C++ OS authentication None NFS Heterogeneous Linux Yes Yes Configurable Yes Yes, supports preemption based on priority Yes Yes No
Slurm C Munge, None, Kerberos Heterogeneous Yes Yes Multifactor Fair-share yes tested 120k tested 100k No Yes Yes PySlurm
Spectrum LSF C/C++ Multiple - OS Authentication/Kerberos Optional Optional Any - GPFS/Spectrum Scale, NFS, SMB Any - GPFS/Spectrum Scale, NFS, SMB Heterogeneous - HW and OS agnostic (AIX, Linux or Windows) Policy based - no queue to computenode binding Policy based - no queue to computegroup binding Batch, interactive, checkpointing, parallel and combinations yes and GPU aware (GPU License free) > 9.000 compute hots > 4 mio jobs a day Yes, supports preemption based on priority, supports checkpointing/resume Yes, fx parallel submissions for job collaboration over fx MPI Yes, with support for user, kernel or library level checkpointing environments Yes
Torque C SSH, munge None, any Heterogeneous Yes Yes Programmable Yes tested tested Yes Yes Yes Yes
Software Implementation Language Authentication Encryption Integrity Global File System Global File System + Kerberos Heterogeneous/ Homogeneous exec node Jobs priority Group priority Queue type SMP aware Max exec node Max job submitted CPU scavenging Parallel job Job checkpointing

Table Explanation

  • Software: The name of the application that is described
  • SMP aware:
    • basic: hard split into multiple virtual host
    • basic+: hard split into multiple virtual host with some minimal/incomplete communication between virtual host on the same computer
    • dynamic: split the resource of the computer (CPU/Ram) on demand

See also

This article is issued from Wikipedia. The text is licensed under Creative Commons - Attribution - Sharealike. Additional terms may apply for the media files.