Lustre Introduction

What Is Lustre?

The name lustre is derived from a combination of ‘Linux’ and ‘cluster’ – due to the types of systems it is mainly used with. Lustre is a parallel distributed File System used widely in the HPC market on many of the world’s largest and fastest clusters, all of which are running Linux.

WhamCloud and Pre Intel

The project to develop Linux was started in 1999 at Carnegie Mellon University. Since then lustre has been owned and managed by a number of companies and Lustre continued to be developed under a grant from the US department of Energy by companies including HP and Intel.

In 2007 Sun Microsystems bought the rights to the Lustre file system and provided it in part of their Solaris OS. When Sun was bought out by Oracle they continued to manage and release Lustre. In 2010 Oracle announced they were to stop releases of Lustre at version 1.8.

After the announcement from Oracle the future of Lustre was taken over by a number of organizations including Whamcloud. Whamcloud continued to progress Lustre, and in 2012 Whamcloud was acquired by Intel, securing the future development of the File system.

While Intel own lustre, Lustre remains open source and free to use. In a similar form to Redhat an RHEL, Intel provides an Enterprise edition of Lustre with added support.

Lustre Architecture

A Lustre System is made up of three types of components, Metadata Servers (MDS), Object Storage Servers (OSS) and clients. The MDS’s and OSS’s are attached to Metadata Targets (MDT) and Object Storage Targets respectively. Each server can have one or more targets.

When a client requests access to a file within the lustre file system, it must first query the metadata server for the location of the file. As the metatdata servers contain the locations of all files it is the most critical part of a lustre file system, and the most likely bottle neck to performance.

As every request goes through the metadata server, any delay in responding to requests will have a huge impact on the speed of access. The specification of the MDS’s is often higher than the OSS’s, particularity in terms of the disks used for the MDT. The disks used are higher performance and can often use SSDs.

Once the client has been given the location of the file, it then requests the file from the OSS that holds the file. The files are distributed evenly over the OSS’s to maximise the performance of the system, by using all the available network bandwidth and storage IOPs available.

Adding more OSS’s to the file system, increase not only the number of files you can store, but also the performance of the system due to the increased bandwidth.

Intel Lustre Manager

The Intel Lustre Manager (IML) provides an easy way to install, configure and manage the lustre file system. IML provides a graphical back end to the open source lustre distribution provided by Intel, giving users an integrated set of management tools for lustre. Through a web interface, Servers can be added and set up allowing their targets to be used as part of the lustre storage.

Lustre High Availability

The system can be configured in High Availability (HA) Mode, allowing the file system to continue operating normally should a server fail. To configure HA mode, each target must be attached to two servers. Lustre will configure the system with a primary and secondary server and should the primary server fail, the secondary server will take over with minimal downtime. IML provided an easy to access central point to monitor and manage the filesystems. This reduces the cost of managing the systems through reduced complexity.

Boston Lustre Solutions with Intel Enterprise Edition

Boston HPC provide a range of lustre solutions which can be tailored to meet you requirements. We can also provide access to test systems within Boston Labs to demonstrate the ease of installation and performance of a Lustre filesystem.