Tiered Storage Heirarchy

The ZFS filesystem allows convenient usage of cloned filesystems with CoW (Copy-on-Write) thin provisioning to get the most out of storage; even though multiple devices may have identical images, the individual device's actual on-disk size is only as large as the differences that instance has written to the server since it was provisioned.

A tiered storage heirarchy is employed by clusterducks:

tiered storage diagram

  • Volumes are top-level objects who form the basis for Group images
  • Groups are clones of Volumes and take up only the space of their differences from their parent Volume
  • Devices are clones of Groups and take up only the space of their temporary storage
  • vDisks may be assigned to Devices. Linux devices may use overlayfs to separate local changes from OS image.

Example:

  • Volume A = 40GB. Contains Windows 10 installation, plus some drivers and utilities.
    • Group A-1 = 10GB on-disk (device sees 50GB). Contains "Volume A" plus "CAD Software 2012"
      • CAD artist boot from this profile
    • Group A-2 = 15GB on-disk (device sees 55GB). Contains "Volume A" plus "CAD Software 2016 (new testing version)"
      • DevOps uses this profile for testing new CAD software. Once testing is complete, users are changed from "CAD 2012" to this template with just a couple clicks.
    • Group A-3 = 1GB on-disk (device sees 41GB). Contains "Volume A" plus "Office Software"
      • Devices boot from this profile; reception, etc. Users who do not need a license for CAD software

Because of the three-layer tier, administrators have the option to create a large number of templates from a single source, who all only consume the space of their differences. If we only had the option for Groups, every environment would need a full, independent OS install.

Tiered storage heirarchy could be considered poor man's deduplication (because of the absurdly high bandwidth and memory requirements for actual inline deduplication hash table calculation & storage).