Multiple ways to peel an orange

Use Cases

  • Datacenter system deployment (imaging/roll-out) & management
  • Workstation image management for any size office space
    • Virtual or bare metal desktops running Windows or Linux without any modification to the OS
    • Transparent to end-users, no performance loss
    • Update one workstation, others can reboot to reprovision to the new version automatically
    • Changes to OS can be eliminated on reboot; zero risk of tampering by end-users because of top-down policy enforcement by the network
  • Developer environments
    • Create multiple netboot templates for testing developed software in many environments
    • Temporarily (or permanently) roll-back template changes on a per-device basis, allowing bug verification against past platforms with less headaches
    • Take developer focus away from VM lifecycle management to spend more time on real issues

Configuration Examples

Networks have different requirements because of various factors: financial considerations, scale of implementation, reliability, performance, and capacity.

Single-server

It is entirely possible (although less safe) to run all storage and compute services from just one node; but as of v0.9.3 there is no straightforward method for backups with a single node aside from running a virtual server on the same server that is connected to separate disks. External ZFS backup tools could be leveraged, but if they interact with Group or vDisk snapshots, you may want to reconsider.

If a user were to write a script using our API to do send/recv using database snapshots, it'd probably be added to our clusterducks/misc-scripts repository.

Multiple servers

When a network is configured, the first server is designated as the master server. This is done mainly to accomodate the use of filesystems that do not support native clustering (ZFS), thus the connection between the master and all slaves should be sized accordingly to avoid suffering long replication times with large dataset changes.

Though the relationship is described as master-slave, all slave nodes are active and support "read-only" OS images whose writes are discarded when the device reboots. This is different from the typical high availability configuration where a server is on standby until needed; on clusterducks networks, all resources are available whenever possible.

Group replication occurs via API request; there is a built-in scheduling interface, but it is in alpha stages.

vDisk replication checks are triggered every time the statistics collection cron job runs (every minute, or greater), though it will not send/recv unless needed; either the amount of data written to disk or the time elapsed since last snapshot (or both) exceed threshold.

Devices running on slave nodes still have access to persistent data from assigned vDisks. Replication between slave servers is planned for a future release.

  • Load Balancing

    • Any thawed devices (OS volume is not reprovisioned during boot) will be redirected to the network master during PXE initialization to ensure that any changes to the OS image can be replicated to slaves appropriately.
    • Devices in their default state (frozen: OS volume is reprovisioned during boot) can use a load balance mode to sort the list of attempted servers by a score that the panel calculates during statistics collection, based on metrics like CPU load, network and disk throughput, and the number of clients that are connected.
  • Round Robin

    • A simple incremental approach to attempting servers is also available - the counter is stored per-device, so one devices' behaviour does not affect another. This mode may not result in balanced resource usage, however, for sites who frequently lose network access it may be the better choice as the Load Balancing approach contacts the control panel to retrieve updated server sorting scores.

In the Datacenter

clusterducks is being used to run multiple networks on commodity server offerings from OVH.

A single dedicated server with 32GB RAM and 500G of storage has quite a bit of room for playing around; clusterducks has native libvirt integration and our Debian installer preconfigures a basic network bridge for guests to bind to.

Although clusterducks does not have an interface for managing the system firewall, it ooes not prevent or disrupt the use of existing tools or network appliances.

If you have multiple servers within a single rack, clusterducks can be used to manage deployment of bare metal environments.

Across multiple datacenter points-of-presence (PoP)

If you have servers in multiple buildings, whether it is an office space or a datacentre, you are going to have to look at similar requirements:

  • Bandwidth between sites
    • 100Mbit will work, but 1000Mbit takes less time to transfer important image changes
    • Remember, as of v0.9.2 all updates come from the master server. The secondary servers can not push updates, only receiving is supported
  • Image requirements - what do the sites do with their systems?
    • If two sites have vastly different requirements, do not link them; set up separate networks with at least two servers in each one
    • Netbooting Windows workstations using a shared image typically requires the workstations have nearly "identical" hardware; at least the network driver should be the same between the two
    • Netbooting Linux servers, whether virtual or physical, do not have this strict requirement and one image can cover multiple types of hardware
  • No live-migration support as of v0.9.2; VMs and bare metal must reboot for migration