The headline feature of ldoms is the ability to live migrate a guest instance from one physical server to another, as long as they are configured correctly using shared storage.

This is most useful when there is sufficient memory and CPU available across the server pool to support the additional workload of one server, because all guests can be migrated off that physical server to allow for primary domain patching, firmware upgrades, and planned hardware changes.

For this reason, if you want to use Live Migration it may be more appropriate to have many small servers such as the SPARC T4-1 or SPARC T5-1B, instead of just two large servers such as the SPARC T4-4 or SPARC T5-8. The amount of reserved CPU and memory required to evacuate a whole physical server space is less if each physical server is smaller. For example if you have only two servers, then you would need to reserve half of the total resource on each server in order to have room to evacuate one server, whereas if you had ten smaller servers, you would only need to reserve one tenth of the total resource on each server.

The main benefits of live migration are:

  • The ability to non-disruptively evacuate a physical server in order to patch and restart the server without affecting guests, and to perform hardware changes and firmware updates
  • The ability to move guest domains around in order to make more efficient use of CPU and memory resources

The main drawbacks are:

  • There is additional work required to configure shared storage across the server pool
  • There is still a single point of failure in the shared storage, and allowing shared access can increase the risk of corrupting the data
  • Up until Solaris 11.1 it was not possible to resize the memory of a guest domain after it had been migrated - this defeated the ability to migrate in order to make better use of memory
  • Solaris 10 guests can not be migrated between physical servers with different CPU frequencies (e.g. a SPARC T4-1 at 2.85GHz and a SPARC T4-4 at 3.0GHz)

Alternatives:

  • Service domain redundancy also allows you to patch and restart the service domain layer
  • High availability software (e.g. cluster) at the application level can provide the same flexibility

Useful links: