Oracle VM for SPARC (LDoms)

LDoms provide hardware-assisted separation between multiple Solaris instances on the same SPARC server. It is a familiar model of server virtualization, just like VMware, Hyper-V, Xen or VirtualBox. Once the hypervisor (consisting of control/service domains) is set up and resources allocated, most procedures are the same for a guest ldom as they are for a standalone server.

However, to get to that point there are a number of design decisions to make. This section will take you through some common pitfalls and solutions.

See also:

Side note: The name "Oracle VM for SPARC" can be misleading because it is completely different to "Oracle VM for x86", which is built on top of Xen technology. When a customer asks me for help with "Oracle VM", I always have to confirm the platform up front. I find it much easier to use the term "ldoms".

In an ldoms configuration, a service domain is responsible for handling hardware resources such as network and storage on behalf of its guests. The simplest configuration is to use a single primary domain which functions as a service domain for all of the guest domains. 

It is possible to configure SPARC servers with redundant service domains (e.g. primary and secondary).

The headline feature of ldoms is the ability to live migrate a guest instance from one physical server to another, as long as they are configured correctly using shared storage.

If you are converting several previously standalone machines into a smaller number of ldoms, you should at a minimum use link aggregation on 1Gb links to provide some level of load balancing. If you plan on using 10Gb Ethernet, note that the ldoms virtualization layer can introduce a significant performance penalty for 10Gb Ethernet unless configured correctly.

LDoms are most efficient when they are allocated whole CPU cores. On the SPARC T4 and T5 servers, each CPU core has 8 threads. If multiple ldoms share a core:

  • The per-core cache may be thrashed by the different workloads, causing unpredictable performance
  • The automatic single-thread optimization feature on the CPU may work correctly, potentially halving single-thread performance

I would only ever use partial-core allocations on a "play" server where there is a need for more guest ldoms than there are CPU cores.

One use case for limiting the CPU usage is to reduce Oracle license costs through hard partitioning. Note that there are specific instructions for configuring ldoms to comply with Oracle's hard partitioning rules.

Alternatives:

The Oracle documentation on configuring virtual disk devices provides instructions on several ways to configure vdisks for your guest ldoms, but unfortunately doesn't tell you about the severe performance issues you can encounter from following those instructions.

In most new environments I like to use 802.3ad (LACP) link aggregation, also known as bonding and port-channeling.

Side note: The term "trunking" can confuse people because it means two different things. It can mean link aggregation: bundling multiple physical links to create one logical link; or it can mean VLAN tagging: sending multiple separate networks over one link. Therefore I try to avoid the the term "trunking" and instead I just talk about link aggregation or VLAN tagging.