After reading some of the Slashdot comments on an article regarding a VMWare Server web application performance test, I felt compelled to write up a “tips” post. I’m amazed how much misunderstanding (and assumption-making) there is regarding VMware and virtualization in general!
I’m an old hand at VMware. I began using it back in the late 90s when the company was still very young. Along the way, I’ve picked up quite a bit of knowledge of the product. Although VMware’s ESX Server offering is really “where it’s at”, when VMware Server was released early last year it brought true “server” virtualization within reach of small IT shops with very tiny budgets (like The Linux Fix!)
But first off, VMware Server is no replacement for ESX Server. ESX is a very robust part of the Virtual Infrastructure, within which you can do almost magical things. However if you have only a handful of VMs, or are running a small shop were you’d like to just set things up “proper” without ganging up all software onto one OS installation to keep things clean, VMware Server is a great way to make sure your getting the most of out the money you’ve invested into your server equipment. Especially since it doesn’t involve investing any more money–an especially great fit for us!
One misnomer is that VMware Server isn’t “stable” or enough of a “performer” for production use. This is absolutely wrong: VMware Server clunks along happily. For example, our environment regularly goes as much as six months without needing to take it down–and generally for some unrelated reason or to simply upgrade VMware itself. You cannot get the same performance numbers or quantity of VMs on a host as ESX, nor can you do the magic of balancing resources, but in most small environments that isn’t necessarily needed or even a problem.
What’s more important is that your entire setup is done properly from the beginning, and that mostly means hardware and the host operating system. Below are the things The Linux Fix has learned along the way, and if you follow them I’m sure you’ll experience the same success with VM-S that we have:
Tip #1 - The Host Machine
Tip #2 - The Network
- Running more than a handful of virtual machines (over 10 moderately busy ones) on a server can stress the network port your server is plugged into. Even a gigabit link can get flooded with traffic–remember throughput doesn’t avoid congestion! Avoid some of this buy using port aggregation and trunking. Use a quality, manageable Layer-2 switch that has a high throughput-backbone and trunk 2, 3, or even 4 ports (if you have them) of your server’s NIC together in a load-balance & failover scheme. This will help spread out the load among multiple ports should avoid unintentionally DoS’ing your switch, and also provides a bit of redunancy at the network layer.
Tip #3 - The Host OS:
- Use CentOS with the “Minimal” installation option as a base for VMWare Server. Since it is based off of Red Hat Enterprise Linux, it’s very stable and just about guarantees compatibility with any vendor-specific server agents that you may (and should) run on your servers (see tip #4).Install CentOS, and use ‘yum update’ to update your system. Shutdown and disable the extra services: cupsd, NFS, smartd (your server’s management backplane should do the monitoring), pcmcia, sendmail, among the others. Try to set up CentOS so that there are no ports other than SSH listening on the network. Then do any necessary miscellaneous software installs (ntp, snmpd, etc.), install and configure VMWare Server, and *don’t touch it* from that point on. Don’t patch it, configure it, or fondle it in any way.Yes, I know–some admins would claim this as bad practice. However, this host’s copy of CentOS *should not* be facing the outside world or your clients. What you’re trying to do is emulate what ESX does: Make the host minimal and stable, remove the unecessary stuff and do not touch the rest.
Get around the security issues of this by securing your network. Again, disable any unnecessary network services. Use VLANs and VLAN-trunking to give your physical host a different network segment than your VMs. Set up a reputable hardware firewall (we like Fortigate) to control access between them . Make this VLAN accessible to those that must administer VMware itself.
Tip #4 - Use your server vendor’s management tools
- By all means, use your tools effectively. For example, Dell’s DRAC does far more than give you lights-out console access. It also provides a full management backplane, that in conjunction with the operating-system based server agents can provide information that use may use to proactively prevent down time. Funky DIMMs, bad spinning fans and disks, hot spots, and even CPU problems will all be detected by the hardware backplane far earlier than your software monitoring can. Set up SNMP and use it to forward traps to a good monitoring package, like Nagios. The goal here is to try and give you early warning on problems so you can fix them before they result in down time!
Tip #5 - Set up and tune your VMs carefully
- VMware Server doesn’t give you the robust resource scheduling that ESX does, but you can offset some of this by ensuring you only give your VMs the resources that they need, and that you manually load balance busy VMs on different physical hosts. Install the vmware-mui on your host, and use that to monitor how much your virtual machines are really using. Most decently busy Apache VMs can get away with as little as 192MB of RAM, but of course your mileage may vary. Monitor the OS in the virtual machine whenever you decrease allocated RAM to ensure your’re not swapping to virtual memory in the guest OS or doing too much buffering or caching. Giving it too much wastes resources, goes to disk cache in the VM, and requires extra work by the host to manage it all. Keep in mind: VM OS disk caching isn’t really important since a) you’re caching at the host OS already, and b) if you bought a server with a good RAID controller, it’s cached there too.Continually fine-tune your VMs so that they do not hog any more of the host’s resources than is absolutely necessary. DNS servers running djbdns usually do fine with 96MB of RAM. Don’t assign mutiple vCPUs in a VM unless you fully understand why that is good and bad (and most of the time, it’s bad).
Following those basic tips will easily allow 4-6 moderately loaded VM’s per physical core, which is right around what VMware claims VMware Server can provide. On a Sun X2200M I acquired for testing with four cores, I was able to concurrently run 25 VM’s in a light-to-medium load environment (network services) without over taxing the server at all. Quite an amazing feat for a free virtualization package!
All in all, The Linux Fix has approximately 100 virtual machines spread over four physical hosts. The money saved in power, rackspace, and management-related costs have easily saved us several thousands of dollars over the past year.
At some point, a return on ESX server’s performance makes it a better investment. Soon, we’ll have enough physical hosts to make the $6000 purchase a wise one: On ESX you can roughly double the amount of virtual machines any particular host can handle. At five physical machines running VMWare Server, we could technically reduce that to two, and sell the rest to recoup some of the licensing costs. Again, it requires you crunch your own numbers and see what works. The line at which that happens is different for everyone, but it always exists!