Geeks in the West Country

Entries categorized as ‘VMware’

VirtualCloud bears fruit

July 31, 2007 · No Comments

Our testing system is finally coming together. Once the last few parts for the server cluster arrive, integration testing can begin. Meanwhile, one of its components is finding uses outside of its parent project.

Managed.VIM, our C# proxy for the VIM webservice, is available for download from the VirtualCloud Sourceforge site. It is currently in beta (revision 105) and probably not ready for use in a production environment. It doesn’t yet implement all the functionality of VIM either; only the bits we need for VirtualCloud have been written so far.

It’s a nice tool for PowerShell monitoring of VirtualCenter, though. Once the Managed.VIM assembly has been loaded into PowerShell, the proxy object can be created as follows:

$vim = New-Object Managed.VIM.Vim(”https://localhost:8443″, “vc_username”, “vc_password”);

Replace the URL, username and password with values appropriate to your VirtualCenter configuration.

The Vim object exposes collections of VMs, Hosts, Farms, Datastores and Templates. This is the root of the VirtualCenter object hierarchy. Nodes in the hierarchy expose children by type, in the same way as the root. For instance, a Farm can contain Hosts, VM groups and VMs; these are accessible via the Hosts, Groups and VirtualMachines indexed properties respectively, and can be retrieved by name.

The root of the hierarchy also exposes collections of all VMs and Hosts managed by the VirtualCenter service. Hosts are accessible by name, as they are within Farms, but the collection of all VMs is indexed by UUID. This is because you can have multiple VMs with the same name as long as they are in different groups.

The following will get the object corresponding to the host ‘testserver1′ in the server farm ‘TestFarm’:

$testServer = $vim.Farms["TestFarm"].Hosts["testserver1"];

Operations supported by VirtualMachine objects currently include:

  • Start, stop and suspend,
  • Change persistence mode of hard disks (VirtualCenter does not permit this, but VIM does),
  • Get or set device or ISO image used for CD drives,
  • Get CPU count and memory allocated to the VM,
  • Get Host.

Operations supported by Host objects include:

  • Get CPU and memory info,
  • Get accessible datastores,
  • Get hosted VMs,
  • Get Farm.

Operations supported by Template objects include:

  • Get size, memory allocation, datastore and guest OS,
  • Deploy template to Host, with optional customisation of Windows guests.

Customisation of guest operating systems has not actually been tested yet and, like most of this assembly, is not documented yet either. It involves the Managed.VIM.Customisation.WindowsCustomisation class, which exposes some fairly self-explanatory properties and methods. Use at your own risk.

The leaf nodes of the hierarchy (Hosts, VirtualMachines, etc) are heavyweight objects. They are kept synchronised with the VirtualCenter service and therefore generate a certain amount of network traffic. For this reason, each collection also exposes a list of the names of all objects in it, like the Keys property of a dictionary. For example:

$hosts = $vim.Farms["TestFarm"].Hosts.Names;

This is to be preferred when you don’t actually need any of the properties or functions on a Host or VirtualMachine, but merely need to determine which ones are available or where they are.

Datastores are a bit of a sore point at present. Multiple datastores can technically have the same name, although I think this only applies to the default datastore on a host, called ‘local’. Working on this assumption I’ve allowed for multiple ‘local’ datastores, accessed via $vim.Datastores.Local, which is indexed by Host. All other stores are accessed by name (through $vim.Datastores) and must therefore have unique names. Otherwise Managed.VIM’s internals don’t work correctly and will attempt to log lots of errors through log4net. They’ll do this on pretty much every update request too, which will eat lots of CPU and effectively kill the update mechanism. I’m not going to try to fix this until I know it’s a problem, because the fix breaks some other things.

Custom properties on Hosts and VirtualMachines are supported by Managed.VIM. Getting and setting values is possible, but adding new properties requires the VirtualCenter client. This is apparently a limitation of the VIM SDK. This was a bit worrying initially since VirtualCloud will require several custom properties. Fortunately, VirtualCenter doesn’t appear to mind having new properties poked directly into its database, so the VirtualCloud installer (when written) will probably want to know where this database is in order to update it. I’d very much like to know if this is considered acceptable, or whether most VirtualCenter admins want other apps to consider that database inviolate; the alternative is asking the admin to add about twelve properties to VirtualCenter by hand…

By the way, Console is made of pure win. Tabbed PowerShell is extremely useful.

Categories: PowerShell · VMware · VirtualCloud · Virtualisation

Small solution to an Enterprise problem

June 28, 2007 · 1 Comment

Given a set of VMs representing possible client configurations, a set of Selenium test scripts, and a collection of multiprocessor servers, how can the process of running all tests against all VMs be automated and scaled to available processing capacity?

On-demand tests against a developer’s copy of the product would also be useful, as would a means of testing computers for compatibility at the client’s site.

Back in May I wrote about the script-based VM management system I put together in preparation for automated UI testing. While sufficient for building our test server and managing our production builds (which it has been doing very well for over a month now) it’s become apparent that maintaining the scripts necessary to run a testing farm is going to quickly become impossible. Powershell’s great for small stuff, but something this big needs some serious software behind it.

There are existing solutions out there which do what we need, eg. Surgient’s Virtual Lab Management Applications, or VMware’s own Lab Manager, but they tend to be aimed primarily at the larger businesses that have either the infrastructure or budget (usually both) for Enterprise-class solutions. We don’t have the space or staff to run a datacenter, and a managed solution doing the amount of processing we need does not come cheap.

So the plans were laid for building a smaller scale solution to the problem of automating tests across multiple VMs. We stuck with VMware, because our existing infrastructure is based upon their free Server product and the competing products weren’t competitive enough for converting to be worthwhile. The target machines for this system were to be three multi-CPU rack servers, which will later also take on the burdens of the three glorified workstations that currently occupy the server cupboard.

We looked for a way to abstract the VM capacity of the physical servers and treat them as a computing cloud. The original plan was to write some daemons in .NET that would use VMware’s VIX API to talk to the VMware Server instances, but VIX functions often take callbacks and reverse P/Invoke didn’t want to play nicely. While it might’ve been possible to write the daemons in C/C++ instead, it was far cheaper just to fork out for VirtualCenter 1.4 and learn to use the Virtual Infrastructure Management (VIM) SDK from C#. This basically reduced our complex distributed solution to a single service that could use VirtualCenter to do its dirty work.

After a little research it was discovered that switching to ESX Server would cost too much after factoring in VirtualCenter 2 and the necessary hardware (SANs do not come cheap), although it would have let us use version 2 of the VIM SDK, which has proper C# bindings. It was decided that our computing cloud would run a lightweight Linux distro to minimise the overhead of the native OS on our cloud’s nodes. Gentoo was suggested, but I wasn’t inclined to spend eight hours installing it when Ubuntu Server is so much quicker and simpler. We have an evaluation copy of ESX somewhere; at some point I should probably determine exactly how much more efficient a hypervisor would make our computing cloud.

My initial reaction to the version 1.4 VIM SDK was not entirely favourable. I wasn’t happy about the use of HTTP webservices for everything, since this means that the client has to pull updates down from the server via a message loop and there was no C# proxy API provided. Implementing that message loop gave me a headache to begin with; it’s uncharted territory for a SOAP newbie like me :P. But after a little hard work and some background reading I began to understand the protocol and the sense behind it, and now we have a nice extensible C# proxy for it which handles object synchronisation entirely in the background.

(I’m still confused about the use of XML diffs. Surely if minimisation of network traffic is a priority, XML isn’t the best format to use anyway?)

Once we’d determined which parts of our original design worked and what could replace the bits that didn’t, a new design was drawn up. It mutated slightly over the following week or so, but has stayed broadly the same.

VirtualTest design

  • The TestBuilder generates ISO images containing the necessary programs and data to run the tests. These ISOs will autoplay if attached to an active VM. VIM doesn’t seem to support attaching devices while a VM is running, however, so each VM includes a service that autoplays an attached ISO at system start time.
  • The Resource Manager handles scheduling of jobs. It knows nothing about these jobs except the resources they consume. The Resource Controller (the Managed.VIM namespace, used for talking to VirtualCenter) eventually got split out of this module and became a separate entity.
  • VirtualTest runs as a service and links everything together.

The use of an ISO to inject tasks into a VM is rather similar to the ‘parameter disk’ concept used in our Powershell-based system. It’s also the technique VMware use to install their Tools package on a VM, which incidentally is where I got the idea in the first place…

The design above is actually split into two parts. The testing system consists of the dashboard, the VM, and the contents of the ISO. VirtualTest (the green part) is really just a job control system dealing with VM processing resource; TestBuilder would be better named IsoBuilder, and we’ll have to rename VirtualTest too. Strangely, the potential flexibility of this system only really became clear once the design had been adjusted to make it fully testable. Yet another victory for TDD, I feel.

This split also made dividing the workload simple. I worked on the VirtualTest system, and Ben has been developing the testing framework using Ruby and Selenium. Integrating the two systems consists of implementing a Job object that will build and deploy the appropriate ISO to the VM; apart from that, neither system need know anything about the other, since test results are uploaded to the dashboard by the testing software running from the ISO. We also get for free a way to do on-site testing of client configurations, because the ISOs can be burned to CD and run on any machine.

At this point in time, VirtualTest is not quite ready for deployment. The CLI tools do not yet work across TCP (I may well solve this by dropping Remoting and just using WCF instead) and all they do is add a simple demo job to the queue. The ISO builder for tests has not yet been implemented. Support for on-demand tests is on hold until we actually have something working.

The core of the VirtualTest system demonstrably works, however, and will soon take over from the Powershell scripts that currently schedule our nightly builds. It’s almost complete enough for our needs, but the following improvements could also be made:

  • Fix resource management so resources can be assigned properly. Resources requested by a job and resources viewed as allocated to that job should be the same thing, but presently the resource manager isn’t smart enough to remember how much is allocated, and instead just looks at how much is being used.
  • Fix the whole thing so multiple resource types can be managed. The only resource it understands is ‘VM capacity’. It’d be nice to be able to track registration key usage, so we could make optimal use of Windows product keys and ensure that only one active VM is using a given key at a time; currently every VM has to have a separate key whether it’s running or not, which is somewhat wasteful.
  • It’d also be nice to allocate CPU and memory intelligently instead of abstracting these as ‘VM capacity’, but that’s probably out of our reach.
  • Fix the whole thing so jobs can be scheduled dependent on multiple resource types. This ties in with the registration key thing.
  • Make the Managed.VIM proxy safe for multithreaded use without requiring lots of locking in client code.
  • Ensure that Managed.VIM works seamlessly with VIM version 2, so migration to ESX Server is easy.
  • Guarantee that all Job objects are serialisable, allowing them to be instantiated by a client app and sent across the wire to the VirtualTest service. At present, the Remoting interface demands a Type and an array of constructor arguments, which is horrible.
  • Improve unit test coverage. The Managed.VIM assembly is mostly untested. We need integration tests for this as well.

I’m going to move the codebase over to Sourceforge as soon as possible, at which time I’ll blog again. The ‘testing’ component of this system is Ben’s domain, so I’ll let him explain it.

UPDATE: We now have a Sourceforge project, called VirtualCloud. The codebase has been migrated and the above improvements added to the Feature Request tracker.

Categories: Testing · VMware · VirtualCloud · Virtualisation

VMware, PowerShell and much automation

May 11, 2007 · 3 Comments

We run many of our servers as virtual machines on VMware GSX Server. This makes backups easy; simply power down the VM and copy the relevant disk image. It’s also rather nice for setting up a test server from a known-good state every day.

We already have a VM that builds in the early hours of the morning, right after the nightly production build. It has a non-persistent disk image, which means that all changes to the hard disk are lost when it is switched off. A script shuts it down prior to the production build, then starts it again afterwards. Another script starts on the VM at boot time to run the necessary installers.

This setup is simple and effective, but the installers can take a while to run. The sample data is a particular problem, sometimes taking as long as forty-five minutes to insert into the database. This means that, should the VM fail to build during the night, there’s a very long period of time between fixing the problem and having a working test server. Plus, the server image is connected to the Windows domain and Active Directory sometimes gets confused by the disk image reverting. This always requires manual intervention and the process for fixing it is prone to error.

With the new requirement for multiple similar servers running automatic tests in parallel, I decided to solve all our problems at once by writing a set of scripts to build server images independently of the network and the domain. The first criterion, network independence, arises from the fact that all the servers will come from a single base image with a single hostname, and running more than one at a time is going to really confuse the network. Changing the hostname is high on the list of priorities for the image builder.

The system I decided upon involves attaching a ‘parameter disk’ to the VM, containing a Bootstrap.ps1 script and all the programs necessary to configure the VM. The base image will search for the bootstrap script every time it starts and run it if possible. This lets us shut down a built image without losing everything on it (as we did with the old system) and without having the installers run on every boot. Once the image is set up, we just disconnect the parameter disk and it’ll behave as an ordinary VM.

Implementing all this has taken most of a fortnight and expanded our PowerShell script library significantly. Most of the complexity comes from the scripts’ interaction with VMware. This interaction occurs primarily through VMware’s COM API, but some things aren’t supported, particularly modifications to the VMs’ hardware configuration.

On the host machine, where the VMs are manipulated:

The image building script does the following:

  1. Copies the base image to the target directory,
  2. Calls another script which sets up the parameter disk,
  3. Boots the VM and waits for it to exit,
  4. Disconnects the parameter disk.

Our test server VM requires a few more things to be done afterwards, like checking the logs on the parameter disk, setting the new image’s MAC address, attaching it to the network, shutting down the old test server and booting the new one.

I needed to be able to set the VM’s MAC address because all our other machines are on the domain, the test server cannot be, and I need to give the VM a known IP so it can coincide with the IP of the old test server, which was on the domain and is the host most of our other machines have bookmarked.

The API provides a means for modifying configuration info like MAC addresses and hard disk devices in memory (VmCtl.Config) but any changes made this way are lost when the VM process terminates. Making permanent changes that will persist when you move the VM image elsewhere requires direct manipulation of the .vmx file, which does not appear to be very well documented. I take this to mean that it’s not really meant to be tinkered with, but that’s just an invitation…

Fortunately, Google turned up this page among the documentation, describing the procedure for setting a static MAC address.

On the VM, where the parameter disk is consumed:

Building two dozen servers for automated testing isn’t much use if they can’t connect to the network because they all have the same hostname. We need some means of changing the VM’s hostname. All Windows machines have a supposedly-unique System ID (SID) as well, and I’d like to change that during image construction just in case we need build machines for the domain.

Windows does not support changing the SID after the GUI phase of system installation has begun. The base image contains a thoroughly cooked Windows system, which has not only fully completed the OS installation but has several apps installed as well. Fortunately for Windows sysadmins everywhere, there’s a nice little app called NewSID, a part of the Sysinternals collection, which can change the SID after installation is complete. Running it in non-interactive mode can be done via a command-line switch, ‘/a’.

At least, in theory.

Microsoft has added a EULA to each of the Sysinternals tools, which means that NewSID will sit and wait for someone to click a button before doing anything, even in ‘non-interactive’ mode. This can be circumvented by creating a registry key before you run NewSID:

[HKEY_CURRENT_USER\SOFTWARE\Sysinternals\NewSID]
“EulaAccepted”=dword:1

Of course, NewSID will reboot the machine after it runs. For our purposes it should only be run once, no matter how many times the system is rebooted subsequently.

So the PowerShell script for running NewSID on a machine and changing its hostname to $compName looks something like this:

if($env:COMPUTERNAME -eq $compName)
{
[do subsequent setup tasks here];
}
else
{
if(-not (Test-Path HKCU:/SOFTWARE/Sysinternals))
{
New-Item HKCU:/SOFTWARE/Sysinternals
}
$key = New-Item HKCU:/SOFTWARE/Sysinternals/NewSID;
$key.SetValue(”EulaAccepted”, 1);
$p = [diagnostics.process]::Start(”newsid.exe”, “/a ${compName}”);
$p.WaitForExit();
}

Construction of the parameter disk:

Parameter disks are defined by INCLUDES configuration files which contain a list of files to put on the disk. This turned out to be a bit limiting; NewSID will need to be run for all our server configurations and adding that functionality isn’t as simple as just putting files on the disk.

So packages can also be included. A package is basically a directory with a BUILD.ps1 script in it, which takes a path to the parameter disk and can assume it’s running from the package’s own directory. For most packages this script just copies files, but the NewSID one has to do something a little more complicated.

A package is not allowed to make any assumptions about the parameter disk apart from the presence of a Bootstrap.ps1 file. All parameter disk configurations will have this file and it will be the first thing run. NewSID’s BUILD.ps1 has to set up the disk so that its own Bootstrap.ps1 gets run first, which is done by renaming the existing Bootstrap.ps1 and chaining to it from its own.

This does come with the caveat that NewSID must be the last package included, otherwise it might get displaced by another.

An INCLUDES file looks something like this:

Bootstrap.ps1
@Installers
@NewSID [hostname]

Filenames are relative to the directory containing the INCLUDES file. Any line starting with an ‘@’ is assumed to be a package name. Packages can be given a list of comma-delimited arguments, which will be passed as an array as the second argument to BUILD.ps1.

At present, our scripts are rather tied to our filesystem layout and server configurations. Once they’re a bit more portable I’ll make them available for download.

UPDATE: These scripts are rapidly approaching obsolescence, what with our decision to manage our VMs with VirtualCenter. Managed.VIM and VirtualCloud will supercede the scripts and be a lot less nasty to maintain.

Categories: PowerShell · VMware · Virtualisation