Deleting a VM swap file fails

The first time I attempt to delete a VM on srv3, it fals with this error:

Deleting c7d7.cloud.virtualmin.com .. .. failed : Logical volume vg_srv3/c7d7_cloud_virtualmin_com_swap is used by another device.

Deleting the VM a second time seems to successfully delete the swap file and the VM.

This may be related to Cloudmin being unable to ping the VM (I dunno if it's up or down, Cloudmin thinks it's down, but the error makes me think it's still up and network problems are preventing pings). The fact that it deletes on the second attempt further makes me think it's the VM itself hanging on to that file, and that something in the earlier delete attempt actually shut it down.

This may have something to do with something being borked on srv3, as I am not having much luck creating a functional VM. Doing more experimenting now.

Status: 
Closed (fixed)

Comments

Joe's picture
Submitted by Joe on Wed, 09/10/2014 - 19:20 Pro Licensee

Further information...if one doesn't attempt the second delete almost immediately, it will fail again. Then, if you follow up with an immediate second deletion, it will succeed. Seems very much like some sort of race condition. No idea to how to troubleshoot it beyond this point.

Joe's picture
Submitted by Joe on Wed, 09/10/2014 - 19:31 Pro Licensee

And, further still, the deletions of swap volumes don't actually happen, even though Cloudmin reports successful deletion on the second attempt and the VM is gone from the dropdown of managed servers.

Joe's picture
Submitted by Joe on Wed, 09/10/2014 - 19:52 Pro Licensee

OK, so it seems to be an issue of the underlying system. Using lvremove does report the volume is in use, even though it's not listed in /proc/mounts, mount, or lsof.

Rebooting the host system (srv3) does not alter the situation. Webmin's logical volume module and the lvremove command claim the VM is in use. So confusing.

I saw this on CentOS 7 as well - it is due to some mapper entries still existing and pointing to partitions within the LV. I'll look into it.

Joe's picture
Submitted by Joe on Wed, 09/10/2014 - 20:36 Pro Licensee

Oops, red herring. That was specific to one logical volume which was being mounted by the host system.

After reboot, volumes were removeable.

Joe's picture
Submitted by Joe on Wed, 09/10/2014 - 22:25 Pro Licensee

I found a ticket about that issue in the RHEL ticket tracker, it is allegedly fixed, though I think we're running the latest everything. Maybe it was borked from a prior boot and new ones won't have the problem.

Ok, the logical volume removal error will be fixed in the next Cloudmin release. Turns out that some /dev/mapper links need to be dmremove'd for the LV deletion to work.

Automatically closed -- issue fixed for 2 weeks with no activity.