<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Utility Computing dot China &#187; esx</title>
	<atom:link href="http://www.utilitycomputing.com.cn/tag/esx/feed" rel="self" type="application/rss+xml" />
	<link>http://www.utilitycomputing.com.cn</link>
	<description>数 据 嘉 年 华</description>
	<lastBuildDate>Mon, 26 Jul 2010 08:59:29 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.0</generator>
		<item>
		<title>ESX: Recover from expanded disk with existing snapshot or corrupted snapshots</title>
		<link>http://www.utilitycomputing.com.cn/the-cloud/recover-from-expanded-disk-with-existing-snapshot-or-corrupted-snapshots</link>
		<comments>http://www.utilitycomputing.com.cn/the-cloud/recover-from-expanded-disk-with-existing-snapshot-or-corrupted-snapshots#comments</comments>
		<pubDate>Tue, 13 Nov 2007 16:35:32 +0000</pubDate>
		<dc:creator>richard</dc:creator>
				<category><![CDATA[The Cloud]]></category>
		<category><![CDATA[esx]]></category>
		<category><![CDATA[expansion]]></category>
		<category><![CDATA[snapshots vi3]]></category>
		<category><![CDATA[virtual disks]]></category>
		<category><![CDATA[vmkfstools]]></category>
		<category><![CDATA[vmware]]></category>

		<guid isPermaLink="false">http://www.utilitycomputing.com.cn/?p=120</guid>
		<description><![CDATA[I had a nasty shock this week with ESX3. I was going about expanding virtual disks and reallocating resources for one client. Now, I have done this MANY times, so I thought that &#8220;the 2 day old backup is sufficient&#8221; and did not wait 3-4 hours for a new backup, right before what will be [...]]]></description>
			<content:encoded><![CDATA[<p>I had a nasty shock this week with ESX3.</p>
<p>I was going about expanding virtual disks and reallocating resources for one client.  Now, I have done this MANY times, so I thought that &#8220;the 2 day old backup is sufficient&#8221; and did not wait 3-4 hours for a new backup, right before what will be a 10 min task.</p>
<p>I went to expand the virtual disks from the COS and noticed that there were some &#8220;Virtual-Disk-000001-delta.vmdk&#8221; and &#8220;Virtual-Disk-000001.vmdk&#8221; files present.</p>
<p><em>&#8220;Oh, a snapshot is here for some reason..?&#8221;</em>, I pondered.  I then went into the VI3 management console, drilled down to said VPS and went to the snapshot manager, expecting to find a snapshot and then simply commit it to the main disk so I could get back to expanding.</p>
<p><span id="more-120"></span> What I found was &#8220;No Snapshots for this Virtual Server&#8221;.</p>
<p>Hmmm&#8230;.. <em>&#8220;maybe they are old snapshot files that should have been deleted, but weren&#8217;t&#8221;</em>, I further mused.  And =IF= it is a snapshot, surely vmkfstools will not let me run a dangerous or incompatible command.  So off I went to expand this virtual disk by another 100GB.</p>
<p>- Expand disk:</p>
<p><strong>&#8220;vmware-cmd –X 220GB Virtual-Disk.vmdk&#8221;</strong></p>
<p>Expansion done.  All looks good.  Fire up VPS&#8230;..</p>
<p>&#8220;Sorry VPS can&#8217;t be started because one of the base files that a given snapshot is based on has been modified and thus can&#8217;t be mounted&#8221;.</p>
<p><strong> &#8221; *^&amp;#*^&amp;@(*&amp;!(@&amp;(!*&amp;(&amp;! &#8220;</strong></p>
<p>Ok, no harm no foul.  The actual disk is not changed.  Doing an expand with vmkfstools just adds a marker for more size&#8230; surely I can just remove the extra addition, &#8216;rollback the expansion&#8217; so to speak and all will be spiffy?</p>
<p>Nup.  Even though I knew in the back of my head that shrinking a VMDK was NOT POSSIBLE in ESX3 as it was in ESX2.5, I still went searching in the faint hope that I had overlooked some trick during past information gathering exercises when I was not under so much pressure and panic as I was this time.</p>
<p>No dice.  What I knew was confirmed.  I can&#8217;t shrink it.  I can&#8217;t even load up ghost and mirror, because the main problem is that this  <em>Virtual-Disk-000001-delta.vmdk </em>file should be appended to the end.  And seeing as it was 25GB in size, for what is a 100GB virtual disk and the data stamp was some 3 months prior &#8211; there is A LOT of data and changes that are at risk now.</p>
<p><strong> &#8221; *^&amp;#*^&amp;@(*&amp;!(@&amp;(!*&amp;(&amp;! &#8220;</strong></p>
<p>OK, on to google again.  After some searching and a lot of effort in trying to refine my query, which was needed, because as opposed to what I actually found out (this being the #4 and #5 global support issues for VMWARE), information was scant.   I did manage to find a couple of blogs that had some very brief and lacking in all technical detail, reviews of the recent VMWorld summit.</p>
<p>So with that hook, I then started to search on detailed info from that summit and managed to get a PPT file from one of the developers.  And inside were all the details that I needed.  Or thought that I needed.  Because with any system as complicated as VMWARE, definitions of words and correct semantics can make if very difficult to get a clear grasp of one problem, versus a slight variation of it.  And even a slight change can come with very different procedures to use and using the wrong ones could make a problem worse.  First rule &#8211; do no more harm.</p>
<p>I then went to the page that was titled <strong>&#8220;Expanding the size of a VMDK with an existing Snapshot&#8221;</strong>.  I did not know if this meant, <em>&#8220;how to expand a VMDK with an existing snapshot and keep it intact&#8221;</em>, or <em>&#8220;How to recover from a monumental screw up that only an idiot would do, when expecting vmkfstools to do all due diligence for him and has fucked up the VMDK that happened to have a current and active snapshot that wasn&#8217;t committed to the main VMDK file first&#8221;</em></p>
<p>I assumed it meant the latter, being &#8220;tech support&#8221; and &#8220;high rating&#8221;&#8230; if it was documentary of a feature or process it would have been, well, better documented.</p>
<p>The procedure is this:</p>
<p>- After I was an idiot and issued this command to cause all the problems:</p>
<p><strong><em> &#8220;vmkfstools –X 220G Virtual-Disk.vmdk&#8221; </em></strong></p>
<p>- Check the &#8220;Virtual-Disk.vmdk&#8221; file with vi and look for the following lines:</p>
<p><em><strong> RW 482344960 VMFS &#8220;Virtual-Disk-000001-delta.vmdk&#8221;</strong></em></p>
<p>- Now check the &#8220;Virtual-Disk-000001.vmdk&#8221; file and look for the following lines:</p>
<p><em><strong> RW 209715200 VMFS &#8220;Virtual-Disk-000001-delta.vmdk&#8221;</strong></em></p>
<p>What we now know is the current RW value on the newly expanded &#8220;Virtual-Disk.vmdk&#8221; and it is 482344960.  We want to &#8216;trick&#8217; the system into thinking that the expand never happened.  So we then go and replace that value with the one we got from the delta vmdk.  So we replace 492344960 with 209715200.</p>
<p>- Now we need to commit all snap shots:</p>
<p><strong><em> &#8220;vmware-cmd /vmfs/volumes/VMFSVOLUME/VPS/VPS.vmx removesnapshots&#8221;</em></strong></p>
<p>Unfortunately I was not done yet.  The system reported back that the virtual machine &#8220;VPS.vmx&#8221; did not have any snapshots present!  <em>&#8220;Ah ha&#8221;</em> I thought.  While this is not good, it is also the reason why vmkfstools went on and screwed everything in the start.  There is a snapshot there &#8211; that is a fact &#8211; but the system does not believe so.</p>
<p>This is where global common VMWARE problem #5 comes in, <strong>&#8220;Corrupted .VMSD file&#8221;</strong>.  In a nutshell this means that the file that tracks all this snapshot info (amongst other tid bits) is somehow compromised.  So a new one is needed.  This is also fairly simple once you know how:</p>
<p>- First rename the current VMSD file:</p>
<p><strong> mv VPS.vmsd VPS.vmsd.old</strong></p>
<p>- Now create a new snapshot to force the system to generate a new all emcompasing VMSD file:</p>
<p><strong>&#8220;vmware-cmd VPS.vmx createsnapshot addedforrecovey &#8220;You are an IDIOT&#8221; </strong></p>
<p>- Now commit all snapshots like we wanted to do before anyway.  You have to commit them all:</p>
<p><strong>&#8220;vmware-cmd VPS.vmx removesnapshots&#8221;</strong></p>
<p>Now that all the snapshots are committed (the original one and the temp one we made to help recreate the VMSD file) we can continue the process of fixing up our expanding a disk issue.   And this is as simple as running the initial vmkfstools expand command that we ran before, that caused all the problems.  This is needed so that the correct RW values are set in Virtual-Disk.vmdk&#8221; because in the end, the virtual disk IS expanded already.</p>
<p>- So issue the command:</p>
<p><strong>&#8220;vmware-cmd –X 220GB Virtual-Disk.vmdk&#8221; </strong></p>
<p>In the end, I am NOT STUPID enough to try and expand a virtual disk with a snapshot.  However if you DO SEE delta files in your file system, do not trust the VI3 clients snapshot manager if it says &#8220;No Snapshots present&#8221;.  As a matter of caution, I would follow the process above to recreate a new VMSD file to be sure and commit the temporary and any other snapshots that may exist.  Then you can go on and expand your disks.</p>
<p>Also, make sure that you have backups.  While I did and they weren&#8217;t totally fresh and the client was not too upset when briefed of the situation, it could have been much worse.</p>
<p><strong>ALWAYS BACKUP! </strong></p>
<p><strong>DON&#8217;T LET A JUNIOR TECH TOUCH THINGS!</strong></p>
<p><strong>TAKE THE TIME TO RELAX AND ASSES THE SITUATION BEFORE YOU POSSIBLY MAKE IT WORSE! </strong></p>
]]></content:encoded>
			<wfw:commentRss>http://www.utilitycomputing.com.cn/the-cloud/recover-from-expanded-disk-with-existing-snapshot-or-corrupted-snapshots/feed</wfw:commentRss>
		<slash:comments>19</slash:comments>
		</item>
		<item>
		<title>When 73GB is not 73GB!  Enter LVM</title>
		<link>http://www.utilitycomputing.com.cn/fossgnulinux/when-73gb-is-not-73g</link>
		<comments>http://www.utilitycomputing.com.cn/fossgnulinux/when-73gb-is-not-73g#comments</comments>
		<pubDate>Sat, 06 Oct 2007 20:38:25 +0000</pubDate>
		<dc:creator>richard</dc:creator>
				<category><![CDATA[FOSS/GNU/Linux]]></category>
		<category><![CDATA[array]]></category>
		<category><![CDATA[dell]]></category>
		<category><![CDATA[esx]]></category>
		<category><![CDATA[Linux]]></category>
		<category><![CDATA[LVM]]></category>
		<category><![CDATA[raid]]></category>
		<category><![CDATA[servers]]></category>
		<category><![CDATA[vmware]]></category>

		<guid isPermaLink="false">http://www.utilitycomputing.com.cn/?p=99</guid>
		<description><![CDATA[Thought I should write something tech for a change! It is golden week here and all are away on break. So instead of forcing a staff member to come back, I thought I would take care of some stuff myself. My problems started when a client who has a large advertising cluster, was running their [...]]]></description>
			<content:encoded><![CDATA[<p>Thought I should write something tech for a change!  <img src='http://www.utilitycomputing.com.cn/wp-includes/images/smilies/icon_wink.gif' alt=';-)' class='wp-smiley' /> </p>
<p>It is golden week here and all are away on break.  So instead of forcing a staff member to come back, I thought I would take care of some stuff myself.</p>
<p>My problems started when a client who has a large advertising cluster, was running their main statistics database (for click fraud detection) on a Dell 1950 with only 1 SAS 15K drive.</p>
<p>I had suggested that this node, not being redundant like the tomcat servers be individually redundant, so DRAC card, redundant power and RAID.</p>
<p>Anyway, some new blades, Dell 1955&#8242;s arrived for the cluster and I thought, well, lets save the client some money, image the old 1950 DB server and load it onto a new 1955 server?</p>
<p>I thought this would be simple with Acronis.</p>
<p>No it wasn&#8217;t.</p>
<p>It turns out that a 3.5 Inch 73GB SAS drive is not the same size as a 2.5 Inch 73GB SAS drive.  So I could not write my system image to the blades raid 1 array of 2 x 15K 73GB SAS drives.</p>
<p><span id="more-99"></span> Shite!  If I reinstall the DB server it is not worth my time.  Cheaper to buy the upgrade parts for the 1950.</p>
<p>Then I thought, well, I have LVM, I should be able to do this, after all I have used LVM before many times on large storage arrays.</p>
<p>So my goal was this, I needed a system image that was a couple hundred megs smaller than it was now, so it will go into the 1955&#8242;s ok.</p>
<p>This is where I added in VMWARE to the mix and made this an easy task.  The steps are below:</p>
<ol>
<li>Image the 1950 server to some ACRONIS TIB files somewhere.  I used FTP</li>
<li>Image the TIB file to a new VPS made with a 73GB virtual disk</li>
<li>Create and attach a new 65GB virtual disk to the virtual machine</li>
<li>Image the MBR and /boot partitions using acronis onto the new 65GB virtual disk</li>
<li>Boot virtual machine with a rescue/live CD</li>
<li>Load FDISK for /dev/sdb and create a new LVM (Type 8e) partition in the remaining space on the 65GB virtual drive</li>
<li>Enter LVM with the <em>&#8220;lvm&#8221;</em> command</li>
<li>Activate all Volume Groups with the command <em>&#8220;vgchange -a y&#8221;</em></li>
<li>EXIT out of LVM and then run this command to resize the EXT3 file system, <em>&#8220;resize2fs /dev/VolGroup/LogVol00 40G&#8221;</em>, you may have to run <em>&#8220;e2fschk -f /dev/VolGroup00/LogVol00&#8243; </em>first too</li>
<li>Enter LVM again with the <em>&#8220;lvm&#8221;</em> command</li>
<li>Now we can reduce the Logical Volume that had the recently shrunk EXT3 file system on it with this command, <em>&#8220;lvreduce LogVol00 -L 45G&#8221;</em></li>
<li>Because we are making a new custom boot image and we have already imaged over the /boot partition and MBR, we now want our old 73GB virtual drive to not have any of the same markings as our embryonic new 65GB virtual drive.  To do this we need to change the Volume Group and Logical Volume names to something new:</li>
<li><em>lvrename VolGroup00 LogVol00 LogVol10</em></li>
<li><em>lvrename VolGroup00 LogVol01 LogVol11</em></li>
<li><em>lvchange LogVol10 -a n</em></li>
<li><em>lvchange LogVol11 -a n</em></li>
<li><em>vgchange VolGroup00 VolGroup10</em></li>
<li>Now we can create the new Logical Volumes and Volume Groups on the 65GB virtual disk in preparation for cloning our now 40GB EXT3 file system</li>
<li>Make a new Physical Volume first wit, <em>&#8220;pvcreate /dev/sdb2&#8243;</em></li>
<li><em>pvscan</em></li>
<li><em>vgcreate VolGroup00 /dev/sdb2</em></li>
<li><em>vgscan</em></li>
<li><em>lvcreate VolGroup00 /dev/sdb2 -n LogVol01 -L 2G</em></li>
<li><em>lvcreate VolGroup00 /dev/sdb2 -n LogVol00 -L 50G</em></li>
<li><em>lvscan</em></li>
<li>Now we need to make our new LVM&#8217;s online and visible to the system so we run the command vgchange again,  <em>&#8220;vgchange -a y&#8221;</em></li>
<li>EXIT</li>
<li>Back at the command prompt we need to now setup our new SWAP partition, so we issue the command, <em>&#8220;mkswap /dev/VolGroup00/LogVol01&#8243;</em></li>
<li>Now we can clone our old 40GB EXT3 partion that we shrunk to our new LVM which is larger than 40GB, but smaller than 65GB, so it will image onto the PE1955 2.5 SAS drive array</li>
<li>We use an old favourite for this, <em>&#8220;dd if=/dev/VolGroup10/LogVol10 of=/dev/VolGroup00/LogVol00&#8243;</em></li>
<li>Once done, shutdown the VPS, boot up in Acronis and image the new 65GB virtual drive and then load it onto the PE1955 server.  DONE!</li>
</ol>
<p>I did all of this in single user mode so as to minimise the need for KUDZU to rescan and change all hardware.  I can do this because of the hardware commonality in the 9th generation Dell servers.</p>
<p>Also once done, the new LVM will be smaller than the full capacity of the 73GB SAS 2.5 SAS drive.  This is easily fixed while online. by making a new partition /dev/sda3 of LVM type (8e), making it into a Physical Volume, adding it to VolGroup00 and then extending the LogVol00 logical volume with the newly added extents.   Once that is done, go back to the command prompt and use the ext2online command to finally expand your EXT3 partition to use all the space on the LVM.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.utilitycomputing.com.cn/fossgnulinux/when-73gb-is-not-73g/feed</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
	</channel>
</rss>
