<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Utility Computing dot China &#187; array</title>
	<atom:link href="http://www.utilitycomputing.com.cn/tag/array/feed" rel="self" type="application/rss+xml" />
	<link>http://www.utilitycomputing.com.cn</link>
	<description>数 据 嘉 年 华</description>
	<lastBuildDate>Mon, 26 Jul 2010 08:59:29 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.0</generator>
		<item>
		<title>Drive Roaming.  DELL PERC4 Controllers</title>
		<link>http://www.utilitycomputing.com.cn/uncategorized/drive-roaming-dell-perc4-controllers</link>
		<comments>http://www.utilitycomputing.com.cn/uncategorized/drive-roaming-dell-perc4-controllers#comments</comments>
		<pubDate>Sun, 18 Nov 2007 15:31:28 +0000</pubDate>
		<dc:creator>richard</dc:creator>
				<category><![CDATA[Uncategorized]]></category>
		<category><![CDATA[array]]></category>
		<category><![CDATA[big iron]]></category>
		<category><![CDATA[Linux]]></category>
		<category><![CDATA[PostgreSQL]]></category>
		<category><![CDATA[raid]]></category>
		<category><![CDATA[servers]]></category>

		<guid isPermaLink="false">http://www.utilitycomputing.com.cn/?p=124</guid>
		<description><![CDATA[During a recent test run to see if a new PostgreSQL back end server would hasten things up in a main cluster &#8211; that has now become CPU bound and NOT IO&#8230;&#8230; the wizardry of that I will blog about later. In any case, the short of it is, that we were juggling PERC4 cards [...]]]></description>
			<content:encoded><![CDATA[<p>During a recent test run to see if a new PostgreSQL back end server would hasten things up in a main cluster &#8211; that has now become CPU bound and NOT IO&#8230;&#8230; the wizardry of that I will blog about later.</p>
<p>In any case, the short of it is, that we were juggling PERC4 cards around servers (PCI-X here, PCIe there..) and also complete raid 1 and raid 10 arrays too.  The cards are supposed to &#8220;detect&#8221; the correct array type from the drives if the firmware was missing.  Anyway, through a comedy of errors, it worked exactly 1/3 times.  The other times we had to remember the exact settings of our arrays (stripe, etc) and how it was structured.  So we could clear PERC cards and then recreate the arrays &#8211; taking special care to not initalise the new arrays.</p>
<p>So in the end, you can move arrays and channels about.  And with LVM, even designations like /sda /sdb reording is also not an issue.  However you should rely on good old fashioned hand held way of doing things.  Before you start write down all the salient details of your arrays first.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.utilitycomputing.com.cn/uncategorized/drive-roaming-dell-perc4-controllers/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>When 73GB is not 73GB!  Enter LVM</title>
		<link>http://www.utilitycomputing.com.cn/fossgnulinux/when-73gb-is-not-73g</link>
		<comments>http://www.utilitycomputing.com.cn/fossgnulinux/when-73gb-is-not-73g#comments</comments>
		<pubDate>Sat, 06 Oct 2007 20:38:25 +0000</pubDate>
		<dc:creator>richard</dc:creator>
				<category><![CDATA[FOSS/GNU/Linux]]></category>
		<category><![CDATA[array]]></category>
		<category><![CDATA[dell]]></category>
		<category><![CDATA[esx]]></category>
		<category><![CDATA[Linux]]></category>
		<category><![CDATA[LVM]]></category>
		<category><![CDATA[raid]]></category>
		<category><![CDATA[servers]]></category>
		<category><![CDATA[vmware]]></category>

		<guid isPermaLink="false">http://www.utilitycomputing.com.cn/?p=99</guid>
		<description><![CDATA[Thought I should write something tech for a change! It is golden week here and all are away on break. So instead of forcing a staff member to come back, I thought I would take care of some stuff myself. My problems started when a client who has a large advertising cluster, was running their [...]]]></description>
			<content:encoded><![CDATA[<p>Thought I should write something tech for a change!  <img src='http://www.utilitycomputing.com.cn/wp-includes/images/smilies/icon_wink.gif' alt=';-)' class='wp-smiley' /> </p>
<p>It is golden week here and all are away on break.  So instead of forcing a staff member to come back, I thought I would take care of some stuff myself.</p>
<p>My problems started when a client who has a large advertising cluster, was running their main statistics database (for click fraud detection) on a Dell 1950 with only 1 SAS 15K drive.</p>
<p>I had suggested that this node, not being redundant like the tomcat servers be individually redundant, so DRAC card, redundant power and RAID.</p>
<p>Anyway, some new blades, Dell 1955&#8242;s arrived for the cluster and I thought, well, lets save the client some money, image the old 1950 DB server and load it onto a new 1955 server?</p>
<p>I thought this would be simple with Acronis.</p>
<p>No it wasn&#8217;t.</p>
<p>It turns out that a 3.5 Inch 73GB SAS drive is not the same size as a 2.5 Inch 73GB SAS drive.  So I could not write my system image to the blades raid 1 array of 2 x 15K 73GB SAS drives.</p>
<p><span id="more-99"></span> Shite!  If I reinstall the DB server it is not worth my time.  Cheaper to buy the upgrade parts for the 1950.</p>
<p>Then I thought, well, I have LVM, I should be able to do this, after all I have used LVM before many times on large storage arrays.</p>
<p>So my goal was this, I needed a system image that was a couple hundred megs smaller than it was now, so it will go into the 1955&#8242;s ok.</p>
<p>This is where I added in VMWARE to the mix and made this an easy task.  The steps are below:</p>
<ol>
<li>Image the 1950 server to some ACRONIS TIB files somewhere.  I used FTP</li>
<li>Image the TIB file to a new VPS made with a 73GB virtual disk</li>
<li>Create and attach a new 65GB virtual disk to the virtual machine</li>
<li>Image the MBR and /boot partitions using acronis onto the new 65GB virtual disk</li>
<li>Boot virtual machine with a rescue/live CD</li>
<li>Load FDISK for /dev/sdb and create a new LVM (Type 8e) partition in the remaining space on the 65GB virtual drive</li>
<li>Enter LVM with the <em>&#8220;lvm&#8221;</em> command</li>
<li>Activate all Volume Groups with the command <em>&#8220;vgchange -a y&#8221;</em></li>
<li>EXIT out of LVM and then run this command to resize the EXT3 file system, <em>&#8220;resize2fs /dev/VolGroup/LogVol00 40G&#8221;</em>, you may have to run <em>&#8220;e2fschk -f /dev/VolGroup00/LogVol00&#8243; </em>first too</li>
<li>Enter LVM again with the <em>&#8220;lvm&#8221;</em> command</li>
<li>Now we can reduce the Logical Volume that had the recently shrunk EXT3 file system on it with this command, <em>&#8220;lvreduce LogVol00 -L 45G&#8221;</em></li>
<li>Because we are making a new custom boot image and we have already imaged over the /boot partition and MBR, we now want our old 73GB virtual drive to not have any of the same markings as our embryonic new 65GB virtual drive.  To do this we need to change the Volume Group and Logical Volume names to something new:</li>
<li><em>lvrename VolGroup00 LogVol00 LogVol10</em></li>
<li><em>lvrename VolGroup00 LogVol01 LogVol11</em></li>
<li><em>lvchange LogVol10 -a n</em></li>
<li><em>lvchange LogVol11 -a n</em></li>
<li><em>vgchange VolGroup00 VolGroup10</em></li>
<li>Now we can create the new Logical Volumes and Volume Groups on the 65GB virtual disk in preparation for cloning our now 40GB EXT3 file system</li>
<li>Make a new Physical Volume first wit, <em>&#8220;pvcreate /dev/sdb2&#8243;</em></li>
<li><em>pvscan</em></li>
<li><em>vgcreate VolGroup00 /dev/sdb2</em></li>
<li><em>vgscan</em></li>
<li><em>lvcreate VolGroup00 /dev/sdb2 -n LogVol01 -L 2G</em></li>
<li><em>lvcreate VolGroup00 /dev/sdb2 -n LogVol00 -L 50G</em></li>
<li><em>lvscan</em></li>
<li>Now we need to make our new LVM&#8217;s online and visible to the system so we run the command vgchange again,  <em>&#8220;vgchange -a y&#8221;</em></li>
<li>EXIT</li>
<li>Back at the command prompt we need to now setup our new SWAP partition, so we issue the command, <em>&#8220;mkswap /dev/VolGroup00/LogVol01&#8243;</em></li>
<li>Now we can clone our old 40GB EXT3 partion that we shrunk to our new LVM which is larger than 40GB, but smaller than 65GB, so it will image onto the PE1955 2.5 SAS drive array</li>
<li>We use an old favourite for this, <em>&#8220;dd if=/dev/VolGroup10/LogVol10 of=/dev/VolGroup00/LogVol00&#8243;</em></li>
<li>Once done, shutdown the VPS, boot up in Acronis and image the new 65GB virtual drive and then load it onto the PE1955 server.  DONE!</li>
</ol>
<p>I did all of this in single user mode so as to minimise the need for KUDZU to rescan and change all hardware.  I can do this because of the hardware commonality in the 9th generation Dell servers.</p>
<p>Also once done, the new LVM will be smaller than the full capacity of the 73GB SAS 2.5 SAS drive.  This is easily fixed while online. by making a new partition /dev/sda3 of LVM type (8e), making it into a Physical Volume, adding it to VolGroup00 and then extending the LogVol00 logical volume with the newly added extents.   Once that is done, go back to the command prompt and use the ext2online command to finally expand your EXT3 partition to use all the space on the LVM.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.utilitycomputing.com.cn/fossgnulinux/when-73gb-is-not-73g/feed</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>PostgreSQL Upgrade Part 3</title>
		<link>http://www.utilitycomputing.com.cn/fossgnulinux/postgresql-upgrade-part-3</link>
		<comments>http://www.utilitycomputing.com.cn/fossgnulinux/postgresql-upgrade-part-3#comments</comments>
		<pubDate>Sat, 01 Sep 2007 12:40:16 +0000</pubDate>
		<dc:creator>richard</dc:creator>
				<category><![CDATA[FOSS/GNU/Linux]]></category>
		<category><![CDATA[array]]></category>
		<category><![CDATA[array reconstruction]]></category>
		<category><![CDATA[big iron]]></category>
		<category><![CDATA[Email]]></category>
		<category><![CDATA[Linux]]></category>
		<category><![CDATA[LVM]]></category>
		<category><![CDATA[mirror]]></category>
		<category><![CDATA[PostgreSQL]]></category>
		<category><![CDATA[raid]]></category>
		<category><![CDATA[raid level migration]]></category>
		<category><![CDATA[scsi]]></category>

		<guid isPermaLink="false">http://www.utilitycomputing.com.cn/?p=49</guid>
		<description><![CDATA[I knew the config files were different between 7.4 and 8.1 and that some items merely changed names and some were deprecated. I used this excellent resource before. Even the very rudimentary tweaks I did, and with a RAID array that is in a 90% rebuild rate, background initialisation, this 8.1 version is FAST! I [...]]]></description>
			<content:encoded><![CDATA[<p>I knew the config files were different between 7.4 and 8.1 and that some items merely changed names and some were deprecated.</p>
<p>I used <a href="http://www.powerpostgresql.com/Downloads/annotated_conf_80.html">this excellent resource</a> before.</p>
<p>Even the very rudimentary tweaks I did, and with a RAID array that is in a 90% rebuild rate, background initialisation, this 8.1 version is FAST!  I have no idea why people use MySQL, it really is such a piece of crud.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.utilitycomputing.com.cn/fossgnulinux/postgresql-upgrade-part-3/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>PostgreSQL Upgrade Part 2</title>
		<link>http://www.utilitycomputing.com.cn/fossgnulinux/postgresql-upgrade-part-2</link>
		<comments>http://www.utilitycomputing.com.cn/fossgnulinux/postgresql-upgrade-part-2#comments</comments>
		<pubDate>Sat, 01 Sep 2007 12:03:14 +0000</pubDate>
		<dc:creator>richard</dc:creator>
				<category><![CDATA[FOSS/GNU/Linux]]></category>
		<category><![CDATA[array]]></category>
		<category><![CDATA[array reconstruction]]></category>
		<category><![CDATA[big iron]]></category>
		<category><![CDATA[Email]]></category>
		<category><![CDATA[Linux]]></category>
		<category><![CDATA[LVM]]></category>
		<category><![CDATA[mirror]]></category>
		<category><![CDATA[PostgreSQL]]></category>
		<category><![CDATA[raid]]></category>
		<category><![CDATA[raid level migration]]></category>
		<category><![CDATA[scsi]]></category>

		<guid isPermaLink="false">http://www.utilitycomputing.com.cn/?p=48</guid>
		<description><![CDATA[Well, the PostgreSQL upgrade was a snap, sorta. I needed to do a full dump and restore as this was a major version change &#8211; no surprises there. What pissed me off though, is that when using the binary data type for dump files when using pg_dump (&#8220;-T c&#8221;) the resulting backup file is of [...]]]></description>
			<content:encoded><![CDATA[<p>Well, the PostgreSQL upgrade was a snap, sorta.  I needed to do a full dump and restore as this was a major version change &#8211; no surprises there.  What pissed me off though, is that when using the binary data type for dump files when using pg_dump (&#8220;-T c&#8221;) the resulting backup file is of no use for remote workers who aren&#8217;t at the actual console.</p>
<p>Let me expand on this;</p>
<p>This type of backup file is advertised as &#8220;more convenient&#8221; and offers more options for restore time selective data restores, data re-ording, index tricks and the like.  However no matter WHAT I did, it reported and sent a copy of the current pg_restore process and all the data being restored to standard output too!!  This means that basically, I was going to have the same full text of 20GB worth of database data shoved down my SSH session!</p>
<p>Yes &#8211; this makes the whole affair much slower!</p>
<p><span id="more-48"></span></p>
<p>Luckily, being the lateral thinking kind of dude that I am, I never put all my eggs in one basket.  That is just a recipe for data omelette.</p>
<p>I also had some plain text file dumps made from pg_dump.  So I went into the the PostgreSQL template1 sessions and then used the &#8220;\i&#8221; command to import my big .SQL file.  Perfect!  Only important updates sent to std output and not the whole damn enchilada.</p>
<p>Disk restore is still going on.  At a max through put of 20MByte a second with a 1000BaseT network, that is at best 1.2G a minute, 72GB an hour, so for 250GB approximately 3.5 hours.   However I was still doing a raid array background initialisation, so even after setting the rebuild rate to 10%, the system still needs a lot of time to move the files back &#8211; I didn&#8217;t bother to do the maths, because 3.5 hours or 10 hours it would all breach my maintenance window.</p>
<p>Because my advertised downtime window to clients was rapidly approaching, I had no choice but to continue to allow the copy to proceed, but redirect the main cluster to access the data store via NFS over the network and bring services back online.  I will then be able to do a RSYNC later in less than 1 hour to bring the haphazardly copied set on the main cluster in line with the now used and modified data store on my hot standby server.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.utilitycomputing.com.cn/fossgnulinux/postgresql-upgrade-part-2/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>PostgreSQL Upgrade</title>
		<link>http://www.utilitycomputing.com.cn/fossgnulinux/postgresql-upgrade</link>
		<comments>http://www.utilitycomputing.com.cn/fossgnulinux/postgresql-upgrade#comments</comments>
		<pubDate>Fri, 31 Aug 2007 21:35:53 +0000</pubDate>
		<dc:creator>richard</dc:creator>
				<category><![CDATA[FOSS/GNU/Linux]]></category>
		<category><![CDATA[array]]></category>
		<category><![CDATA[array reconstruction]]></category>
		<category><![CDATA[big iron]]></category>
		<category><![CDATA[Email]]></category>
		<category><![CDATA[Linux]]></category>
		<category><![CDATA[LVM]]></category>
		<category><![CDATA[mirror]]></category>
		<category><![CDATA[PostgreSQL]]></category>
		<category><![CDATA[raid]]></category>
		<category><![CDATA[raid level migration]]></category>
		<category><![CDATA[scsi]]></category>

		<guid isPermaLink="false">http://www.utilitycomputing.com.cn/?p=47</guid>
		<description><![CDATA[Well this week was fun. For some reason one of our main clusters that runs client ASP software for general office, file, email, collaboration, etc&#8230; went crazy. First I noticed that the usual night time &#8220;Vacuum&#8217;s&#8221; that are needed to keep the PostgreSQL planar at it&#8217;s most efficient and indexes clean, was running right into [...]]]></description>
			<content:encoded><![CDATA[<p>Well this week was fun.  For some reason one of our main clusters that runs client ASP software for general office, file, email, collaboration, etc&#8230; went crazy.</p>
<p>First I noticed that the usual night time &#8220;Vacuum&#8217;s&#8221; that are needed to keep the PostgreSQL planar at it&#8217;s most efficient and indexes clean, was running right into the day time!  It usually needed less than an hour for the 20GB database we currently have.</p>
<p>So after many failed attempts to get an online vacuum done.  I stayed up really late, took the cluster down and did FULL vacuum.  Full vacuum&#8217;s are slow, and you can&#8217;t run anything while they happen because they do full table locks, where as online vacuum&#8217;s do quick tuple/row level locks.</p>
<p>Anyway, database seemed speedier, but system was still sluggish.  I have all data separated.  Database files are on large RAID10 arrays with U320 SCSI drives spinning at 15K &#8211; split over TWO SCSI buses!  Yeah tis fast.  Big disks are used because it means relative to the size of the disk, more data is on the outer edge of the platters, that spin faster than the centre of the platters.  I also keep PostgreSQL&#8217;s transaction log on a separate RAID1 array with 73GB 15K U320 drives as well with a 256MB battery backed cache.</p>
<p><span id="more-47"></span></p>
<p>Now database based operations were zippy again&#8230;.the system while set by me to not rate a disk search at too high a cost due to the super speed disk IO that I have, still should get SOME data out of the cache and not run to the comparatively sloooow disks straight away.  After doing this all was cool again database wise.</p>
<p>However operations that needed the data store to be accessed were still piss slow.  So email ingestion, file usage, etc crawled.  The data store which compliments the database data weighs in at about 250GB now.  And this is on a single RAID1 array with 300GB U320 10K drives.  I also remembered that over a 1000BaseT network to backup to a robotic tape library, the data store array (RAID1) maxes out at about 750MB a min.  While the RAID10 with the database does it at 2700+MB a min.</p>
<p>So I thought, &#8220;lets add some more disks to the data store array&#8221;.  I get more space, so relatively speaking more of the data is on the outside edges of all the platters (another topic for what one can do with LVM), the array has double the heads and spindles too &#8211; and some RAID0 goodness inside that RAID10 nested set.</p>
<p>Now Dell&#8217;s storage white paper from 2005 does state that a RAID1 to a RAID10 migration/raid level reconstruction is supported.  RAID10 being two RAID1 arrays striped together in RAID0 (with the mirror part straddling 2 SCSI buses for speed and channel redundancy).  However when I went into Open Manage after putting the drives in the chassis, expecting to be able to &#8220;reconstruct/migrate levels&#8221; to a RAID10, then pop into LVM, create some new data devices, add them to my Volume group and then expand my partition and then finally my file system &#8211; all while still being online&#8230;&#8230;..I was greeted with only the option to reconstruct/migrate to RAID5 (Get ^*^*&amp;^) or RAID0 (Uh, yeah, OK..).</p>
<p>So feeling quite annoyed now.  I then went and made the two new drives into a new RAID1 set, thinking that I would then be able to add this to a final &#8220;nested&#8221; array with the current RAID1 set and make a RAID10 out of them.  Well I thought it was working, but it wasn&#8217;t.  All I managed to do was end up with my existing array being made into a RAID0 (heart attack!!) and the new array sitting there untouched.  Also the Open Manage array management setup gave me no choice of stripe size.  It defaulted to 64K.  I prefer 128K to ensure the best chance of the records/tuples fitting into one whole stripe to maximise concurrency of head operations on different data records &#8211; and also because the PERC4 family of RAID cards does not suffer any penalty if a stripe size is too big for the data used &#8211; so why not eh?</p>
<p>So what am I doing now at 5:32 AM?</p>
<p>Rsyncing all data again to the hot standby server, will then power off the cluster, log in with the DRAC card, go into the BIOS of the RAID card and redo my friggin array from scratch, as RAID10 and 128K stripe and then boot up and copy my damn data back.</p>
<p>Not happy pappy.  Not happy.  I think it is SAN time&#8230;unless SAN&#8217;s can be this anal as well?</p>
<p>So what does this have to do with PostgreSQL upgrade?  Well since the cluster is down anyway, I might as well go from 7.4 to 8.1.  Get some of that auto vacuum goodness and lap up some of the apparent massive speed boosts in the 5 years of development between the two versions.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.utilitycomputing.com.cn/fossgnulinux/postgresql-upgrade/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
	</channel>
</rss>
