Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations Mike Lewis on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

your thoughts on striping

Status
Not open for further replies.

exsnafu

Technical User
Apr 25, 2008
99
US
i'd like to pose a question regarding striping at the lvm layer and on the backend.

we have ESS SAN storage made up of raid-5 packs carved up and sent out to the LPAR's, I don't have a lot of exposure to the backend but I trust the SAN guys know what they're doing and we don't particularly have any issues.

issue I have though is that i've inherited a bunch of boxes with this backend disk and theres no striping at the LVM layer, I understand the argument that since the storage array is striping under the covers it's unnecessary at the OS level.

my question though is this: really unnecessary? the problem I have is that now I look at my distribution on the LV's and pretty much all the used space ends up on one or two of the 8 vpaths in a VG and is slowly growing out to more. personally I've always striped at the lvm to ensure equal distribution across all my spindles. i've been told this can actually hurt performance too but not seen anything to back it up.

also, as an aside.. something else that's disturbing me on these LPAR's is that they have *8* HBA's going out to the fabric, I guess the logic was add an HBA per certain amount of storage allocated but to me this is insanity. i've never seen the purpose behind going more than 4 HBA's on an LPAR but I guess on newer P6 hardware there might be a reason to have 8 or more HBA's on a heavy IO system?

anyways.. anyone have strong opinions one way or another?h
 
My only experience was with a DS4000 series SAN, I tried striping but realized the following, it gave my system more spindles for the vg but less performance due to cache hits went way down, so it may improve writes but impacted reads, which is where we had the issue , by aligning the LTG to be a multiple of san stripes we improved both read and write.

So my experience on ds4000 which is active/pasive path, you only gain if you are maxing out the HBA card, and generally I would advise against striping a SAN volume.


Tony ... aka chgwhat

When in doubt,,, Power out...
 

A few tips:

Your SAN guys should be carving out LUNs of consistent size...And in the case of a purely AIX environment, the LUN sizes should even be carved in such a way as to allow the LVM to make best use of the LUN. For carving increments, we use 15.8GB, 31.7GB, 63.5GB, etc. Another reason you want to have a large number of LUNs with equal sizes is for the purposes of FlashCopy. If your SAN guy is carving out LUNs of arbitrary size, he's going to need to find (or else carve) another LUN of equivalent size or larger to serve as a corresponding FlashCopy target. With oddball LUN sizes, this translates into alot of pricey storage going to waste if you have to pair a 50GB LUN with a 190GB LUN as its target, if that's the only one available. We did the math once here, our ESS 800 costs about $1100 per gigabyte. So yeah, it matters. It also shows management you know what the hell you're doing behind the wheel. :)

Standardized LUN sizes allow you to construct and build storage for hosts in sort of a lego-brick fashion. Instead of having one large arbitrarily-sized LUN, you also gain the benefit of having another layer of parallelism behind multiple vpaths, host-side. All the flexibility with none of the fat.

Anyway, your original question -- Internally striping a volume that's part of an existing RAID 5 set. Host-side LVM layer striping doesn't override the fact that your data is still being housed as RAID 5, and enjoys a great deal of parallelism by default. RAID 5 is striping + dedicated parity. In other words, by striping your volumes, you're striping data that's already BEING striped at the hardware layer. So yeah, you're actually incurring more work than what needs to be done, with no tangible benefit. You already -are- running I/O against all spindles by virtue of it being housed in RAID 5.

As for the number of HBAs, it depends on the number of LPARs you have, the kind of I/O you typically see, and the speed of each HBA. Think of it this way. A 1-Gb HBA can move 128MB/sec at full bore. Most modern systems are 4Gb/sec. If low latency is critical, then increasing the number of HBAs would be beneficial.. But otherwise, you've got a system with a pipe the size of a freight train with only trickle of data running through it.



 
we do have LUN sizes of all the same size, each LUN is a chunk of the raid-5 pack in the backend shark.

while I agree that i'm already striping across "all" spindles for one LUN at the hardware layer, when I have an LV with 4 LUN's in it, I'd rather have the data equally spread across all 4 physical LUN's, each LUN consisting of multiple spindles though yes?

what I have now is 4 LUN's, 100% utilization on the first, 20% utilization on the second and 0 for three and four... sure i have plenty of room to grow but depending on the data I end up with heavy IO against a single LUN while the others sit mostly idle.

 
exsnafu said:
what I have now is 4 LUN's, 100% utilization on the first, 20% utilization on the second and 0 for three and four... sure i have plenty of room to grow but depending on the data I end up with heavy IO against a single LUN while the others sit mostly idle.

So? Why should you care? Your data is in the ESS already spread across a whole string of disks. It doesn't matter that your host sees one nearly full LUN and a couple of near empty ones. It just means that some of your NULL data is also spread across the disks. For the ESS's caching algorithm it may even be better like this.

Your app reads data: the ESS has it in cache and gives it to your app OR the ESS reads the backend disks, stores disk blocks in cache and hands a few of them to your app
Your app writes data: the blocks are transferred to the ESS's write cache - end of story for your app. It now becomes the ESS's problem.

HTH,

p5wizard
 
so basically the argument against is due to the ESS caching algorithm you can hurt yourself by striping at the volume level. ok, I can buy that.

why do i care? because for me(yes, i'm thick headed) but i like everything in a certain, uniform way and it just irks me when i look at lv distribution and see it all dogpiled so I wondered why it *wasn't* a good idea... plus its just how i was raised.
 
Ok, with a striped LV you can balance the data across say 4 different LUNs in maybe 4 different LSSs in the ESS. But then what is going to happen when you run out of space in the LV and you need to increase its size? You'd need to add 4 more LUNs to continue the striped layout of the LV...

What I'm trying to say is, with already striped RAID5 volumes on the SAN server layer, there's really no need at all to try to balance LUN usage for a server's filesystems. With enough servers using the SAN storage, it sort of all balances itself out, provided you have enough cache in the SAN server and the backend disks in the SAN servers are fast enough.

But hey, you're the boss!



HTH,

p5wizard
 
p5wizard,
a long time ago I have found a white paper in which there where a double striping on SAN side and on LVM side.
The writer assured that that was the most performant configuration.
The fact is, probably to have two different level of striping WOULD give you something in performance... certainly gives you a lot of headache when you have to increase the size of the LV.
 
So, I'll write a redpaper someday explaining that it won't make a whole lot of difference anyway and you can do without all the hassle of how to manage the LVs and especially how to increase a striped LV once it's filling up... ;-)

[soapbox]
What if you have nice striped LVs on all of your AIX boxes across all the RAID ranks in the SAN server. But you still have - or a colleague still has - one heavy hitter server linked to the same SAN server that won't (or can't) do LVM striping. So you end up with one or more RAID ranks that are loaded more heavily than others, thereby messing up the nicely balanced loading of the RAID ranks that you are so preciously guarding...


...long time ago... like ice age? Just kidding...


HTH,

p5wizard
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top