New Posts  All Forums:Forum Nav:

software raid

post #1 of 9
Thread Starter 
it's slightly offtopic in the fact that it's not directly related to laptops, but it is related to linux (sorta).

i popped into the sager general section for the first time in... a long time. there was a thread in there about raid. they're discussing hardware raid vs software raid, and the general consensus over there seems to be that hardware raid is cheaper and better then software raid.

wait... software raid is free. how is that more expensive than hardware raid? windows xp allows you to create raids out of dynamic disks, and linux has raid built into the kernel. nothing to buy.

a week ago, i created two raid 5 setups in my desktop with evms in linux. five 120 gig hard drives in one set for 480 gigs total, and 4 160 gig drives in the other also totaling 480 gigs. how much did it cost me? _nothing_. if i had invested in hardware raid, i would have had to buy at least two raid5 controller cards - at least 100 dollars each. not only that, but those cards almost always come with only 4 ide buses (i think more than two mass controllers on one pci card and one irq causes problems), so i could only use 4 drives, and i wouldn't have even been able to create that one with the 5 drives. (you could do 8 drives on 4 buses, but having two drives on the same cable and bus decreases bandwidth. not only that, if one drive fails, it's possible that the bus will die, killing both drives)

not only that, but software raid, being more expandable, has support for all four parity striping methods - left-symmetric, left-asymmetric, right-symmetric, right-asymmetric. left-symmetric is the best, and is default in linux. hardware ones use left-asymmetric, and don't usually support others.

sure, software raid incurs a slight overhead because the processor has to do the calculations, but it's negligible. raid 5 would probably use the most cpu power, because it has to calculate parity bits as well as distributing blocks. but look at these benchmarks for various raid5 checksumming functions on my system:

8regs - 2052 MB/sec
32regs - 1684 MB/sec
pIII_sse - 3680 MB/sec
pII_mmx - 3556 MB/sec
p5_mmx - 4560 MB/sec

my system is an athlonxp, so it would use the pIII_sse function since it supports sse. no matter how much you write, your bandwidth will not hit 3.6 gigs. assuming that each drive operates at 75 megs/sec (which is generous - ide drives almost never hit that), you would need about 50 drives to saturate that checksumming speed. and with hardware raid, the checksumming speed is limited by the chip on the card. but with software, it's determined by the processor - so if you've saturated your checksumming bandwidth, you can just get a faster processor.

wait, i've got another one. what happens if your raid card dies? the bios dies, or the chip that does the striping dies, or whatever. the card was the one that handled the ordering of the drives, and now you don't know anymore. you would have to line up those drives in the exact same order to get the data back intact - that's not easy unless you have the drives explicitly labeled. in software raid (on linux, at least), there's something written to the hard drives called the persistent superblock. it has general information about the raid setup and device mapping. so all the linux system has to do when it boots up is autodetect that persistent superblock, and it will arrange the drives correctly and your raid will be intact. so you could completely switch around all your drives to different ide buses and orders, or move all the drives to another linux computer, and the raid would just be redetected and reconstructed correctly.

if i had the money and got an 8890 with the hardware raid, i would use software raid _anyway_. the hardware raid controller in there only does two drives, and only does 1 or 0. (0+1, maybe? dunno) with software raid, i could use three drives and make a raid 5, which is beyond the capability of that small raid controller. i could swap out the drives temporarily for an extra cd drive or something, and put them back in any order without having to remember exactly which drive bay each drive went into, and it would still be intact.

... i don't see how hardware raid is better. but if you think so, feel free to speak your mind. i'm always up for a friendly debate.

sorry, this is pretty offtopic, and it might get moved at some point or something, but it confused me so much that i had to say something. and if i went into that thread in sager general and said it, i would sound like a pretty big jerk.
post #2 of 9
WOW. I always just assumed that hardware raid was tons better since software raid was dependant on the processor but you present a VERY convincing argument. Props to you on doing your homework before posting, always better to back what your saying up with facts. If i ever do RAID i think it's gunna be software from now on
post #3 of 9
This is actually pretty interesting, as I'm getting ready to make my old desktop machine (purchased 2 yrs ago with this in mind, as it has an ABit motherboard with onboard IDE RAID controller) into my new server -- now that I've transferred all my stuff to my new Sager.

I was thinking of just using the onboard RAID controller, as I have two 60GB drives that I was simply going to mirror (RAID 0 or 1, I think). Now you've got me rethinking that.

Last time I setup RAID, it was years ago, with Novell Netware 4.1, so I'm expecting a challenge.
post #4 of 9
I was thinking of software RAIDing my soon to be 5680 but if i remember correctly when you do the 2 HDD setup you loose the #2 bay. So if i ever wanted to take out the HDD and use the floppy drive nothing would work since the data would be split between to two disks. My only other option would be to use RAID 1 and that wouldn't speed anything up. Any thoughts on this..correct me if i'm wrong.

And to say it again this is a great thread!
post #5 of 9
Thread Starter 
i don't know much about the 5680, but i wouldn't set up a raid on any disks that have to be swapped out. like you said, the only mode that would work is raid 1. and even then, with one hard drive, the raid would be working in degraded mode. so after you did whatever and put the other drive back in, the raid would have to rebuild the data on that drive, which takes a long time. and besides, running in degraded mode with _any_ raid is a dangerous situation for your data. you're supposed to replace that drive as soon as possible.

like the name says... it's an array of hard disks that act as one unit. if you _really_ want that raid, you could get an 88xx based laptop, which has 5 spindles.
post #6 of 9
Okay, I got a Raid 0 card for $20 and it works a charm.

Both drives are on their own channels, and both CD ROMs are on their own channels (the mobo controller).

Hardware RAID in Windows is a must because there is minimal improvement if you have 2 drives each with a nonraid partition and then a raid between them. Since all 3 drives have 3 different partitions which can be accessed individually there is no real benefit.

PLUS, Raid allows simultaneous reading and writing, so with software raid, the controller cannot simultaneously write or read as easily. Also, if you have a CD sharing the channel, then the CD interrupts the process further degrading performance!

CPU latency is negligible perhaps, but it is there, on top of the CPU usage caused by the drives themselves.

Also, GOOD Raid controllers have buffers onboard in the megabyte range, in addition to buffers on the hard drives.


Quote:
wait, i've got another one. what happens if your raid card dies? the bios dies, or the chip that does the striping dies, or whatever. the card was the one that handled the ordering of the drives, and now you don't know anymore. you would have to line up those drives in the exact same order to get the data back intact
Not necessarily, you can put the drives on any channel and they are seen as a SET. The set is the same no matter what channel combination or slave/master setting. The SET doesn't change unless you add a new drive to it. In Raid 1 this is negligible.

Quote:
these benchmarks for various raid5 checksumming functions on my system:

8regs - 2052 MB/sec
32regs - 1684 MB/sec
pIII_sse - 3680 MB/sec
pII_mmx - 3556 MB/sec
p5_mmx - 4560 MB/sec
Those are pretty lame benchmarks because EACH controller would have to be on its own bus, every drive on its own channel, all operating at UDMA6 ATA/133, and since on most systems the controller is on PCI bus, you would max out 133Mbytes/second anyway, despite having two ATA/133 controllers, because the bus bottlenecks you. And for both the ATA/133 controllers and the PCI bus, you have overhead.

Quote:
hardware ones use left-asymmetric, and don't usually support others.
Okay, any good controller would support that. Maybe not my $20 Raid 0, 1, 0+1 card doesn't, but who cares. If left-asymetrical is best and default, GREAT, why would I want to change it or care about setting it otherwise?


Quote:
week ago, i created two raid 5 setups in my desktop with evms in linux. five 120 gig hard drives in one set for 480 gigs total, and 4 160 gig drives in the other also totaling 480 gigs. how much did it cost me? _nothing_. if i had invested in hardware raid, i would have had to buy at least two raid5 controller cards - at least 100 dollars each. not only that, but those cards almost always come with only 4 ide buses (i think more than two mass controllers on one pci card and one irq causes problems), so i could only use 4 drives, and i wouldn't have even been able to create that one with the 5 drives. (you could do 8 drives on 4 buses, but having two drives on the same cable and bus decreases bandwidth. not only that, if one drive fails, it's possible that the bus will die, killing both drives)
Okay, even with software RAID performance is still degraded with multiple devices per channel, and you still need 5 channels in that case. And not to mention, since RAID relies on simultaneous writing of data to drives all at once, you loose that ability with multiple devices on a channel.




Well thats just some stuff I've noticed about your argument, thogh I admint, if you are using LINUX it perhaps makes more sense to use software. But if you are in the 90% majority with WINDOWS, you need a good HARDWARE raid card to get the job done reasonably well.
post #7 of 9
Thread Starter 
i suppose. i know very little about recent windows. why do you think this is posted in the linux forum?

Quote:
Hardware RAID in Windows is a must because there is minimal improvement if you have 2 drives each with a nonraid partition and then a raid between them. Since all 3 drives have 3 different partitions which can be accessed individually there is no real benefit.
sorry, but i don't quite understand what you mean here.

Quote:
PLUS, Raid allows simultaneous reading and writing, so with software raid, the controller cannot simultaneously write or read as easily. Also, if you have a CD sharing the channel, then the CD interrupts the process further degrading performance!
which is another reason why each hard drive is supposed to have its own bus. i have nine ide buses for 9 hard drives in a raid, and a tenth for my cdrom and system hard drive which is not a part of a raid.

Quote:
CPU latency is negligible perhaps, but it is there, on top of the CPU usage caused by the drives themselves.
that's still negligible. i can use all 9 drives in my raid simultaneously with less than 10 percent cpu usage. and with software raid, you can set maximum and minimum rates if you want more or less speed, or more or less cpu usage.

Quote:
Also, GOOD Raid controllers have buffers onboard in the megabyte range, in addition to buffers on the hard drives.
buffers are a bad thing for data security. a power outage could be disastrous for that data. you'd want a minimal amount to stay in there.

Quote:
Not necessarily, you can put the drives on any channel and they are seen as a SET. The set is the same no matter what channel combination or slave/master setting. The SET doesn't change unless you add a new drive to it. In Raid 1 this is negligible.
ok, maybe that's true. i don't know; i don't have a raid card.

Quote:
Those are pretty lame benchmarks because EACH controller would have to be on its own bus, every drive on its own channel, all operating at UDMA6 ATA/133, and since on most systems the controller is on PCI bus, you would max out 133Mbytes/second anyway, despite having two ATA/133 controllers, because the bus bottlenecks you. And for both the ATA/133 controllers and the PCI bus, you have overhead.
ATA133 almost _never_ hits 133, just like ATA100 almost never hits 100. a 7200 rpm drive at best will do maybe a little over 50, limited not by the bus, but by the rotation of the drive. the most channels i have on one pci card is 4, which is why i put 4 5400 rpm drives on there. those generally do maybe 20-30 megs/sec. so assuming 30 meg/sec (on a good day), it would do 120megs/sec, which is under the pci bus's 133meg limit.

those numbers aren't the speed of my raid; they're the speed of the various raid 5 parity functions. that's how many megabytes per second my processor could calculate parity bits for - _theoretically_. that my raid is not that fast is a _good_ thing.

and i guess i forgot to mention that my processor is 1.1ghz. those scores aren't bad for that old a processor.

Quote:
Okay, any good controller would support that. Maybe not my $20 Raid 0, 1, 0+1 card doesn't, but who cares. If left-asymetrical is best and default, GREAT, why would I want to change it or care about setting it otherwise?
left-asymmetric is not the best, but it's the default on hardware raid. left-symmetric is the best and the default on linux software raid. and certain modes could be better for certain applications, so it's good to be able to choose. for example, left-symmetric is good for reading large files or many chunks. it might not be as good for working with many small files and tiny reads.

Quote:
Okay, even with software RAID performance is still degraded with multiple devices per channel, and you still need 5 channels in that case. And not to mention, since RAID relies on simultaneous writing of data to drives all at once, you loose that ability with multiple devices on a channel.
... i don't have multiple ide devices on a channel. (i used bus and channel interchangeably. same thing)

Quote:
Well thats just some stuff I've noticed about your argument, thogh I admint, if you are using LINUX it perhaps makes more sense to use software. But if you are in the 90% majority with WINDOWS, you need a good HARDWARE raid card to get the job done reasonably well.
maybe. like i said, i don't know all that much about windows anymore. and i have a feeling that a windows system drive would be better with hardware raid. windows probably wouldn't like starting up a software raid before the rest of the system, so it might be better to have the whole thing transparent to the system. (actually, _could_ it even start up from a raid? windows doesn't have a separate kernel with all functionality built in, the way linux does)

i use raid 5 as storage; nothing is run off of it. (infact, i have the 'noexec' mount option for it) so speed is not so much an issue for me. i wanted some data redundancy, while still making the best use of what space i had. (i don't have much money, and sacrificing half of my drives for a mirrored setup is just too much)

one thing - a lot of your arguments are about hardware limitations. for example, how the pci bus only can handle 133 mb/sec. but that applies either way.... whether you use a hardware or software raid, your pci bus is _still_ only going to do 133 mb/sec.
post #8 of 9
RAID is all about the controllers and channels - people forget that the majority of time RAID is used to increase uptime - not performance. You need to be able to lose a controller and have your disks remain available.

If however this is not what you are trying to do then HW over SW raid (provided you have a big enough pipe to your disks) Is not going to make a huge difference. If you are doing a lot of IO you are going to take a hit on CPU usage, failing that speeds should be relatively similar.
post #9 of 9
Thread Starter 
absolutely right. the only raid that is for performance is raid 0, (and arguably, the esoteric raid 3) which is why a lot of times you'll see people say that raid 0 is not a real raid.

i use my raid 5 for data integrity and security. (that's not _exactly_ what raid is for, either... but better data integrity leads to less downtime - not as much to recover from backup.) i don't have money flowing out from my orifices to maintain hundreds of backups the way companies do, so a raid 5 is the best i can manage.

naturally, i would never use software raid on a mission-critical business server. software is more susceptible to hacking and corruption and whatnot... you can't really modify the firmware of a hardware raid card too easily. and a hardware raid requires a little less maintenance (consequently decreasing downtime). you'd want as much as possible of the server to be hardware based. but for the average home user's desktop (or laptop), while either one would be fine, i personally prefer the cost and flexibility of software raid.

for an example of flexibility, i came across an extremely clever use of software raid on the gentoo forums. while hardware raid only lets you put entire physical disks into raid, software raid works with partitions. so suppose you have three hard drives of the same size. and suppose each one has the same partition table - a small boot partition at the front, 'x' amount of space for swap, and the rest is for the files (/). what you could do with software raid is to put the three boot partitions in raid 1, the three swap partitions in raid 0, and the three system partitions in raid 5. the boot partition is mirrored across three drives for backup - your kernel is in there, after all. and /boot is not mounted during normal usage. the swap is striped with raid 0 for improved swapping performance. and the system drive is raid 5, for _some_ data security while still using a good amount of the space (2/3). now _that's_ something you can't do with hardware raid.
New Posts  All Forums:Forum Nav:
  Return Home
  Back to Forum: Linux & Other OS's