Thursday, February 26, 2015

Replace failed RAID controller

OK, here is some experience on building a RAID server and fixing hardware failure. This is quite interesting as software engineers do not often have the chance to work on hardware.

The server uses PERC 6/E card to connect to a disk array. The disks in the disk array have the same physical capacity, say 1TB, but configured as logical drives with large capacity each (which effectively means you combine several physical drives into one logical drive).  PERC stands for PowerEdge RAID Controller. "E" in 6/E stands for External, which means HDDs are in a expansion box separate from the server box, attached to server using a SCSI interface such as MD1000 or MD3000. Correspondingly "I" in 6/I means "Internal", where HDDs are in the same server box. PowerEdge is a server brand of Dell. Relevant servers are listed on this wiki page: List of Dell PowerEdge Servers.

[1] is a very good step by step guide on how to do the configuration on a PERC 6/I card. It is the same for configuring a PERC 6/E card. [2] is a more general and complete guide.

There are several things one can do.

At boot time, press CTRL-R to enter PERC configuration utility. Once in the configuration utility, you can see a list of PERC cards, click "enter" on one to enter its configuration. In the next page "VD Mgmt", press "F2" for actions to choose from, including: Create New VD, Clear Config, Foreign Config (Import/Clear).

In the case the PERC card failed, the speed to access PERC configurations can be painfully slow each operation. Once the new good card is replaced on board, all steps are very much faster. The difference is like 10-30 minutes versus a few seconds.

PERC configuration is written to each RAID disk. In case of a PERC failure, just replace the PERC card and "import" foreign configuration. A configuration is "foreign" means the PERC configuration does not match configuration information on the RAID disks. When a PERC card fails its configuration may become inconsistent with information on the disks, and you may see warning lights on the disks blinking and the status of the lights is inconsistent upon each reboot.

If 1 or 2 disks in array failed, you just pull it out and insert a new disk. RAID should rebuild automatically.

If a large number of disks are in foreign state, you may want to clear foreign configuration and recreate from scratch. Note that for new setup you can "Initialize" the disks to clean them, but for disks with existing data (in the case of recovering from a failed PERC card, where disks with data should keep intact) you should not initialize the disks since you want to preserve the data.

If you need to create new VD (Virtual Disks), you need to choose RAID level, then assign disks to use, and set some options.

It's actually quite a lot of fun.

[1] DELL Tutorial: Create RAID Using PERC 6/i Integrated BIOS Configuration Utility
[2] Dell™ PERC 6/i, PERC 6/E and CERC 6/i User's Guide


Saturday, February 21, 2015

Redis better than memcache?

Why Redis beats Memcached for caching

Memcached: good when 1) static/small data, 2) better horizontal scaling

Redis: good when:
1) more data structure support
2) faster for more clients
3) larger object size limit (512MB v.s. 1MB of memcached)
4) stored data can be manipulated, not opaque
5) more control over LRU policies (6)
6) tunable persistence
7) offers replication

On Redis, Memcached, Speed, Benchmarks and The Toilet
  Redis: ~ 80-100k GET/SET per sec, better for more clients.
  Memcached: ~ 60-80k GET/SET per sec

An update on the Memcached/Redis benchmark
  Memcached: 1 instance running multiple threads, overhead causes less speed per thread.
  Redis: single thread. Can run multiple instances together.

Blog Archive

Followers