Start a Conversation

Unsolved

This post is more than 5 years old

194607

January 29th, 2014 10:00

PERC S100 Raid 5 Array - Steps for proper Replace Drive & Rebuild

We had one of our drives go bad in our RAID 5 setup with PERC S100. The current config is 4x 1TB Seagate Drives. During the post, I CTRL R to get into admin. I noticed that it showed the status degraded array with 0:2 drive was offline but the other drives were showing online. So, I purchased 2 new drives (1 for replacement and another for future). So, I powered down, removed the bad drive, put the new drive in, and powered up. When I powered up, did CTRL R to go into admin again. It was showing a new array with that drive as NON-RAID. I could not delete the array or anything. So, from what I read from online information, I was told that I would need to add this drive to the array via OMSA.

So, I downloaded that and installed it. Go into OMSA and deleted the new array. Then it would allow me to add my new drive as a spare. The tasks in OMSA really did not give me any other option but that. So, I assumed that this was what I needed to do.

So, after I added as a spare, it went through the rebuilding process which took over 26hrs to do. WOW!

After the rebuild, OMSA is now stating that drive 0:1 is degraded with failure predicted. So, it seems that besides one failing, another one is about to fail. But, before I replace that drive, I have some concerns here which is below.

I booted our server, did CTRL R to view the admin, which I noticed that this first replacement drive 0:2 status is set as SPARE and the other drives have the status ONLINE.

My question here is this: Since I replaced a faulty drive within the array, shouldn't that new drive be part of the set?? In other words, shouldn't that new drive shows status as online since it replaced the faulty drive??

Before I shut down and replace the faulty drive 0:1, I want to make sure I am doing this process correctly for replacement drives within our array.

Any advice on the proper steps on how to get all our drives to be part of the set would be appreciated.

Regards

Joey

5 Posts

January 29th, 2014 12:00

Thank you for the information and details as this helps a great deal. I do have a few questions now.

1) Before I replace the faulty drive I have now, should I unassign the first replacement as global hot spare?

2) The steps you provided about replacing drive, I noticed that the first step was to set the predicted drive to offline. When I am in OMSA, there is no task offered to do this.

Joey

Moderator

 • 

6.2K Posts

January 29th, 2014 12:00

Hello Joey

My question here is this: Since I replaced a faulty drive within the array, shouldn't that new drive be part of the set?? In other words, shouldn't that new drive shows status as online since it replaced the faulty drive??

On the S100 hot spares retain their hot spare status even after they have been built into an array. You can unassign the drive as a hot spare in OpenManage.

Before I shut down and replace the faulty drive 0:1, I want to make sure I am doing this process correctly for replacement drives within our array.

Any advice on the proper steps on how to get all our drives to be part of the set would be appreciated.

The steps you took previously are correct. When the S100 detects a new drive the default is to set is as Non-RAID. Non-RAID is considered a virtual disk mode and you will need to delete the virtual disk before you can do anything with the drive. Here are the steps to replace the predictive failure drive:

  • In OpenManage offline the predictive failure drive
  • Remove the pred fail drive
  • Insert the replacement drive
  • Delete Non-RAID VD on the new drive
  • Set new drive as hot spare
  • After the drive completes rebuilding unassign it as a hot spare

The S100 does not maintain a controller log, so we will be unable to tell if the array is punctured. If another drive goes predictive failure then you may need to delete and recreate the array. Corrupt array data will cause block errors to report in the HDD SMART. Once a certain threshold of bad blocks has been logged to the SMART the drive will be marked as predictive failure. To ensure the array metadata is rewritten you will need to create a dissimilar array and initialize it(Something other than the RAID 5 you are using). This will be a very long process on the S100. You will need to wait for the initialization to complete and then create the RAID you want.

Thanks

Moderator

 • 

6.2K Posts

January 29th, 2014 12:00

1) Before I replace the faulty drive I have now, should I unassign the first replacement as global hot spare?

It doesn't matter. You can unassign it whenever you like. Being assigned as a hot spare does nothing while it is part of an array. It will not be used as a hot spare while it is in an array. Unassigning it as hot spare does not change any functionality.

2) The steps you provided about replacing drive, I noticed that the first step was to set the predicted drive to offline. When I am in OMSA, there is no task offered to do this.

The S100 may not have the offline option. If it doesn't then just skip that part. The S100 does not support hot-swap drives, so you will need to power down to replace the drive. Offlining a drive is for hot-swap situations. It allows the controller to stop read/write requests to the drive and avoid data corruption. It is a safer method than just pulling a drive that is online. If you have the option it will be here:

Thanks

5 Posts

January 29th, 2014 13:00

Thank you!

No, there are no available tasks so I will skip that step.

I will go ahead and do the steps and see if everything works out.


I appreciate the help.


Joey

No Events found!

Top