Administering virtual disks

Possible problems

A virtual disk can be online or offline. In the online state, the virtual disk is active and all data is accessible.

If a failure occurs, appropriate console warning messages and status information will be displayed. If a single drive fails on a RAID array (levels 1, 4 and 5), the virtual disk will remain online and all data will be accessible; in other circumstances, the virtual disk may go offline. Simple, concatenated and stripe virtual disk types always remain online as disk errors are passed back to the application.

Status information is displayed in the Virtual Disk Manager after the disk or piece information. Possible error states are:

identifies a disk failure. One of the pieces in the array is inaccessible. If a single piece is in the OOS state, the array remains online.

identifies a piece with corrupt parity. The array will remain online, unless a disk failure occurs. A restore operation should be performed as soon as possible. If a disk fails when one of the disk pieces in the array is in the OOD state, the entire array will go offline.

indicates which disk piece is the hot spare (when configured). If the hot spare is in use, the status indicator IN USE is added. The hot spare piece will automatically replace the piece identified as OOS.

indicates a virtual disk with bad parity. This might also mean a new disk which has not had a restore operation, or an array that has just been brought online.
The array enters the offline state when virtual disk I/O accesses fail. When an array goes offline, repair the virtual disk and restore data from a backup.

See also:

Invalid timestamp on root device mirror

The Virtual Disk Manager uses timestamps on RAID virtual disk configurations to ensure proper operation and data integrity. If a timestamp on one of the mirror virtual disk pieces becomes invalid, the piece will not be fully configured. You cannot set the timestamps to a known state on a mirror virtual disk that is online.

If this happens, the out-of-service piece on the mirror root device cannot be restored.

  1. Unmirror the root device.

  2. Shut down the system and reboot, as described in ``Starting and stopping the system''.

  3. Mirror the root device again.

Mirror root failure

If the primary disk fails during the system reboot (when mirroring the root disk for the first time), the array will go offline and the system boot will fail. At this point in the boot sequence, the system cannot switch over to the secondary disk if it has not been completely restored. Before replacing the primary drive or rebuilding the system, remove power from the secondary disk and try to boot the system. If the primary disk is not completely bad, the system will boot. When the system boots, unmirror the root device. Once the problem has been corrected, try to mirror the root device again.

Offline disk array

When an array or mirrored virtual disk has a mounted filesystem and the array or mirror goes offline due to error conditions, the filesystem becomes unusable. At this point, the filesystem cannot be unmounted (much the same as a hard disk failure). The system must be rebooted to clear this condition.

An array or mirror may go offline when more than one piece is out of service or one piece is out of service and parity is out of date. To rectify this:

  1. Disable the virtual disk.

  2. Force the virtual disk online.

  3. Restore the parity.

  4. Restore the filesystem data from the backup as described in ``Restoring a scheduled filesystem backup''.
If the system fails (crashes) while running on a RAID array, the parity data will be automatically regenerated when the system is booted. This will ensure that parity data accurately reflects the data on the other drives in the array. If an I/O error is encountered while the parity information is being restored, the array will go offline. It is recommended that a UPS (uninterruptable power supply) be used to reduce the risk of power outages on systems using virtual drives.

Kernel virtual memory shortage

When the system has many RAID virtual disk configurations, with large cluster sizes or a heavy I/O load, the performance of the array may be reduced due to the high contention for system resources (buffers, kernel virtual memory, and so on). By increasing the total amount of physical memory, the system and array performance can be improved. See ``Warning messages'' for more information on driver error messages related to kernel virtual memory.

Next topic: Warning messages
Previous topic: Repairing a failed drive

© 2003 Caldera International, Inc. All rights reserved.
SCO OpenServer Release 5.0.7 -- 11 February 2003