|
|
SCSI hard disks, although very reliable, suffer from physical defects like any other magnetic media. These defects cause soft errors where data loss is recoverable, and hard errors where data loss is irrecoverable. SCSI disk controllers that support an error correction code (ECC) typically can correct at least 15 to 25 incorrect bits per sector; this is known as their correction span. A hard error is reported if the number of incorrect bits is greater than the correction span.
Most SCSI disk controllers allow bad blocks to be remapped to good blocks on the disk; note that these are SCSI logical blocks not filesystem logical blocks. (SCSI logical blocks can range from 48 to 4096 bytes in size depending on the manufacturer. The logical block size is usually set to 512 bytes when the disk is formatted.) The logical addresses of bad blocks are added to the disk's grown defect list (G-List). Whenever a bad block is accessed, the disk controller invisibly remaps the request to the good block. The manufacturer's defect list or primary defect list (P-List) is written to the disk during its manufacture and is unchangeable.
You can use the interactive badtrk(ADM) utility to reallocate bad blocks on SCSI disks in the same way that you can reallocate bad tracks for ST506 and ESDI disk controllers. badtrk adds bad blocks to the G-List by issuing a SCSI REASSIGN BLOCKS command to the disk controller. If the controller does not support block reallocation in hardware, or all the spare blocks on the disk have been used, the driver remaps bad blocks using an operating system-managed reserved area near the start of a disk partition (as for other disk types).
Depending on the disk controller, it may be possible to configure it to add defects to the G-List automatically using Automatic Write Relocation (AWR) and Automatic Read Relocation (ARR). These are SCSI-2 features that reallocate bad blocks on detecting any soft error. You can use badtrk to turn AWR and ARR on or off; you are also given the opportunity to do this when the disk is first installed. It is usually undesirable to reallocate soft errors since these may be as small as a single incorrect bit; if provided, the ECC should be capable of correcting such errors.
The driver cannot recover from hard errors that occur while
reading from disk. Such errors mean that data has almost
certainly been lost. The default action of the driver is to
report these errors to the system administrator as a WARNING
message (see
messages(M)
for more details).
Sdsk_verbosity values
Setting | Errors reported |
---|---|
SDSK_V_NONE | none |
SDSK_V_HARD | unrecoverable hard errors only |
SDSK_V_WARN | recoverable soft errors, and unrecoverable hard errors |
SDSK_V_ALL | all |
NOTICE: Sdsk, Spurious interruptNo command was pending when an interrupt was received.
WARNING: Sdsk: Bad block size SDsk: Block size (n) must be between NBPSCTR and SBUFSIZEThe block size on the device has been found to be outside the allowed limits.
The block numbers reported by the Sdsk driver when an unrecoverable read or write error is detected on the disk are unreliable. You should scan the partition containing the defective block using the badtrk utility. This will report the true SCSI logical block addresses (LBAs) of any bad blocks in that partition. You can then reallocate these blocks using badtrk. If you need to discover in which files the bad blocks occurred, use the LBA reported by badtrk with the badblk command.