|
|
The cluster size that is chosen for striped configurations (RAID 0, 4, and 5) has a large affect on the performance of a virtual disk array. In a multiuser environment, you should adjust the cluster size to be equal to the predominant request size of the applications that are accessing the array. In this context, applications can include the operating system buffer cache, or relational database buffering mechanisms. It does not necessarily mean user programs which access a filesystem.
In a multiuser environment, the aim is for each request to affect only a single data piece. In this way, disk activity will be minimized and spread evenly between the disks. If a request begins halfway through a cluster, it will need to access two disks in the array. The effect of this will be to increase the overall disk I/O and reduce job throughput. Such a request is known as a split job. If this type of request occurs frequently, it may be worthwhile increasing the cluster size so that more requests fit in a single cluster. However, if you make the cluster size too large, contention between processes for access to individual clusters will also increase disk I/O and decrease job throughput.
Sometimes it is beneficial to make the dominant I/O size equal to the stripe size of a virtual disk array. Examples are a single application performing synchronous I/O, or a single-user system. This improves throughput because I/O requests are performed in parallel across all the disks in an array. The highest throughput is obtained when write requests are the same size as the stripe and are aligned on its boundaries. On RAID 4 and 5 arrays, such full stripe writes enhance performance because no old data needs to be read in order to generate parity.
A relational database server would appear to be an ideal application for full-stripe I/O. However, such applications often provide their own facilities for disk load-balancing and protection against disk failure. For the best possible performance, use these features in preference to virtual disk arrays. However, you should note that tuning such applications can be time consuming. Using virtual disk arrays provides a quicker and more easily configurable method of obtaining a performance improvement over single simple disks.
The vdisk driver keeps counts of the type of requests which have been made to an array. You can examine these counts using the -ps options to dkconfig(ADM).
In this example, dkconfig is used to examine the request statistics for the virtual disk array /dev/dsk/vdisk3:
/dev/dsk/vdisk3: 16384 iosz 397195 reads 153969 writes 551164 io piece 1 /dev/dsk/2s1 404172 reads 140260 writes 544432 io piece 2 /dev/dsk/3s1 350326 reads 137769 writes 488095 io piece 3 /dev/dsk/4s1 382089 reads 135147 writes 517236 io piece 4 /dev/dsk/5s1 365069 reads 135808 writes 500877 io Job Types: Full Stripe 0 reads 0 writes Group 174463 reads 94708 writes Cluster 332476 reads 119464 writes Split Jobs 260919 IO Sizes: 16384 bytes 205180 io 1024 bytes 94662 io 2048 bytes 73729 io 3072 bytes 48287 io 8192 bytes 14454 io 4096 bytes 23 io 12288 bytes 4 io 13312 bytes 3 io 5120 bytes 2 io 10240 bytes 1 io 7819 resets to IO size statisticsThe counts include:
iosz
).
Full Stripe
Group
Cluster
Split Jobs
io
) of different sizes.
Use these to help you determine the optimum
cluster size for your application.
If you are using the block device to access a disk array, such as when using an array for a filesystem, you may often achieve the best performance by setting the cluster size to 32 (16KB) or greater. This is because the buffer cache reads ahead 16KB.
In the example shown above, the system was running a benchmark to perform a stress test of a filesystem implemented on a RAID 5 array with a cluster size of 32. The dominant request size was 16KB as expected for access via the buffer cache but there were a comparable number of split jobs. In this case, better performance might be achieved by increasing the cluster size to 40 or 48.