分享

[Milberg09] Chapter 10. Disk I/O: Introduction

 Stefen 2010-09-13

Section IV: Disk I/O



This section gives you an overview of disk management on AIX, including how to monitor and tune your disk I/O subsystem. We also discuss best practices for disk placement, file system management, optimum hardware configuration, and concepts such as direct and concurrent I/O, and asynchronous I/O (AIO).


Chapter 10. Disk I/O: Introduction

What, exactly, is involved in tuning your disk subsystem? Tuning disk is a little trickier that tuning your CPU or virtual memory subsystem. One important reason is because you can do more to optimize throughput during the initial configuration of your I/O devices than you can ever do with tuning. It's simply much easier to move things around during the initial build-out of your environment than to re-architect production.

Furthermore, understand that the slowest operation for running programs is the time spent on actually retrieving your data from disk. This activity involves the physical disk as well as its logical components, such as the Logical Volume Manager (LVM). All the tuning in the world will do little if you have a poorly architected subsystem. Let's look at the I/O stack, which is depicted in Figure 10.1.

Figure 10.1. I/O stack

The figure clearly shows the tight integration between physical components as they relate to both the logical disk and its application I/O. When you configure your disk, you should work from the ground up. Start with the physical system and then move to the device layers, logical volumes, file systems, files, and applications. The physical component is crucial. Configuring this component involves determining the amount of disk, type (speed), size, and throughput.

One important challenge to note with storage technology is that although the storage capabilities of disk are increasing dramatically, disk rotational speed increases more slowly. Disk I/O is clearly the weakest link on a system: while RAM access takes about 540 CPU cycles, disk access can take 20 million CPU cycles.

To reiterate, poor layout of your data affects I/O performance much more than any tunable I/O parameter. Returning to the I/O stack, you can clearly see the truth in this statement just by looking at where the tunables are on the stack. They are much closer to the top than disk placement and logical volumes.

10.1. Direct I/O

First introduced in AIX 4.3, direct I/O bypasses the Virtual Memory Manager (VMM), enabling the transfer of data directly to disk from the user's buffer. Direct I/O is not for everyone, because although it is possible to improve performance using this technique, it is also possible to degrade performance if you turn on direct I/O where you shouldn't.

Implementing direct I/O can provide near raw logical volume performance while at the same time maintaining the flexibility and manageability of file systems. What are a good candidates for direct I/O? Applications that have files with poor cache utilization are one example. Another is applications that use synchronous writes, because these writes must go to disk. Direct I/O goes directly to disk, so CPU usage drops because the dual data copy (bypassing the cache) is dropped.

What are not good candidates for direct I/O? Applications that have smaller requests with persistent segments (which translate into permanent locations).

10.2. Concurrent I/O

Introduced in AIX 5.2, concurrent I/O (CIO) is nearly identical to direct I/O, but one better. With direct I/O, inodes (data structures that are associated with files) are locked to prevent a condition in which multiple threads might try to change the contents of a file at the same time. CIO actually bypasses this inode lock, letting multiple threads read and write data concurrently to the same file. This capability is enabled due to the way in which JFS2 is implemented with a write-exclusive inode lock, which lets multiples users read the same file simultaneously. This design has the effect of increasing performance dramatically when multiple users read from the same data file.

Direct I/O can cause major problems with databases that continuously read from the same file. Concurrent I/O solves this problem, making it the preferred method of running databases. You turn on CIO either by mounting the file system or through open systems calls. It's as simple as running the mount command with the cio option:

# mount -o cio /u01

When you mount the file system using this method, all files in the file system will use CIO.

Unlike direct I/O, you can use CIO only with JFS2. As with direct I/O, some environments won't benefit from turning on CIO. For example, applications that could benefit from a file system read-ahead or high buffer cache might actually experience decreased performance. Test, test, test, and then test some more!

10.3. Asynchronous I/O

Asynchronous I/O (AIO) conceptually relates to whether applications are waiting for I/O to complete before processing additional data. In other words, AIO lets applications continue to process while I/O runs in the background. This approach improves performance because processing can occur simultaneously.

An AIX 6.1 note: virtually everything AIO-related has changed with the implementation of AIX 6.1. For information about these changes, see Chapter 16.

10.5. Intra- and Inter-Policy

Figure 10.2 depicts the relationship between the logical volumes and the physical disk.

Figure 10.2. System layers

The logical volume layer sits between the application and physical layers. In other words, the application layer correlates to the file system or raw logical volume. The physical layer consists of the actual disk. Logical Volume Manager is the AIX disk management system that maps data between logical and physical storage. LVM also lets data reside on multiple physical platters and be managed and analyzed using specialized LVM commands. LVM controls all the physical disk resources on your system while providing a logical view of the storage subsystem.

Knowing that the logical layer sits directly between the application layer and the physical layer should help you understand why the logical layer is probably the most important of all the layers. Even your physical volumes themselves are part of the logical layer because the physical layer encompasses only the actual physical components.

What about the other elements that make up the preceding illustration? From the bottom up, each of the drives is named as a physical volume. Multiple physical volumes make up the volume group. The logical volumes are defined within the volume group, and LVM enables the data to be on multiple physical drives, although they might be configured to be on a single volume group. The logical volumes can be either one or multiple logical partitions. Each logical partition has a physical partition that correlates to it. This is where you actually mirror your system, by having multiple copies of the physical partitions.

How does logical volume creation correlate with physical volumes? Figure 10.3 illustrates the storage position on the physical disk platter.

Figure 10.3. Physical disk platter layout

As a general rule, data written toward the center of the platter has faster seek times than data written on the outer edge. This has to do with the concept of data density. Because data is more dense as it moves toward the center, there will be less movement of the head. Because the inner edge will usually have the slowest seek times, more intensive I/O applications should be brought closer to the center of the physical volumes. Is this always the case? There are exceptions. For example, disks hold more data per track on the edges of the disk than on the center. For this reason, logical volumes being accessed sequentially should actually be placed on the edge for better performance. The same holds true for logical volumes that have Mirror Write Consistency Check (MWCC) turned on. This is because the MWCC sector is on the edge of the disk (not at the center), which relates to the intra-disk policy of logical volumes.

10.6. Inter-Disk Policy

The inter-disk policy defines the number of actual disks on which the physical partitions of a logical volume reside. The general rule is that the minimum policy provides the greatest reliability and availability, while the maximum policy improves performance. Simply put, the more drives your data is spread on, the better the performance. Some other best practices include the following:

  • Allocating intensive logical volumes to separate physical volumes

  • Defining the logical volumes to the maximum size you need

  • Placing frequently used logical volumes close together

These are all reasons to understand your data before configuring your systems so that you can create policies that make sense from the start. You can define your policies when creating the logical volumes themselves using the System Management Interface Tool (SMIT) fastpath command:

# smitty mklv

10.7. File Systems

Two types of kernels exist in AIX: a 32-bit kernel and a 64-bit kernel. (AIX 6.1 has only a 64-bit kernel.) Although both types of kernels share some common libraries and most commands and utilities, you should understand their differences and how the kernel relates to overall performance tuning. JFS2 is optimized for the 64-bit kernel, while JFS is optimized for the 32-bit kernel. Always use JFS2 if you can. Both JFS and JFS2 are journaling file systems, which have been associated with performance overheads. In fact, with JFS, where availability was not an issue and peak performance was necessary, you could disable metadata logging in an effort to increase performance. With JFS2 (AIX 5.3 only), that technique is no longer possible (or necessary) because the file system is tuned to handle metadata-intensive types of applications more efficiently. With AIX 6.1 you can now mount file systems without logging.

The most important advantage of JFS2 lies in its ability to scale. With JFS2, you can have files up to 16 TB; JFS imposes a file size limit of 64 GB. JFS2 also includes changes in the directory organization. It uses a binary tree representation while performing inode searches, rather than the linear method used by JFS.


 


    本站是提供个人知识管理的网络存储空间,所有内容均由用户发布,不代表本站观点。请注意甄别内容中的联系方式、诱导购买等信息,谨防诈骗。如发现有害或侵权内容,请点击一键举报。
    转藏 分享 献花(0

    0条评论

    发表

    请遵守用户 评论公约

    类似文章 更多