分享

[Milberg09] Chapter 15. Network I/O: Tuning

 Stefen 2010-09-25

Chapter 15. Network I/O: Tuning

The most important command for tuning AIX network parameters is the no command. First, take a look at the first few parameters, using the -a flag:

root@lpar37p682e[/] > no -a

arpqsize = 12
arpt_killc = 20

arptab_bsiz = 7

arptab_nb = 149
bcastping = 0

clean_partial_conns = 0

delayack = 0
delayackports = {}

As an alternative, you can use the -L flag, which provides much more detailed information.

The no command provides more than 100 parameters you can tune. In older versions of AIX, thewall was an important tunable whose defaults you needed to change; this parameter defined the upper limit for network kernel buffers. Today, this size is defined at installation time depending on the amount of RAM and the kernel type. For example, if you are running AIX 5.3 on a 64-bit kernel, the parameter is set at half the size of real memory. (I actually used to enjoy playing around with thewall, so I'm not sure I like the new approach.) You can use netstat -m to detect shortages or failures of network memory requests. In the following example, there are no shortages (failures):

root@lpar37p682e[/etc/tunables] > netstat -m
Kernel malloc statistics

******* CPU 0 *******
By size inuse calls failed delayed free hiwat free

32 117 217 0 0 11 5240 0
64 109 6523 0 1 83 5240 0

128 975 15951 0 29 785 2620 0
256 520 67637 0 30 1016 5240 0

Streams mblk statistic failures

0 high priority mblk failures
0 medium priority mblk failures
0 low priority mblk failures

Although you can change many parameters using the no utility, most of them are better left alone. The most important parameters are those that relate to TCP streaming workload tuning:

  • tcp_sendspace — This parameter controls how much buffer space in the kernel is used to buffer application data. You really want to bump this value up from the default because if its limit is reached, the sending application suspends data transfer until TCP sends the data to the buffer.

  • tcp_recvspace — In addition to controlling the amount buffer space to be consumed by receive buffers, this value helps AIX determine the size to make its transmit window.

  • udp_sendspace — When using UDP, you can set this value no higher than 65536 because IP has an upper limit of 65,536 bytes per packet.

  • udp_recvspace — This value should be greater than udp_sendspace because it needs to handle as many simultaneous UDP packets per socket as it can. You can easily set this parameter to 10 times the value of udp_sendspace.

Let's use no make a few changes. First, increase the size of udp_send-space:

root@lpar37p682e[/] > no -p -o udp_sendspace=65536

Setting udp_sendspace to 65536
Setting udp_sendspace to 65536 in nextboot file

Next, change udp_recvspace to the recommended configuration of 10 times udp_sendspace:

root@lpar37p682e[/] > no -p -o udp_recvspace=655360

Setting udp_recvspace to 655360
Setting udp_recvspace to 655360 in nextboot file
Change to tunable udp_recvspace, will only be effective for future connections


Note that the -p flag retains the entries, even after a reboot. It appends the updated values in the etc/tunables/nextboot stanza file.

Regarding the TCP parameters for higher-speed adapters, there is no problem setting tcp_sendspace to twice the value of tcp_recvspace. These are good settings.

Two other important workload parameters of the no command are rfc1323 and sb_max. The rfc1323 tunable enables the TCP window scaling option, which lets TCP use a larger window size. Turning on this parameter enables the best TCP performance. The sb_max tunable sets an upper limit on the number of socket buffers queued to an individual socket, controlling the amount of buffer space consumed by buffers (queued to either a sender or receiver socket). This number should usually be less than thewall and approximately four times the size of the largest value of the TCP or UDP send and receive settings. For example, if your udp_recvspace value is 655360, you can't go wrong by doubling this to 1310720.

Another useful no tunable, tcp_nodelayack, prompts TCP to send an immediate rather than a delayed acknowledgment. Although sending an immediate acknowledgment can add more overhead in some environments, it can greatly improve network performance in others. If changing this parameter does not improve performance in your environment, you can quickly change it back.

Let's also review ipqmalen. This tunable controls the length of the IP input queue. If you see an overflow counter (using netstat -s), setting a maximum length for this queue can help fix the overflow.

What about Address Resolution Protocol (ARP)? When many clients are connected to the system, you might want to tune the ARP cache. You can examine the relevant statistics using netstat:

root@lpar37p682e[/etc/tunables] > netstat -p arp

arp:
10 packets sent
0 packets purged

If you see a high purge count, increase the size of the ARP table. In the preceding example, no increase is needed.

Here are the no parameters that relate to arp:

root@lpar37p682e[/etc/tunables] > no -a | grep arp

arpqsize = 12
arpt_killc = 20
arptab_bsiz = 7

arptab_nb = 149

You can tune these buffers either systemwide or according to specific interfaces. To tune by interface, set the no command's use_isno option to 1 (this option is enabled by default in AIX 5.3):

root@lpar37p682e[/etc/tunables] > no -a | grep use
use_isno = 1

Disabling the use_isno parameter (by setting it to 0) can serve as a diagnostic tool of sorts by setting the buffer values across the board to help isolate performance problems. When these values are set for the specific interfaces, they actually override the default value in the no view, which can sometimes confuse system administrators. You can view specific interface settings using either ifconfig or lsattr:

# ifconfig en0

en0: flags=1e080863,480<UP,BROADCAST,NOTRAILERS,RUNNING,SIMPLEX,MULTICAST,
GROUPRT,64BIT,CHECKSUM_OFFLOAD(ACTIVE),CHAIN>
inet 172.29.135.44 netmask 0xffffc000 broadcast 172.29.191.255
tcp_sendspace 262144 tcp_recvspace 262144 rfc1323 1

In this example, look at the settings using ifconfig (see the last line, which references a couple of the tunables mentioned earlier). You can change these options (by interface) using SMIT or the chdev or ifconfig command. Note that ifconfig will not update the Object Data Manager (ODM), so on reboot, the settings will revert to their previous values. For this reason, you should use SMIT. Use the smit tcpip fastpath, and go to Further configuration > Network interfaces > Change/Show characteristics of an interface.

15.1. Name Resolution

Name resolution is another area that can impact performance. If you know how you want to resolve names (using either DNS or the hosts file), make sure name resolution is set up correctly in the /etc/netsvc.conf file. If you're using DNS, take out the local if you are not using a hosts file at all, or leave it in if you are using it as a backup to DNS (but make it the second entry). If you're not using DNS, remove the bind because it will slow performance by first trying (if it is the first entry in the record) to resolve using a name server that doesn't exist.

15.2. Maximum Transfer Unit

The maximum transfer unit (MTU) is defined as the largest packet that can be sent over a network. The size depends on the type of network. For example, 16-bit token-ring has a default MTU size of 17,914, while Fiber Distributed Data Interface (FDDI) has a default size of 4,352. Ethernet's default size is 1,500 (or 9,000 with jumbo frames enabled). Larger packets mean fewer packet transfers, which results in higher bandwidth utilization on your system. An exception to this rule is if your application prefers smaller packets.

If you're using a Gigabit Ethernet, you can use a jumbo frames option. To support the use of jumbo frames, your switch must be configured accordingly. To change to jumbo frames, use the smit device fastpath and go to Communication > Ethernet > Adapter > Change > Show characteristics of an Ethernet adapter. You can make the change from there.

15.3. Tuning: Client

The biod daemon plays an important role in connectivity. While biod self-tunes the number of threads (the daemon process creates and kills threads as needed), you can adjust the maximum number of biod threads, depending on the overall load. An important concept to understand here is that increasing the number of threads alone will not alleviate performance problems caused by CPU, I/O, or memory bottlenecks. For example, if your CPU is near 100 percent utilization, increasing the number of threads won't help you at all.

Increasing the number of threads can help when multiple application threads access the same files and you don't find any other types of bottlenecks. Using the lsof command can help you further determine which threads are accessing which files. From earlier tuning sections, you might remember the Virtual Memory Manager parameters minperm and max-perm. Unlike when you tune database servers, with NFS you want to let the VMM use as much RAM as possible for NFS data caching. Most NFS clients have little need for working segment pages. To ensure that all memory is used for file caching, set both maxperm and maxclient to 100 percent:

root@lpar24ml162f_pub[/tmp] > vmo -o maxperm%=100

Setting maxperm% to 100
root@lpar24ml162f_pub[/tmp] > vmo -o maxclient%=100
Setting maxclient% to 100

Note that in the event that your application uses databases and could benefit from performing its own file data caching, you should not set maxperm and maxclient to 100 percent. In this situation, set these numbers low and mount your file systems using concurrent I/O over NFS. NFS maintains caches on each client system that contain attributes of the most recently accessed files and directories. The mount command controls the length of time that these entries are kept in cache.

The mount parameters you can change include the following: acdirmin, acdirmax, acregmin, acregmax, and actime. For example, the acregmin parameter specifies the minimum length of time after an actual update that file entries will be retained. When a file is updated, its removal from cache depends on this parameter's value.

Using the mount command, you can also specify whether you want a hard or soft mount. With a soft mount, if an error occurs, it is reported immediately to the requested program; with a hard mount, NFS keeps retrying. These retries themselves could lead to performance problems. From a reliability standpoint, hard mounting read and write directories is recommended to prevent possible data corruption.

Mount parameters rsize and wsize define the maximum sizes of RPC packets for read and write directories, respectively. The default value is 32,768 bytes. With NFS 3 and 4, if your NFS volumes are mounted on high-speed networks, you should increase this setting to 65,536. On the other hand, if your network is extremely slow, you might think about decreasing the default to reduce the amount of packet fragmentation by sending shorter packets. However, if you do decrease the default, more packets will need to be sent, which could increase overall network utilization.

Understand your network, and tune it accordingly!

15.4. Tuning: Server

Before examining specific NFS parameters, always try to decrease the load on the network while also looking at your CPU and I/O subsystems. CPU bottlenecks often contribute to what appears to be an NFS-specific problem. For example, NFS can use either TCP or UDP, depending on the version and your preference. Make sure your tcp_sendspace and tcp_recvspace tunables are set to values higher than the defaults because this can have an impact on your server by increasing network performance. You tune these values with the no command:

root@lpar24ml162f_pub[/tmp] > no -a | grep send

ipsendredirects = 1
ipsrcroutesend = 1
send_file_duration = 300
tcp_sendspace = 1638
udp_sendspace = 9216

root@lpar24ml162f_pub[/] > no -o tcp_sendspace=524288

Setting tcp_sendspace to 524288
Change to tunable tcp_sendspace, will only be effective for future connection


If you are running Version 4 of NFS, make sure you turn on nfs_rfc1323. Doing so allows for TCP window sizes greater than 64K. Set this value on the client as well.

root@lpar24ml162f_pub[/] > no -o rfc1323

Setting rfc1323 to 1

As an alternative, you can set the rfc1323 tunable using the nfso command, which manages the NFS tuning parameters:

root@lpar24ml162f_pub[/] > nfso -o nfs_rfc1323=1

Setting nfs_rfc1323 to 1

Setting rfc1323 with nfso configures the TCP window to affect only NFS (as opposed to no, which applies this setting across the board). If you have already set this option with no, you don't need to change it, although you might want to in case some other Unix administrator decides to play around with the no command.

Similar to the client, if the server is a dedicated NFS server, make sure you tune your VMM parameters accordingly. Modify maxperm and maxclient to 100 percent to make sure the VMM controls the caching of the page files, using as much memory as possible in the process.

On the server, tune nfsd, which is multithreaded, the same way you tuned biod. (Other daemons you can tune include rpc.mountd and rpc.lockd.) Like biod, nfsd self-tunes, depending on the load. Increase the number of threads using the nfso command. One parameter to check is nfs_max_read_size, which sets the maximum size of RPCs for read replies. Look at what nfs_max_read_size is set to below:

root@lpar24ml162f_pub[/tmp] > nfso -L nfs_max_read_size

NAME CUR DEF BOOT MIN MAX UNIT TYPE
DEPENDENCIES
---------------------------------------------------------------------------
nfs_max_read_size 32K 32K 32K 512 64K Bytes D

Let's increase it to 64K (using bytes):

root@lpar24ml162f_pub[/tmp] > nfso -o nfs_max_read_size=65536
root@lpar24ml162f_pub[/tmp] > nfso -L nfs_max_read_size

NAME CUR DEF BOOT MIN MAX UNIT TYPE
DEPENDENCIES
---------------------------------------------------------------------------
nfs_max_read_size 64K 32K 32K 512 64K Bytes D

We just changed nfs_max_read_size to the maximum value allowed. If you want to keep the new values, add your changes to the /etc/tunables/nextboot file so that the settings will remain changed after a reboot.

The nfso offers additional parameters you can modify. To list them all, use the -a or -L flag.



    本站是提供个人知识管理的网络存储空间,所有内容均由用户发布,不代表本站观点。请注意甄别内容中的联系方式、诱导购买等信息,谨防诈骗。如发现有害或侵权内容,请点击一键举报。
    转藏 分享 献花(0

    0条评论

    发表

    请遵守用户 评论公约

    类似文章 更多