A mal-performing system is either going to be CPU-, memory-, or I/O-bound. Here are a few items to check that are easily missed by many sysadmins.
1) Check your ulimit settings. By default, users (including root) are capped on file size and memory usage. If your users complain that they cannot use enough memory or create a large file, check their ulimit settings. To check ulimits, log in as the appropriate user and run ulimit -a and ulmit -aH. The first command shows the soft ulimits and the second shows the hard ulimits.
2) Use topas (shipped with AIX 4.3.3 and above). Topas is a great tool for viewing many performance statistics at once. Pay attention to the %busy for disks as well as paging rates and the resource consumption of individual processes. Topas will help determine if a system is CPU-, memory-, or I/O-bound. Be aware that topas can use a lot of system resources. Check the man page for options to decrease this load, such as -i and -p.
3) If a system is CPU or memory bound. Check your processes. Are your most important processes receiving the resources they need? If less important processes are hogging CPU or memory, you can use Workload Manager (distributed with 4.3.3 and up) to throttle CPU, RAM, and I/O usage by user, group, or executable. There is a LOT to WLM, so if you are interested, I encourage you to read the Redbook. Be aware that WLM does use a little bit of overhead to manage the resources, but this overhead is often worth it.
4) Most slow performance is caused by I/O bottlenecks. I/O tuning is an art in and of itself, but there are several key items to look at:
4.1) Use filemon. Filemon is an excellent tool to help pinpoint I/O bottlenecks. Filemon will display the most heavily used physical volums, logical volumes and file systems among other things.
4.2) File System Buffers. By default, the number of file system buffers is set to 196. For high I/O systems, this is typically too small. To see if you are blocking I/O due to not having enough file system buffers, run: vmstat -v. For JFS file systems, look at the "filesystem I/Os blocked with no fsbuf" line. For JFS2 file systems, look at the "client filesystem I/Os blocked with no fsbuf" line. If these values are more than a couple thousand, you may need to increase the respective parameters. For JFS file systems, you will need to change the numfsbufs parameter. For JFS2 file systems, change the j2_nBufferPerPagerDevice parameter. Changing this parameter does not require a reboot, but will only take effect when the file system is mounted, so you will have to unmount/mount the file system.
4.2) JFS Log Devices. Heavily used filesystems should ALWAYS have their own JFS log on a separate physical disk. All writes to a JFS (or JFS2) file system are written to the JFS log. By default, there is only one JFS log created for any volume group containing JFS file systems. This means that ALL writes to ALL the file systems in the volume group go to ONE PHYSICAL DISK!! (This is, unless, your underlying disk structure is striped or another form of RAID for performance.) Creating separate JFS logs on different physical disks is very important to getting the most out of the AIX I/O subsystem.
4.3) Paging Space Devices. Paging is not necessarily a bad thing. This obviously depends on the system and application, the workload and required performance. Paging will always make the system at least a little slower. Excessive paging can bring a system to its knees. If your system uses paging a lot, and you cannot afford to increase physical memory, at least move the paging logical volumes to unused or lightly used physical disks. You should also have multiple paging spaces, all the same size. This will ensure that AIX uses each paging space equally.
4.4) VM minperm% and maxperm% settings. If you do not understand minperm% and maxperm%, [color red]USE CAUTION HERE![/color] These settings determine the percentages of free memory at which file pages will or will not be stolen by the page replacement algorithm (the lrud). As such, changing these parameters can have a very adverse affect on performance. When running a DBMS (Oracle, Sybase, etc.), these values are typically decreased since the DBMS uses its own buffer cache. If you do change these values, it is advisable to decrease them a little (no more than 10%) at a time and monitor performance at each step.
4.5) Check asynchronous I/O. If your application can utilize AIO, make sure you have it enabled. Databases in particular typically require AIO to perform with any speed. AIO is not turned on be default. It must be enabled by the administrator.
4.6) Use raw devices for databases. Raw devices typically increase the performance of a database. First, raw devices bypass the file system overhead for I/O. When using file systems with a database, the data is read double-buffered in both the file system buffer cache AND the DBMS buffer cache. Using raw devices avoids this double-buffering. Additionally, when using file systems, a write to a table or a log creates an inode lock on the actual file. This inode lock is not released until the I/O is complete. The inode lock can negate the performance effects of row/table locking in the DBMS.
While this is by no means an exhaustive list of performance tuning steps for AIX, it is a start if you are new to this operating system. Happy tuning!!
This site uses cookies to help personalise content, tailor your experience and to keep you logged in if you register.
By continuing to use this site, you are consenting to our use of cookies.