分享

Slicehost Articles: Customizing nginx web log...

 kokogood 2010-11-16

Customizing nginx web logs

You can create your own custom formats for nginx web logs, to record more information or to make them easier to read. Here's how.


Changing the log format

If you know how to read web logs then you may have an idea of how you would want to write them differently — maybe add a little here, trim a little out there, switch the order around a bit. Luckily, you can do that with the access logs through a couple built-in commands and a handful of log variables.

log_format

Nginx's "log_format" directive is what lets you define your own access log setup. Let's look at how that directive would be used to define the combined log format (CLF):

log_format combined '$remote_addr - $remote_user [$time_local]  '
'"$request" $status $body_bytes_sent '
'"$http_referer" "$http_user_agent"';

The first argument gives a nickname to the log format you're creating. In this case it's "combined", a definition for the default combined log format.

The second argument, in single quotes (and broken up across a few lines for readability), is the string that defines the log format itself.

The format string contains a bunch of placeholders that describe the data to be included in the log. That first one, for example, is "$remote_addr" and represents the IP address of the visitor (the identifier for their host). A bit further on, "$time_local" represents the time of the request.

Components of the CLF

Let's look at that CLF format string side-by-side with an access log entry in the format:

$remote_addr - $remote_user [$time_local] "$request" $status $body_bytes_sent "$http_referer" "$http_user_agent"
123.65.150.10 - - [23/Aug/2010:03:50:59 +0000] "POST /wordpress3/wp-admin/admin-ajax.php HTTP/1.1" 200 2 "http://www./wordpress3/wp-admin/post-new.php" "Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10_6_4; en-US) AppleWebKit/534.3 (KHTML, like Gecko) Chrome/6.0.472.25 Safari/534.3"

Okay, they don't look too pretty together, but there is a correlation between each element in the format string and the components of the log entry below it. Breaking down what the stuff in the format string means:

$remote_addr        The remote host
$remote_user        The authenticated user (if any)
$time_local         The time of the access
$request            The first line of the request
$status             The status of the request
$body_bytes_sent    The size of the server's response, in bytes
$http_referer       The referrer URL, taken from the request's headers
$http_user_agent    The user agent, taken from the request's headers

So reading along, we see that in place of "$remote_addr" is "123.65.150.10" - the remote host.

After that, "-" becomes, well, "-" for the remote log host part of the format, since nginx doesn't support remote logging in a standard configuration. Because "-" is not a variable it doesn't get replaced in the log entry.

The "$remote_user" format element turns into "-" for the remote user (since this connection didn't require authentication), "$time_local" is replaced with "23/Aug/2010:03:50:59 +0000" because it's the time the request was sent, and so on.

I feel compelled to note that for "$http_referer", "referer" is misspelled. That's the spelling of the header name in the HTTP standards, however, so it is "Referer" for all time when talking about web link referrers. A bit of lexicographical trivia for you there. Enjoy.

Other format components

Apart from what we saw in our breakdown of the combined log format, there are other components you can include in a log_format entry. Some commonly-used components are:

$cookie_COOKIE

The contents of the cookie named "COOKIE" for the request.

$http_HEADER

The contents of the HTTP header named "HEADER" for the request. The name of the header should be converted to lower-case and any dashes replaced with underscores, as in "$http_user_agent".

$server_name

The name of the server that handled the request. If you have multiple virtual hosts logging to the same access log, recording the server name (which should be set for each host) will help you see which connection was for which site.

$connection

The number of connections that have been handled since nginx was last started. Note that this is a cumulative total of connections, making no distinction between individual users or IP addresses. For most people this value might be interesting to see but otherwise wouldn't be terribly useful to track.

For a full list of format variables see the nginx core documentation and the nginx log format documentation.

Make your own log format

While the log_format entry is useful for interpreting what appears in the logs, it can also be used to create your own formats.

If you want your log to add the length of time it takes to serve requests to its access entries, you might make a log_format directive that looks like:

log_format timed_combined '$remote_addr - $remote_user [$time_local]  '
'"$request" $status $body_bytes_sent '
'"$http_referer" "$http_user_agent" $request_time';

All we have to do is add a "$request_time" to the end of the format string, then give it a new nickname — for our example, "timed_combined".

Using the new log format

Now, if you want to tell your virtual host to make an access log using the new format, you can include in the virtual host definition:

access_log /var/log/nginx/timed.log timed_combined;

That second argument to access_log is where you specify the log format you defined. The default format (used if no log format name is included in the directive) is "combined", for the combined log format (CLF).

To recap: A log_format directive takes a format you give it and assigns it a nickname you choose. Then you use access_log to tell nginx to write the access log using the new format by telling it where to write the log and the nickname of your log format.

Adding more custom logs

You can have more than one access_log directive for a virtual host. If you already have an access_log using the CLF format you don't have to remove it when adding your "timed_combined" log. This can be useful if you want to maintain one log in CLF that a web log analyzer program can read and another log file with just the information you care about when you're skimming the entries.

So if you wanted another log with just the stuff you wanted in it, you might take that "timed_combined" format and remove the things you feel are distractions. If you decided to remove the remote log entry, the user entry, and the user agent entry, you could create that format with:

log_format slim '$remote_addr [$time_local] "$request" $status $body_bytes_sent "$http_referer" $request_time';

And then create a new access_log to use the "slim" format:

access_log /var/log/nginx/slim.log slim

Precedence

Note that any logs defined in a virtual host will override log directives in the main nginx config file. So if the main config file has the access_log entry (remembering that "combined" is the default if no format is specified):

access_log /var/log/nginx/access.log

And the virtual host has another access_log entry:

access_log /var/log/nginx/.log combined

Then the virtual host will log its accesses to the ".log" file, but not to the "access.log" file. If you wanted accesses to be logged to both files, you would need to include a line for the main access.log file in the virtual host definition, as in:

access_log /var/log/nginx/access.log
access_log /var/log/nginx/.log combined

Rotating new logs

When you create any new logs, you should remember to configure logrotate to rotate them regularly. Otherwise they may grow and grow until they eat all your disk space right up. Any logs in the default nginx log directory should get rotated under nginx's default rules, but if you put a new log in another directory you may need to add a rule to logrotate.

Summary

Log customization is a really handy web server feature. You can tailor the access logs to make them more readable, or to fit a format required by a log analyzer program. Or you can do both, logging accesses to both a log for a program to analyze and to another log in a more human-readable format.

All you need to do is define the log format, then tell the server where to use it.

    本站是提供个人知识管理的网络存储空间,所有内容均由用户发布,不代表本站观点。请注意甄别内容中的联系方式、诱导购买等信息,谨防诈骗。如发现有害或侵权内容,请点击一键举报。
    转藏 分享 献花(0

    0条评论

    发表

    请遵守用户 评论公约

    类似文章 更多