Your Ad Here
首页 | 编程语言 | 网站建设 | 游戏天堂 | 冲浪宝典 | 网络安全 | 操作系统 | 软件时空 | 硬件指南 | 病毒相关 | IT 认证
软讯网络 > 操作系统 > Linux > Nagios配置学习手记
【标  题】:Nagios配置学习手记
【关键字】:Nagios
【来  源】:http://www.cublog.cn/u/1589/showart.php?id=114110

Nagios配置学习手记

Your Ad Here




Nagios配置学习手记



文件:Nagios配置学习手记.pdf
大小:570KB
下载:下载













2005.05.17

chnl@163.com



目录













说明

本文只对nagios的简单安装、配置作了描述。具体对command(plugins)的配置,对本地及远程hostservice的监控,与SNMP管理的集成,将在后续文档中继续介绍。

测试主机的操作系统版本是Redhat 9.0

本文主要参考Nagios的官方文档Nagios_2_0_Docs.pdf而成。

本文档的所有权归chenl(chnl@163.com)所有,使用GPL发布,可以自由拷贝,转载,转载时请保持文档的完整性,严禁用于任何商业用途。


一、前言

之前一段时间,一直在一家专业做IT网管的公司做项目,所以对相关的产品都比较留意。刚好最近手头有一些LinuxServer要管理,因为规模不大,就上网google了一下相关的Open Source的东西,找到Nagios之后,自己配置使用之后,发现还不错。Nagios可以提供对主机、网络服务等强大的监控、处理能力。能够以多样的灵活的方式处理IT设施的一些问题。以plugins的方式提供用户个性化需求的接口。

以下是简单的配置过程,希望能对感兴趣的人有所帮助。

二、安装

Nagios的官方文档给出的安装步骤,都是从源码开始的。具体的安装过程,pdf文档中非常德相近。因为想偷懒,我这里找了Nagiosrpm二进制包来安装。

安装的文件列表:

gd-2.0.33.tar.gz

nagios-2.2-1.rh9.rf.i386.rpm

nagios-devel-2.2-1.rh9.rf.i386.rpm

nagios-plugins-1.4.2.tar.gz



1gd的安装

Nagios的运行,是一定需要gd来支持的,因此要先安装gd-2.0.33.tar.gz

具体的安装过程,请参阅解压后软件包中附带的INSTALL文档,如果没有特殊要求,只要进入解压缩后的软件目录中执行一下命令即可:

# ./configurer || ./make || ./make install


2Nagios的安装

Nagios的安装,此处采用rpm方式。只要安装以下两个软件包就可以:

nagios-2.2-1.rh9.rf.i386.rpm

nagios-devel-2.2-1.rh9.rf.i386.rpm (非必须)

常规的安装大致为执行

# rpm -ivh nagios*.rpm


3nagios-plugins的安装

可以在Nagios的官方网站上下载到最新的nagios-plugins的程序包,安装也非常简单,与gd的安装类似。记得要做的是,在安装完成后,你需要把libexec中生成的文件,全部copy/usr/lib/nagios/pluginsrpm安装后,nagios缺省的plugins所在目录)下。

配置

Nagios安装后缺省配置文件所在的目录为/etc/nagios。需要修改的文件列表如下。

1nagios.cfg

在这个文件中,为了便于调试,我们选择使用minimal.cfg,其他的都注释掉,如下所示:

cfg_file=/etc/nagios/minimal.cfg

#cfg_file=/etc/nagios/contactgroups.cfg

#cfg_file=/etc/nagios/contacts.cfg

#cfg_file=/etc/nagios/dependencies.cfg

#cfg_file=/etc/nagios/escalations.cfg

#cfg_file=/etc/nagios/hostgroups.cfg

#cfg_file=/etc/nagios/hosts.cfg

#cfg_file=/etc/nagios/services.cfg

#cfg_file=/etc/nagios/timeperiods.cfg

另外,对checkcommands.cfg misccommands.cfg的引用,也要注释掉。

详细配置请见附录。


2minimal.cfg

这是主要的配置文件,文件中定义了相应的commandhostservice等。大多采用了模板的形式,只要参照原有的配置,很容易做一些个性化配置的添加。

详细配置请见附录。


3htpasswd.users

这是nagios web页面及cgi的认证文件,需要有htpasswd来生成。

可以采用以下的命令:

htpasswd -c /etc/nagios/htpasswd.users nagiosadmin

输入nagiosadmin的密码就可以了。

4httpd的配置

首先要修改/etc/httpd/conf/httpd.conf 这个文件,确认有以下的配置在:

Include /etc/httpd/conf.d/nagios.conf

其次,修改 /etc/httpd/conf.d/nagios.conf ,其中各个参数项,可以根据自己的情况修改。确认以下的配置:

ScriptAlias /nagios/cgi-bin "/usr/lib/nagios/cgi"

<Directory "/usr/lib/nagios/cgi">

# SSLRequireSSL

Options ExecCGI

AllowOverride None

Order allow,deny

Allow from all

# Order deny,allow

# Deny from all

# Allow from 127.0.0.1

AuthName "Nagios Access"

AuthType Basic

AuthUserFile /etc/nagios/htpasswd.users

Require valid-user

</Directory>

Alias /nagios "/usr/share/nagios"

<Directory "/usr/share/nagios">

# SSLRequireSSL

Options None

AllowOverride None

Order allow,deny

Allow from all

# Order deny,allow

# Deny from all

# Allow from 127.0.0.1

AuthName "Nagios Access"

AuthType Basic

AuthUserFile /etc/nagios/htpasswd.users

Require valid-user

</Directory>



4、启动

全部配置完成之后,就可以启动nagios,可以使用以下的命令启动:

/etc/init.d/nagios start

并使用以下的命令来追踪日志文件:

tail -f /var/log/nagios /nagios.log

一般出现的错误,大多都和错误的配置有关,查一下官方文档,一般都可以解决。

Nagios需要httpd支持,因此需要启动httpd

/etc/init.d/httpd start

5、使用

打开浏览器之后,输入主机的ip地加上nagios的路径即可,如果你的ip192.168.200.172,那么输入http://192.168.200.172/nagios就可以了。

如下图所示:



只要输入正确的用户名/密码,即可登陆。

以下是一些页面的截图:








附录:

配置文档列表:

1nagios.cfg

##############################################################################

#

# NAGIOS.CFG - Sample Main Config File for Nagios

#

# Read the documentation for more information on this configuration

# file. I've provided some comments here, but things may not be so

# clear without further explanation.

#

# Last Modified: 11-23-2005

#

##############################################################################



# LOG FILE

# This is the main log file where service and host events are logged

# for historical purposes. This should be the first option specified

# in the config file!!!


log_file=/var/log/nagios/nagios.log




# OBJECT CONFIGURATION FILE(S)

# This is the configuration file in which you define hosts, host

# groups, contacts, contact groups, services, etc. I guess it would

# be better called an object definition file, but for historical

# reasons it isn't. You can split object definitions into several

# different config files by using multiple cfg_file statements here.

# Nagios will read and process all the config files you define.

# This can be very useful if you want to keep command definitions

# separate from host and contact definitions...


# Plugin commands (service and host check commands)

# Arguments are likely to change between different releases of the

# plugins, so you should use the same config file provided with the

# plugin release rather than the one provided with Nagios.

#cfg_file=/etc/nagios/checkcommands.cfg


# Misc commands (notification and event handler commands, etc)

#cfg_file=/etc/nagios/misccommands.cfg


# You can split other types of object definitions across several

# config files if you wish (as done here), or keep them all in a

# single config file.


cfg_file=/etc/nagios/minimal.cfg


#cfg_file=/etc/nagios/contactgroups.cfg

#cfg_file=/etc/nagios/contacts.cfg

#cfg_file=/etc/nagios/dependencies.cfg

#cfg_file=/etc/nagios/escalations.cfg

#cfg_file=/etc/nagios/hostgroups.cfg

#cfg_file=/etc/nagios/hosts.cfg

#cfg_file=/etc/nagios/services.cfg

#cfg_file=/etc/nagios/timeperiods.cfg


# Extended host/service info definitions are now stored along with

# other object definitions:

#cfg_file=/etc/nagios/hostextinfo.cfg

#cfg_file=/etc/nagios/serviceextinfo.cfg


# You can also tell Nagios to process all config files (with a .cfg

# extension) in a particular directory by using the cfg_dir

# directive as shown below:


#cfg_dir=/etc/nagios/servers

#cfg_dir=/etc/nagios/printers

#cfg_dir=/etc/nagios/switches

#cfg_dir=/etc/nagios/routers




# OBJECT CACHE FILE

# This option determines where object definitions are cached when

# Nagios starts/restarts. The CGIs read object definitions from

# this cache file (rather than looking at the object config files

# directly) in order to prevent inconsistencies that can occur

# when the config files are modified after Nagios starts.


object_cache_file=/var/log/nagios/objects.cache




# RESOURCE FILE

# This is an optional resource file that contains $USERx$ macro

# definitions. Multiple resource files can be specified by using

# multiple resource_file definitions. The CGIs will not attempt to

# read the contents of resource files, so information that is

# considered to be sensitive (usernames, passwords, etc) can be

# defined as macros in this file and restrictive permissions (600)

# can be placed on this file.


resource_file=/etc/nagios/resource.cfg




# STATUS FILE

# This is where the current status of all monitored services and

# hosts is stored. Its contents are read and processed by the CGIs.

# The contents of the status file are deleted every time Nagios

# restarts.


status_file=/var/log/nagios/status.dat




# NAGIOS USER

# This determines the effective user that Nagios should run as.

# You can either supply a username or a UID.


nagios_user=nagios




# NAGIOS GROUP

# This determines the effective group that Nagios should run as.

# You can either supply a group name or a GID.


nagios_group=nagios




# EXTERNAL COMMAND OPTION

# This option allows you to specify whether or not Nagios should check

# for external commands (in the command file defined below). By default

# Nagios will *not* check for external commands, just to be on the

# cautious side. If you want to be able to use the CGI command interface

# you will have to enable this. Setting this value to 0 disables command

# checking (the default), other values enable it.


check_external_commands=0




# EXTERNAL COMMAND CHECK INTERVAL

# This is the interval at which Nagios should check for external commands.

# This value works of the interval_length you specify later. If you leave

# that at its default value of 60 (seconds), a value of 1 here will cause

# Nagios to check for external commands every minute. If you specify a

# number followed by an "s" (i.e. 15s), this will be interpreted to mean

# actual seconds rather than a multiple of the interval_length variable.

# Note: In addition to reading the external command file at regularly

# scheduled intervals, Nagios will also check for external commands after

# event handlers are executed.

# NOTE: Setting this value to -1 causes Nagios to check the external

# command file as often as possible.


#command_check_interval=1

#command_check_interval=15s

command_check_interval=-1




# EXTERNAL COMMAND FILE

# This is the file that Nagios checks for external command requests.

# It is also where the command CGI will write commands that are submitted

# by users, so it must be writeable by the user that the web server

# is running as (usually 'nobody'). Permissions should be set at the

# directory level instead of on the file, as the file is deleted every

# time its contents are processed.


command_file=/var/log/nagios/rw/nagios.cmd




# COMMENT FILE

# This is the file that Nagios will use for storing host and service

# comments.


comment_file=/var/log/nagios/comments.dat




# DOWNTIME FILE

# This is the file that Nagios will use for storing host and service

# downtime data.


downtime_file=/var/log/nagios/downtime.dat




# LOCK FILE

# This is the lockfile that Nagios will use to store its PID number

# in when it is running in daemon mode.


lock_file=/var/run/nagios.pid




# TEMP FILE

# This is a temporary file that is used as scratch space when Nagios

# updates the status log, cleans the comment file, etc. This file

# is created, used, and deleted throughout the time that Nagios is

# running.


temp_file=/var/log/nagios/nagios.tmp




# EVENT BROKER OPTIONS

# Controls what (if any) data gets sent to the event broker.

# Values: 0 = Broker nothing

# -1 = Broker everything

# <other> = See documentation


event_broker_options=-1




# EVENT BROKER MODULE(S)

# This directive is used to specify an event broker module that should

# by loaded by Nagios at startup. Use multiple directives if you want

# to load more than one module. Arguments that should be passed to

# the module at startup are seperated from the module path by a space.

#

# Example:

#

# broker_module=<modulepath> [moduleargs]


#broker_module=/somewhere/module1.o

#broker_module=/somewhere/module2.o arg1 arg2=3 debug=0





# LOG ROTATION METHOD

# This is the log rotation method that Nagios should use to rotate

# the main log file. Values are as follows..

# n = None - don't rotate the log

# h = Hourly rotation (top of the hour)

# d = Daily rotation (midnight every day)

# w = Weekly rotation (midnight on Saturday evening)

# m = Monthly rotation (midnight last day of month)


log_rotation_method=d




# LOG ARCHIVE PATH

# This is the directory where archived (rotated) log files should be

# placed (assuming you've chosen to do log rotation).


log_archive_path=/var/log/nagios/archives




# LOGGING OPTIONS

# If you want messages logged to the syslog facility, as well as the

# NetAlarm log file set this option to 1. If not, set it to 0.


use_syslog=1




# NOTIFICATION LOGGING OPTION

# If you don't want notifications to be logged, set this value to 0.

# If notifications should be logged, set the value to 1.


log_notifications=1




# SERVICE RETRY LOGGING OPTION

# If you don't want service check retries to be logged, set this value

# to 0. If retries should be logged, set the value to 1.


log_service_retries=1




# HOST RETRY LOGGING OPTION

# If you don't want host check retries to be logged, set this value to

# 0. If retries should be logged, set the value to 1.


log_host_retries=1




# EVENT HANDLER LOGGING OPTION

# If you don't want host and service event handlers to be logged, set

# this value to 0. If event handlers should be logged, set the value

# to 1.


log_event_handlers=1




# INITIAL STATES LOGGING OPTION

# If you want Nagios to log all initial host and service states to

# the main log file (the first time the service or host is checked)

# you can enable this option by setting this value to 1. If you

# are not using an external application that does long term state

# statistics reporting, you do not need to enable this option. In

# this case, set the value to 0.


log_initial_states=0




# EXTERNAL COMMANDS LOGGING OPTION

# If you don't want Nagios to log external commands, set this value

# to 0. If external commands should be logged, set this value to 1.

# Note: This option does not include logging of passive service

# checks - see the option below for controlling whether or not

# passive checks are logged.


log_external_commands=1




# PASSIVE CHECKS LOGGING OPTION

# If you don't want Nagios to log passive host and service checks, set

# this value to 0. If passive checks should be logged, set

# this value to 1.


log_passive_checks=1




# GLOBAL HOST AND SERVICE EVENT HANDLERS

# These options allow you to specify a host and service event handler

# command that is to be run for every host or service state change.

# The global event handler is executed immediately prior to the event

# handler that you have optionally specified in each host or

# service definition. The command argument is the short name of a

# command definition that you define in your host configuration file.

# Read the HTML docs for more information.


#global_host_event_handler=somecommand

#global_service_event_handler=somecommand




# SERVICE INTER-CHECK DELAY METHOD

# This is the method that Nagios should use when initially

# "spreading out" service checks when it starts monitoring. The

# default is to use smart delay calculation, which will try to

# space all service checks out evenly to minimize CPU load.

# Using the dumb setting will cause all checks to be scheduled

# at the same time (with no delay between them)! This is not a

# good thing for production, but is useful when testing the

# parallelization functionality.

# n = None - don't use any delay between checks

# d = Use a "dumb" delay of 1 second between checks

# s = Use "smart" inter-check delay calculation

# x.xx = Use an inter-check delay of x.xx seconds


service_inter_check_delay_method=s




# MAXIMUM SERVICE CHECK SPREAD

# This variable determines the timeframe (in minutes) from the

# program start time that an initial check of all services should

# be completed. Default is 30 minutes.


max_service_check_spread=30




# SERVICE CHECK INTERLEAVE FACTOR

# This variable determines how service checks are interleaved.

# Interleaving the service checks allows for a more even

# distribution of service checks and reduced load on remote

# hosts. Setting this value to 1 is equivalent to how versions

# of Nagios previous to 0.0.5 did service checks. Set this

# value to s (smart) for automatic calculation of the interleave

# factor unless you have a specific reason to change it.

# s = Use "smart" interleave factor calculation

# x = Use an interleave factor of x, where x is a

# number greater than or equal to 1.


service_interleave_factor=s




# HOST INTER-CHECK DELAY METHOD

# This is the method that Nagios should use when initially

# "spreading out" host checks when it starts monitoring. The

# default is to use smart delay calculation, which will try to

# space all host checks out evenly to minimize CPU load.

# Using the dumb setting will cause all checks to be scheduled

# at the same time (with no delay between them)!

# n = None - don't use any delay between checks

# d = Use a "dumb" delay of 1 second between checks

# s = Use "smart" inter-check delay calculation

# x.xx = Use an inter-check delay of x.xx seconds


host_inter_check_delay_method=s




# MAXIMUM HOST CHECK SPREAD

# This variable determines the timeframe (in minutes) from the

# program start time that an initial check of all hosts should

# be completed. Default is 30 minutes.


max_host_check_spread=30




# MAXIMUM CONCURRENT SERVICE CHECKS

# This option allows you to specify the maximum number of

# service checks that can be run in parallel at any given time.

# Specifying a value of 1 for this variable essentially prevents

# any service checks from being parallelized. A value of 0

# will not restrict the number of concurrent checks that are

# being executed.


max_concurrent_checks=0




# SERVICE CHECK REAPER FREQUENCY

# This is the frequency (in seconds!) that Nagios will process

# the results of services that have been checked.


service_reaper_frequency=10





# AUTO-RESCHEDULING OPTION

# This option determines whether or not Nagios will attempt to

# automatically reschedule active host and service checks to

# "smooth" them out over time. This can help balance the load on

# the monitoring server.

# WARNING: THIS IS AN EXPERIMENTAL FEATURE - IT CAN DEGRADE

# PERFORMANCE, RATHER THAN INCREASE IT, IF USED IMPROPERLY


auto_reschedule_checks=0




# AUTO-RESCHEDULING INTERVAL

# This option determines how often (in seconds) Nagios will

# attempt to automatically reschedule checks. This option only

# has an effect if the auto_reschedule_checks option is enabled.

# Default is 30 seconds.

# WARNING: THIS IS AN EXPERIMENTAL FEATURE - IT CAN DEGRADE

# PERFORMANCE, RATHER THAN INCREASE IT, IF USED IMPROPERLY


auto_rescheduling_interval=30





# AUTO-RESCHEDULING WINDOW

# This option determines the "window" of time (in seconds) that

# Nagios will look at when automatically rescheduling checks.

# Only host and service checks that occur in the next X seconds

# (determined by this variable) will be rescheduled. This option

# only has an effect if the auto_reschedule_checks option is

# enabled. Default is 180 seconds (3 minutes).

# WARNING: THIS IS AN EXPERIMENTAL FEATURE - IT CAN DEGRADE

# PERFORMANCE, RATHER THAN INCREASE IT, IF USED IMPROPERLY


auto_rescheduling_window=180




# SLEEP TIME

# This is the number of seconds to sleep between checking for system

# events and service checks that need to be run.


sleep_time=0.25




# TIMEOUT VALUES

# These options control how much time Nagios will allow various

# types of commands to execute before killing them off. Options

# are available for controlling maximum time allotted for

# service checks, host checks, event handlers, notifications, the

# ocsp command, and performance data commands. All values are in

# seconds.


service_check_timeout=60

host_check_timeout=30

event_handler_timeout=30

notification_timeout=30

ocsp_timeout=5

perfdata_timeout=5




# RETAIN STATE INFORMATION

# This setting determines whether or not Nagios will save state

# information for services and hosts before it shuts down. Upon

# startup Nagios will reload all saved service and host state

# information before starting to monitor. This is useful for

# maintaining long-term data on state statistics, etc, but will

# slow Nagios down a bit when it (re)starts. Since its only

# a one-time penalty, I think its well worth the additional

# startup delay.


retain_state_information=1




# STATE RETENTION FILE

# This is the file that Nagios should use to store host and

# service state information before it shuts down. The state

# information in this file is also read immediately prior to

# starting to monitor the network when Nagios is restarted.

# This file is used only if the preserve_state_information

# variable is set to 1.


state_retention_file=/var/log/nagios/retention.dat




# RETENTION DATA UPDATE INTERVAL

# This setting determines how often (in minutes) that Nagios

# will automatically save retention data during normal operation.

# If you set this value to 0, Nagios will not save retention

# data at regular interval, but it will still save retention

# data before shutting down or restarting. If you have disabled

# state retention, this option has no effect.


retention_update_interval=60




# USE RETAINED PROGRAM STATE

# This setting determines whether or not Nagios will set

# program status variables based on the values saved in the

# retention file. If you want to use retained program status

# information, set this value to 1. If not, set this value

# to 0.


use_retained_program_state=1




# USE RETAINED SCHEDULING INFO

# This setting determines whether or not Nagios will retain

# the scheduling info (next check time) for hosts and services

# based on the values saved in the retention file. If you

# If you want to use retained scheduling info, set this

# value to 1. If not, set this value to 0.


use_retained_scheduling_info=0




# INTERVAL LENGTH

# This is the seconds per unit interval as used in the

# host/contact/service configuration files. Setting this to 60 means

# that each interval is one minute long (60 seconds). Other settings

# have not been tested much, so your mileage is likely to vary...


interval_length=60




# AGGRESSIVE HOST CHECKING OPTION

# If you don't want to turn on aggressive host checking features, set

# this value to 0 (the default). Otherwise set this value to 1 to

# enable the aggressive check option. Read the docs for more info

# on what aggressive host check is or check out the source code in

# base/checks.c


use_aggressive_host_checking=0




# SERVICE CHECK EXECUTION OPTION

# This determines whether or not Nagios will actively execute

# service checks when it initially starts. If this option is

# disabled, checks are not actively made, but Nagios can still

# receive and process passive check results that come in. Unless

# you're implementing redundant hosts or have a special need for

# disabling the execution of service checks, leave this enabled!

# Values: 1 = enable checks, 0 = disable checks


execute_service_checks=1




# PASSIVE SERVICE CHECK ACCEPTANCE OPTION

# This determines whether or not Nagios will accept passive

# service checks results when it initially (re)starts.

# Values: 1 = accept passive checks, 0 = reject passive checks


accept_passive_service_checks=1




# HOST CHECK EXECUTION OPTION

# This determines whether or not Nagios will actively execute

# host checks when it initially starts. If this option is

# disabled, checks are not actively made, but Nagios can still

# receive and process passive check results that come in. Unless

# you're implementing redundant hosts or have a special need for

# disabling the execution of host checks, leave this enabled!

# Values: 1 = enable checks, 0 = disable checks


execute_host_checks=1




# PASSIVE HOST CHECK ACCEPTANCE OPTION

# This determines whether or not Nagios will accept passive

# host checks results when it initially (re)starts.

# Values: 1 = accept passive checks, 0 = reject passive checks


accept_passive_host_checks=1




# NOTIFICATIONS OPTION

# This determines whether or not Nagios will sent out any host or

# service notifications when it is initially (re)started.

# Values: 1 = enable notifications, 0 = disable notifications


enable_notifications=1




# EVENT HANDLER USE OPTION

# This determines whether or not Nagios will run any host or

# service event handlers when it is initially (re)started. Unless

# you're implementing redundant hosts, leave this option enabled.

# Values: 1 = enable event handlers, 0 = disable event handlers


enable_event_handlers=1




# PROCESS PERFORMANCE DATA OPTION

# This determines whether or not Nagios will process performance

# data returned from service and host checks. If this option is

# enabled, host performance data will be processed using the

# host_perfdata_command (defined below) and service performance

# data will be processed using the service_perfdata_command (also

# defined below). Read the HTML docs for more information on

# performance data.

# Values: 1 = process performance data, 0 = do not process performance data


process_performance_data=0




# HOST AND SERVICE PERFORMANCE DATA PROCESSING COMMANDS

# These commands are run after every host and service check is

# performed. These commands are executed only if the

# enable_performance_data option (above) is set to 1. The command

# argument is the short name of a command definition that you

# define in your host configuration file. Read the HTML docs for

# more information on performance data.


#host_perfdata_command=process-host-perfdata

#service_perfdata_command=process-service-perfdata




# HOST AND SERVICE PERFORMANCE DATA FILES

# These files are used to store host and service performance data.

# Performance data is only written to these files if the

# enable_performance_data option (above) is set to 1.

<