What is Nagios?
Nagios is a framework for setting up monitoring of hosts, services, and networks .
What are the components of Nagios?
- nagios – the main server software and web scripts
- nagios-plugins – the common set of check scripts used to query services
- nagios-nrpe – Nagios Remote Plugin Executor
- nagios-nsca – Nagios Service Check Acceptor
nagios-nrpe and nagios-nsca are known as addons.
What is the difference between Nagios-Plugins and Nagios-AddOns?
Addons extend the base system on the server, making them different from Nagios Plugins. Nagios offers two types of Addons that enhance functionality and customization for monitoring systems.
- Core-Addons (Addons provided by the Nagios-Developer Ethan Galstad)
- Community-Addons(Addons provied by the Nagios-Community)
Nagios Core Addons
- Database – NDOUtils – NDOUtils allows you to export current and historical data from one or more Nagios instances to a MySQL database. Several community addons use this as one of their data sources.
- Distribution
- NSCA – NSCA allows you to integrate passive alerts and checks from remote machines and applications with Nagios. Useful for processing security alerts, as well as redundant and distributed Nagios setups.
- NRPE – NRPE allows you to remotely execute Nagios plugins on other Linux/Unix machines. This allows you to monitor remote machine metrics (disk usage, CPU load, etc.). NRPE can also communicate with some of the Windows agent addons, so you can execute scripts and check metrics on remote Windows machines as well.
Nagios Architechture
Nagios consists of a central server running Nagios daemon. It can be used to monitor many network services like smtp, http and dns on remote hosts. It also has support for snmp to allow you to check things like processor loads on routers and servers.
- A service is anything on the remote host that you want checked. Its state can be one of: OK, Warning, Critical or Unknown
- A check is a script run on the central monitoring server whose exit status determines the state of the service monitored on the remote machine: 0, 1, 2 or -1
How does Nagios Check the services?
Nagios is capable of monitoring hosts and services in two ways:
- Active
- Passive
Active checks
The main features of active checks are as follows
- Active checks are initiated by the Nagios daemon in the central server
-
Active checks are run on a regularly scheduled basis
How Are Active Checks Performed?
The Nagios daemon initiates active checks through its check logic. When Nagios needs to verify the status of a host or service, it runs a plugin and provides details about what to check. The plugin checks the host or service’s operational state and sends the results back to the Nagios daemon. After receiving the results, Nagios processes the data and takes appropriate actions, such as sending notifications or running event handlers.
Why use plugins to do active checks?
Nagios relies on external programs, known as plugins, to check the status of hosts and services on your network. Instead of using internal mechanisms, Nagios uses these plugins to perform all monitoring tasks and gather status information.
What Are Plugins?
Plugins are executable programs or scripts—such as Perl or shell scripts—that you can run from the command line to check the status of a host or service. Nagios runs these plugins and uses their results to determine the current status of all hosts and services across your network.
Nagios executes a plugin whenever it needs to check the status of a host or service. The plugin performs the required check and returns the results to Nagios. After receiving the results, Nagios processes them and takes necessary actions, such as running event handlers or sending notifications.
Advantage of plugin architecture
Plugin architecture can monitor just about anything you can think of. We can write our own plugins if needed.
Disadvantage of plugin architecture
Nagios does not need to know the details of what you are monitoring. It simply tracks changes in the state of hosts, services, and other resources. The plugins handle the specifics — they know exactly what to monitor and how to perform the checks that provide status information to Nagios.
What Plugins Are Available?
There are plugins currently available to monitor many different kinds of devices and services, including:
- HTTP, POP3, IMAP, FTP, SSH, DHCP
- CPU Load, Disk Usage, Memory Usage, Current Users
- Unix/Linux, Windows, and Netware Servers
- Routers and Switches,etc
When Active checks are executed?
- At regular intervals, as defined by the check_interval and retry_interval options in your host and service definitions
- On-demand as needed
Regularly scheduled checks occur at intervals equaling either the “check_interval” or the “retry_interval” in your “host” or “service” definitions, depending on what type of state the host or service is in. If a host or service is in a HARD state, it will be actively checked at intervals equal to the check_interval option. If it is in a SOFT state, it will be checked at intervals equal to the retry_interval option.
On-demand checks are performed whenever Nagios sees a need to obtain the latest status information about a particular host or service.
Nagios performs active checks using its plugins or the check_nrpe command from the NRPE addon. The NRPE addon lets Nagios execute plugins on remote Linux or Unix machines. By default, Nagios runs plugins from the monitoring node, but with NRPE, it runs them directly on the monitored machine to collect more accurate and efficient results.
check_nrpe (from NRPE) is used to initiate an active check from the nagios machine. It is called by nagios much like a normal plugin but it works differently. check_nrpe contacts the nrpe daemon running on a remote host and asks it to run a pre-configured check command and returns the results to nagios.
- Nagios will execute the check_nrpe plugin and tell it what service needs to be checked
- The check_nrpe plugin contacts the NRPE daemon on the remote host over an (optionally) SSL-protected connection
- The NRPE daemon runs the appropriate Nagios plugin to check the service or resource
- The results from the service check are passed from the NRPE daemon back to the check_nrpe plugin, which then returns the check results to the Nagios process.
Passive Checks
Host and service checks which are performed and submitted to Nagios by external apps(running in the hosts monitored) are called passive checks. Passive checks can be contrasted with active checks, which are host or service checks that have been initiated by Nagios.
NSCA(Nagios Service Check Acceptor) addon helps in performing passive checks.
Nagios uses NSCA to process passive checks from remote hosts. A wrapper script on the remote host, triggered by cron or another scheduling tool, runs a plugin and sends the results to the send_nsca program. The send_nsca program communicates with the nsca daemon on the Nagios server, which forwards the check results to Nagios for further processing.
Nagios uses NSCA to handle passive checks from remote hosts. A wrapper script on the remote host—triggered by cron or another scheduler—runs a plugin and sends the results to the send_nsca program. The send_nsca program then contacts the nsca daemon on the Nagios server, which delivers the check results to Nagios for processing.
When to use active and passive checks?
Typically passive checks are used when a firewall prevents the Nagios server to make a request to the client or when the client is running an application that asynchronous, in other words the time schedule for a service is erratic and cannot be fully determined. Security events is one example of a situation where you do not know when the event may occur. Passive checks may also be used for distributed monitoring where you have multiple Nagios servers providing information to a master Nagios server.
Active checks occur when Nagios itself is responsible for checking the status of a device at regular intervals. On the other side, a passive check is when the device reports its status to Nagios only when its status changes.
For example, I may wish to use passive checks for most of the services on a device, but I may want to check for the reachability — via a PING check using check_ping — using an active check.
When does Nagios send us notifications?
When a service fails for the first time, Nagios marks it as being in a soft state. Nagios then checks the service multiple times to determine whether it recovers. If the service does not recover within the configured number of checks, Nagios changes the state to hard and sends notifications. If the service recovers during those checks, Nagios skips notifications because the issue has resolved.
Event handlers let Nagios perform specific actions when a service changes state, whether soft or hard. For example, if an Apache web service fails, an event handler can restart the Apache process automatically. If the service comes back online during the checks, Nagios does not send notifications since the problem no longer exists.
If Nagios fails to restart the service, it eventually sets the service to a hard state, which triggers a notification. To make Nagios always send notifications, adjust the max_check_attempts parameter for both hosts and services. This parameter defines how many checks Nagios performs before changing the service or host to a hard state.
Hope this was useful and if you require any assistance feel free to Contact Us.