control for "downed host detection"
Moderators: Developers, Moderators
control for "downed host detection"
I have several hosts and the downed host feature is neat for some of them, but it needs to be possible to turn it off ( without finding an obsure post in this forum. about clearing snmp public strings) so three requests.
1) put this info into the FAQ
2) put a control for each host/device to set the method per host rather than globally
3) have an extra setting available
OFF
snmp
snmp+ping
ping
Let me explain that a bit more.
I have several hosts that I can ping (ICMP only) but cannot use snmp. I have a router/FW that only responds to snmp and not ping for security reasons. ( yes its the same rule on both ports ) and I have some boxes where I cannot use snmp or ping but retrieve data with wget + perl scripts.
currently I have the option set to SNMP only, and turn it off on all but one host, as thats the only way to get it to work. That seems like the wrong thing to do.
1) put this info into the FAQ
2) put a control for each host/device to set the method per host rather than globally
3) have an extra setting available
OFF
snmp
snmp+ping
ping
Let me explain that a bit more.
I have several hosts that I can ping (ICMP only) but cannot use snmp. I have a router/FW that only responds to snmp and not ping for security reasons. ( yes its the same rule on both ports ) and I have some boxes where I cannot use snmp or ping but retrieve data with wget + perl scripts.
currently I have the option set to SNMP only, and turn it off on all but one host, as thats the only way to get it to work. That seems like the wrong thing to do.
Phil
the FAQ text ( starting point )
When I clickon devices some of my devices are always DOWN, even though I can ping them. How do I fix this.
In the settings menu, under the poller tab, there is a setting to choose a method that can be used to "ping" the hosts or devices. This prevents multiple requests being sent to a host which is down. It uses either an SNMP get, a network ping or both. If it is set to snmp and you do not have snmp running on the target device or you do not have permissions to read the [????] oid, then the device will appear to be down. Equally if you have snmp running on the target device, but have ping traffic blocked by a firewall or turned off, then using ping will not work.
To work around this, Use SNMP for the pings. and disable the snmp gets where they will not work. eg. Set the poller to use SNMP only for "downed host detection" in the poller tab of the settings window. Then for each device that does not support SNMP, go into the devices window, clickon the device and clear the text in the SNMP comunity box.
Now wait 5 mins and you should see your devices recovering and then up.
regards Phil
In the settings menu, under the poller tab, there is a setting to choose a method that can be used to "ping" the hosts or devices. This prevents multiple requests being sent to a host which is down. It uses either an SNMP get, a network ping or both. If it is set to snmp and you do not have snmp running on the target device or you do not have permissions to read the [????] oid, then the device will appear to be down. Equally if you have snmp running on the target device, but have ping traffic blocked by a firewall or turned off, then using ping will not work.
To work around this, Use SNMP for the pings. and disable the snmp gets where they will not work. eg. Set the poller to use SNMP only for "downed host detection" in the poller tab of the settings window. Then for each device that does not support SNMP, go into the devices window, clickon the device and clear the text in the SNMP comunity box.
Now wait 5 mins and you should see your devices recovering and then up.
regards Phil
Phil
- TheWitness
- Developer
- Posts: 17007
- Joined: Tue May 14, 2002 5:08 pm
- Location: MI, USA
- Contact:
I am planning on incorporating a per-host availability setting in 0.8.x. This occurred to us when we found people who were not able to access the SysDescr OID's on some of their devices.
Otherwise, if you don't use SNMP for a host, just simply clear the read community and it will not poll SNMP even though your availability polling uses SNMP.
TheWitness
Otherwise, if you don't use SNMP for a host, just simply clear the read community and it will not poll SNMP even though your availability polling uses SNMP.
TheWitness
True understanding begins only when we realize how little we truly understand...
Life is an adventure, let yours begin with Cacti!
Author of dozens of Cacti plugins and customization's. Advocate of LAMP, MariaDB, IBM Spectrum LSF and the world of batch. Creator of IBM Spectrum RTM, author of quite a bit of unpublished work and most of Cacti's bugs.
_________________
Official Cacti Documentation
GitHub Repository with Supported Plugins
Percona Device Packages (no support)
Interesting Device Packages
For those wondering, I'm still here, but lost in the shadows. Yearning for less bugs. Who want's a Cacti 1.3/2.0? Streams anyone?
Life is an adventure, let yours begin with Cacti!
Author of dozens of Cacti plugins and customization's. Advocate of LAMP, MariaDB, IBM Spectrum LSF and the world of batch. Creator of IBM Spectrum RTM, author of quite a bit of unpublished work and most of Cacti's bugs.
_________________
Official Cacti Documentation
GitHub Repository with Supported Plugins
Percona Device Packages (no support)
Interesting Device Packages
For those wondering, I'm still here, but lost in the shadows. Yearning for less bugs. Who want's a Cacti 1.3/2.0? Streams anyone?
-
- Posts: 25
- Joined: Sun Sep 12, 2004 1:13 pm
I've been in trouble with downed host detection. When i was using php-based poller, default UDP-ping+SNMP was ok, but when i've switched to cactid SOME of my devices went down. Most of them was ciscos 35xx, with no more similarities between them... However, when i've switched host detection to SNMP only, they went back online.
- TheWitness
- Developer
- Posts: 17007
- Joined: Tue May 14, 2002 5:08 pm
- Location: MI, USA
- Contact:
Basilio Cat,
UDP ping relies on the device echoing a UDP port unrecognized message back to the server. We have found a few issues with it. Namely:
a) some devices ignore the error and do not respond (firewalls for example)
b) some operating systems respond with different errors (yes, UDP ping results in an error) and therefore, your system may be responding with yet another error number.
If I could, it would be helpful if you would run 1 pass using cmd.php in DEBUG and either post the log output or e-mail it to me so that I can make sure that the UDP ERROR NO is appropriately coded into the cmd.php/ping.php code.
TheWitness
UDP ping relies on the device echoing a UDP port unrecognized message back to the server. We have found a few issues with it. Namely:
a) some devices ignore the error and do not respond (firewalls for example)
b) some operating systems respond with different errors (yes, UDP ping results in an error) and therefore, your system may be responding with yet another error number.
If I could, it would be helpful if you would run 1 pass using cmd.php in DEBUG and either post the log output or e-mail it to me so that I can make sure that the UDP ERROR NO is appropriately coded into the cmd.php/ping.php code.
TheWitness
True understanding begins only when we realize how little we truly understand...
Life is an adventure, let yours begin with Cacti!
Author of dozens of Cacti plugins and customization's. Advocate of LAMP, MariaDB, IBM Spectrum LSF and the world of batch. Creator of IBM Spectrum RTM, author of quite a bit of unpublished work and most of Cacti's bugs.
_________________
Official Cacti Documentation
GitHub Repository with Supported Plugins
Percona Device Packages (no support)
Interesting Device Packages
For those wondering, I'm still here, but lost in the shadows. Yearning for less bugs. Who want's a Cacti 1.3/2.0? Streams anyone?
Life is an adventure, let yours begin with Cacti!
Author of dozens of Cacti plugins and customization's. Advocate of LAMP, MariaDB, IBM Spectrum LSF and the world of batch. Creator of IBM Spectrum RTM, author of quite a bit of unpublished work and most of Cacti's bugs.
_________________
Official Cacti Documentation
GitHub Repository with Supported Plugins
Percona Device Packages (no support)
Interesting Device Packages
For those wondering, I'm still here, but lost in the shadows. Yearning for less bugs. Who want's a Cacti 1.3/2.0? Streams anyone?
-
- Posts: 25
- Joined: Sun Sep 12, 2004 1:13 pm
-
- Posts: 25
- Joined: Sun Sep 12, 2004 1:13 pm
Well, there's should be more debugging, I suppose, but that's what i've studied from the source differences:
1. Socket timeouts
PHP poller sets seconds (erm? it's stored in milliseconds i suppose):
C poller sets it in miliseconds*1000, seems like correct way, but for timeouts less that 1 second?
2. Main difference takes place is in recv timeouts:
No real timeout in PHP
C poller set inavoidable timeout via select (timeout value is the same as for setsockopt called before)
1. Socket timeouts
PHP poller sets seconds (erm? it's stored in milliseconds i suppose):
Code: Select all
socket_set_option($this->socket,
SOL_SOCKET, // socket level
SO_RCVTIMEO, // timeout option
array(
"sec"=>$this->timeout, // Timeout in seconds
"usec"=>0 // I assume timeout in microsecond
));
Code: Select all
/* establish timeout value */
timeout.tv_sec = 0;
timeout.tv_usec = set.ping_timeout * 1000;
...
setsockopt(udp_socket, SOL_SOCKET, SO_RCVTIMEO, (char*)&timeout, sizeof(timeout));
No real timeout in PHP
Code: Select all
$this->start_time();
socket_write($this->socket, $this->request, $this->request_len);
$code = @socket_recv($this->socket, $this->reply, 256, 0);
/* get the end time */
$this->time = $this->get_time($this->precision);
Code: Select all
send(udp_socket, request, request_len, 0);
select(numfds, &socket_fds, NULL, NULL, &timeout);
if (FD_ISSET(udp_socket, &socket_fds)) {
return_code = read(udp_socket, socket_reply, 256
} else {
return_code = -10;
}
- geraldocastro
- Posts: 1
- Joined: Wed Aug 24, 2005 2:57 pm
Availability has been a problem for us.
We edited poller.c and commented lines reference availability.
146 /* perform a check to see if the host is alive by polling it's SysDesc
147 * if the host down from an snmp perspective, don't poll it.
148 * function sets the ignore_host bit */
149 /*
150 if ((set.availability_method == AVAIL_SNMP) && (host->snmp_community == "")) {
151 update_host_status(HOST_UP, host, ping, set.availability_method);
152
153 if (set.verbose >= POLLER_VERBOSITY_MEDIUM) {
154 snprintf(logmessage, LOGSIZE, "Host[%i] No host availability check possible for '%s'\n", host- >id, host->hostname);
155 cacti_log(logmessage);
156 }
157 }else{
158 if (ping_host(host, ping) == HOST_UP) {
159 update_host_status(HOST_UP, host, ping, set.availability_method);
160 }else{
161 host->ignore_host = 1;
162 update_host_status(HOST_DOWN, host, ping, set.availability_method);
163 }
164 }
165 */
166 // LINE ABOVE INSERTED
167 update_host_status(HOST_UP, host, ping, set.availability_method);
Compiled the cactid e Ok !
Host always UP.
We edited poller.c and commented lines reference availability.
146 /* perform a check to see if the host is alive by polling it's SysDesc
147 * if the host down from an snmp perspective, don't poll it.
148 * function sets the ignore_host bit */
149 /*
150 if ((set.availability_method == AVAIL_SNMP) && (host->snmp_community == "")) {
151 update_host_status(HOST_UP, host, ping, set.availability_method);
152
153 if (set.verbose >= POLLER_VERBOSITY_MEDIUM) {
154 snprintf(logmessage, LOGSIZE, "Host[%i] No host availability check possible for '%s'\n", host- >id, host->hostname);
155 cacti_log(logmessage);
156 }
157 }else{
158 if (ping_host(host, ping) == HOST_UP) {
159 update_host_status(HOST_UP, host, ping, set.availability_method);
160 }else{
161 host->ignore_host = 1;
162 update_host_status(HOST_DOWN, host, ping, set.availability_method);
163 }
164 }
165 */
166 // LINE ABOVE INSERTED
167 update_host_status(HOST_UP, host, ping, set.availability_method);
Compiled the cactid e Ok !
Host always UP.
Who is online
Users browsing this forum: No registered users and 4 guests