Nodes showing offline intermittently

Support questions about the Network Weather Map plugin

Moderators: Developers, Moderators

Post Reply
SSami
Posts: 2
Joined: Tue Nov 22, 2016 12:18 am

Nodes showing offline intermittently

Post by SSami »

Hi,

We are facing an issue with our Cacti that it keeps showing some nodes offline (gray arrows) even though the devices are marked as up and we are able to ping them. we have tried increasing polling interval , increasing ping timeout value for these nodes, removing those nodes and adding them again , clearing the statistics for them and eventually restarting the service but with no avail.

These nodes sometimes appear online (green arrow) and then it shows again after some time offline.

Any suggestions?
Rno
Cacti Pro User
Posts: 692
Joined: Wed Dec 07, 2011 9:19 am

Re: Nodes showing offline intermittently

Post by Rno »

I have the same issue but only with snmp, I got error that the device is not responding to snmp but ping is ok !!
And snmp timeout is over 10 seconds !! But that won't solve the problem
Test
Almalinux
php 8.2.14
mariadb 10.6.16
Cacti 1.2.27
Spine 1.2.27
RRD 1.7.2
thold 1.8
monitor 2.5
syslog 3.2
flowview: 3.3
weathermap 1.0 Beta
User avatar
Howie
Cacti Guru User
Posts: 5508
Joined: Thu Sep 16, 2004 5:53 am
Location: United Kingdom
Contact:

Re: Nodes showing offline intermittently

Post by Howie »

Grey arrows in weathermap?

What does your scale look like?

Weathermap doesn't show anything as up or down - it just takes a number and turns it into a colour. If the scale (which does the translation) has gaps, or is incorrect for your interpretation of the data, then you might get grey arrows. The most common problem like this is where people have a scale like

Code: Select all

SCALE myscale 0 1   255 0 0
SCALE myscale 1 2   0 255 0
or

Code: Select all

SCALE myscale 1 1   255 0 0
SCALE myscale 2 2   0 255 0
and then get a value of 1.5. If you are storing things like interface status in rrd files, then you sometimes get values in between 1 and 2, even if you think that you can only get 1 or 2 back from the device, depending on how you have set up the rrd parameters. It will produce an average if those aren't correct.

The easiest fix is:

Code: Select all

SCALE myscale 0.5 1.5   255 0 0
SCALE myscale 1.5 2.5   0 255 0
Now there are no gaps. The actual state might still be wrong though - if the average is 1.7 because the state has only recently changed from 2 to 1, the map will show an incorrect state. So the real answer is to fix the rrd parameters if this is what the problem is.
Weathermap 0.98a is out! & QuickTree 1.0. Superlinks is over there now (and built-in to Cacti 1.x).
Some Other Cacti tweaks, including strip-graphs, icons and snmp/netflow stuff.
(Let me know if you have UK DevOps or Network Ops opportunities, too!)
SSami
Posts: 2
Joined: Tue Nov 22, 2016 12:18 am

Re: Nodes showing offline intermittently

Post by SSami »

Hi Howie,

Thanks for your reply, much appreciated. Basically we are monitoring traffic on our devices. if any of them stopped passing traffic,the link (or arrow) would turn from green to grey on the weathermap and would mark the device as down in the devices list.

The issue we are facing that even though device is marked up and its passing traffic and we are able to reach it from cacti , its still showing grey link as if its not passing traffic.

I have attached sample of the logs we are getting for one of the devices we are facing this issue with.

PS: we have tried rebuilding the rrd file manually but with no avail.

Also kindly can you please help in the location of where to check the scale from? As I'm not aware of.

Thanks,
Attachments
cacti rrd error.PNG
cacti rrd error.PNG (57.7 KiB) Viewed 1911 times
User avatar
Howie
Cacti Guru User
Posts: 5508
Joined: Thu Sep 16, 2004 5:53 am
Location: United Kingdom
Contact:

Re: Nodes showing offline intermittently

Post by Howie »

The scale is defined in the top of the weathermap map config file.

If you are getting 'no valid data' though, it's probably not that.

At the time that you get that error in your logs, please check the last-modified date on that specific file.
Weathermap 0.98a is out! & QuickTree 1.0. Superlinks is over there now (and built-in to Cacti 1.x).
Some Other Cacti tweaks, including strip-graphs, icons and snmp/netflow stuff.
(Let me know if you have UK DevOps or Network Ops opportunities, too!)
Post Reply

Who is online

Users browsing this forum: No registered users and 0 guests