frequent drops in cacti graphs

Post general support questions here that do not specifically fall into the Linux or Windows categories.

Moderators: Developers, Moderators

Post Reply
sangayya.alagundimath
Posts: 20
Joined: Thu Aug 20, 2009 7:41 am

frequent drops in cacti graphs

Post by sangayya.alagundimath »

Hi

all of sudden there are frequent drops in exising graphs, and newly added grpahs like windows disk, cpu grpahs are not getting generated

please find the cacti.log:
12/15/2009 10:02:29 PM - CMDPHP: Poller[0] Host[173] DS[2651] WARNING: Result from SERVER not valid. Partial Result: U
12/15/2009 10:02:29 PM - CMDPHP: Poller[0] Host[173] DS[2652] WARNING: Result from SERVER not valid. Partial Result: U
12/15/2009 10:02:29 PM - CMDPHP: Poller[0] Host[173] DS[2652] WARNING: Result from SERVER not valid. Partial Result: U
12/15/2009 10:02:29 PM - CMDPHP: Poller[0] Host[174] DS[2714] WARNING: Result from SERVER not valid. Partial Result: U
12/15/2009 10:02:29 PM - CMDPHP: Poller[0] Host[174] DS[2714] WARNING: Result from SERVER not valid. Partial Result: U
12/15/2009 10:02:29 PM - CMDPHP: Poller[0] Host[174] DS[2713] WARNING: Result from SERVER not valid. Partial Result: U
12/15/2009 10:02:29 PM - CMDPHP: Poller[0] Host[174] DS[2713] WARNING: Result from SERVER not valid. Partial Result: U
12/15/2009 10:02:29 PM - CMDPHP: Poller[0] Host[174] DS[2712] WARNING: Result from SERVER not valid. Partial Result: U
12/15/2009 10:02:29 PM - CMDPHP: Poller[0] Host[174] DS[2712] WARNING: Result from SERVER not valid. Partial Result: U
12/15/2009 10:02:29 PM - CMDPHP: Poller[0] Host[174] DS[2710] WARNING: Result from SERVER not valid. Partial Result: U
12/15/2009 10:02:32 PM - CMDPHP: Poller[0] Host[178] DS[2736] WARNING: Result from SERVER not valid. Partial Result: U
12/15/2009 10:02:32 PM - CMDPHP: Poller[0] Host[178] DS[2743] WARNING: Result from SERVER not valid. Partial Result: U


your help will be appreciated, thanks in advance
Attachments
Capture.GIF
Capture.GIF (50.35 KiB) Viewed 4202 times
--SanG
User avatar
TheWitness
Developer
Posts: 17007
Joined: Tue May 14, 2002 5:08 pm
Location: MI, USA
Contact:

Post by TheWitness »

Microsoft SNMP agent, well it kindof XXXX's. So, in systems with high I/O, you will need to increase your snmp timeout, to even as high as 10 seconds. Also, consider removing any floppies, if the system even has these any longer. With the Microsoft SNMP agent, the floppy disk drive inspects itself almost every time you perform a poll. How lame is that.

TheWitness
True understanding begins only when we realize how little we truly understand...

Life is an adventure, let yours begin with Cacti!

Author of dozens of Cacti plugins and customization's. Advocate of LAMP, MariaDB, IBM Spectrum LSF and the world of batch. Creator of IBM Spectrum RTM, author of quite a bit of unpublished work and most of Cacti's bugs.
_________________
Official Cacti Documentation
GitHub Repository with Supported Plugins
Percona Device Packages (no support)
Interesting Device Packages


For those wondering, I'm still here, but lost in the shadows. Yearning for less bugs. Who want's a Cacti 1.3/2.0? Streams anyone?
sangayya.alagundimath
Posts: 20
Joined: Thu Aug 20, 2009 7:41 am

Post by sangayya.alagundimath »

Thank you , I will check this once am in office but this frequent drops are happening to even network devices and all unix flavors.

please let us know incase if you the solution and am getting parial response or SNMP not valid from all these devices in cacti log
--SanG
User avatar
TheWitness
Developer
Posts: 17007
Joined: Tue May 14, 2002 5:08 pm
Location: MI, USA
Contact:

Post by TheWitness »

Turn off process leveling then. There is a bug there. Console->Settins->Poller

TW
True understanding begins only when we realize how little we truly understand...

Life is an adventure, let yours begin with Cacti!

Author of dozens of Cacti plugins and customization's. Advocate of LAMP, MariaDB, IBM Spectrum LSF and the world of batch. Creator of IBM Spectrum RTM, author of quite a bit of unpublished work and most of Cacti's bugs.
_________________
Official Cacti Documentation
GitHub Repository with Supported Plugins
Percona Device Packages (no support)
Interesting Device Packages


For those wondering, I'm still here, but lost in the shadows. Yearning for less bugs. Who want's a Cacti 1.3/2.0? Streams anyone?
sangayya.alagundimath
Posts: 20
Joined: Thu Aug 20, 2009 7:41 am

Post by sangayya.alagundimath »

I have unchecked Balance Process Load on poller under settings but still the same problem.

I have changed script timeout from 25 sec to 60 sec

The Maximum SNMP OID's Per SNMP Get Request from 10 to 1

some graps are dropping intermittently and some are always NAN:

error.log:
12/16/2009 09:36:48 PM - CMDPHP: Poller[0] Host[303] DS[5472] WARNING: Result from SNMP not valid. Partial Result: U
12/16/2009 09:36:48 PM - CMDPHP: Poller[0] Host[303] DS[5472] WARNING: Result from SNMP not valid. Partial Result: U

we have deployed transporter plugin , is there any problem if we use this?
--SanG
User avatar
TheWitness
Developer
Posts: 17007
Joined: Tue May 14, 2002 5:08 pm
Location: MI, USA
Contact:

Post by TheWitness »

Not certain. You should not have that large of a timeout. Bad mojo. I would have to log into the box to k ow more.

TheWitness
True understanding begins only when we realize how little we truly understand...

Life is an adventure, let yours begin with Cacti!

Author of dozens of Cacti plugins and customization's. Advocate of LAMP, MariaDB, IBM Spectrum LSF and the world of batch. Creator of IBM Spectrum RTM, author of quite a bit of unpublished work and most of Cacti's bugs.
_________________
Official Cacti Documentation
GitHub Repository with Supported Plugins
Percona Device Packages (no support)
Interesting Device Packages


For those wondering, I'm still here, but lost in the shadows. Yearning for less bugs. Who want's a Cacti 1.3/2.0? Streams anyone?
User avatar
TheWitness
Developer
Posts: 17007
Joined: Tue May 14, 2002 5:08 pm
Location: MI, USA
Contact:

Post by TheWitness »

You should also make certain that you are in 087E and have properly taken the default XML and script from the distrib and not your old install.

TheWitness
True understanding begins only when we realize how little we truly understand...

Life is an adventure, let yours begin with Cacti!

Author of dozens of Cacti plugins and customization's. Advocate of LAMP, MariaDB, IBM Spectrum LSF and the world of batch. Creator of IBM Spectrum RTM, author of quite a bit of unpublished work and most of Cacti's bugs.
_________________
Official Cacti Documentation
GitHub Repository with Supported Plugins
Percona Device Packages (no support)
Interesting Device Packages


For those wondering, I'm still here, but lost in the shadows. Yearning for less bugs. Who want's a Cacti 1.3/2.0? Streams anyone?
sangayya.alagundimath
Posts: 20
Joined: Thu Aug 20, 2009 7:41 am

Post by sangayya.alagundimath »

i have 0.8.7e only, any how i have deleted all the devices and readded it.

but some windows graphs for memeory and CPU and some are not again am getting the same error as partial SNMP please find the debugging log for such host

12/17/2009 10:33:23 PM - WEBLOG: Poller[0] CACTI2RRD: /usr/local/rrdtool-1.3.7/bin/rrdtool graph - --imgformat=PNG --start=1261058602 --end=1261060402 --title="file-srv-a - Memory - Usage" --rigid --base=1000 --height=120 --width=500 --alt-autoscale-max --lower-limit=0 COMMENT:"From 2009/12/17 22\:03\:22 To 2009/12/17 22\:33\:22\c" COMMENT:" \n" --vertical-label="megabytes" --slope-mode --font TITLE:12: --font AXIS:8: --font LEGEND:10: --font UNIT:8: DEF:a="/var/www/htdocs/cacti/rra/file-srv-a_mem_used_8114.rrd":mem_total:AVERAGE DEF:b="/var/www/htdocs/cacti/rra/file-srv-a_mem_used_8114.rrd":mem_used:AVERAGE CDEF:cdefa=a,1048576,* CDEF:cdefc=b,1048576,* AREA:cdefa#002A97FF:"Total" GPRINT:cdefa:LAST:"Amount\:%8.2lf%s\n" AREA:cdefc#EA8F00FF:"Used\:" GPRINT:cdefc:LAST:"Current\:%8.2lf%s" GPRINT:cdefc:AVERAGE:"Average\:%8.2lf%s" GPRINT:cdefc:MAX:"Maximum\:%8.2lf%s\n"
12/17/2009 10:33:23 PM - WEBLOG: Poller[0] CACTI2RRD: /usr/local/rrdtool-1.3.7/bin/rrdtool graph - --imgformat=PNG --start=1261058602 --end=1261060402 --title="file-srv-a - CPU Load" --rigid --base=1000 --height=120 --width=500 --upper-limit=100 --lower-limit=0 COMMENT:"From 2009/12/17 22\:03\:22 To 2009/12/17 22\:33\:22\c" COMMENT:" \n" --vertical-label="CPU Load %" --slope-mode --font TITLE:12: --font AXIS:8: --font LEGEND:10: --font UNIT:8: DEF:a="/var/www/htdocs/cacti/rra/file-srv-a_cpu_load_8113.rrd":cpu_load:AVERAGE AREA:a#005199FF:"CPU Load\n" GPRINT:a:LAST:"Current\:%8.2lf%s" GPRINT:a:MIN:"Minimum\:%8.2lf%s" GPRINT:a:AVERAGE:"Average\:%8.2lf%s" GPRINT:a:MAX:"Maximum\:%8.2lf%s"
--SanG
sangayya.alagundimath
Posts: 20
Joined: Thu Aug 20, 2009 7:41 am

Post by sangayya.alagundimath »

am getting this error in cacti log for memeory usage:

12/17/2009 11:31:21 PM - CMDPHP: Poller[0] Host[433] DS[8179] WARNING: Result from CMD not valid. Partial Result: U

on graph management the rrd is not created:
RRDTool Command:
/usr/local/rrdtool-1.3.7/bin/rrdtool graph - \
--imgformat=PNG \
--start=-86400 \
--end=-300 \
--title="CIM-PCSDAPP1 - Memory - Usage" \
--rigid \
--base=1000 \
--height=120 \
--width=500 \
--alt-autoscale-max \
--lower-limit=0 \
--vertical-label="megabytes" \
--slope-mode \
--font TITLE:12: \
--font AXIS:8: \
--font LEGEND:10: \
--font UNIT:8: \
DEF:a="/var/www/htdocs/cacti/rra/cim-pcsdapp1_mem_used_8180.rrd":mem_total:AVERAGE \
DEF:b="/var/www/htdocs/cacti/rra/cim-pcsdapp1_mem_used_8180.rrd":mem_used:AVERAGE \
CDEF:cdefa=a,1048576,* \
CDEF:cdefc=b,1048576,* \
AREA:cdefa#002A97FF:"Total" \
GPRINT:cdefa:LAST:"Amount\:%8.2lf%s\n" \
AREA:cdefc#EA8F00FF:"Used\:" \
GPRINT:cdefc:LAST:"Current\:%8.2lf%s" \
GPRINT:cdefc:AVERAGE:"Average\:%8.2lf%s" \
GPRINT:cdefc:MAX:"Maximum\:%8.2lf%s\n"
RRDTool Says:
ERROR: opening '/var/www/htdocs/cacti/rra/cim-pcsdapp1_mem_used_8180.rrd': No such file or directory
--SanG
User avatar
TheWitness
Developer
Posts: 17007
Joined: Tue May 14, 2002 5:08 pm
Location: MI, USA
Contact:

Post by TheWitness »

Likely that OID does not exist. Can you confirm by performing a manual walk of the OID?

TheWitness
True understanding begins only when we realize how little we truly understand...

Life is an adventure, let yours begin with Cacti!

Author of dozens of Cacti plugins and customization's. Advocate of LAMP, MariaDB, IBM Spectrum LSF and the world of batch. Creator of IBM Spectrum RTM, author of quite a bit of unpublished work and most of Cacti's bugs.
_________________
Official Cacti Documentation
GitHub Repository with Supported Plugins
Percona Device Packages (no support)
Interesting Device Packages


For those wondering, I'm still here, but lost in the shadows. Yearning for less bugs. Who want's a Cacti 1.3/2.0? Streams anyone?
twilliam
Posts: 3
Joined: Mon Jan 11, 2010 4:14 am

Post by twilliam »

I have the exact same problem! My worst host seems to be id 10 (fi91), attached image below. My other hosts seems to work better but still have issues as well.

Code: Select all

mysql> select description from host where id = 10;
+-------------+
| description |
+-------------+
| fi91        | 
+-------------+
1 row in set (0.00 sec)

Code: Select all

mysql> select count(*) from host;
+----------+
| count(*) |
+----------+
|       10 | 
+----------+
1 row in set (0.00 sec)
What I've already tried to change:

SNMP Timeout: 500 -> 1000
Script and Script Server Timeout Value: 25 -> 40
Maximum Concurrent Poller Processes: 1 -> 2
Balance Process Load: Checked -> Unchecked
01/11/2010 11:12:05 AM - CMDPHP: Poller[0] Host[5] DS[53] WARNING: Result from SERVER not valid. Partial Result: U
01/11/2010 11:12:05 AM - CMDPHP: Poller[0] Host[5] DS[58] WARNING: Result from SERVER not valid. Partial Result: U
01/11/2010 11:12:05 AM - CMDPHP: Poller[0] Host[5] DS[58] WARNING: Result from SERVER not valid. Partial Result: U
01/11/2010 11:12:05 AM - CMDPHP: Poller[0] Host[10] DS[149] WARNING: Result from SERVER not valid. Partial Result: 01/11/2010 11:12:05
01/11/2010 11:14:05 AM - CMDPHP: Poller[0] Host[10] DS[151] WARNING: Result from SERVER not valid. Partial Result: 01/11/2010 11:14:05
01/11/2010 11:16:04 AM - CMDPHP: Poller[0] Host[5] DS[57] WARNING: Result from SERVER not valid. Partial Result: 01/11/2010 11:16:04
01/11/2010 11:18:04 AM - CMDPHP: Poller[0] Host[5] DS[54] WARNING: Result from SERVER not valid. Partial Result: 01/11/2010 11:18:04
01/11/2010 11:18:04 AM - CMDPHP: Poller[0] Host[10] DS[150] WARNING: Result from SERVER not valid. Partial Result: 01/11/2010 11:18:04
01/11/2010 11:21:05 AM - CMDPHP: Poller[0] Host[10] DS[148] WARNING: Result from SERVER not valid. Partial Result: 01/11/2010 11:21:05
01/11/2010 11:22:04 AM - CMDPHP: Poller[0] Host[5] DS[58] WARNING: Result from SERVER not valid. Partial Result: 01/11/2010 11:22:04
01/11/2010 11:22:04 AM - CMDPHP: Poller[0] Host[5] DS[58] WARNING: Result from SERVER not valid. Partial Result: U
01/11/2010 11:22:04 AM - CMDPHP: Poller[0] Host[10] DS[149] WARNING: Result from SERVER not valid. Partial Result: 01/11/2010 11:22:04
[root@m1 log]# grep "Result from SERVER not valid" cacti.log | wc -l
18572
[root@m1 log]# grep "Result from SERVER not valid" cacti.log | grep "Host\[10\]" | wc -l
7517

This is what I've noticed as a difference:

[root@m1 log]# snmpwalk <hidden options> 10.*.*.91 | wc -l
6229
[root@m1 log]# snmpwalk <hidden options> 10.*.*.22 | wc -l
4537

I'll try to debug some more and increase my timeouts even more.

- William
twilliam
Posts: 3
Joined: Mon Jan 11, 2010 4:14 am

Post by twilliam »

I think I've found my problem - managed to reproduce this error from the CLI:

[root@m1 log]# snmpwalk <hidden options> 10.*.*.91 .1.3.6.1.2.1 | wc -l
Timeout: No Response from 10.*.*.91
10
User avatar
gandalf
Developer
Posts: 22383
Joined: Thu Dec 02, 2004 2:46 am
Location: Muenster, Germany
Contact:

Post by gandalf »

twilliam wrote:I think I've found my problem - managed to reproduce this error from the CLI:

[root@m1 log]# snmpwalk <hidden options> 10.*.*.91 .1.3.6.1.2.1 | wc -l
Timeout: No Response from 10.*.*.91
10
No Response is definitively bad ...
R
User avatar
TheWitness
Developer
Posts: 17007
Joined: Tue May 14, 2002 5:08 pm
Location: MI, USA
Contact:

Post by TheWitness »

A big problem is that you have a script server script that is not responding properly and this is causing all sort's of havoc in the script server. Can you validate which script server script is causing the problem by using the problem hosts and running a simple script server test.

TheWitness
True understanding begins only when we realize how little we truly understand...

Life is an adventure, let yours begin with Cacti!

Author of dozens of Cacti plugins and customization's. Advocate of LAMP, MariaDB, IBM Spectrum LSF and the world of batch. Creator of IBM Spectrum RTM, author of quite a bit of unpublished work and most of Cacti's bugs.
_________________
Official Cacti Documentation
GitHub Repository with Supported Plugins
Percona Device Packages (no support)
Interesting Device Packages


For those wondering, I'm still here, but lost in the shadows. Yearning for less bugs. Who want's a Cacti 1.3/2.0? Streams anyone?
twilliam
Posts: 3
Joined: Mon Jan 11, 2010 4:14 am

Post by twilliam »

I'll try to. As you said it seems like this problem only occurs to the probes with XML-files.

<path_cacti>/resource/script_server/host_disk.xml
<path_cacti>/resource/snmp_queries/interface.xml

How can I run this script server test? Any hints except tracing arguments and running php scripts in <path_cacti>/scripts?

Edit: N/M found chapter 14 in the documentation.
Post Reply

Who is online

Users browsing this forum: No registered users and 1 guest