graph issue: Discontinuos graphs showed

Post support questions that relate to the Windows 2003/2000/XP operating systems.

Moderators: Developers, Moderators

Post Reply
ni5ni6
Posts: 9
Joined: Tue Mar 25, 2008 8:07 am

graph issue: Discontinuos graphs showed

Post by ni5ni6 »

Hi, my configuration is as follows:

Operating System: Windows 2K3
Webserver: IIS 6.0
Cacti: 0.8.7b
MySQL: 5.0
PHP: 5.2.5
RRDTool (Win32 version): 1.2.15
Net-SNMP: 5.x

I posted few days ago about rrdtool.exe behaving disorderly, but I realized that was probably due to hardware or OS drivers problem.

My problem now is that
I'm getting scattered, discontinuous graphs
as you can see on the posted image.

I couldn't fetch errors in debug mode cause, everytime I am waiting for corrupted poll cycle, it's acting perfect, but during the time I'dont trace - errors pop up.
all I have is plenty of:
04/09/2008 08:05:30 AM - CMDPHP: Poller[0] Host[28] DS[194] WARNING: Result from SERVER not valid. Partial Result:
I've considered your remarks and installed the rrdtool v. 1.2.27 so will be waiting for new errors.

In the meantime, could anyone tell me if you know about similar cases and what was the exact problems causing them? :roll:

Thanks!
Nikola
Attachments
graph.PNG
graph.PNG (21.34 KiB) Viewed 4722 times
User avatar
BSOD2600
Cacti Moderator
Posts: 12171
Joined: Sat May 08, 2004 12:44 pm
Location: USA

Post by BSOD2600 »

1) increase the timeout and memory values in your php.ini.
2) increase the timeout for that device which is having problems.
3) some times when a windows box is under load, it has issues reporting snmp data.
ni5ni6
Posts: 9
Joined: Tue Mar 25, 2008 8:07 am

Post by ni5ni6 »

BSOD2600 wrote:1) increase the timeout and memory values in your php.ini.
my maxiumum execution time is set to 60; input time to 120 and memory limit to 128M.
What are recommended values since those windows boxes are under descent load, but I indeed want to monitor them successfully..? Should I double them? (except memory limit which I think is fair enough)
2) increase the timeout for that device which is having problems.
Actually 14 out of 14 machines I monitor have the same problem.. :o

As I heard from one cacti user which monitors linux machines and had the same problem, solution is to change SNMP protocol to use 64bit messages (which, I suppose means using SNMP version 2).
I changed the version to v2 on some of the machines, but nothing actually happened. Problem sits unchanged. Any thoughts about this?
Thanks guys!!!
Syngress
Posts: 26
Joined: Tue Apr 24, 2007 9:17 am

Post by Syngress »

I to am having this same issue, although in my case it is only with 1 machine out of almost 175 devices - This issue started after some major work on our network which caused Cacti to have issues reaching alot of devices for a while.

To prove it isnt the deive at fault I have added it again to Cacti under a test name, selected the same graphs as the old one and its monitoring under the test setup with no issues, this leads me to think that some how Cacti has got confused, I just need to find a way to 'unconfuse' it now :)

Mike
User avatar
TheWitness
Developer
Posts: 17007
Joined: Tue May 14, 2002 5:08 pm
Location: MI, USA
Contact:

Post by TheWitness »

Can you please re-index your hosts? What does the poller register when you run it? What does your log look like? Are you running any plugins? What poller are you using?

Regards,

TheWitness
True understanding begins only when we realize how little we truly understand...

Life is an adventure, let yours begin with Cacti!

Author of dozens of Cacti plugins and customization's. Advocate of LAMP, MariaDB, IBM Spectrum LSF and the world of batch. Creator of IBM Spectrum RTM, author of quite a bit of unpublished work and most of Cacti's bugs.
_________________
Official Cacti Documentation
GitHub Repository with Supported Plugins
Percona Device Packages (no support)
Interesting Device Packages


For those wondering, I'm still here, but lost in the shadows. Yearning for less bugs. Who want's a Cacti 1.3/2.0? Streams anyone?
ni5ni6
Posts: 9
Joined: Tue Mar 25, 2008 8:07 am

Post by ni5ni6 »

hey,
I re-indexed the hosts, but before that I changed the parameters in php.ini to:
max_execution_time = 120
max_input_time = 240
memory_limit = 128M
So during the whole day yesterday i had only ONE execution failure. Here comes the log file:
before this everything seems to be fine...
04/14/2008 11:46:13 PM - CMDPHP: Poller[0] Host[24] DS[150] SERVER: C:\Inetpub\wwwroot\cacti\scripts\ss_host_disk.php ss_host_disk x.x.x.x 24 1:161:1000:public:::MD5::DES: get used 6, output: 177471488
04/14/2008 11:46:13 PM - CMDPHP: Poller[0] Time: 7.9700 s, Theads: N/A, Hosts: 4
04/14/2008 11:51:03 PM - POLLER: Poller[0] Maximum runtime of 298 seconds exceeded. Exiting.
04/14/2008 11:51:03 PM - SYSTEM STATS: Time:298.3831 Method:cmd.php Processes:5 Threads:N/A Hosts:16 HostsPerProcess:4 DataSources:209 RRDsProcessed:97
04/14/2008 11:54:29 PM - CMDPHP: Poller[0] Host[26] DS[160] CMD: C:/Inetpub/wwwroot/cacti/scripts/rdp_connections.pl x.x.x.x 1 cstring 161 500, output: established:0
04/14/2008 11:54:29 PM - CMDPHP: Poller[0] Host[26] DS[161] WARNING: Result from SERVER not valid. Partial Result:
04/14/2008 11:54:29 PM - CMDPHP: Poller[0] Host[26] DS[161] SERVER: C:\Inetpub\wwwroot\cacti\scripts\ss_host_disk.php ss_host_disk x.x.x.x 26 1:161:500:cstring:::MD5::DES: get used 1, output: U
04/14/2008 11:54:29 PM - CMDPHP: Poller[0] Host[26] DS[161] WARNING: Result from SERVER not valid. Partial Result:
...
after this line plenty of broken, "not-valid" log-lines
and again...

04/14/2008 11:54:30 PM - CMDPHP: Poller[0] Host[28] DS[226] CMD: C:/Inetpub/wwwroot/cacti/scripts/rdp_connections.pl x.x.x.x 2 cstring 161 1000, output: established:0
04/14/2008 11:54:30 PM - CMDPHP: Poller[0] Time: 504.2719 s, Theads: N/A, Hosts: 4
04/14/2008 11:56:04 PM - POLLER: Poller[0] NOTE: Poller Int: '300', Scheduled Task Int: '300', Time Since Last: '600', Max Runtime '298', Poller Runs: '1'
04/14/2008 11:56:04 PM - POLLER: Poller[0] WARNING: Scheduled Task is out of sync with the Poller Interval! The Poller Interval is '300' seconds, with a maximum of a '300' second Scheduled Task, but 600 seconds have passed since the last poll!
04/14/2008 11:56:04 PM - POLLER: Poller[0] WARNING: Poller Output Table not Empty. Potential Data Source Issues for Data Sources: established(DS[160]), hdd_total(DS[161]), hdd_used(DS[161]), hdd_total(DS[162]), hdd_used(DS[162]), hdd_total(DS[163]), hdd_used(DS[163]), cpu(DS[164]), cpu(DS[165]), cpu(DS[166]), cpu(DS[167]), traffic_in(DS[168]), traffic_out(DS[168]), traffic_in(DS[169]), traffic_out(DS[169]), hdd_total(DS[170]), hdd_used(DS[170]), established(DS[179]), hdd_total(DS[180]), hdd_used(DS[180]), hdd_total(DS[181]), hdd_used(DS[181]), hdd_total(DS[182]), hdd_used(DS[182]), cpu(DS[183]), cpu(DS[184]), cpu(DS[185]), cpu(DS[186]), traffic_in(DS[187]), traffic_out(DS[187]), traffic_in(DS[188]), traffic_out(DS[188]), hdd_total(DS[189]), hdd_used(DS[189]), ctx_act_session(DS[190]), ctx_act_session(DS[191]), hdd_total(DS[192]), hdd_used(DS[192]), hdd_total(DS[193]), hdd_used(DS[193]), hdd_total(DS[194]), hdd_used(DS[194]), cpu(DS[195]), cpu(DS[196]), cpu(DS[197]), cpu(DS[198]), traffic_in(DS[199]), traffic_out(DS[199]), traffic_in(DS[200]), traffic_out(DS[200]), hdd_total(DS[201]), hdd_used(DS[201]), established(DS[226])
04/14/2008 11:56:06 PM - PHPSVR: Poller[0] PHP Script Server has Started - Parent is cmd
... and everything runs smoothly again
So, I have to mention that this happened once in past 20 hours. Other polling cycles take apx. 10-14 seconds to finish.
As you can see, I'm using default, cmd.php poller.
Not running any kind of plugins.
ni5ni6
Posts: 9
Joined: Tue Mar 25, 2008 8:07 am

Post by ni5ni6 »

So, any clues, guys? I couldnt manage to find the reason and solution for this issue.

As I see the dynamics of poller is like this:
1. Poller hangs after successful polling of couple of devices. It polls e.g. 10 devices and then hangs (does not report any error).
2. After maximum polling period of 298 secs passes, it stops and start another cycle. This time it reports that it doesnt have info for the devices that it didnt poll the last time. Therefore, it makes gap in the graph.
3. Next polling cycle brings back everything to normal.


Please, this is becoming really stressfull.

Is there any way to overcome those missed values, since it is just ugly to see scattered graph. If there is a way to draw the straight line between the last good cycle and the first cycle after poller hang it will be wonderfull. One or two polling cycles cannot change much considering the whole view of graph..

Hope to hear from you asap...
Thanks!
ni5ni6
Posts: 9
Joined: Tue Mar 25, 2008 8:07 am

Post by ni5ni6 »

Finally, I found out what was the cause of making gaps in my graphs.
It was (probably buggy) custom perl script that I used to track RDP connections to my windows boxes.
Once I removed the queries from all devices that were using it, graphs and log became "smooth" :)

I really need that RDP script, so, if anyone knows how to fix it please see the post: http://forums.cacti.net/viewtopic.php?t=27041
User avatar
TheWitness
Developer
Posts: 17007
Joined: Tue May 14, 2002 5:08 pm
Location: MI, USA
Contact:

Post by TheWitness »

Bad scripts, written in any number of languages always cause nasty problems. Sorry to hear about it.

TheWitness
True understanding begins only when we realize how little we truly understand...

Life is an adventure, let yours begin with Cacti!

Author of dozens of Cacti plugins and customization's. Advocate of LAMP, MariaDB, IBM Spectrum LSF and the world of batch. Creator of IBM Spectrum RTM, author of quite a bit of unpublished work and most of Cacti's bugs.
_________________
Official Cacti Documentation
GitHub Repository with Supported Plugins
Percona Device Packages (no support)
Interesting Device Packages


For those wondering, I'm still here, but lost in the shadows. Yearning for less bugs. Who want's a Cacti 1.3/2.0? Streams anyone?
Post Reply

Who is online

Users browsing this forum: No registered users and 0 guests