[solved] Memory leak...but cannot isolate the culprit
Moderators: Developers, Moderators
[solved] Memory leak...but cannot isolate the culprit
Operating System: Windows XP (SP3)
Webserver: IIS
Cacti: 0.8.7b
Spine: not installed
MySQL: 5.0.67
PHP: 5.2.6
RRDTool (Cygwin or Win32 version): 1.2.28
Net-SNMP: 5.4.1.2
Cygwin (cygwin1.dll version): 1.5.25-cr-0x5f1
Plugin Architecture: none
I've successfully installed Cacti and have been using it to monitor about 30 different servers. It works great. Initially I had Cacti only monitoring my Windows boxes (about 20) but last week I added our non-windows boxes to the application. Since then I've noticed that after a day or two, the machine will run out of memory and lock up completely. Rebooting will resolve the problem and Cacti will work properly again.
I've been searching the forums for anyone having similar problems, but I'm not having any luck. So I've gone ahead and applied every 'critical' Windows update available, yet the problem persists. During the initial install, I also applied all available Cacti patches
I can try and stop monitoring the non-windows boxes to see if that resolves my problem, but then I wouldn't know where to go from there. Would the problem be in php, net-snmp or some other component that Cacti uses to work properly.
Please forgive if I'm not specific enough...my first post.
Webserver: IIS
Cacti: 0.8.7b
Spine: not installed
MySQL: 5.0.67
PHP: 5.2.6
RRDTool (Cygwin or Win32 version): 1.2.28
Net-SNMP: 5.4.1.2
Cygwin (cygwin1.dll version): 1.5.25-cr-0x5f1
Plugin Architecture: none
I've successfully installed Cacti and have been using it to monitor about 30 different servers. It works great. Initially I had Cacti only monitoring my Windows boxes (about 20) but last week I added our non-windows boxes to the application. Since then I've noticed that after a day or two, the machine will run out of memory and lock up completely. Rebooting will resolve the problem and Cacti will work properly again.
I've been searching the forums for anyone having similar problems, but I'm not having any luck. So I've gone ahead and applied every 'critical' Windows update available, yet the problem persists. During the initial install, I also applied all available Cacti patches
I can try and stop monitoring the non-windows boxes to see if that resolves my problem, but then I wouldn't know where to go from there. Would the problem be in php, net-snmp or some other component that Cacti uses to work properly.
Please forgive if I'm not specific enough...my first post.
- streaker69
- Cacti Pro User
- Posts: 712
- Joined: Mon Mar 27, 2006 10:35 am
- Location: Psychic Amish Network Administrator
What does Task Manager show for memory usage per process? You should be able to watch it for a few minutes and see which process is growing in memory usage, and see if there is a solution from there.
Chances are, you have some extraneous service running under WinXP that doesn't need to run that's causing it.
Chances are, you have some extraneous service running under WinXP that doesn't need to run that's causing it.
[b]Cacti Version[/b] - 0.8.7d
[b]Plugin Architecture[/b] - 2.4
[b]Poller Type[/b] - Cactid v
[b]Server Info[/b] - Linux 2.6.18-128.1.6.el5
[b]Web Server[/b] - Apache/2.2.3 (CentOS)
[b]PHP[/b] - 5.2.9
[b]MySQL[/b] - 5.0.45-log
[b]RRDTool[/b] - 1.3.0
[b]SNMP[/b] - 5.3.2.2
[b]Plugins[/b]PHP Network Managing v0.6.1, Global Plugin Settings v0.6,thold v0.4.1,XMLPort v0.3.5,CactiCam v0.1.5,NetTools v0.1.5,pollperf v0.32,RRD Cleaner v1.1,sqlqueries v0.2,superlinks v0.8,syslog v0.5.2,update v0.4,discovery v0.9,zond v0.34a,hostinfo v0.2,Bloom v0.6.5,mactrack v1.1,weathermap v0.96a,mobile v0.1
[b]Plugin Architecture[/b] - 2.4
[b]Poller Type[/b] - Cactid v
[b]Server Info[/b] - Linux 2.6.18-128.1.6.el5
[b]Web Server[/b] - Apache/2.2.3 (CentOS)
[b]PHP[/b] - 5.2.9
[b]MySQL[/b] - 5.0.45-log
[b]RRDTool[/b] - 1.3.0
[b]SNMP[/b] - 5.3.2.2
[b]Plugins[/b]PHP Network Managing v0.6.1, Global Plugin Settings v0.6,thold v0.4.1,XMLPort v0.3.5,CactiCam v0.1.5,NetTools v0.1.5,pollperf v0.32,RRD Cleaner v1.1,sqlqueries v0.2,superlinks v0.8,syslog v0.5.2,update v0.4,discovery v0.9,zond v0.34a,hostinfo v0.2,Bloom v0.6.5,mactrack v1.1,weathermap v0.96a,mobile v0.1
Thanks for the reply...that's what I've been doing and have noticed 2 processes consuming memory...
svchost.exe and mysqld-nt.exe
The svchost MAY be the problem, but this box did not have a memory leak until I got Cacti up and running. As for the MySql process, I would assume that if this was the problem, a lot more users would be having similar issues.
Still monitoring....will post if anything changes.
svchost.exe and mysqld-nt.exe
The svchost MAY be the problem, but this box did not have a memory leak until I got Cacti up and running. As for the MySql process, I would assume that if this was the problem, a lot more users would be having similar issues.
Still monitoring....will post if anything changes.
I hope I'm onto something here..in the Cacti log there were these events, "09/12/2008 04:25:32 AM - CMDPHP: Poller[0] Host[34] DS[126] WARNING: Result from CMD not valid. Partial Result: 1min: 5min: 10
09/12/2008 04:25:31 AM - CMDPHP: Poller[0] Host[32] DS[120] WARNING: Result from CMD not valid. Partial Result: 1min: 5min: 10 " which corresponded to some of my linux/unix hosts that I was monitoring. Specifically using the ucd/Net - CPU Usage, Load Average or Memory Usage mibs.
I've stopped using those mibs to monitor my hosts and the Cacti logs has cleared up. No more warning events. Also, the website appears to be stable and working well. I should know soon if these were the cause of my memory leak.
Just to clarify, although these warnings were appearing in the log, I was still able to get data from the host and successfully graph the information. So if this is the problem, then I'd like to determine what would need to be changed so that I can re-enable these mibs
09/12/2008 04:25:31 AM - CMDPHP: Poller[0] Host[32] DS[120] WARNING: Result from CMD not valid. Partial Result: 1min: 5min: 10 " which corresponded to some of my linux/unix hosts that I was monitoring. Specifically using the ucd/Net - CPU Usage, Load Average or Memory Usage mibs.
I've stopped using those mibs to monitor my hosts and the Cacti logs has cleared up. No more warning events. Also, the website appears to be stable and working well. I should know soon if these were the cause of my memory leak.
Just to clarify, although these warnings were appearing in the log, I was still able to get data from the host and successfully graph the information. So if this is the problem, then I'd like to determine what would need to be changed so that I can re-enable these mibs
1) What type of memory growth/leak are you talking about for those processes? If you were to graph your Cacti's box memory usage, it might help to give a better picture how fast/slow this problem is occurring .
2) Run tasklist /svc from the cmd prompt to determine what services are running under that svchost.exe which is consuming memory.
3) As for your linux template issue, notice the script was unable to get the 1min cpu time? That is the likely issue. When you manually run the script and/or snmpwalk that box(es) for the various cpu usage times, does it return data?
2) Run tasklist /svc from the cmd prompt to determine what services are running under that svchost.exe which is consuming memory.
3) As for your linux template issue, notice the script was unable to get the 1min cpu time? That is the likely issue. When you manually run the script and/or snmpwalk that box(es) for the various cpu usage times, does it return data?
| Scripts: Monitor processes | RFC1213 MIB | DOCSIS Stats | Dell PowerEdge | Speedfan | APC UPS | DOCSIS CMTS | 3ware | Motorola Canopy |
| Guides: Windows Install | [HOWTO] Debug Windows NTFS permission problems |
| Tools: Windows All-in-one Installer |
per your request..here's some more information...
I've attached a screen print of a perfmon that I ran last week prior to turning off the query for the problematic mibs. The machine would run out of memory in just over a day.
Another tell tale error in the System eventlog would also preceed the failure:
Event Type: Error
Event Source: DCOM
Event Category: None
Event ID: 10016
Date: 9/11/2008
Time: 11:42:43 AM
User: DELL838\IWAM_DELL838
Computer: DELL838
Description:
The application-specific permission settings do not grant Local Activation permission for the COM Server application with CLSID
{0C0A3666-30C9-11D0-8F20-00805F2CD064}
to the user DELL838\IWAM_DELL838 SID (S-1-5-21-1645522239-1482476501-682003330-1008). This security permission can be modified using the Component Services administrative tool.
For more information, see Help and Support Center at http://go.microsoft.com/fwlink/events.asp.
Once this error ocurred, the machine would be out of memory within hours. Another note, I'm not using WMI to query any of my servers.
On Friday, I've stopped querying any mib that was continuously creating a warning event in the Cacti log. As of this posting, the machine has remained up and running. Secondly the 'svchost' and 'mysql-nt.exe' processes have remained steady, so I don't believe those are the culprit.
There were about 30 net/ucd graph templates and data queries that were causing errors/warnings which I removed in order for the machine to remain stable. Since I'm using the cmd.php poller, which 'process' should I be monitoring so that I can begin to narrow this problem down.
I'm willing to re-enable a few of the net/ucd devices in order to 'cause' the problem again
Thanks for your assistance.
I've attached a screen print of a perfmon that I ran last week prior to turning off the query for the problematic mibs. The machine would run out of memory in just over a day.
Another tell tale error in the System eventlog would also preceed the failure:
Event Type: Error
Event Source: DCOM
Event Category: None
Event ID: 10016
Date: 9/11/2008
Time: 11:42:43 AM
User: DELL838\IWAM_DELL838
Computer: DELL838
Description:
The application-specific permission settings do not grant Local Activation permission for the COM Server application with CLSID
{0C0A3666-30C9-11D0-8F20-00805F2CD064}
to the user DELL838\IWAM_DELL838 SID (S-1-5-21-1645522239-1482476501-682003330-1008). This security permission can be modified using the Component Services administrative tool.
For more information, see Help and Support Center at http://go.microsoft.com/fwlink/events.asp.
Once this error ocurred, the machine would be out of memory within hours. Another note, I'm not using WMI to query any of my servers.
On Friday, I've stopped querying any mib that was continuously creating a warning event in the Cacti log. As of this posting, the machine has remained up and running. Secondly the 'svchost' and 'mysql-nt.exe' processes have remained steady, so I don't believe those are the culprit.
There were about 30 net/ucd graph templates and data queries that were causing errors/warnings which I removed in order for the machine to remain stable. Since I'm using the cmd.php poller, which 'process' should I be monitoring so that I can begin to narrow this problem down.
I'm willing to re-enable a few of the net/ucd devices in order to 'cause' the problem again
Thanks for your assistance.
- Attachments
-
- perfmon screen print
- perf.JPG (89.71 KiB) Viewed 3782 times
- TheWitness
- Developer
- Posts: 17061
- Joined: Tue May 14, 2002 5:08 pm
- Location: MI, USA
- Contact:
Better solution IMHO. I do like Windows for some things. For Cacti, no.
TheWitness
TheWitness
True understanding begins only when we realize how little we truly understand...
Life is an adventure, let yours begin with Cacti!
Author of dozens of Cacti plugins and customization's. Advocate of LAMP, MariaDB, IBM Spectrum LSF and the world of batch. Creator of IBM Spectrum RTM, author of quite a bit of unpublished work and most of Cacti's bugs.
_________________
Official Cacti Documentation
GitHub Repository with Supported Plugins
Percona Device Packages (no support)
Interesting Device Packages
For those wondering, I'm still here, but lost in the shadows. Yearning for less bugs. Who want's a Cacti 1.3/2.0? Streams anyone?
Life is an adventure, let yours begin with Cacti!
Author of dozens of Cacti plugins and customization's. Advocate of LAMP, MariaDB, IBM Spectrum LSF and the world of batch. Creator of IBM Spectrum RTM, author of quite a bit of unpublished work and most of Cacti's bugs.
_________________
Official Cacti Documentation
GitHub Repository with Supported Plugins
Percona Device Packages (no support)
Interesting Device Packages
For those wondering, I'm still here, but lost in the shadows. Yearning for less bugs. Who want's a Cacti 1.3/2.0? Streams anyone?
So if it wasn't the 'svchost' and 'mysql-nt.exe' processes sucking down the resources, what was it? *SOMETHING* on that system was, which isn't normal...
| Scripts: Monitor processes | RFC1213 MIB | DOCSIS Stats | Dell PowerEdge | Speedfan | APC UPS | DOCSIS CMTS | 3ware | Motorola Canopy |
| Guides: Windows Install | [HOWTO] Debug Windows NTFS permission problems |
| Tools: Windows All-in-one Installer |
Who is online
Users browsing this forum: No registered users and 2 guests