Help with poller timeout and overruns

Post general support questions here that do not specifically fall into the Linux or Windows categories.

Moderators: Developers, Moderators

Post Reply
dallenk
Posts: 19
Joined: Sat Nov 26, 2011 3:21 am

Help with poller timeout and overruns

Post by dallenk »

I don't know how this happened, or if it was due to an upgrade (most current git pulls with develop) This happened seemingly overnight.

Every single poller run on a server is emails errors/alerts passing the 60 second timeout, but here is the weird thing, all the graphs are updating correctly, no gaps, no missing data, but system load is very high as well (3+)

cat /proc/cpuinfo | grep processor | wc -l
4

PHP 8.1.30 (cli) (built: Sep 27 2024 04:07:29) (NTS)

df2d7f96a (HEAD -> develop, origin/develop, origin/HEAD) Merge branch 'develop' of https://github.com/Cacti/cacti into develop

Code: Select all

2024-11-28 08:31:31 - SYSTEM STATS: Time:30.7458 Method:spine Processes:4 Threads:15 Hosts:169 HostsPerProcess:43 DataSources:2071 RRDsProcessed:1458
2024-11-28 08:30:32 - SYSTEM STATS: Time:31.4839 Method:spine Processes:4 Threads:15 Hosts:169 HostsPerProcess:43 DataSources:2071 RRDsProcessed:1458
2024-11-28 08:29:33 - SYSTEM STATS: Time:31.9367 Method:spine Processes:4 Threads:15 Hosts:169 HostsPerProcess:43 DataSources:2071 RRDsProcessed:1458
2024-11-28 08:28:37 - SYSTEM STATS: Time:36.1866 Method:spine Processes:4 Threads:15 Hosts:169 HostsPerProcess:43 DataSources:2071 RRDsProcessed:1458
2024-11-28 08:27:52 - SYSTEM STATS: Time:23.8041 Method:spine Processes:4 Threads:15 Hosts:169 HostsPerProcess:43 DataSources:2071 RRDsProcessed:1458
2024-11-28 08:27:00 - SYSTEM STATS: Time:58.9428 Method:spine Processes:4 Threads:15 Hosts:169 HostsPerProcess:43 DataSources:2071 RRDsProcessed:1458
2024-11-28 08:25:38 - SYSTEM STATS: Time:37.7470 Method:spine Processes:4 Threads:15 Hosts:169 HostsPerProcess:43 DataSources:2071 RRDsProcessed:1458
2024-11-28 08:24:34 - SYSTEM STATS: Time:32.9448 Method:spine Processes:4 Threads:15 Hosts:169 HostsPerProcess:43 DataSources:2071 RRDsProcessed:1458
2024-11-28 08:23:37 - SYSTEM STATS: Time:36.1345 Method:spine Processes:4 Threads:15 Hosts:169 HostsPerProcess:43 DataSources:2071 RRDsProcessed:1458
2024-11-28 08:22:27 - SYSTEM STATS: Time:27.2584 Method:spine Processes:4 Threads:15 Hosts:169 HostsPerProcess:43 DataSources:2071 RRDsProcessed:1458
2024-11-28 08:21:30 - SYSTEM STATS: Time:27.8365 Method:spine Processes:4 Threads:15 Hosts:169 HostsPerProcess:43 DataSources:2071 RRDsProcessed:1458
2024-11-28 08:20:36 - SYSTEM STATS: Time:35.4239 Method:spine Processes:4 Threads:15 Hosts:169 HostsPerProcess:43 DataSources:2071 RRDsProcessed:1458
2024-11-28 08:19:36 - SYSTEM STATS: Time:34.4630 Method:spine Processes:4 Threads:15 Hosts:169 HostsPerProcess:43 DataSources:2071 RRDsProcessed:1458
2024-11-28 08:19:04 - SYSTEM STATS: Time:25.1903 Method:spine Processes:4 Threads:15 Hosts:169 HostsPerProcess:43 DataSources:2071 RRDsProcessed:1458
2024-11-28 08:18:40 - SYSTEM STATS: Time:99.0168 Method:spine Processes:4 Threads:15 Hosts:169 HostsPerProcess:43 DataSources:2071 RRDsProcessed:1428
2024-11-28 08:16:36 - SYSTEM STATS: Time:34.6589 Method:spine Processes:4 Threads:15 Hosts:169 HostsPerProcess:43 DataSources:2071 RRDsProcessed:1458
2024-11-28 08:15:39 - SYSTEM STATS: Time:38.0872 Method:spine Processes:4 Threads:15 Hosts:169 HostsPerProcess:43 DataSources:2071 RRDsProcessed:1458
2024-11-28 08:14:30 - SYSTEM STATS: Time:29.8767 Method:spine Processes:4 Threads:15 Hosts:169 HostsPerProcess:43 DataSources:2071 RRDsProcessed:1458
2024-11-28 08:13:31 - SYSTEM STATS: Time:28.9734 Method:spine Processes:4 Threads:15 Hosts:169 HostsPerProcess:43 DataSources:2071 RRDsProcessed:1458
2024-11-28 08:12:23 - SYSTEM STATS: Time:22.2418 Method:spine Processes:4 Threads:15 Hosts:169 HostsPerProcess:43 DataSources:2071 RRDsProcessed:1458
2024-11-28 08:11:37 - SYSTEM STATS: Time:33.7727 Method:spine Processes:4 Threads:15 Hosts:169 HostsPerProcess:43 DataSources:2071 RRDsProcessed:1458
2024-11-28 08:10:34 - SYSTEM STATS: Time:33.0161 Method:spine Processes:4 Threads:15 Hosts:169 HostsPerProcess:43 DataSources:2071 RRDsProcessed:1458
The system stats don't show these overruns every minute, but I get an email every minute indication there is a problem. (over 1300 this morning)

Examples

Code: Select all

08:34
Maximum runtime of 58 seconds exceeded for Poller[Main Poller]. Exiting.
WARNING: There are 1 processes detected as overrunning a polling cycle for Poller[Main Poller], please investigate.
WARNING: There are 1 processes detected as overrunning a polling cycle for Poller[Main Poller], please investigate.
WARNING: Cacti Polling Cycle Exceeded Poller Interval by 39.18 seconds
Maximum runtime of 58 seconds exceeded for Poller[Main Poller]. Exiting.

08:28
WARNING: There are 1 processes detected as overrunning a polling cycle for Poller[Main Poller], please investigate.
WARNING: Cacti Polling Cycle Exceeded Poller Interval by 26.97 seconds

so, i decided to upgrad the server to a new machine entirely.. followed a backup and restore (192 gig ram) 16 core... and actually seeing the same thing and I'm super confused..

Is there anything I can dig into deeper as to why this is happening?
User avatar
TheWitness
Developer
Posts: 17047
Joined: Tue May 14, 2002 5:08 pm
Location: MI, USA
Contact:

Re: Help with poller timeout and overruns

Post by TheWitness »

Most importantly, enable boost. Remember that develop is also very dynamic. For example today, I made a commit that requires you to rerun the upgrade script. So, make sure, if running develop, to look for upgrades, or changes that force rerunning the upgrade.
True understanding begins only when we realize how little we truly understand...

Life is an adventure, let yours begin with Cacti!

Author of dozens of Cacti plugins and customization's. Advocate of LAMP, MariaDB, IBM Spectrum LSF and the world of batch. Creator of IBM Spectrum RTM, author of quite a bit of unpublished work and most of Cacti's bugs.
_________________
Official Cacti Documentation
GitHub Repository with Supported Plugins
Percona Device Packages (no support)
Interesting Device Packages


For those wondering, I'm still here, but lost in the shadows. Yearning for less bugs. Who want's a Cacti 1.3/2.0? Streams anyone?
dallenk
Posts: 19
Joined: Sat Nov 26, 2011 3:21 am

Re: Help with poller timeout and overruns

Post by dallenk »

oh man! I didn't even think about that, and usually I check that and the database_upgrade when i'm testing..
/facepalm.

I will put a sticky note on my wall. (I disabled boost when it was causing faults and forgot to re-enable it.) I may be back to mark this one resolved as PEBKAC errors...
User avatar
TheWitness
Developer
Posts: 17047
Joined: Tue May 14, 2002 5:08 pm
Location: MI, USA
Contact:

Re: Help with poller timeout and overruns

Post by TheWitness »

Make sure you give someone a thanksgiving hug today. If they are not close, just pick someone randomly off the street. They, he or she may save your life!
True understanding begins only when we realize how little we truly understand...

Life is an adventure, let yours begin with Cacti!

Author of dozens of Cacti plugins and customization's. Advocate of LAMP, MariaDB, IBM Spectrum LSF and the world of batch. Creator of IBM Spectrum RTM, author of quite a bit of unpublished work and most of Cacti's bugs.
_________________
Official Cacti Documentation
GitHub Repository with Supported Plugins
Percona Device Packages (no support)
Interesting Device Packages


For those wondering, I'm still here, but lost in the shadows. Yearning for less bugs. Who want's a Cacti 1.3/2.0? Streams anyone?
dallenk
Posts: 19
Joined: Sat Nov 26, 2011 3:21 am

Re: Help with poller timeout and overruns

Post by dallenk »

i'm at a loss here.. I took a backup from an old dying server to a brand new dedicated machine with 32G ram.. DL360-G9 . every host has been disabled and I still see pollers running past 60 seconds in the logs.

Code: Select all

Interval	60
Type	SPINE 1.3.0 Copyright 2004-2024 by The Cacti Group
Items	Action[0]: 1,520
Action[1]: 7
Action[2]: 545
Total: 2,072
Concurrent Processes	Name: Main Poller, Procs: 4
Max Threads	Name: Main Poller, Threads: 20
PHP Servers	1
Minimum Connections:	Main Server: Current: 200, Min Required: 184
Script Timeout	10
Max OID	10
Last Run Statistics	Time:60.3172 Method:spine Processes:4 Threads:20 Hosts:0 HostsPerProcess:0 DataSources:33 RRDsProcessed:0

Code: Select all

collation_server = utf8mb4_unicode_ci
character_set_server = utf8mb4
max_heap_table_size = 512M
tmp_table_size = 512M
innodb_buffer_pool_size = 8000M
innodb_doublewrite = OFF
innodb_log_file_size = 1G
innodb_flush_log_at_trx_commit = 1
query_cache_size = 16M
query_cache_limit = 16M
join_buffer_size = 262144
sort_buffer_size = 2097152
read_buffer_size = 4M
read_rnd_buffer_size = 8M
max_connections = 200
key_buffer_size = 24M
Using the recommended settings in support.php. odd thing is nothing really changed between when it worked great, and when it didn't work great aside from git pulls.

DB has been updated.
spine 1..3.0
87c9b17bf (HEAD -> develop, origin/develop, origin/HEAD) Fix #5952 - New Form dropdown type drop_icon



i'm confused... (more than usual)
User avatar
TheWitness
Developer
Posts: 17047
Joined: Tue May 14, 2002 5:08 pm
Location: MI, USA
Contact:

Re: Help with poller timeout and overruns

Post by TheWitness »

So, for so few servers you really only need like one process and say 20 threads.
True understanding begins only when we realize how little we truly understand...

Life is an adventure, let yours begin with Cacti!

Author of dozens of Cacti plugins and customization's. Advocate of LAMP, MariaDB, IBM Spectrum LSF and the world of batch. Creator of IBM Spectrum RTM, author of quite a bit of unpublished work and most of Cacti's bugs.
_________________
Official Cacti Documentation
GitHub Repository with Supported Plugins
Percona Device Packages (no support)
Interesting Device Packages


For those wondering, I'm still here, but lost in the shadows. Yearning for less bugs. Who want's a Cacti 1.3/2.0? Streams anyone?
Post Reply

Who is online

Users browsing this forum: No registered users and 3 guests