cacti with 5000+ DS not updating all data sources

Post general support questions here that do not specifically fall into the Linux or Windows categories.

Moderators: Developers, Moderators

Post Reply
nate450
Posts: 17
Joined: Mon Mar 13, 2006 1:45 pm
Location: Seattle, WA

cacti with 5000+ DS not updating all data sources

Post by nate450 »

I've inherited a cacti installation that isn't setup in the best fashion I think.

It's running cacti 0.8.6h and cactid 0.8.6h.

Currently there are about 300 devices and about 5800 data sources.

Enabling full debug shows cacti claiming it's updating the data sources for some hosts, but the files haven't been touched in about 18 hours. At first I thought I resolved this just by increasing the number of poller processes(was at 4, increased it to 8), and it went about a week without much issue.

But now it's cropped back up again, cacti says it "completes" in about 70-80 seconds. No errors related to the hosts in the logs it says it's doing it but it's not. I further increased pollers/threads/etc to see if that would help but it hasn't. Load on the system isn't too high, but something is preventing cacti from touching the files. Permissions are all fine.

I'm thinking there is just too many data sources, and some sort of internal limit is preventing cacti from touching some files(internal timeout or something).

Cacti's log says out of 5850 DSs, it is updating 5613 of them.

I plan to re-do this installation so there are less data sources, but that's still a few weeks away probably, was wondering if anyone had any thoughts in the meantime.

If I run cactid against one of the problematic hosts specifically it works fine and updates the poller_output table, no errors. I verified that the data actually shows up in mysql as well.

Maybe something to do with the poller getting data from the poller_output table? I tried looking at the code but couldn't find much that I might be able to adjust to increase debugging further(not familiar with PHP).

thanks
nate450
Posts: 17
Joined: Mon Mar 13, 2006 1:45 pm
Location: Seattle, WA

update

Post by nate450 »

I found an error now in the debug log, will try to debug it more tomorrow morning, the error is

04/09/2008 04:32:19 PM - PCOMMAND: Poller[0] ERROR: Poller Command processing timed out after processing 'Array'
nate450
Posts: 17
Joined: Mon Mar 13, 2006 1:45 pm
Location: Seattle, WA

Post by nate450 »

alright after digging into the code a bit more looks like the poller is just timing out after 296 seconds, so not much I can do with the way it's currently setup. It seems while data collection can be run in parallel with multiple threads, the back-end process is serial?

anyways, no big deal at least I understand why, I can fix it later.
Post Reply

Who is online

Users browsing this forum: No registered users and 0 guests