Strange Poller behaviour (cacti/spine/boost) with >100k D
Moderators: Developers, Moderators
-
- Posts: 13
- Joined: Wed Nov 19, 2008 4:08 pm
Strange Poller behaviour (cacti/spine/boost) with >100k D
Hi,
I'm confused with the cacti/spine/boost poller behaviour.
Poller behaviour looks like this:
when the poller starts via cron it "immediately" hits heavily on the network for about 20 to 30 seconds like one would expect During this time I also see 100-300% of mysql cpu activity in top.
Then spine exits (no spine processes in ps anymore) and the network becomes quiet and I see a php and mysql consuming together 100% (i.e. running singlethreaded) for about 50 to 60 seconds. Does anyone know what happens during this time and if it's possible to speed it up?
Overall poller time as reported in cacti.log is about 80-90 seconds, both things add up.
Unfortunately I'd like to run a 1 minute poller interval but obviously this doesn't work.
The hard facts:
I'm running a Cacti 0.8.7d / Boost 2.4 / Spine 0.8.7c installation on a 8 Core Intel Xeon 3.2GHz machine with 16GB RAM on a 64 Bit Centos5.
There are 110000 data sources on 350 hosts, Spine is configured with 5 processes and 30 threads. Boost is configured with a 4Gig memory resident DB.
The Data Sources are either SNMP interface traffic (bits/s) or SNMP errors from "out of the box"
Best regards
Knallfrosch
I'm confused with the cacti/spine/boost poller behaviour.
Poller behaviour looks like this:
when the poller starts via cron it "immediately" hits heavily on the network for about 20 to 30 seconds like one would expect During this time I also see 100-300% of mysql cpu activity in top.
Then spine exits (no spine processes in ps anymore) and the network becomes quiet and I see a php and mysql consuming together 100% (i.e. running singlethreaded) for about 50 to 60 seconds. Does anyone know what happens during this time and if it's possible to speed it up?
Overall poller time as reported in cacti.log is about 80-90 seconds, both things add up.
Unfortunately I'd like to run a 1 minute poller interval but obviously this doesn't work.
The hard facts:
I'm running a Cacti 0.8.7d / Boost 2.4 / Spine 0.8.7c installation on a 8 Core Intel Xeon 3.2GHz machine with 16GB RAM on a 64 Bit Centos5.
There are 110000 data sources on 350 hosts, Spine is configured with 5 processes and 30 threads. Boost is configured with a 4Gig memory resident DB.
The Data Sources are either SNMP interface traffic (bits/s) or SNMP errors from "out of the box"
Best regards
Knallfrosch
- TheWitness
- Developer
- Posts: 17062
- Joined: Tue May 14, 2002 5:08 pm
- Location: MI, USA
- Contact:
Paste the following:
Also a screen shot of your Boost Poller Status Screen.
TheWitness
Code: Select all
mysql cacti
show create table poller_output_boost;
show create table poller_output;
TheWitness
True understanding begins only when we realize how little we truly understand...
Life is an adventure, let yours begin with Cacti!
Author of dozens of Cacti plugins and customization's. Advocate of LAMP, MariaDB, IBM Spectrum LSF and the world of batch. Creator of IBM Spectrum RTM, author of quite a bit of unpublished work and most of Cacti's bugs.
_________________
Official Cacti Documentation
GitHub Repository with Supported Plugins
Percona Device Packages (no support)
Interesting Device Packages
For those wondering, I'm still here, but lost in the shadows. Yearning for less bugs. Who want's a Cacti 1.3/2.0? Streams anyone?
Life is an adventure, let yours begin with Cacti!
Author of dozens of Cacti plugins and customization's. Advocate of LAMP, MariaDB, IBM Spectrum LSF and the world of batch. Creator of IBM Spectrum RTM, author of quite a bit of unpublished work and most of Cacti's bugs.
_________________
Official Cacti Documentation
GitHub Repository with Supported Plugins
Percona Device Packages (no support)
Interesting Device Packages
For those wondering, I'm still here, but lost in the shadows. Yearning for less bugs. Who want's a Cacti 1.3/2.0? Streams anyone?
-
- Posts: 13
- Joined: Wed Nov 19, 2008 4:08 pm
Answers...
Hello and thanks for picking this thing up...
The questionable output is (I figure the blurp around isn't really needed):
| poller_output_boost | CREATE TABLE `poller_output_boost` (
`local_data_id` mediumint(8) unsigned NOT NULL default '0',
`rrd_name` varchar(19) NOT NULL default '',
`time` datetime NOT NULL default '0000-00-00 00:00:00',
`output` varchar(512) NOT NULL,
PRIMARY KEY (`local_data_id`,`rrd_name`,`time`),
KEY `time_local_data_id` (`time`,`local_data_id`)
) ENGINE=MEMORY DEFAULT CHARSET=latin1 |
| poller_output | CREATE TABLE `poller_output` (
`local_data_id` mediumint(8) unsigned NOT NULL default '0',
`rrd_name` varchar(19) NOT NULL default '',
`time` datetime NOT NULL default '0000-00-00 00:00:00',
`output` text NOT NULL,
PRIMARY KEY (`local_data_id`,`rrd_name`,`time`)
) ENGINE=MyISAM DEFAULT CHARSET=latin1 |
So the poller_output table isn't running as memory embedded if I understand correctly. Would it help to move it into memory?
The boost status is somewhat... hum... I turned a lot of knobs without really knowing what I was doing until things run more or less smoothly...
Best regards & Thanks for your help
Knallfrosch
The questionable output is (I figure the blurp around isn't really needed):
| poller_output_boost | CREATE TABLE `poller_output_boost` (
`local_data_id` mediumint(8) unsigned NOT NULL default '0',
`rrd_name` varchar(19) NOT NULL default '',
`time` datetime NOT NULL default '0000-00-00 00:00:00',
`output` varchar(512) NOT NULL,
PRIMARY KEY (`local_data_id`,`rrd_name`,`time`),
KEY `time_local_data_id` (`time`,`local_data_id`)
) ENGINE=MEMORY DEFAULT CHARSET=latin1 |
| poller_output | CREATE TABLE `poller_output` (
`local_data_id` mediumint(8) unsigned NOT NULL default '0',
`rrd_name` varchar(19) NOT NULL default '',
`time` datetime NOT NULL default '0000-00-00 00:00:00',
`output` text NOT NULL,
PRIMARY KEY (`local_data_id`,`rrd_name`,`time`)
) ENGINE=MyISAM DEFAULT CHARSET=latin1 |
So the poller_output table isn't running as memory embedded if I understand correctly. Would it help to move it into memory?
The boost status is somewhat... hum... I turned a lot of knobs without really knowing what I was doing until things run more or less smoothly...
Best regards & Thanks for your help
Knallfrosch
- Attachments
-
- boost status
- booststatus.jpg (48.55 KiB) Viewed 7227 times
- TheWitness
- Developer
- Posts: 17062
- Joined: Tue May 14, 2002 5:08 pm
- Location: MI, USA
- Contact:
Code: Select all
ALTER TABLE poller_output, ENGINE=MEMORY;
Code: Select all
ALTER TABLE poller_output_boost DROP PRIMARY KEY, ADD PRIMARY KEY USING BTREE (`local_data_id`,`rrd_name`,`time`), ADD INDEX `local_data_id` USING BTREE (`local_data_id`);
True understanding begins only when we realize how little we truly understand...
Life is an adventure, let yours begin with Cacti!
Author of dozens of Cacti plugins and customization's. Advocate of LAMP, MariaDB, IBM Spectrum LSF and the world of batch. Creator of IBM Spectrum RTM, author of quite a bit of unpublished work and most of Cacti's bugs.
_________________
Official Cacti Documentation
GitHub Repository with Supported Plugins
Percona Device Packages (no support)
Interesting Device Packages
For those wondering, I'm still here, but lost in the shadows. Yearning for less bugs. Who want's a Cacti 1.3/2.0? Streams anyone?
Life is an adventure, let yours begin with Cacti!
Author of dozens of Cacti plugins and customization's. Advocate of LAMP, MariaDB, IBM Spectrum LSF and the world of batch. Creator of IBM Spectrum RTM, author of quite a bit of unpublished work and most of Cacti's bugs.
_________________
Official Cacti Documentation
GitHub Repository with Supported Plugins
Percona Device Packages (no support)
Interesting Device Packages
For those wondering, I'm still here, but lost in the shadows. Yearning for less bugs. Who want's a Cacti 1.3/2.0? Streams anyone?
-
- Posts: 13
- Joined: Wed Nov 19, 2008 4:08 pm
Hello Witness,
first of all thanks for trying to help.
The other one did work ok, but with little immediate effect. Generally the behaviour didn't change, I'll keep it running through the night to see if theres a variation in the times the poller takes.
Best Regards
Knallfrosch
first of all thanks for trying to help.
Code: Select all
mysql> ALTER TABLE poller_output, ENGINE=MEMORY;
ERROR 1163 (42000): The used table type doesn't support BLOB/TEXT columns
Best Regards
Knallfrosch
-
- Posts: 13
- Joined: Wed Nov 19, 2008 4:08 pm
tried it differently....
Hi Witness,
Hm. I changed the poller_output table so that "output" isn't a text anymore but a varchar field (hope that 128 is long enough...) and committed the table to memory.
This improves the situation significantly, but I'm still around with spine taking e.g. 25 seconds, but the accumulated polling time is at 70 seconds.
So I see an improvement but still it takes long...
My assumptions on what happens in the polling process with spine without boost would be that
1 - spine retrieves the data and puts them into the poller_output
2 - the poller.php iterates through the poller_output table and updates the rrd files.
I assume that with boost the 2nd step is intercepted and that instead of updating the rrd files the data is taken and processed into the poller_output_boost table that has the same structure but different indices.
So essentially what happens is that the poller.php copies the entries from poller_output to poller_output_boost.
Is there some essential reason why spine doesn't write the data in the poller_output_boost directly? Or is this just missing code?
Thanks
Knallfrosch
Hm. I changed the poller_output table so that "output" isn't a text anymore but a varchar field (hope that 128 is long enough...) and committed the table to memory.
Code: Select all
poller_output | CREATE TABLE `poller_output` (
`local_data_id` mediumint(8) unsigned NOT NULL default '0',
`rrd_name` varchar(19) NOT NULL default '',
`time` datetime NOT NULL default '0000-00-00 00:00:00',
`output` varchar(128) NOT NULL,
PRIMARY KEY (`local_data_id`,`rrd_name`,`time`)
) ENGINE=MEMORY DEFAULT CHARSET=latin1 |
So I see an improvement but still it takes long...
My assumptions on what happens in the polling process with spine without boost would be that
1 - spine retrieves the data and puts them into the poller_output
2 - the poller.php iterates through the poller_output table and updates the rrd files.
I assume that with boost the 2nd step is intercepted and that instead of updating the rrd files the data is taken and processed into the poller_output_boost table that has the same structure but different indices.
So essentially what happens is that the poller.php copies the entries from poller_output to poller_output_boost.
Is there some essential reason why spine doesn't write the data in the poller_output_boost directly? Or is this just missing code?
Thanks
Knallfrosch
- TheWitness
- Developer
- Posts: 17062
- Joined: Tue May 14, 2002 5:08 pm
- Location: MI, USA
- Contact:
Increasing a few variables on the Boost tab will speed things up a bit. For example, the maximum mysql insert size, by default is 1MByte making it 500k would do good. You will have to test it out.
TheWitness
TheWitness
True understanding begins only when we realize how little we truly understand...
Life is an adventure, let yours begin with Cacti!
Author of dozens of Cacti plugins and customization's. Advocate of LAMP, MariaDB, IBM Spectrum LSF and the world of batch. Creator of IBM Spectrum RTM, author of quite a bit of unpublished work and most of Cacti's bugs.
_________________
Official Cacti Documentation
GitHub Repository with Supported Plugins
Percona Device Packages (no support)
Interesting Device Packages
For those wondering, I'm still here, but lost in the shadows. Yearning for less bugs. Who want's a Cacti 1.3/2.0? Streams anyone?
Life is an adventure, let yours begin with Cacti!
Author of dozens of Cacti plugins and customization's. Advocate of LAMP, MariaDB, IBM Spectrum LSF and the world of batch. Creator of IBM Spectrum RTM, author of quite a bit of unpublished work and most of Cacti's bugs.
_________________
Official Cacti Documentation
GitHub Repository with Supported Plugins
Percona Device Packages (no support)
Interesting Device Packages
For those wondering, I'm still here, but lost in the shadows. Yearning for less bugs. Who want's a Cacti 1.3/2.0? Streams anyone?
-
- Posts: 13
- Joined: Wed Nov 19, 2008 4:08 pm
Hi,
sorry for coming back with this thing after so long time:
I played around with the variables but I don't see much change.
mysql insert size: tried 500k, 1000k, 2000k run time varied 1-2 seconds.
Am I right that the
maximum records
maximum data sources per pass
maximum argument length
are just for the rrd updates, not relevant for the polling process itself?
Best regards
Knallfrosch
sorry for coming back with this thing after so long time:
I played around with the variables but I don't see much change.
mysql insert size: tried 500k, 1000k, 2000k run time varied 1-2 seconds.
Am I right that the
maximum records
maximum data sources per pass
maximum argument length
are just for the rrd updates, not relevant for the polling process itself?
Best regards
Knallfrosch
- TheWitness
- Developer
- Posts: 17062
- Joined: Tue May 14, 2002 5:08 pm
- Location: MI, USA
- Contact:
The maximum data sources per pass defines the delete size when processing the boost table. If you increase this value, you will see deletes taking place less frequently when viewing the boost status screen under System Utilities. Also, when using a table structure other than MEMORY, setting this too large, causes a slowdown in polling.
What is your current situation?
TheWitness
What is your current situation?
TheWitness
True understanding begins only when we realize how little we truly understand...
Life is an adventure, let yours begin with Cacti!
Author of dozens of Cacti plugins and customization's. Advocate of LAMP, MariaDB, IBM Spectrum LSF and the world of batch. Creator of IBM Spectrum RTM, author of quite a bit of unpublished work and most of Cacti's bugs.
_________________
Official Cacti Documentation
GitHub Repository with Supported Plugins
Percona Device Packages (no support)
Interesting Device Packages
For those wondering, I'm still here, but lost in the shadows. Yearning for less bugs. Who want's a Cacti 1.3/2.0? Streams anyone?
Life is an adventure, let yours begin with Cacti!
Author of dozens of Cacti plugins and customization's. Advocate of LAMP, MariaDB, IBM Spectrum LSF and the world of batch. Creator of IBM Spectrum RTM, author of quite a bit of unpublished work and most of Cacti's bugs.
_________________
Official Cacti Documentation
GitHub Repository with Supported Plugins
Percona Device Packages (no support)
Interesting Device Packages
For those wondering, I'm still here, but lost in the shadows. Yearning for less bugs. Who want's a Cacti 1.3/2.0? Streams anyone?
Hi, why does the 'maximum data sources per pass' value in the setting tab set to fixed lengt (max 8000), can I increase it manually to 80000? What's the effect?
TheWitness wrote:The maximum data sources per pass defines the delete size when processing the boost table. If you increase this value, you will see deletes taking place less frequently when viewing the boost status screen under System Utilities. Also, when using a table structure other than MEMORY, setting this too large, causes a slowdown in polling.
What is your current situation?
TheWitness
- TheWitness
- Developer
- Posts: 17062
- Joined: Tue May 14, 2002 5:08 pm
- Location: MI, USA
- Contact:
That would not be wise. The default of 2000 is pretty good. There are a few cacti installs bordering on 400k data sources that have reported that this is the best setting.
TheWitness
TheWitness
True understanding begins only when we realize how little we truly understand...
Life is an adventure, let yours begin with Cacti!
Author of dozens of Cacti plugins and customization's. Advocate of LAMP, MariaDB, IBM Spectrum LSF and the world of batch. Creator of IBM Spectrum RTM, author of quite a bit of unpublished work and most of Cacti's bugs.
_________________
Official Cacti Documentation
GitHub Repository with Supported Plugins
Percona Device Packages (no support)
Interesting Device Packages
For those wondering, I'm still here, but lost in the shadows. Yearning for less bugs. Who want's a Cacti 1.3/2.0? Streams anyone?
Life is an adventure, let yours begin with Cacti!
Author of dozens of Cacti plugins and customization's. Advocate of LAMP, MariaDB, IBM Spectrum LSF and the world of batch. Creator of IBM Spectrum RTM, author of quite a bit of unpublished work and most of Cacti's bugs.
_________________
Official Cacti Documentation
GitHub Repository with Supported Plugins
Percona Device Packages (no support)
Interesting Device Packages
For those wondering, I'm still here, but lost in the shadows. Yearning for less bugs. Who want's a Cacti 1.3/2.0? Streams anyone?
On my DB I have a maximum heap table size 512 Mbyte. Sometimes I get Insert Error on the poller_output_boost. What's the best setting for:
- How Often Should Boost Update All RRD's
- Maximum Records
- Maximum Data Sources Per Pass
- Maximum MySQL Insert String
- Maximum Argument Length
- Memory Limit for Boost and Poller
- Maximum RRD Update Script Run Time
Is there any practical formula to set these properties to gain optimum/maximum performance?
Thanks.
- How Often Should Boost Update All RRD's
- Maximum Records
- Maximum Data Sources Per Pass
- Maximum MySQL Insert String
- Maximum Argument Length
- Memory Limit for Boost and Poller
- Maximum RRD Update Script Run Time
Is there any practical formula to set these properties to gain optimum/maximum performance?
Thanks.
- Attachments
-
- Here is my boost status
- Telkomnet Care - MRTG_boost.png (21.71 KiB) Viewed 6972 times
- TheWitness
- Developer
- Posts: 17062
- Joined: Tue May 14, 2002 5:08 pm
- Location: MI, USA
- Contact:
I have a spreadsheet that provides some guidance. Your settings are pretty good, but you might want to consider upping your maximum records to 1M as your boost is running quite frequently.
Beyond that it's all trial and error. I would like to see those error messages, they are very important.
TheWitness
Beyond that it's all trial and error. I would like to see those error messages, they are very important.
TheWitness
True understanding begins only when we realize how little we truly understand...
Life is an adventure, let yours begin with Cacti!
Author of dozens of Cacti plugins and customization's. Advocate of LAMP, MariaDB, IBM Spectrum LSF and the world of batch. Creator of IBM Spectrum RTM, author of quite a bit of unpublished work and most of Cacti's bugs.
_________________
Official Cacti Documentation
GitHub Repository with Supported Plugins
Percona Device Packages (no support)
Interesting Device Packages
For those wondering, I'm still here, but lost in the shadows. Yearning for less bugs. Who want's a Cacti 1.3/2.0? Streams anyone?
Life is an adventure, let yours begin with Cacti!
Author of dozens of Cacti plugins and customization's. Advocate of LAMP, MariaDB, IBM Spectrum LSF and the world of batch. Creator of IBM Spectrum RTM, author of quite a bit of unpublished work and most of Cacti's bugs.
_________________
Official Cacti Documentation
GitHub Repository with Supported Plugins
Percona Device Packages (no support)
Interesting Device Packages
For those wondering, I'm still here, but lost in the shadows. Yearning for less bugs. Who want's a Cacti 1.3/2.0? Streams anyone?
The error is:
But it is no longer happened because I have increase the maximum heap table size to 512 MB. But then I changed it again from Memory engine to MyIsam because I'm not quite sure if I add more devices and graphs, it will fit.
In the next boost version, will it use multi threading? Cause I think it would be nice that boost use multi thread and combine with spine multi thread would increase performance. Thanks.
Btw, would you share the spreadsheet with us? Thanks.
Code: Select all
04/07/2009 06:45:34 PM - CMDPHP: Poller[0] ERROR: A DB Exec Failed!, Error:'1114', SQL:"INSERT INTO poller_output_boost (local_data_id, rrd_name, time, output) VALUES ('2398','traffic_in','2009-04-07 18:45:02','1527167155'), ('2400','traffic_out','2009-04-07 18:45:02','1109724258'), ('2400','traffic_in','2009-04-07 18:45:02','2153912020'), ('2402','traffic_out','2009-04-07 18:45:02','1109767842'), ('2402','traffic_in','2009-04-07 18:45:02','2153955604'), ('2404','traffic_out','2009-04-07 18:45:02','11746'), ('2404','traffic_in','2009-04-07 18:45:02','0') ON DUPLICATE KEY UPDATE output=VALUES(output)
In the next boost version, will it use multi threading? Cause I think it would be nice that boost use multi thread and combine with spine multi thread would increase performance. Thanks.
Btw, would you share the spreadsheet with us? Thanks.
- TheWitness
- Developer
- Posts: 17062
- Joined: Tue May 14, 2002 5:08 pm
- Location: MI, USA
- Contact:
You should not use myisam for such a large install it will increase poll times significantly.
I am not where can send you the SS right now.
TheWitness
I am not where can send you the SS right now.
TheWitness
True understanding begins only when we realize how little we truly understand...
Life is an adventure, let yours begin with Cacti!
Author of dozens of Cacti plugins and customization's. Advocate of LAMP, MariaDB, IBM Spectrum LSF and the world of batch. Creator of IBM Spectrum RTM, author of quite a bit of unpublished work and most of Cacti's bugs.
_________________
Official Cacti Documentation
GitHub Repository with Supported Plugins
Percona Device Packages (no support)
Interesting Device Packages
For those wondering, I'm still here, but lost in the shadows. Yearning for less bugs. Who want's a Cacti 1.3/2.0? Streams anyone?
Life is an adventure, let yours begin with Cacti!
Author of dozens of Cacti plugins and customization's. Advocate of LAMP, MariaDB, IBM Spectrum LSF and the world of batch. Creator of IBM Spectrum RTM, author of quite a bit of unpublished work and most of Cacti's bugs.
_________________
Official Cacti Documentation
GitHub Repository with Supported Plugins
Percona Device Packages (no support)
Interesting Device Packages
For those wondering, I'm still here, but lost in the shadows. Yearning for less bugs. Who want's a Cacti 1.3/2.0? Streams anyone?
Who is online
Users browsing this forum: No registered users and 4 guests