I've completed a migration of Cacti from version 0.8.7a (with cactid 0.8.6i) to a brand new machine with Cacti 0.8.7e and Spine 0.8.7e - both servers are sparc based on Solaris 10.
Spine has been compiled against the net-snmp sources provided by Sun (SUNWsmaS package) which is a Sun packaged version of net-snmp 5.0.9. Database has been dumped to the new one and Cacti/Spine can succesfully connect to it.
A note in regard of PHP: it was the version taken out of the Sun GlassFish Webstack which doesn't have SNMP support. However I don't look at this as a showstopper since I've provided the paths to the snmp binaries to Cacti and it should be able to snmpwalk/snmpget to the hosts just fine.
The existing RRD files (about 7000 total, 3GB ca. of data) have been migrated first to xml with rrdtool 1.2.x present on the old machine, and then imported back on the new one with rrdtool 1.3 to rrd format again.
So far so good, the graphs display correctly until the time of the switch-over.
However the big problem is that Spine doesn't poll any of the host! I've done everything I could, compiled Spine with the Solaris 10 privileges option and gave access to cactiuser to the icmp sockets in the Solaris kernel, but I always get loads of "ICMP: Ping timed out" messages even though it later says "Host responded to SNMP".
What's wrong with it? I'm almost tearing my hairs out because of this issue.. simply because I cannot see any logic behind this behavior. If I try snmpwalk or ping manually against that host I get a reply, hence a network problem has to be excluded in the first place.
And then, even if the host is up, it just says "There are XX polling items for this host" and immediately "HOST COMPLETE: About to exit host polling thread function". Why? It neither tries to do a single polling..
I've reported some Spine printouts if anything meaningful could be found at all:
What could it be? A bug? I cannot guess anything different.. (no offense meant to the developers!)07/17/2009 01:53:49 PM - SPINE: Poller[0] DEBUG: Basic privset is: 'basic'.
07/17/2009 01:53:49 PM - SPINE: Poller[0] DEBUG: Privilege PRIV_NET_ICMPACCESS is: 'Enabled'.
07/17/2009 01:53:49 PM - SPINE: Poller[0] DEBUG: The Value of Active Threads is 23
07/17/2009 01:53:49 PM - SPINE: Poller[0] DEBUG: The Value of Active Threads is 22
07/17/2009 01:53:49 PM - SPINE: Poller[0] DEBUG: Valid Thread to be Created
07/17/2009 01:53:49 PM - SPINE: Poller[0] DEBUG: In Poller, About to Start Polling of Host
07/17/2009 01:53:49 PM - SPINE: Poller[0] DEBUG: The Value of Active Threads is 23
07/17/2009 01:53:49 PM - SPINE: Poller[0] DEBUG: Basic privset is: 'basic'.
07/17/2009 01:53:49 PM - SPINE: Poller[0] DEBUG: Privilege PRIV_NET_ICMPACCESS is: 'Enabled'.
07/17/2009 01:53:49 PM - SPINE: Poller[0] DEBUG: Valid Thread to be Created
07/17/2009 01:53:49 PM - SPINE: Poller[0] DEBUG: In Poller, About to Start Polling of Host
07/17/2009 01:53:49 PM - SPINE: Poller[0] DEBUG: The Value of Active Threads is 24
07/17/2009 01:53:49 PM - SPINE: Poller[0] DEBUG: Basic privset is: 'basic'.
07/17/2009 01:53:49 PM - SPINE: Poller[0] DEBUG: Privilege PRIV_NET_ICMPACCESS is: 'Enabled'.
07/17/2009 01:53:49 PM - SPINE: Poller[0] Host[514] PING Result: ICMP: Ping timed out
07/17/2009 01:53:49 PM - SPINE: Poller[0] Host[514] SNMP Result: Host did not respond to SNMP
07/17/2009 01:53:49 PM - SPINE: Poller[0] Host[501] NOTE: There are '75' Polling Items for this Host
07/17/2009 01:53:49 PM - SPINE: Poller[0] Host[501] DEBUG: HOST COMPLETE: About to Exit Host Polling Thread Function
07/17/2009 01:53:49 PM - SPINE: Poller[0] DEBUG: The Value of Active Threads is 23
07/17/2009 01:53:50 PM - SPINE: Poller[0] DEBUG: Valid Thread to be Created
07/17/2009 01:53:50 PM - SPINE: Poller[0] DEBUG: In Poller, About to Start Polling of Host
07/17/2009 01:53:50 PM - SPINE: Poller[0] DEBUG: The Value of Active Threads is 24
07/17/2009 01:53:50 PM - SPINE: Poller[0] DEBUG: Basic privset is: 'basic'.
07/17/2009 01:53:50 PM - SPINE: Poller[0] DEBUG: Privilege PRIV_NET_ICMPACCESS is: 'Enabled'.
07/17/2009 01:53:50 PM - SPINE: Poller[0] Host[505] NOTE: There are '7' Polling Items for this Host
07/17/2009 01:53:50 PM - SPINE: Poller[0] Host[505] DEBUG: HOST COMPLETE: About to Exit Host Polling Thread Function
07/17/2009 01:53:50 PM - SPINE: Poller[0] DEBUG: The Value of Active Threads is 23
07/17/2009 01:53:50 PM - SPINE: Poller[0] DEBUG: Valid Thread to be Created
07/17/2009 01:53:50 PM - SPINE: Poller[0] DEBUG: In Poller, About to Start Polling of Host
07/17/2009 01:53:50 PM - SPINE: Poller[0] DEBUG: The Value of Active Threads is 24
07/17/2009 01:53:50 PM - SPINE: Poller[0] DEBUG: Basic privset is: 'basic'.
07/17/2009 01:53:50 PM - SPINE: Poller[0] DEBUG: Privilege PRIV_NET_ICMPACCESS is: 'Enabled'.
07/17/2009 01:53:50 PM - SPINE: Poller[0] Host[503] NOTE: There are '62' Polling Items for this Host
07/17/2009 01:53:50 PM - SPINE: Poller[0] Host[503] DEBUG: HOST COMPLETE: About to Exit Host Polling Thread Function
07/17/2009 01:53:50 PM - SPINE: Poller[0] DEBUG: The Value of Active Threads is 23
07/17/2009 01:53:50 PM - SPINE: Poller[0] Host[504] NOTE: There are '62' Polling Items for this Host
07/17/2009 01:53:50 PM - SPINE: Poller[0] Host[504] DEBUG: HOST COMPLETE: About to Exit Host Polling Thread Function
07/17/2009 01:53:50 PM - SPINE: Poller[0] DEBUG: The Value of Active Threads is 22
07/17/2009 01:53:50 PM - SPINE: Poller[0] Host[534] PING Result: ICMP: Ping timed out
07/17/2009 01:53:50 PM - SPINE: Poller[0] Host[534] SNMP Result: Host responded to SNMP
I'm able to provide all of the information needed if some dev would like to dig more into this weird behavior. Apparently the latest version of Spine hasn't been tested quite well on Solaris 10 Sparc; lurking around the forum I see most of the users rely on Linux as main OS platform - this is unfortunately not an option as we have received a brand new SunFire T1000 server to host Cacti standalone and it would be a shameful waste of hardware if we had to pick up a lousy x86 box bloated already with other services.
PS: we have quite a large config of Cacti, around 500 hosts and 10k datasources. Having the framework working without scrapping the actual host/datasource configuration is the only way to go for us. All of the possible checks that could have been done after the migration have been already made (poller/programs paths, ran ldd against all of the binaries to make sure no libraries were missing, checked PHP extensions etc).
PS2: currently Spine has been compiled 64bit with SunStudio 12u1 CC against net-snmp/mysql sparcv9 libraries, however I've also produced a 32bit binary but that doesn't make any difference.
Thanks in advance for any help provided.
Regards
Fabio