Hi,
I've been a Linux hobbyist for several years, and I'm now in my first job that's 100% IT work.
I've got a few dozen remote sites running CentOS 4 servers. They're behind firewall/NAT with a port forwarded for SSH to one server, and a port forwarded for HTTP to another (running Apache). I want to be able to easily monitor them-- i.e. is database replication working, is $service running, etc.
None of the stuff I need to monitor is critical. If something critical breaks, the users on site will give us a call. I just want to be proactive to identify possible problems before they come up.
I was working on it this morning trying to write a Python script that would SSH out from our office, run the diagnostic commands remotely, and record the results. It seemed straight-forward, but the remote network and servers/services configuration is making it more tricky than it would seem.
I'm now considering writing a script that runs on remote side (probably as a cron job a couple times a day) and pushes the results to a server here at the office (where I can get a port opened).
Before I start on this, I figure I should ask what my options for already developed software are. Is there a good open source solution for this (that will run on CentOS 4)? Or, is this simple enough that most people just roll their own?
[link][2 comments]