Hey all! I am looking for some software to monitor the health status of our servers. Our environment is a pretty typical mix of HP/Dell running Microsoft Server and VMware vSphere. We are currently using WhatsUPGold and would like to migrate away from that. I have been checking out Dell Management Console and IT Assistant but I think it seems troublesome to get all our products monitored with it.
Any other suggestions that would work for a medium size environment?
EDIT: I should also note that I would like something that is fairly easy to roll-out, doesn't require a lot of rule/custom rule creation/editing.
Personal favorite and renowned basicly everywhere for being among the best monitoring softwares, Nagios.
I've yet to have a single device/server/software i couldn't monitor with it. (Some required writing custom modules, in the end it was still monitored).
Edit: There are several "packages" that include nagios & web control panels and such, one of the most popular ones is Op5
. However obviously thats has a significant cost tagged on it. Most of these packages should meet all your needs, including the easy to roll out / not much custom rule creation.
InterMapper (commercial) is what I use -- It's very flexible and the company behind it is pretty responsive to enhancement requests and each release of the software has brought a bunch of new (useful) features. There is an annual license agreement, but I don't find the pricing prohibitive, even (especially) for small clients.
InterMapper also has a database backend that lets you do trending/reporting (similar to what you can do with Cacti, etc.), though this isn't very well polished yet.
I'll be the contrarian and say I don't much care for Nagios: partly because I'm not a fan of the "remote plugin execution" mdoel, but mostly due to bad experiences with setups that had an abysmal signal-to-noise ratio and what I find to be a less-than-refined configuration process.
Edit to respond to Questioner's Edit: With InterMapper you'll almost certainly have to run SNMP daemons on the stuff you want to monitor, and you'll probably have to customize some of the thresholds per-server unless your environment is really tight, but it's all done from the GUI & pretty easy.
Stuff like hardware monitoring (drive failures, etc.) usually requires a custom probe, but there are a bunch of them already written (and if the probe you need doesn't exist implementing it yourself is pretty easy).
GruffTech beat me to it - I was going to recommend Nagios too!
It comes with lots of probes out of the box, and writing new ones is really easy. Its also easy to bolt on different front ends and there are other types of plug-in available off the shelf (e.g. Cacti for trend graphing).
(Previously used BMC Patrol, Oracle Grid Control, NetSaint and others - I much prefer Nagios).
Nagios, OpenNMS, Zabbix are all touted here...
Nagios is the most documented of the three, and while OpenNMS looks awesome, I don't like thir website or documentation much at all.
There's another monitoring tool you should "monitor" (heh): Shinken. It's a re-implementation of Nagios (100% config compatible) in Python, re-architected to have more HA and distribution. It's not listed as production ready, but if you're adventurous, you can put it on QA.
Do not forget about Zenoss.
Hyperic HQ Open Source is my favorite
opsview community http://www.opsview.com/community is a great way to get yourself started with nagios. Nearly everything can be done through the web ui and you can use all the nagios plugins. If your setup is to be maintained by people without unix skills, it is the way to go.
If you have the budget for it, go for Solarwinds NPM. It is a very good product. I would love to test few of those numerous opensource tools but till now I haven't gotten a chance. The good thing about open-source tools is the flexibility you have and community driven options that you get. Mostly the output is much better than that of a handful of engineers brainstorming a commercial product's features.
I tend to like this combination:
Nagios NCONF Cacti
They all play nice together and are very easy to set up.
If you want a nice clean easy and inexpensive option i would look at GFI Max Server Monitoring (formally Hound Dog). I use it for literally hundreds of servers very simple to manager via any webbrowser/location. Fast, Secure (Runs over port 80,443) Pricing relevant to what you want to monitor. Highly recomended.