a member used NOREX DirectConnect to query the community
regarding products to consider when replacing their Enterprise
Monitoring and Alerting System.
NOREX peers responded with valuable experience and unbiased reviews:
Member: We are researching replacements for our enterprise monitoring and alerting system. Which products are trending in this space? We are looking at the following products right now; which of these are worth pursuing (and why)? Which should we avoid (and why)? What other options are there in this space that we should consider?
Zabbix, Solarwinds, Nagios/Isinga, HP Oneview, LogRhythm, Dynatrace/Rigor/New Relic, Service Now's Event Management module
I contributed a guide to the NOREX document library that touches on some of the more generic guidance involved with a SIEM (which is where you are going in terms of technology).
Check Document 10-2266 SIEM SPECIFICATIONS
With the advent of SIEM as a service offerings, the first question is if your
organization is open to contracting your SIEM to an external resource or do you
want to build on-prem.
After you determine that, then it is a matter of making a list of the goals you
want to achieve.
In my practices, I will commonly create a "Must, Should, Could, Will
not" document. I like to call it a MoSCoW document.
In that document, I would outline in concise, mostly independent, bullet
statements the requirements I envision for a solution to be viable.
Always develop this document with a green fields mindset. Build your vision of
an optimal solution, and then revisit to help prioritize (add weight) to each
Development of the MoSCoW document can take a bit of time simply because it
will quickly become your requirements source as you begin to assess solution
capabilities. You don't want to miss anything, so take your time and
collaborate with peers.
Once you are done building the requirements, start shopping the solutions out
If the solution capabilities cover all of the requirements you specified, then
it comes down to price, aesthetics, resources (think people), and timelines.
My experiences have been with ArcSight, LogRhythm, Solarwinds, Splunk, and recently we are working to develop an ELK solution.
ArcSight - It's been awhile since I worked with this one, but I recall it being a pretty good product. In my experiences it required a well-trained technician to implement and manage.
LogRhythm - I liked the interfaces this product provided. At the time I was using it, I didn't like that it did not have any archiving capabilities to sunset log data of systems that were removed from the environment after a period of time. That may have changed since I last used it.
If you go this route, definitely send your technicians to their training program BEFORE they start implementing. LogRhythm is easy to implement, but can be implemented poorly if the technicians do not understand how it is configured. One could say the same about any SIEM, but this one stands out in my memory due to some of the problems my technicians caused by not waiting until they attended training.
Solarwinds - I really like the solutions these guys build. Great interfaces, flexible, decent price point. I really liked how it helped me develop a "rogue device" detection capability on the network, practically right out of the box. The full suite of products is pretty darned nice overall. It does require some technical expertise to really shine, but less than ArcSight IMHO.
Splunk - This is one that I wish I had more time to work with in regards to the enterprise license. I have used their freeware licensed product in the past and liked what I saw. But that license has a data volume limit. It does take some technical acumen to implement properly, but it also has a decent support community behind it. The enterprise license pricing is kinda odd because it is based on data volume. I have not had the luxury of having such a system because my employers feared unexpected costs if we get a large influx of log data in an unplanned or unexpected situation.
ELK - The latest project I have been involved with.
are building our ELK solution to augment our current SIEM environment wherein
we want to do more detailed analytics on the same log data and architecture.
Think of it as a reasonably priced double-check solution for our event
The main seller for us is the price point and support community.
There is still some technical acumen involved to implement and maintain, but we have those resources at this time and plan to develop more skills with this solution.
We just started into this venture, so I don't have much more to share with you at this time. But from the many demos we reviewed, we feel (and hope) ELK has much promise for us.
I tried to summarize my recent experiences with various products below:
I, and another member of my team, have used Ipswitch’s WhatsUpGold (WUG) with great success. The licensing costs were reasonable, and the product was very reliable and extensible. We did spend a good amount of time initially setting up alert thresholds, server tiers, and automated responses, but once set up they were very reliable and easy to extend to additional systems. The historic trending of monitors worked well also. Depending on the size of the team, WUG also has a good offering of professional services to assist with the initial setup. WUG would be my recommendation without knowing any other specifics of the organization.
We are currently using Solarwinds for our monitoring and alerting and it is working well. The licensing structure grows quickly as each individual item you monitor on a system counts against your overall license. It doesn’t seem quite as easy to configure automated alert responses either. If we didn’t already have other components of the Orion suite, we would have gone with WUG.
Dynatrace is also a great tool, but one that has a more focused role. Dynatrace can automatically establish performance baselines for app functionality and alert on deviation or slowdown. It can also go down into the code level across Web, App, and Database tiers to help identify the root cause of any performance issues. It can also add value in the development process by ensuring new code does not introduce any performance issues – it will not perform load testing though. It’s a great tool to add to targeted, internally developed, systems, whereas SolarWinds and WUG are more focused on Enterprise Wide monitoring.
I have found Nagios and Zabbix to be very powerful tools, but they take a lot of effort to get set up and then care and feeding after they are. Zabbix is a little more user friendly of the two.
MS System Center has a lot of functionality and we had been using it for a while, but found its alerting to be unreliable with a lot of missed events for unknown reasons, so I would avoid it.
ServiceNow is a high quality product,
and if it is already being used for ITSM, would likely perform well. It can be
a little cumbersome to setup and use however if the team is not already
familiar with it.
There are also a lot of other products in this space that I have not personally had any experience with. The specifics on the environment they are looking to deploy into is very much going to influence what works best for them. If there’s any other specific questions I can help with, please feel free to forward them to me.
would be available to talk about our use of Solarwinds.
We use LogRhythm. I would be
happy to speak or have email communication with them.
Have something to add? Members can
contribute to this discussion thread here