cancel
Showing results for 
Search instead for 
Did you mean: 
Create Post
Level 11

Missing Orion VMAN features (which were in the VMAN Appliance)

I'd like to start gather together a list of features which are missing from the Orion VMAN replacement for the VMAN Appliance.

1.  Inability to search for datastores

There are really basic features which are missing.   I cannot figure out how to search for a datastore and displays its metrics, where you could just type the datastore name, or a partial name plus wildcard, and get a list of datastores, then drill down in to the metrics of the one you want.

2. Inability to display progress of scans.

In the VMAN appliance you could see when scans were being run against ESX hosts / vCenters and it would show up ones with problems.   We can't see this information, it seems to be buried in text-based log files.  This isn't helpful when trying to diagnose whether scans are working correctly.

3. Alerting is slower

Notifications about events seems to coincide with some of the infrequent scans, the ones which occur every 8-12 hours, and thus notifications of events such as snapshots being taken, etc.,  are delayed to such extend that they cannot be called monitoring, which historical reference.  Unless they arrive quickly they become meaningless.   If the system tells me a snapshot was taken 12 hours ago it has no real use.  How can alerting be speeded up to match what Solarwinds offers elsewhere.

I'd like to hear what features other people feel are missing from Orion VMAN.   I welcome the integration verses having to update appliances and collectors.   The new update process is much better, but there are serious missing, very basic, features in the Orion VMAN module which were features we paid substantial sums for in the appliance, but which have not been brought in to VMAN Orion.   Given these are absolutely fundamental features, such as search for a datastore, and I can't figure out how to do this most simplist of things in Orion VMAN, I would go back to the appliance if I could.  But given new features are only now appearing in Orion VMAN we seem to have little choice.

And please, before anybody says, I can't run the appliance as well as the Orion VMAN as it just would not work and I couldn't get Solarwinds support to fix it, so I just gave up trying.

0 Kudos
16 Replies
Product Manager
Product Manager

For this line item, what are you looking to solve?

2. Inability to display progress of scans.

In the VMAN appliance you could see when scans were being run against ESX hosts / vCenters and it would show up ones with problems.   We can't see this information, it seems to be buried in text-based log files.  This isn't helpful when trying to diagnose whether scans are working correctly.

Are you trying to troubleshoot if polling jobs are still occurring? For instance this feature request?

0 Kudos

I'm having all sorts of problems with data not collecting on some hosts - although you'd really have no idea it wasn't working, displays green, but the metrics aren't being collected.

I discovered this because I've gone around to various Riverbed appliances which have ESXi installed - they are WAN accelerators some of them have ESXi built-in to support a couple of VMs.

This is supposed to save having separate dedicated servers.    Most of these work just fine, some don't.  We've only had these problems since we went to Orion VMAN - we could see them working correctly on the VMAN Appliance - because we had a status display.   Now with Orion VMAN the status is buried in log files and I'm coming across all sorts of failures.

I have a ticket open, Case # 00222577 VMAN is not collecting changed datastore names.

So, I went around to these Riverbed ESX hosts and I changed the datastore name.  By default it is something like, riverbed_000eb6760d48, and I need it to be easier to identify on reports.  I've added a ton of Virtual Datastore custom properties as I have to produce a report for an external entity - they want to know location (we have many), what it's connected to (SAN name), how it's connected (FC/iSCSI), etc.

Anyway, back to the name.  I've gone around and changed them to the name of the ESX server, so something like UKBIRRBESX01 (UK, Birmingham - we don't have an office there, it's for example, RB = Riverbed, ESX= obvious) and _datastore, so I've changed the name to something like UKBIRRBESX01_datastore.

I'm afraid that without something visible indicating that data is being collected and updated that we cannot have confidence in any of the data produced by VMAN.  We know some is being collected, probably most of it, but I know that I have green lights on some boxes and they definitely are not collecting data.  I know some of these datastore names are showing the changed ones, and some ESX hosts are not showing the data.  And I've compared free space on the ESX host with what Solarwinds has, and I know that it is not collecting the disk space metrics either.  I thought it was just not picking up the name change, but it's not polling metrics either.

In Cortex.log I see;

2018-12-11 09:45:17,733 [75] ERROR DataCollection.SolarWinds.CortexPlugin.VMan.Protocols.VMware.Service.VMwareServiceProvider - Unable to get the response from GetAlarm method API call.

System.ServiceModel.FaultException: The session is not authenticated.

a lot. But what logs should I be looking at.

This is why it is essential we have a good, accurate, status indication that things are working.   You might not know but for two large datacenters the VC was not collecting information for 42 days before we discovered the issue, which was some new license appearing in License Manager and until we assigned it to our existing poller it wasn't collecting information.  BUT, we had no warning, and in the end I discovered the problem and fixed it.   There wasn't even an alert or notification that one of our licenses had expired.  In fact the licenses were extended, but one of the existing ones expired, and a new one suddenly appeared but was not assigned.

0 Kudos

jonathanswift  wrote:

I'm having all sorts of problems with data not collecting on some hosts - although you'd really have no idea it wasn't working, displays green, but the metrics aren't being collected.

I discovered this because I've gone around to various Riverbed appliances which have ESXi installed - they are WAN accelerators some of them have ESXi built-in to support a couple of VMs.

This is supposed to save having separate dedicated servers.    Most of these work just fine, some don't.  We've only had these problems since we went to Orion VMAN - we could see them working correctly on the VMAN Appliance - because we had a status display.   Now with Orion VMAN the status is buried in log files and I'm coming across all sorts of failures.

I have a ticket open, Case # 00222577 VMAN is not collecting changed datastore names.

So, I went around to these Riverbed ESX hosts and I changed the datastore name.  By default it is something like, riverbed_000eb6760d48, and I need it to be easier to identify on reports.  I've added a ton of Virtual Datastore custom properties as I have to produce a report for an external entity - they want to know location (we have many), what it's connected to (SAN name), how it's connected (FC/iSCSI), etc.

Anyway, back to the name.  I've gone around and changed them to the name of the ESX server, so something like UKBIRRBESX01 (UK, Birmingham - we don't have an office there, it's for example, RB = Riverbed, ESX= obvious) and _datastore, so I've changed the name to something like UKBIRRBESX01_datastore.

I'm afraid that without something visible indicating that data is being collected and updated that we cannot have confidence in any of the data produced by VMAN.  We know some is being collected, probably most of it, but I know that I have green lights on some boxes and they definitely are not collecting data.  I know some of these datastore names are showing the changed ones, and some ESX hosts are not showing the data.  And I've compared free space on the ESX host with what Solarwinds has, and I know that it is not collecting the disk space metrics either.  I thought it was just not picking up the name change, but it's not polling metrics either.

In Cortex.log I see;

2018-12-11 09:45:17,733 [75] ERROR DataCollection.SolarWinds.CortexPlugin.VMan.Protocols.VMware.Service.VMwareServiceProvider - Unable to get the response from GetAlarm method API call.

System.ServiceModel.FaultException: The session is not authenticated.

a lot. But what logs should I be looking at.

This is why it is essential we have a good, accurate, status indication that things are working.   You might not know but for two large datacenters the VC was not collecting information for 42 days before we discovered the issue, which was some new license appearing in License Manager and until we assigned it to our existing poller it wasn't collecting information.  BUT, we had no warning, and in the end I discovered the problem and fixed it.   There wasn't even an alert or notification that one of our licenses had expired.  In fact the licenses were extended, but one of the existing ones expired, and a new one suddenly appeared but was not assigned.

To your point about "green" status notifications, we've actually been working on a fix to provide greater visibility as to the status of your polling engine. I'd love to confirm one detail with you though, when you hover over the polling source that is having issues, what do you see in the popover?

For instance in the screenshot below, you can see that the vCenter is showing as "Up" so in that case, polling will be up and running.

pastedImage_2.png

0 Kudos

ps.  this is what it displays, shows up, shows connected.  You wouldn't have any idea it wasn't working.

ESX Status   Up
Operational StateOperational StateConnected
Polling methodVMAN Orion
Product NameVMware ESXi
Product Version5.0.0
Physical Memory32.0 Gbytes
Number of VMs2 (2 running)
0 Kudos

jonathanswift  wrote:

ps.  this is what it displays, shows up, shows connected.  You wouldn't have any idea it wasn't working.

ESX Status   Up
Operational StateOperational StateConnected
Polling methodVMAN Orion
Product NameVMware ESXi
Product Version5.0.0
Physical Memory32.0 Gbytes
Number of VMs2 (2 running)

To fix this issue, did you reboot? how did you get past the polling issue?

0 Kudos

Disabled polling, re-enabled polling, it now isn't coming back - collecting info.  So I've moved it to a different poller, the primary, where Orion VMAN is installed.

Problem solved.   It was now updated metrics and changed the datastore name.

So it does not poll correctly when the node is being monitored by a poller other than the primary poller.

Perhaps I have installed them incorrectly, perhaps I should have created another poller with just the additional poller VMAN collected installed, and monitored it via that.

These were all points I raised long ago when trying to figure out how we replace the Appliance and collectors, with VMAN on the primary poller.  In our case we kept it very simple, we ditched the collectors.

Perhaps I did not understand that you cannot monitor a node via an additional poller unless that poller has the VMAN collector installed.

And perhaps by saying this is what I think is wrong / has happened here others will be able to avoid this pitfall.

0 Kudos

What this also highlights is that your monitoring can look ok, but might not be.   You can't rely on a green dot to indicate a device is actually polling, I've shown that to be wrong.  I've proved that it can say up and connected and still not really be monitoring / collecting metrics and information.   This needs to be addresses - we need a status screen.  IF there is an entry in a table, somewhere, that says the last time the scan/collection ran successfully, then if it has not successfully collected metrics in a set period then that should trigger an error.  Bettter still we should be able to display this table as a Custom Query, to show the ESX hosts/vCenters and last time successfully polled.  I've done this but the only Timestamp isn't very helpful.   What we really need is a Orion VMAN timestamp to confirm the polling completed successfully and to overwrite this value each time it polled successfully.

In just the same way I get an alarm/alert if the polling engine failed to update for more than 10 minutes.

0 Kudos

FInal Point.  I did look on the Additional Poller, which wasn't correctly polling this node.  The poller is in Redhill, Surrey, UK.  It is polling a device in Columbia, South Carolina, US.  The additional poller shows it has NCM 7.8, NPM 12.3 HF 1, NTA 4.4 HF 1, SAM 6.7 and VMAN 8.3.

Primary poller (from the bottom of the web page) shows;

Orion Platform 2018.2 HF5, NCM 7.8, NPM 12.3, DPAIM 11.1.0, NTA 4.4.0, VMAN 8.3.0, SAM 6.7.0, NetPath 1.1.3
I'll be doing an update to the latest this week.  We enter change freeze for Christmas on 18th.
0 Kudos

jonathanswift  wrote:

FInal Point.  I did look on the Additional Poller, which wasn't correctly polling this node.  The poller is in Redhill, Surrey, UK.  It is polling a device in Columbia, South Carolina, US.  The additional poller shows it has NCM 7.8, NPM 12.3 HF 1, NTA 4.4 HF 1, SAM 6.7 and VMAN 8.3.

Primary poller (from the bottom of the web page) shows;

Orion Platform 2018.2 HF5, NCM 7.8, NPM 12.3, DPAIM 11.1.0, NTA 4.4.0, VMAN 8.3.0, SAM 6.7.0, NetPath 1.1.3
I'll be doing an update to the latest this week.  We enter change freeze for Christmas on 18th.

Could you confirm whether NPM 12.3 HF1 is applied to your main polling engine. It's surprisng that your additional polling engine would be updated with a hotfix and your main polling engine would not be.

0 Kudos

Re-ran the installer and it says the current version of NPM is 12.3 HF1, so all good.

Level 11

Ok, found it.  There is a new search feature, and the old one appears to only show a few standard items and has a long list of our Custom Properties.

So I can find it, but I must have missed this new way of searching.   I don't remember reading about it in the release notes.

0 Kudos

jonathanswift  wrote:

Ok, found it.  There is a new search feature, and the old one appears to only show a few standard items and has a long list of our Custom Properties.

So I can find it, but I must have missed this new way of searching.   I don't remember reading about it in the release notes.

The release notes that mentioned search being added are here: Virtualization Manager 8.2.1 Release Notes - SolarWinds Worldwide, LLC. Help and Support 

pastedImage_3.png

0 Kudos

Ok, so I can search for them.  It's not great because what I want to do is select all datastores, then filter down to what I want.  This free-form searches-anything has the disadvantage of bringing a lot of things back which I don't want.

A better way was, as in the original Appliance, to be able to search for all or a partial name, just in the datastores.

For example, how do I bring back a complete list in the search results of ALL datastores?  I can't juse type * for example, because that brings back interfaces and nodes.  I can't search with the search criteria field blank.  If I put a * I only get interfaces and nodes but no datastores.  The datastores have all sorts of different names, so * or nothing would be the appropriate searches.

So there is a way to find datastores, but right now it isn't a good way - unless I'm missing something.

0 Kudos

jonathanswift  wrote:

Ok, so I can search for them.  It's not great because what I want to do is select all datastores, then filter down to what I want.  This free-form searches-anything has the disadvantage of bringing a lot of things back which I don't want.

A better way was, as in the original Appliance, to be able to search for all or a partial name, just in the datastores.

For example, how do I bring back a complete list in the search results of ALL datastores?  I can't juse type * for example, because that brings back interfaces and nodes.  I can't search with the search criteria field blank.  If I put a * I only get interfaces and nodes but no datastores.  The datastores have all sorts of different names, so * or nothing would be the appropriate searches.

So there is a way to find datastores, but right now it isn't a good way - unless I'm missing something.

on my system I have these datastores

pastedImage_2.png

most of which have an "s" in the name

pastedImage_3.png

as you mentioned, you would see all the results

pastedImage_4.png

but to see the datastores only, utilize the filters on the left hand side

pastedImage_5.png

pastedImage_6.png

There are more filters that you can add that will narrow down the search results even further. Let me know what are your most common filtering pieces that you use or if you'd like to see additional data available in the search results.

0 Kudos

Unfortunately this won't work, there is nothing common to all datastores, we need to be able to return a complete list in search.

This feature was/is available in the appliance.  Also in the appliance we could drill down further and/or export the list.  The filtering looks good, but we should be able to export the list.   The list is also just a list of datastore names, it would be better to include more information like the appliance did.  We need to make sure our investment in Virtualization Manager is maintained by ensuring features that were available in the appliance and correctly transferred to the Orion VMAN.  It can't be very difficult to get the development team to go through every feature and make sure it is properly replicated in what you are offering as a replacement.  When something is offered as a replacement you have to replace it like for like, or better, otherwise buyers will be unhappy.

I have a 1-hour call scheduled with your researchers to discuss improvements I want to see in Solarwinds, and I've already outlined quite a few, and I'll include this in that discussion.  The search function is good, I am impressed with the filtering options, but it can be better.  I want to see Solarwinds as the best monitoring product available and with the help of customers it can be.  We can bring our real-world knowledge of how the product is used in business to help your focus on eliminating areas where this product causes us some pain.  It is a good product, getting better, but the VMAN transition still needs work to finish migrating all the features.

As we've discussed before, the diagnostic capability where you can time-travel to see changes over time is missing.  This is an important feature and should be put back in.  There is the option to reduce maintenance costs in recognition that not all features in the appliance have been migrated, but I don't think you want to go down that route.

So would it be possible for you to engage with product managers/development teams to ensure we can search for a list of all datastores, and see if time-travel can be put back in.  Also, we definitely do want to see the status of the scans as before, or there should be an alert when any scanned VM host / vCenter does not scan correctly.  For me either will do.  But recently scanning silently died for 40 days without us being aware, that is not acceptable.  I could have easily detected that in the appliance but cannot in Orion VMAN.

0 Kudos

jonathanswift  wrote:

Unfortunately this won't work, there is nothing common to all datastores, we need to be able to return a complete list in search.

This feature was/is available in the appliance.  Also in the appliance we could drill down further and/or export the list.  The filtering looks good, but we should be able to export the list.   The list is also just a list of datastore names, it would be better to include more information like the appliance did.  We need to make sure our investment in Virtualization Manager is maintained by ensuring features that were available in the appliance and correctly transferred to the Orion VMAN.  It can't be very difficult to get the development team to go through every feature and make sure it is properly replicated in what you are offering as a replacement.  When something is offered as a replacement you have to replace it like for like, or better, otherwise buyers will be unhappy.

I have a 1-hour call scheduled with your researchers to discuss improvements I want to see in Solarwinds, and I've already outlined quite a few, and I'll include this in that discussion.  The search function is good, I am impressed with the filtering options, but it can be better.  I want to see Solarwinds as the best monitoring product available and with the help of customers it can be.  We can bring our real-world knowledge of how the product is used in business to help your focus on eliminating areas where this product causes us some pain.  It is a good product, getting better, but the VMAN transition still needs work to finish migrating all the features.

As we've discussed before, the diagnostic capability where you can time-travel to see changes over time is missing.  This is an important feature and should be put back in.  There is the option to reduce maintenance costs in recognition that not all features in the appliance have been migrated, but I don't think you want to go down that route.

So would it be possible for you to engage with product managers/development teams to ensure we can search for a list of all datastores, and see if time-travel can be put back in.  Also, we definitely do want to see the status of the scans as before, or there should be an alert when any scanned VM host / vCenter does not scan correctly.  For me either will do.  But recently scanning silently died for 40 days without us being aware, that is not acceptable.  I could have easily detected that in the appliance but cannot in Orion VMAN.

I appreciate your transparency and your frustrations. However, I would like to give credit to the hard work our developers are doing, as I would like to discuss this point a little further. 

"It can't be very difficult to get the development team to go through every feature and make sure it is properly replicated in what you are offering as a replacement." This in fact is a difficult art and my development team and others on the Orion platform work  incredibly hard to deliver excellent features fast to respond to customer needs. There are many factors including, new technology stacks and architectures, as well as evolving needs of our customer base. One example around search is this, on the appliance due to the technology stack used, we were able to use a third party search engine that could use a more advanced search language. Which includes the search term "*" On Orion, we tried a similar tactic which led to the advent of the tech preview of Orion Global Search Global Search Technical Preview - Notes & Info  The results of that technical preview was that the technology options available were not compatible with our architecture to provide the performance results that are up to our standards. So, we scaled back the use cases to provide a minimal first set of value, utilizing the tech that we have available, but still allowing users to get some value and then come back to us with feedback for the most important use cases.

So, your feedback about "*" is noted and I've duly pushed internally on our feature request to see if we can re-evaluate options and see how we can improve this going forward.

It took us years to develop the appliance, and the migration of the features to the new tech stack is taking us many more. I know with your sharp eyes and deep engagement with the product you will be the first to see those gaps, but please know that we are listening and every time you mention it, we are internally discussing options to see how we can move faster on this versus that feature. It will take us time, perhaps even years, to match some of the power of what we were able to accomplish on the appliance, but with the different set of technology that is available on Orion, there are things that we can outpace the appliance already today. An example of that is IPv6 support. We did not have proper IPv6 support on the appliance, however, it is fully supported on the Orion platform. Another example is SAML support, again another feature that we could not provide on the appliance, but is fully available on the Orion platform.

To your point, " I want to see Solarwinds as the best monitoring product available and with the help of customers it can be.  We can bring our real-world knowledge of how the product is used in business to help your focus on eliminating areas where this product causes us some pain.  It is a good product, getting better, but the VMAN transition still needs work to finish migrating all the features." we do too! We know it's not a done process, and we are not claiming that it is, but it won't happen overnight. You're absolutely right that your input is what is going to make us the best product available. I'd love to listen in on this session that you're having, so I've sent out an email to our UX researchers to see if they can add me in. Please keep the conversation going, because we'll definitely keep making strides in the direction you're hoping for.

0 Kudos