cancel
Showing results for 
Search instead for 
Did you mean: 
Create Post

Data Warehouse for long term interface statistics, reporting, trending, and capacity planning

Data Warehouse for long term interface statistics, reporting, trending, and capacity planning

One of the useful aspects of Orion and NPM is the statistics gathering for performance and capacity planning...until you have to change the IP address for whatever reason or similar events.
Now all of the carefully gathered performance data is gone.

What do you use to trend the old to the new?
How do you know if you have gained any ground or not?

Similar data is gone if you re-index a switch's interfaces (hardware upgrade).

There is also the issue of being able to retain detailed performance data for long period of time...6 months,1 year, 2 years, or longer for reporting and accurate trending.

One of the items Orion is missing is a data warehouse to archive that performance data for long term reporting.

It would not be practical to keep 18 months or more months of performance data in the production database...and then risk losing 6 months of valuable perfromance data after a hardware change. 

So introduce the data warehouse.

It would be a separate out of band database optimized for reporting.  It syncs updates every hour or so from the production database through a scheduled process.
This allows for near time data for reporting.  Thus any reports are not hitting the production database unless it requires recent data within say 4 hours.


Thus the production database can be set to trim at a lower level (days or weeks instead of months) keeping tables smaller and the product more efficient.
Based on this it would be desirable to locate the data warehouse on the SAN as it not as critical for shear speed or for SolarWinds to operate.

This will help product performance by keeping the production database size down.

Now you can retain detailed historical data for longer periods of time and be able to map annual trends/growth during seasonal fluctuations.
An example would be, your company has a spike in network activity related to the Christmas season every year.  How do you trend the last three years to see
if you may be close to saturation?  Without the historical data available you can't.  How does the new switch chassis you put into production 2 mnths ago fit into those trends?

Chances are all the data for the old interfaces prior to the current switch is gone since the new interface layout changed.

By having a data warehouse you would still have the data, but  in an out of band database you can query to your hearts content and not impact production polling thus providing the ability to answer those questions.

This should also apply to any monitored statistic in SAM and WPM as well.  This allows you to trend data for applications and see what changes to the environment have done and whether they
have positively or negatively impacted response time and availability as well as other monitored metrics over time.  This is huge for software upgrades and troubleshooting.

So the biggest wins are long term data points for  capacity planning, out of band reporting reducing overhead on primary database, reduced cost of SAN based database for long term storage, retention of data through hardware and IP address changes, smaller more efficient production database for increased performance via smaller polled metric specific tables.

320 Comments
Level 14

I like it.

Level 16

We now use Orion for our executive system status and roadmap meetings....  Being able to move everything from 1 to 5 years off to a separate database would be terrific and maybe even sped up NPM response time for daily use.

Level 9

We need this.  Great suggestion OP.

Level 15

As an add on or perhaps it should be an additional feature request should be historical bandwidth tracking.  Right now say you have a 100Mb circuit that is 80% utilized.  If you upgrade that to 200Mb all your old charts instantly use the new bandwidth setting (because the system only keeps one) so your prior reports show 40% utilization.  Makes it hard to show value to your business stakeholders if the perception is the utilization is no better than it was before.

That is one of the points I was trying to make... I just didn't say it as well as you did as I was trying to get several aspects of it across.

Level 15

I think I'll log it as a separate request however because while this request is more for long term planning functionality tracking BW changes should be handled in the current product.  We utilize hundreds of MultiLink circuits where if a T1 drops the bandwidth dynamically changes so we see fluctuations like this daily.

Your request still got my vote regardless

Great, great idea.

I wonder if some reimagined version of the FSDB (used for 4.0+ versions of NTA) could be a candidate for housing this data. Data not changing a lot once put in place? Check. Separate from other NPM data? Check!

Just saying.....I'm not a smart person like these SW devs so it might be a nonsensical idea, but the need/use case sounds similar.

Thanks rharland2012, I've been trying to talk this up and mentioned it to several of the UX persons who seemed interested....

For most small shops it's not a big thing, but for larger where we need to do capacity planning and comparative analysis of changes over time it becomes crucial.

Right now calculating bandwidth usage versus line size has to be done outside the product since the day/time of line changed over to a bigger pipe is not an embedded value nor does it retain the old value.

So how do you know if the added expense to upgrade the line was worth it ?

Thanks again and feel free to help spread the word.

Level 16

Comments:

a) the data should stay in the same database server -- this allows you to use some of the MS SQL services to migrate data

b) please, please, read the The Data Warehouse Toolkit: The Complete Guide to Dimensional Modeling (9780471200246): Ralph Kimbal...

This is because many reporting tools make use of those same design patterns and anticipate the reporting database looks vaguely like what you might expect.

Level 16

this would be taken care of if the interface speed is a type-2 slowly changing dimension (see the work i cite below)