Strategies for Scaling the World’s Largest Networks

shidoshi1000 over 6 years ago 2 minute read time

By Joe Kim, SolarWinds Chief Technology Officer

It can be truly astounding to think about the scale of today’s largest government networks, which are growing larger and more complex every day.

As a public sector IT pro, it may seem like an impossible challenge to manage this growing behemoth. Ever-increasing numbers of network devices, servers, and applications give you less leeway for downtime, hiccups, or problems of any sort.

There is a range of strategies that government IT pros can employ to support network growth and scalability while helping to ensure that all architectural and infrastructural requirements are met, and system failover scenarios are accounted for.

As the IT environment expands, it becomes more important for monitoring and management systems to scale to keep up with growth. Most monitoring systems are built with the following elements, each with its own requirements and challenges to scale:

A server that hosts the monitoring product and polls for status and performance
A database where the polled information is stored for historical data access and reporting
A web console for software management, data visualization, and reporting

Within this environment, three primary variables will affect a system’s scalability:

Infrastructure size: The number of monitored elements (where an element is defined as a single, identifiable node, interface, or volume), or the number of servers and applications that can be monitored.
Polling frequency: The interval at which the monitoring system polls for information. For example, statistics collected every few seconds instead of every minute will make the system work harder, and requirements will increase.
The number of simultaneous users accessing the monitoring system.

Those are the basics of understanding the feasibility of scalability. Now, let’s move on to ways to manage that environment.

A command center is particularly well suited to agencies with multiple regions or sites where the number of nodes to be monitored in each region would warrant both localized data collection and storage. It works well for regional teams that are responsible for their own environments and require autonomy over their monitoring platform. While the systems are segregated between regions, all data can still be accessed from the centrally located console.

Additional scalability tips

There are several additional strategies that will help manage an agency’s growing infrastructure:

Add polling engines: Distributing the polling load for the monitoring system among multiple servers will provide scalability for large networks.

Add web servers: Additional web servers can help support increasing numbers of concurrent monitoring sessions, helping to ensure that more users have uninterrupted web access to network monitoring software.

Add a failover server: To help ensure the monitoring system is always available, install a failover mechanism that will switch monitoring system operation to a secondary server if the primary server should fail.

Agency networks will certainly get large. It's the nature of an increasingly technically driven government. While it may seem overwhelming, implementing these few tactics will help IT managers embrace the growth and ultimately realize its value.

Find the full article on Government Computer News.

Top Comments

rschroeder over 6 years ago +1

Missing from this topic: Training Scaling is built on the right practices, and our house of cards is made stronger by great training of those who will design and deploy it. You might have the best Network…

bravert over 5 years ago

I'm confused. We bought 2 web servers having been told that they were able to be deployed in an HA pair. Turns out that is apparently not the case. We then tried DNS round robin load balancing. Every time the web page is refreshed, the connection reverts to the other web server, which requires the user to log in again. I'm also told that SolarWinds does not support F% load balancing. If the web servers are not session state aware with each other, how can this be a seamless redundancy? Am I missing something or do I have bad information?
Thanks.
- Cancel
- Vote Up 0 Vote Down
- More
- Cancel
tallyrich over 6 years ago

We have a secondary polling engine, but haven't found the need for HA or additional web servers yet. I'm still trying to get others on the team excited about the products so hopefully usage will grow.
- Cancel
- Vote Up 0 Vote Down
- More
- Cancel
jm_sysadmin over 6 years ago

We have added polling engines, we are about to add web servers, and my database gets re-allocated resources about every other month. (always more). Fail over is on the horizon, but not seen as urgent. Yet.
- Cancel
- Vote Up 0 Vote Down
- More
- Cancel
rpdinavahi over 6 years ago

Totally agree with adding a failover server.
- Cancel
- Vote Up 0 Vote Down
- More
- Cancel
tinmann0715 over 6 years ago

Nice article. A nice cookbook when planning for a monitoring solution for large-scale networks.
- Cancel
- Vote Up 0 Vote Down
- More
- Cancel