1 2 3 Previous Next

Geek Speak

1,240 posts

It's Good Friday here in the US and many companies are on holiday today. Some will have Easter Monday as a day off instead. I suspect much of the working world is enjoying at least a three day weekend.

 

And that got me thinking about how we should all have longer weekends.

 

I'm not suggesting that we only work four days a week (but if my bosses think that's a great idea, then I'll take credit for it). No, what I am thinking about is how often my life has been interrupted during a weekend by someone needing help. As a production DBA for 7+ years, with 5 of those years as the team lead, I lost many, many hours on weekends due to "emergencies" that were anything but. It wore me down, and wore me out.

 

So that's going to be my number one goal here at Solarwinds, to give customers longer weekends. Every thing I do here at Solarwinds will be with that one purpose in mind.

 

I want our customers to have what I didn't: the opportunity to spend uninterrupted time with their families when they are away from the office.

 

Enjoy the long weekend!

The Heartbleed bug is a vulnerability that’s compromising Internet applications like Web, email, and instant message communication. However, recent revelations indicate that there’s more to this threat. Unravelling the intensity of its potential, Heartbleed has recently been found to also affect connected devices that rely on OpenSSL encryption library, including network hardware like routers and switches. Networking vendors such as Cisco, Juniper Networks, F5 Networks, and Fortigate have all issued security alerts indicating this risk.

 

OpenSSL is a software that decrypts data encrypted under SSL (Secure Socket Layer) or TLS (Transport Layer Security) technology. OpenSSL 1.0.1 before 1.0.1g does not properly handle Heartbeat extension packets, which allows remote attackers to obtain sensitive information. This information can include private keys, usernames and passwords, or encrypted traffic from process memory via created packets that trigger a buffer over-read. This creates a huge vulnerability that allows hackers to infiltrate any large network.

 

Heartbleed Remediation in Your Network

 

OpenSSL, being a widely-used implementation of SSL, is difficult to fully remediate. However, your immediate action should be to update patches with the fixed version, i.e. 1.0.1g or newer.

 

To remediate Heartbleed in 3 simple steps:

  1. Change passwords for all devices (before & after patching, to be absolutely sure that no attacker sneaks in...)
  2. Patch your network operating system for all perimeter hosts
  3. Purge bad OpenSSL versions from your entire infrastructure

 

It’s important to contact vendors of your devices that connect to the Internet. You need to find out if those devices rely on OpenSSL and ask if  there is a patch available. In addition, refrain from using any affected applications or devices, and apply any updates as soon as possible.

 

Junos OS affected by OpenSSL "Heartbleed" issue – Juniper

junos.png

Cisco has also released a list of affected and vulnerable products.

 

For a network with 100s to 1000s of devices, it’s no small task to push Network OS and firmware updates/patches in bulk. Using an automated tool to quickly take action and apply software fixes on all devices in the network will definitely save the network admin time, and enable quicker TATs (turn-around-time) to address sudden vulnerabilities such as Heartbleed.

 

Also, note that most vendors are working on providing fixed versions of code for its products. So, patching your devices with new updates should be on your to-do list for a quite some time.

 

Note for SolarWinds customers: Please take a look at this table to check the Heartbleed vulnerability against the product(s) you use.

Application performance monitoring (APM) is a broad subject which tends to look at how businesses are using enterprise level applications. These applications can help solve end-user requirements and continuously maintain high availability to ensure optimal performance. Furthermore, organizations that depend on APM technology to scale certain areas within their business must understand that innovation plays a vital role. After all, CIOs and other decision makers will want to look at what the ROI is over a period of time.

      

The Role of an APM Software

Organizations with a sizeable IT infrastructure need an APM software to effectively manage IT assets to ensure they last for a certain amount of time, always deliver from the time they’re set up, and so on. For example, say you’re an enterprise with 1000+ physical and virtual servers. These servers have mission critical applications that need to support a certain business group. As IT pros, it’s your duty to ensure the life of server hardware is always healthy and applications are always available without downtime. Managing this manually isn’t an option since there are several nuts and bolts you’ll have to look at. Moreover, you also have to support and manage other areas within the environment.

   

Having an APM tool means you can automate availability and performance management for your servers and applications. In addition, APM tools offer various benefits, for example, getting automatically notified when something goes wrong with servers and apps. Within minutes, APM tools will help pin-point where an issue originates, and monitor application performance in phases before they go live on to a production environment. In turn, you can fix minor issues before the end-user starts pointing them out, and much more.

      

Where APM Fits

APM as a technology has evolved from primarily monitoring only a set of applications that IT uses. Today, this technology has a significant impact among various groups in an organization. For example, industry specific users leveraging APM to manage their business needs and addressing everyday challenges, IT pros looking for tools that go deeper to manage performance of business and critical applications, like Exchange and SQL server®.

     

Manage Critical Applications: Several times in a week, IT pros are asked to help users by unlocking their accounts. APM tools these days not only monitor Active Directory® metrics, but also have built-in functionalities to manage logs that are generated by critical applications.

     

Manage Custom Applications: Industries like healthcare and financial services are largely dependent on APM tools to help improve customer support, help streamline auditing and compliance processes, manage large amounts of customer data, and so on. For example, the banking industry may have poor customer satisfaction when sites are slow to respond to user requests. Monitoring online transactions based on response time, traffic, etc. will help business groups streamline their system.

    

Manage Overall IT Infrastructure: It’s not enough for IT personnel to only know the performance of a network. To really identify the root of the issue, IT needs APM to figure out whether the network is at fault or whether the problem has to do with inadequate hardware resources or whether an app failure has caused end-user issues.

      

Mobile IT Management: Other than accessing emails via a smartphone, IT organization and business groups feel the need to have critical applications at their fingertips. Using an APM solution on the go means that instant notifications of components with issues can be routed to the right teams so it can be fixed in real-time and in matter of minutes.

       

Role of APM in Analytics: An APM tool gives you different types of information on the performance of your servers and hardware, operating systems, critical applications, databases, etc. Making sense of this data is essential to be able to determine problems that may arise. 

       

Manage Web Applications: Getting visibility into the performance of your websites and Web applications can help you quickly pinpoint and resolve the root cause of issues. APM helps you determine if there are constraints on resources, such as Web server, application server, or databases.

          

Manage Virtual Environments: Organizations may have virtual admins managing the health of virtual appliances. Virtual admins also need visibility into how applications running in VMs are performing. APM also allows you to plan capacity management for a given application and its underlying resources.

        

Whether it’s analytics, cloud-based apps, or managing various assets within your IT infrastructure, APM fits well and more often than not, provides assistance in managing your IT environment.

One of the questions encountered often by new users of Patch Manager is the purpose and uses of the Managed Computers node in the console.PMMC-1.png


The Managed Computers node is the collection of all Patch Manager servers, registered WSUS servers, and any machines that have been targeted for an inventory task, regardless of whether the machine was successfully inventoried. As the inventory task obtains the list of machines from the target container, a record is created in the Managed Computers list for that machine. When the inventory is successfully completed, a number of attributes are displayed for the machine in the Managed Computers node.

 

The Managed Computers node is especially useful for accessing basic diagnostic information about the status of computers and the inventory process. In the Computer Details tab for each machine, five state results are provided that describe the results of the inventory connection attempt.

 

When an inventory task is initiated, the Patch Manager server queries the container object that the inventory task has been targeted to. This may be a domain, subdomain, organizational unit, workgroup, WSUS Target Group, or Patch Manager Computer Group. Regardless of the type of container, the Patch Manager server obtains a list of machine names from the identified container.

 

Failed Inventory Connections

An entry in the Managed Computers node with an icon containing a red circle PMMC-2.png

indicates a machine that failed the most recent inventory connection attempt.

 

DNS resolution attempt reports the status of the attempt to resolve the computer name obtained from the container. If the name was resolved, the IP Address is captured and stored in the computer record and the status is reported as “Success”. If the name was not resolvable, the status is reported as “Failed”.

 

ARP resolution attempt reports the status of the attempt to resolve the IP Address obtained from the DNS resolution attempt. If the ARP resolution attempt is successful, the MAC address is captured and stored in the computer record, and the status is reported as “Success”. If the ARP resolution attempt was not successful, the status is reported as “Failed”.

 

ARP is a broadcast-based network technology and generally does not cross broadcast boundaries, which include routers, bridges, gateways and VLANs. As such, when performing ARP resolution for IP Addresses on the other side of a gateway, it’s important to note that the gateway will respond with its own MAC Address, as the purpose of ARP is to identify where network packets should be addressed to get the packet on the correct pathway to its destination. Patch Manager knows whether a MAC address returned is the MAC address of a boundary device or the actual targeted device. When a boundary device is identified as the owner of a resolved MAC Address, Patch Manager will not record that MAC address and will report the ARP resolution as “Failed”. Thus, it is a normal indication for machines on remote networks to have a status of “Failed” for the ARP resolution attempt, except where an Automation Role server is physically present on that remote network. (See Patch Manager Architecture - Deploying Automation Role Servers and How-To: Install and Configure a Patch Manager Automation Role Server for more information about the use of Automation Role servers.)

 

Endpoint Mapper connect attempt reports the status of the attempt to connect to the RPC Endpoint Mapper on port 135. When the status of this event is reported as “Failed”, and the status of DNS and ARP resolution events are reported as “Success”, this is generally the result of an intervening firewall blocking traffic on port 135.

 

File and Printer Sharing connect attempt reports the status of the attempt to establish a file sharing session on port 445 using SMB over IP. When the status of this event is reported as “Failed”, either an intervening firewall is blocking port 445, or the File and Printer Sharing service may not be enabled. Comparing the results of the Endpoint Mapper connect attempt can shed additional light on the situation. It’s also important to note that File and Printer Sharing is only needed to deploy/update the Patch Manager WMI Providers. If the WMI Providers
are deployed, a failure here will not negatively impact the completion of the inventory task.

 

WMI connect attempt reports the status of the attempt to establish the WMI session. When this event is reported as “Failed”, you should check firewall configurations as well as the credentials configured in the assigned credential ring. If using a local account to access the machine, the password stored for the credential may not match the password configured on the machine’s local account; also confirm that the chosen credential does have local Administrator privileges on the target machine.

 

Partially Successful Inventory Connections

A machine with a yellow triangle iconPMMC-4.png

PMMC-3.png

indicates a machine that was successfully inventoried, but one or more issues occurred while attempting to access a specific datasource or object. The specific objects that were impacted will be listed at the bottom of the Computer Details tab.

 

Tabs & Graphics

There are four tabs provided on the Managed Computers node that display the statistical results of various steps in the inventory collection process. Double-clicking on any graph segment will launch the corresponding report listing the machines affected.

 

The Computer Inventory Details tab shows the specific datasources collected, and the timestamp of the last successful collection.

 

The Connectivity Summary graph shows the number of systems targeted for inventory, the number of systems accessible via WMI, the number of systems presumed to be powered off (or otherwise not reachable), and the number of systems that are reachable, but could not be accessed with WMI.

 

The Connectivity Failure Summary graph shows the number of systems which failed at any of four of the five steps of the connection process: DNS resolution, ARP resolution, RCP Endpoint Mapper connectivity, and File and Printer Sharing connectivity. It also contains results for a NetBIOS connection attempt which is also performed during the inventory connection process.

 

The WMI Failure Summary graph shows the number of systems which failed WMI connectivity for the three most commonly occurring reasons: [1] Access Denied, [2] Firewall blocked or WMI disabled, and [3] any other known WMI failures which are unidentifiable or other failures not attributable to any other known cause.

 

In addition, the Managed Computers node also provides a Discovery Summary graph which shows the number of machines that were accessible on selected ports tested during a Discovery event. Discovery is a process by which devices and machines are identified by IP address, and network accessibility is identified by TCP port availability.

 

For more information about the features of Patch Manager, or to download your own 30-day trial, please visit the Patch Manager product page at SolarWinds.

Now that I’ve had a chance to settle in after five weeks on the road, I wanted to take a moment to write about our visit to Interop in Las Vegas this year.  While SolarWinds always exhibits at Cisco Live including a couple of international editions, and a host of other events, we haven’t been to Interop since 2007.  In a way it’s an anniversary of sorts for me; my first time staffing our booth was at Interop 2007.
It WasSmaller but Better Than 2007
interop-shirts.jpg
Interop was a bit smaller than during our previous visit, but the majority of the absent vendors weren’t really missed, putting the focus back on real admins solving issues. After working CeBIT this spring, if I never have to walk past another row of 10x10s featuring exactly the same looking WICs from vendors I’ve never heard of I’ll be happy.  With many of the single issue solution vendors gone, most attending were companies like SolarWinds that know what they’re doing, have been around for a while and solve real problems without vaporware.  There were a few startups looking to snag for their first whale, but by in large it was a great environment for attendees to take their time and engage vendors in low pressure conversations about.. well Geek stuff. Like how products really work not the pretty graphics on the booth.

Another outstanding feature of the show was the network.  I’ll put this in bold because it’s earned.  The Interop Network Team (InteropNet) delivered the greatest, most stable and highest performing network of any tradeshow I’ve ever connected the booth to.  60 down, 12 up with <12 latency and almost no jitter.  DHCP terminated into Fe0/0 on my trusty 2800- done and done.  I don’t understand why everywhere else exhibitors have to pay $1900 for ADSL speeds. That it’s staffed largely by volunteers is all the more amazing.  Great job guys!

Mandalay Bay – Most Improved Hotel Internet

Once upon a time the casinos did everything they could to keep you inside, including of rumors of cell phone interference and certainly poor WiFi.  Apparently the word is out that BYOD isn’t just for enterprises, and that humans prefer to hang out where they have good internet.  For my last six shows there, Mandalay Bay has been a black hole of connectivity, especially for geek events when we arrive in masse to pound the APs.  This year was completely different.

BYOD rule #1- it has to be stupid easy to use.  For example if you own a huge structure hundreds or thousands of feet from other networks, you don’t need passwords on your SSIDs.  Great improvement there.  Also, use your controllers and thin clients to shutdown rouge networks.  My Dell, Air and phone were all online and happy in moments. BYOD rule #2- Create an edge where the network is available but controlled access as long as it’s still pretty easy to use.  In all the areas outside the rooms including the casinos floor they now offer free 3m/1m with your name and room number.  Smarter, they also upsell to even faster speeds.  Provide visitors with bandwidth or we go elsewhere.  They got the message.

NSX Was (REDACTED), More from Cisco Live

I’m a huge ESX fan and love the idea of a combined console that virtualizes everything.  I’ll talk about this more in another column because it’s really important, but want to chat more at Cisco Live and balance that discussion with Cisco ACI.  I’m not saying I was unimpressed with the technology for SMBs, but more research is needed before prognostication about the future of the non-huge datacenter.
The IT Beast Is Dead Long Live the IT Beast

With all the great customer conversations, and special sights in the booth, like watching customers spontaneously demo our products to non-customers on the big screen, I'm a bit sad to see our current booth theme retired.  We typically keep a theme about a year and the big, green, betentacled beast has been an eye-catching conversation starter in booths from Cisco Live to RSA.  Interop was his last outing.  His piercing red eyes and sharp teeth will be missed.

However-- we’re bringing all new fun to Cisco Live in San Francisco, so be there to check it out. It’s your first chance to get this year’s new t-shirt before anyone else does.
interop-pano.jpg

Users describe call experience as ‘good’ ‘ok’ ‘poor’ ‘bad’ ‘terrible’. And call experience is defined by the elements of voice quality and the network factors that affect them. Elements of voice quality are loudness, distortion, noise, fading and cross talk whereas network factors that affect them are latency, jitter, packet loss, voice activity detection, echo and echo canceller performance.

 

How do these factors affect voice quality?


  • Latency: is the delay or the time it takes for the speech to get from one designated point to another. Very long latency results in delay of hearing the speaker at the other end.
  • Jitter: is the variance of inter-packet delay. When multiple packets are sent consecutively from source to destination and if there are delays in the network like queuing or arriving through alternate routes, the arrival delay between packets is the jitter value. For delay-sensitive applications like VoIP a jitter value of 0 is ideal.
  • Packet loss: occurs when data packets are discarded at a given moment when a device is overloaded and unable to accept incoming data. You need to keep packet loss to the lowest value possible. For VoIP, packet loss causes parts of the conversation to be lost.
  • Voice Activity Detection: is used in VoIP to reduce bandwidth consumption. When this technology is used, the beginnings and ends of words tend to be clipped off, especially the "T" and "S" sounds at the end of a word.
  • Echo: is the sound of the speaker's voice returning to and being heard by the speaker. Echo is a problem of long round-trip delay. The longer the round-trip delay, the more difficult it is for the speaker to ignore the echo.
  • Echo Canceller Performance: The echo canceller remembers the waveform sent out and, for a certain period of time, looks for a returning waveform that it can correlate to the original signal.   How well the echo is cancelled depends on the quality of the echo canceller. If the return signal (echo) arrives too late, the echo canceller won't be able to correlate and cancel it properly.
  • CODEC: stands for coder-decoder, converts an audio signal into digital form for transmission and then back into an audio signal for replay. CODECs also compress the packet to gain maximum efficiency from the network. How well the CODEC converts speech to digital packets and back again affects voice quality. Choosing the right codec for the network depends on the required sound quality, available network bandwidth and so on. Some networks use more than one codec but this again may impact call quality.

  

The index to measure the call quality using network data is called the Mean Opinion Score (MOS).

  

Mean Opinion Score (MOS)


MOS is a benchmark used to determine the quality of sound produced by specific codecs and the opinion scores are averaged to provide the mean for each codec sample. It is always preferable to have a MOS score of 4 or 5 for your VoIP calls. When the MOS decreases to 3.5 or below, users find the voice quality to be unacceptable. It is used to assess the performance of codecs that compress the audio.

  

Measuring VoIP performance using MOS


Test Infrastructure Readiness for VoIP Traffic

When implementing VoIP, it is a good practice to test the network for its readiness to carry voice traffic. But how can you accomplish this without spending CAPEX on VoIP infrastructure? Cisco devices use IP SLA to generate synthetic VoIP traffic and collects data to measure metrics like latency, jitter, packet loss and MOS. The MOS score is an indication of what voice quality to expect. You can start by troubleshooting network devices in the route of VoIP calls and those with an MOS score lower than 3. However, configuring IP SLA operations requires good knowledge of CLI (Command Line Interface).


Troubleshoot Poor VoIP Performance In the Network

Manually troubleshooting VoIP issues involves collecting performance metrics like jitter, latency, packet loss, etc., from various nodes in the network such as switches, routers or call managers. But this does not provide a standard to compare and understand the VoIP call quality in the network. MOS acts as a standard for measuring call quality. For a network experiencing poor VoIP performance, you can pin point the root cause depending on the MOS scores measured for a specific codec or at a particular time or for a particular department, location.


In summary, MOS scores are good indicators for troubleshooting VoIP performance issues in the network. Tools like SolarWinds VoIP and Network Quality Manager (VNQM) lets you enable IP SLA operations on your devices without knowledge of CLI commands, as well as helps avoid time consuming manual configurations on multiple devices.


SolarWinds VNQM also provides automated reports with locations and MOS scores, comparison of MOS scores per codec for each call, comparison of call performance metrics between departments or call managers and notification via email in case of bad calls.


Reduce time needed to evaluate network readiness and troubleshoot VoIP performance issues from hours to minutes!


Learn More:

Network management is good, but there are some benefits that are not so apparent at first. It gives you the proof you need when that person calls the help desk and says “the network is slow..” at this point you have to clarify the issue with the user and find out exactly what the problem is. Once you have obtained the details, you can then check your NMS, to find out if there is an issue, one of three things is going to happen, you find a problem and you can correct it, or you don’t find a problem and prove the user is incorrect.

 

The thing you wont have is “ I don’t know if there is a problem“ and that is the one that will drive you crazy. Because the user is sure there is a problem and unless you can offer authoritative proof the finger pointing cycle will begin, and this is a place no one wants to be. You know there is not a problem with the network, but the user needs proof, Solarwinds Orion is what gives you this proof, that can be taken to the client and say, its not my opinion,  that the network is not slow , here are  the facts.. circuit xyz has a utilization of 20% and is nowhere near being saturated.

 

I had a user call into the help desk and claim the network was slow, he placed a ticket that his mission critical system was slow and this was causing a production problem. The help desk created a high priority ticket and sent it over to my team. I looked at the ticket and checked my Orion server , for the problem area that was reported I saw no issues,  I created a report to backup my findings. I called the user and explained that based on the facts I have obtained , there was no problem.

 

The user told me , that there was indeed a problem , he had just moved from building 3  to building 1 and when he was in building one he ran a speedtest from speed test.net and it was five times faster in his old building. He even showed me some print screens of his findings.

 

In reality , there was no problem, but if I didn’t have the facts to back me up, the finger pointing would start and it would just spiral out of control and at that point nobody would be happy . like they always say “the truth shall set you free”  and it sure did in this case.  

 

Orion, has saved so much time and frustration, and I sleep better at night knowing that yes there will still be network problems, but I will be able to answer that question, Is there a problem? If yes, then I fix it , if not I can prove it. but I never have to deal with I don’t know, and that my friends is a wonderful thing. Bye, bye  finger pointing hello productive day. How much is your time worth?

Last week, we had a great conversation on finger-pointing, and some of you shared real-world advice on how to avoid it. Most of the comments described a work environment that was still tied to the stove-piped organizational structure from ten years ago, when network, server, and storage were discrete disciplines with effectively zero relation to one another. This approach, however, is no longer valid.


Virtualization, specifically the abstraction of physical resources, makes isolated engineering teams dysfunctional. It’s not enough to pursue skills that exist exclusively in the confines of network, server, and storage. For example, it’s not surprising to hear that someone has a few VMware certifications, and a CCNA. That makes sense, since you can’t do a whole lot with vSphere unless it’s connected to your network. But for those of us who have been doing IT work for a long time, you certainly remember a time when having a Microsoft cert AND a Cisco cert was unheard of.


So, a few questions for you:


  1. Are you part of a siloed team at work? If so, how do you support virtualization (or other technologies that consume resources from multiple teams)?
  2. Do any of you have multiple vendor certifications that extend beyond the network | server | storage silos? How have they helped your career?
  3. Do you think having some primitive coding skills can help engineers in any discipline?
  4. Is there a future for engineers who focus on a single skill-set?


And here's a hint: the answer to number 4 is no. Discuss.

Network management is good, but there are some benefits that are not so apparent at first. It gives you the proof you need when that person calls the help desk and says “the network is slow..” at this point you have to clarify the issue with the user and find out exactly what the problem is. Once you have obtained the details, you can then check your NMS, to find out if there is an issue, one of three things is going to happen, you find a problem and you can correct it, or you don’t find a problem and prove the user is incorrect.

 

The thing you wont have is “ I don’t know if there is a problem“ and that is the one that will drive you crazy. Because the user is sure there is a problem and unless you can offer authoritative proof the finger pointing cycle will begin, and this is a place no one wants to be. You know there is not a problem with the network, but the user needs proof, Solarwinds Orion is what gives you this proof, that can be taken to the client and say, its not my opinion,  that the network is not slow , here are  the facts.. circuit xyz has a utilization of 20% and is nowhere near being saturated.

 

I had a user call into the help desk and claim the network was slow, he placed a ticket that his mission critical system was slow and this was causing a production problem. The help desk created a high priority ticket and sent it over to my team. I looked at the ticket and checked my Orion server , for the problem area that was reported I saw no issues,  I created a report to backup my findings. I called the user and explained that based on the facts I have obtained , there was no problem.

 

The user told me , that there was indeed a problem , he had just moved from building 3  to building 1 and when he was in building one he ran a speedtest from speed test.net and it was five times faster in his old building. He even showed me some print screens of his findings.

 

In reality , there was no problem, but if I didn’t have the facts to back me up, the finger pointing would start and it would just spiral out of control and at that point nobody would be happy . like they always say “the truth shall set you free”  and it sure did in this case.  

 

Orion, has saved so much time and frustration, and I sleep better at night knowing that yes there will still be network problems, but I will be able to answer that question, Is there a problem? If yes, then I fix it , if not I can prove it. but I never have to deal with I don’t know, and that my friends is a wonderful thing. Bye, bye  finger pointing hello productive day. How much is your time worth?

File sharing is common; file sharing is critical; and file sharing is sometimes complex. We perform file sharing all the time in the organization either by peer-to-peer sharing via email, instant message, internal shared drives, etc., or by using FTP transfers and using cloud services. File sharing has always been a simple concept, but when it comes to security of the data in transit and storage, that’s when the doubt seeps in. What could happen during the transfer process? Can someone else intercept it and steal or modify the data? It is possible and happens many times even in large organizations. Not just in file transfer within the organization, but also to parties outside the corporate network. Security is no guarantee in any of these forms of file transfers. Then, there’s the issue of process complexity.

 

Secure File Transfer

There’s an alternative which is secure file sharing using a manager file transfer server which is secured by FTPS/SFTP/HTTPS and any other organizational security policy and permissions. Within the perimeter of your own network for data storage and FTP server infrastructure, you can transfer files safely and securely between FTP clients (including both computers and handheld devices). That takes the security concern away. Now, to address complexity.

 

Ad Hoc File Sharing

What you would want in a file sharing process is the simplicity to send and receive files without complicated process and manual labor. Enter ad hoc file sharing – whenever-you-want, wherever-you-want file transfer is a few clicks away!

  • Sending a File: When user wants to send a file, all he needs to do is upload the file to a secure FTP server and email the link (with or without password protection) to the recipient. The recipient (whether inside or outside the enterprise network) will receive the email with the link to download the file. If password security is enabled, he’d require to open the file upon entering the password. Sending a file cannot get any simpler over secure FTP.
  • Requesting & Receiving a File: And, if you’d like to request a file to be transferred, just use the FTP Client in the MFT server interface to send an email with a secure upload link to the sender (with or without password protection). Once the sender receives this link in his email, he can use it to upload the file to the FTP server, and you’d get an email notification of the completed file transfer. Now, you, the recipient, can just use the click the link in your email and download the file, or use an FTP client interface to do the same thing.

FTP.png  

Managed File Transfer

In addition to security and the ease of file sharing, automation and control of the file transfer process will result in more simplicity and operational efficiency. Employing a third-party MFT server will help you gain additional features like event-driven automation, scheduled file transfer, multiple file transfer, large file transfer, file synchronization, side-by-side and drag-and-drop transfer, reporting, virtual folder access, AD sync for permissions, multi-factor authentication, more intuitive FTP Clients and more.

 

Alongside simplifying and securing your file transfer process, you can also make it powerful and robust to support your growing file transfer needs within your organization. Indeed file transfer is fun when you have the right tool to facilitate it. Only remember to play safe!

According to Forrester, the SaaS application and software market is expected to reach $75 billion in 2014. Forrester goes on to quote that the, “browser-based access model for SaaS products works better for collaboration among internal and external participants than behind-the-firewall deployments.” When you think about it, today’s users at organizations spend most of their time accessing various “smart applications”. Whether it’s applications like Office 365 or salesforce, the user base accessing and using these applications are increasing tremendously.

      

Monitoring the performance of these applications will make a huge difference considering more and more users are adopting the use of SaaS and cloud based applications. Monitoring the load on the server, user experience, and bottlenecks are crucial to optimize the overall performance whether the application is hosted on-premise, in a public cloud, or using a hybrid approach. If your organization is using several SaaS based applications, you can look at the following considerations if you choose to monitor performance and availability of such applications.

         

Monitor User Experience: Since users are going to be accessing the application extensively, you should monitor overall user experience and users’ interaction with the application. This allows you to analyze performance from the end-users perspective.  Slow page load times or image matching issues can be a first indication that there’s an issue with the application.  By drilling in deeper, you can determine if the problem is related to a specific page and location.  Ultimately, monitoring user experience allows you to improve and optimize application performance, which results in improved conversion.

     

You could also look at this in two ways: from the perspective of the service provider, and from the perspective of the service consumer.

     

Service providers need to focus on:

  1. User experience: It’s likely service providers have SLAs with end users and they need to demonstrate they are meeting uptime and other SLA considerations.
  2. Infrastructure: There are many factors that can cause a service failure, therefore all aspects of the infrastructure must be monitored. These aspects include applications, servers, virtual servers, storage, network performance, etc.
  3. Integration services (web services): Services provided are dependent on other SaaS providers or internal apps.

        

Service consumers need to focus on: 

  1. User experience: If part of your web application consumes web services, this can be the first indication of a problem.
  2. Web service failures: This can help identify a failure in communication.

     

Focusing on these aspects are essential when you’re monitoring SaaS applications. These key considerations help IT admins to take proactive measures to ensure applications don’t suffer downtime during crucial business hours. At the same time, each application will be optimized by continuously monitoring, thus improving overall efficiency.

            

Check out the online demo of Web Performance Monitor!

TiffanyNels

Countdown To FINAL ROUND

Posted by TiffanyNels Apr 9, 2014

Ok, fellow thwackers... we are getting down to the wire.  The next 12 hours of voting will determine which games go on to compete to win the HIGH SCORE or go GAME OVER.

 

If you haven't voted for your favorite game, you still have time on the clock. The final four round closes at MIDNIGHT tonight.  Call of Duty is facing Grand Theft Auto in a first-person shooter match up, while the other side of the bracket is a match up between two iconic games which many of you cut your gaming teeth on, Zelda v. Doom. 

 

Get out there and campaign for your favorites, vote early (but not often). There is still time on the clock to change the outcome of this round.

 

So, get to it. We will see you for the final boss battle.

 

Game on, gamers!

This is a very common predicament that most SQL developers and DBAs face in their day-to-day database encounters – regardless of the relational database platform being used. “Why is my database slow?” This could be for many reasons, with one of the hard-to-isolate reasons being slow query processing and longer wait times.

 

Reasons for Slow Database Performance

  • Network: There could be network connection issues
  • Server: The server workload on which the database is running could be high which makes database processing slower
  • Database/Query: There may be redundant query lines, complex or looping syntaxes, query deadlocks, lack of proper indexing, improper partitioning of database tables, etc.
  • Storage: Slow storage I/O operations, data striping issues with RAID

 

While network issues and server workload can be easily measured with typical network monitoring and server monitoring tools, the real complexity arises with the understanding of the following database & query-related questions:

  • What query is slow?
  • What is the query wait time?
  • Why is the query slow?
  • What was the time of the day/week of the performance impact?
  • What should I do to resolve the issue?

  

Query response time analysis is the process of answering the above questions by monitoring and analyzing query processing time and wait time, and exploring the query syntax to understand what makes the query complex. We can break down query response time into 2 parts:

  1. Query processing time – which is the actual processing time for the database run the query. This includes measuring all the steps involved in the query operation, and analyzing which step is causing processing delay.
  2. Query wait time – which is the time of a database session spent on waiting for availability of resources such as a lock, log file or hundreds of other wait events or wait types

 

Response Time = Processing Time + Waiting Time

DPA.png

Query waiting time is determined with the help of the wait time metric called wait type or wait event. This indicates the amount of time spent while sessions wait for each database resource.

  •   In SQL Server®, wait types represent the discrete steps in query processing, where a query waits for resources as the instance completes the request. Check out this blog to view the list of common SQL Server wait types.
  • In Oracle®, queries pass through hundreds of internal database processes called Oracle wait events which help understand the performance of a SQL query operations. Check out this blog to view the list of common Oracle wait events.

 

A multi-vendor database performance monitoring tool such as SolarWinds Database Performance Analyzer will help you monitor all your database sessions and capture query processing and wait times to be able to pinpoint bottlenecks for slow database response time. You can view detailed query analysis metrics alongside physical server and virtual server workload and performance for correlating database and server issues. There are also out-of-the-box database tuning advisors to help you fix common database issues.

A recent survey commissioned by Avaya reveals that network vulnerabilities are causing more business impacts than most realize, resulting in revenue and job loss.


  • 80% percent of companies lose revenue when the network goes down; on average, companies lost $140,000 USD as a result of network outages
  • 1 in 5 companies fired an IT employee as a result of network downtime

 

And.....


  • 82% of those surveyed experienced some type of network downtime caused by IT personnel making errors when configuring changes to the core of the network
  • In fact, the survey found that one-fifth of all network downtime in 2013 was caused by core errors

 

Cases of Device Misconfigurations Leading to Network Downtime


Real-world scenario 1: Company Websites Down, Reason Unknown

Soon after a software giant had a big advertising campaign with major incoming Web traffic expected, their websites went down. Unable to pinpoint the actual cause of downtime to being a configuration change made earlier, the websites remained unreachable for a few hours. Taking time to identify the issue and re-establish connectivity, the organization suffered huge losses in revenue from the millions of dollars spent on the promotional campaign.


Troubleshooting: With the current network situation, all thoughts pointed to a core router failure or a DoS attack. On checking and confirming all critical devices to be ‘Up’, the next assumption was that the network was the victim of a DoS attack. But again, seeing no traffic flood on the network the root cause had to be something else. After hours of troubleshooting and individually checking core and edge device configurations, it was later found that the WAN router had a wrong configuration. The admin who made the configuration change, instead of blocking access to a specific internal IP subnet on port 80, ended up blocking port 80 for a wider subnet that also included the public Web servers. This completely cut off Webserver connectivity to inbound traffica typo that cost the company millions!


Real-world scenario 2: Poor VoIP Performance, Hours of Deployment Efforts Wasted


A large trading company uses voice and video for inter-branch and customer communication. To prioritize voice and video traffic and ensure quality at all times, QoS policies are configured across all edge devices over a weekend. However, following the change, the VoIP application begins to experience very poor performance.

Troubleshooting: QoS monitoring suggests that VoIP and video has been allocated lesser priority than required. Instead of marking VoIP traffic to EF (Expedited Forwarding) priority, the administrator ended up marking VoIP packets to DF (Default class) resulting in the poor performance of VoIP and video traffic. Correcting the VoIP traffic setting to EF on all edge devices meant many more hours of poor performance and loss of business.


Remediation


The network downtime in the above two cases could have been avoided via simple change notification and approval systems.


In the first case, notifying other stakeholders about the change would have helped correlate and identify the recent change as a possible cause of the issue. Troubleshooting would have been faster and normalcy restored by quickly rolling back the erroneous change.


In the second case, a huge change involving critical edge devices should have gone through an approval process. Having the configuration approved by a senior administrator before deployment can help identify and prevent errors that can bring the network down.


Both cases reflect poorly on the administrators. Bringing down the network was clearly not intentional!


Human errors are expected to occur in daily network administration. However, considering the impact a bad change can have on both the company and the person, it’s imperative that there are NCCM processes put in place. To reduce human errors and network downtime, use a tool that supports NCCM processes such as change notification and approvals.


Check out this paper for more tips on reducing configuration errors in your network.

Tech Tip Banner Image.jpg

It's amazing,


I used to think, my network was running pretty well, a few hiccups now and then, but by and large, I got by and thought everything was business as usual, the boss would ask me: "Craig how is the network today?" and I would say "it's fine boss". Flash ahead to fiscal year's end, and there is some money left in the budget and I am given fifteen minutes to decide what tools to buy otherwise the money doesn't get spent. Most people wouldn't be prepared for this, but I was ready. I pulled out my wish list and said I want SolarWinds Orion NPM. After a few frantic calls to the vendor of choice, it was all set. I'm thinking, this is great that I got this, but I probably won't use it that much, because we have no problems.. and it's going to take forever to install, and the learning curve will be immense. In reality, the software was really easy and intuitive to install; point here, click here, answer a few questions and it was done.But, why was there so much red everywhere? This must be a software bug, because my network has no problems..  I spent the rest of the day tweaking and twiddling about, and I have to say, it was like turning the light on in a dark room. I was able to solve a lot of long standing problems, some of them that I didn't even know I had.

 

There was the switch with a bad blade that always had problems intermittently, but never failed, so no alarm was tripped. After being alerted to this, I had the blade replaced and things began to run clean. Sometimes the network was slow but I never could attribute it to any single cause, it usually coincided with a home game at the local ballpark. Turns out a lot of non-work related web streaming was going on, and some other folks were enjoying Netflix.

There was the router that went down even though it had redundant power supplies, but no one ever saw when the first one went down, but people sure noticed went the second one failed.. I setup an alert to monitor this and several other things. The major cost of IT where I work is no so much the hardware or software, it's the cost of actually scheduling time with union, paying for the lift truck, just the logistics were mind boggling and intensive and nobody wanted downtime. I am now able to easily automate and monitor my network, and do a lot more proactive monitoring and forecasting. I am just as busy as I was before, the difference is now I have a better view of what is going on with the network and I can act proactively instead of reactively. I have a lot less stress. I have lost fifty pounds and I have a corner office.. lol.. just kidding.. but I do get to sleep through the weekends without the pager going off at 3am and I still go to the same amount of meetings, but now they are more about future planning instead of postmortems.

 

What about you guys? Can anyone share a general process of things you might monitor and proactively forecast? Any tips and tricks pertaining to procedure are greatly appreciated!

I am now a believer in network performance management. It has really paid for itself many times over.

Filter Blog

By date:
By tag: