Skip navigation
1 14 15 16 17 18 Previous Next

Geek Speak

1,765 posts

Career management is one of my favorite topics to write and or talk about, because I can directly help people. Something I notice as a consultant going into many organizations is that many IT professionals aren’t thinking proactively about their careers, especially those that work in support roles (supporting an underlying business, not directly contributing to revenue like a consulting firm or software development organization). One key thing to think about is how your job role fits into your organization—this is a cold hard ugly fact that took me a while to figure out.

 

Let’s use myself as an example—I was a DBA at a $5B/yr medical device company—that didn’t have tremendous dependencies on data or databases. The company needed someone in my slot—but frankly it did not matter how good they were at there job beyond a point. Any competent admin would have sufficed. I knew there was a pretty low ceiling of how far my salary and personal success could go at that company. So I moved to a very large cable company—they weren’t a technology company per se, but they were large enough organization that high level technologists roles were available—I got onto a cross platform architectural team that was treated really well.

 

I see a lot of tweets from folks that often seem frustrated in their regular jobs—the unemployment rate in database roles is exceedingly low—especially for folks like you who actively reading and staying on top of technology—don’t be scared to explore the job market, you might be pleasantly surprised.

Guys, I'm really excited to be part of the Thwack community.

 

My first post "THERE MUST BE A BETTER WAY TO MANAGE VIRTUALIZED SYSTEMS" reached more than 2550 people and my second post reached over 800 people plus a bunch of people who actively participated. This is great sign and shows that you all enjoy being part of this community.

 

In my previous posts we covered how you are managing your virtualized systems and what features are you using most.

 

In today's post, I would like to discuss one particular part of your virtual infrastructure, your Virtual Desktop Infrastructure (VDI). Common questions I come across are, how to right size my VDI infrastructure, how many IOPS will my users generate and should I utilize VMware View Thin Apps or Citrix XenApp. I found the VDI Calculator by Andre Leibovici very helpful.

 

Screen Shot 2015-05-18 at 9.37.38 PM.png

 

Additionally, I found LoginVSI to be great tool for VDI storage benchmarking and to find out how many VMs you can actually host on your system. I doubt that many of you are using it since it isn’t cheap but if you are using it or you have used it in the past, you know what I’m talking about. This tool fully simulates real VDI workloads and not just some artificial vdbench/sqlio/fio load. Also, VMware View Planner is supposed to be a great tool for benchmarking and right sizing your environment but I haven’t touched it just yet. Have you?

The last tool in my VDI repository is a tool created by the VMware Technical Marketing Group - VMware OS Optimization Tool. I am not going too much in detail here but just click on the VMware OS Optimization Tool and read my blog post about it. It is a great tool, which can be used to create your golden VDI image.

 

If you know some useful VDI tools or you used the tools, which I’ve mentioned above, please comment and share your experience with us. Let’s make this post a great resource for all VDI admins out there.

Fault Management (FM) and Performance management (PM) are two important elements of OAM in layer 2 and layer 3 networks.

 

FM covers faults management related to connectivity/communication of end stations.  While PM includes, monitoring the performance of link using statistics like packet loss, latency and delay variation (also called jitter) etc.

 

Here we need to differentiate between layer 2 and layer 3 networks.

 

For layer 2 networks, FM is usually done using CCM messages (connectivity check messages) while PM is done using standard protocols like 802.1ag or Y.1731 that can monitor all parameters mentioned above.

 

For layer 3 networks, Ping and trace route are primary tools for FM and by far the most widely used tools for troubleshooting, while IP SLA is one of the PM tools for Cisco devices. IP SLA can monitor all stats including loss, latency and delay variation at IP layer ( can also do it on layer 2) in addition to helpful stats for VOIP like MOS score. (Please note, Cisco use the term IP SLA also for both layer 3 and layer 2 links, even though the stats at layer 2 are on the Ethernet layer).

 

Coming from a carrier Ethernet background in my last job, when I look back, I can say that tools especially the PM tools at layer 2, were not used very often. It may be, because many people were not aware of it or the thresholds of pass/fail for performance measurements were not very well defined. Recently, Metro Ethernet Forum (MEF) has done a great job by standardizing the threshold and limits for jitter, delay and packet loss. Therefore, the PM tools have started gaining acceptance industry wide and are being rolled out in layer 2 service provider networks more actively.

 

However, I am quite curious on how often OAM tools are used in IP networks.

 

Fault management tools like Ping/traceroute are the bread and butter of an IP engineer when it comes to troubleshooting networks but I am especially interested to know more about the IP SLA and its use in the networks.

 

So my question to you would be

 

  • How often do you use IP SLA ( or any similar tool)  in your network? Do you use it in specific applications like VoIP?

 

  • Do you used it for both layer 2 and layer 3 networks. In enterprises as well as service provider environment?

 

  • Are the thresholds of the PMs (Delay, jitter and packet loss) well defined;  by Cisco or any standard body?

 

Would love to hear your opinion here!

cxi

Trust... But Verify

Posted by cxi May 18, 2015

Let me start by saying, wow and thank you to everyone who maintains such a high activity in this community. While I may occasionally share some jibber jabber with all of you, you are all the real champions of this community and I cannot thank you enough for your contributions, feedback and more!

 

This leads me to my segment this week... One I welcome your contributions as always...

 

Trust; But Verify!

 

Screen Shot 2015-05-17 at 3.55.00 PM.png

 

This line of thought isn't limited to Authentication, but it certainly shines as a major element of a trust model.

How many times are we put into a position of, "Oh yea, it's all good, no one has access to our systems without two factor authentication!" "What about service accounts?" "...crickets"

 

I've been there. My account to login to look at files, personal email, etc has such a high level of restraint and restriction that they have everything under the sun, username, password, secret pin, blood sample, DNA matrix...   Yet then the Admins themselves, either directly for elevated accounts, or indirectly through Service Accounts, or other credentials are 'secured' through simple password.   "Oh, we can't change the password on the account because it takes too long, so it goes unchanged for 60, 90, 180, never?"

 

Now, not every organization operates this way. I remember having tokens back in the 90s for authentication and connectivity for Unix systems, but that is truly few and far between.

 

I won't even go into the model whereby people 'verify' and 'validate' the individual who is hired to protect and operate in the network as that's VERY much outside the scope of this little blog, but it leaves to question... How far do we go?

 

What do you feel is an appropriate authentication strategy? One form (password), two form (Password+something else) or even a more complex intermixing of multiple methods?

And forget what we 'think' vs what you actually see implemented.

 

What do you prefer? Love, Hate, Other!



 

FeedMeSeymour.jpgDontFeedAnimals.pngFEED ME SEYMOUR! That's what I hear when anyone jumps to a conclusion that the database needs more CPU, memory, or faster disks.  Why? Because I'm a DBA who has seen this too many times. The database seems to be the bottleneck and there's no shortage of people suggesting more system resources. I usually caution them, "don't mistake being busy with being productive!" Are we sure the workload is cleverly organized? Are we applying smart leverage with proper indexing, partitioning and carefully designed access patterns or are we simply pouring data into a container, doing brute force heavy lifting and having concurrency issues?

 

Now if you're a SysAdmin, you might be thinking, this is DBA stuff, why do I care? The reason you should care is that I've seen too many times where resources were added and it only produced a bigger monster!

 

So for crying out loud, please don't feed the monsters.  THEY BITE!

 

To demonstrate my point, I'll tell you about an interesting discovery I explored with Tom LaRock AKA @SQLRockstar while creating demo data for the release of Database Performance Analyzer 9.2. I set out to do the opposite of my usual job. I set out to create problems instead of solving them.  It was fun!  :-)

LittleDBofHorrorsWaits.png

Latch_EX_Tip.JPG

My primary goal was to generate specific wait types for:

  1. Memory/CPU - Not a real wait type. We put this in the wait type field when you're working, rather than waiting because the only thing you're waiting on then is the CPU and Memory to complete whatever task the CPU was tasked with.
  2. ASYNC_NETWORK_IO - Ironically, this is seldom truly a network problem, but it could be and may be interesting to a SysAdmin.
  3. PAGEIOLATCH_XX - These are significant signs that you're waiting on storage.
  4. LCK_M_X - This is a locking wait type and locking can harm performance in ways that adding system resources can't help.

 

I knew that table scans cause all sorts of pressure, so I created 1 process that used explicit transactions to insert batches into a table in a loop while 4 processes ran SELECT queries in infinite loops. To maximize the pain, I ensured that they'd always force full table scans on the same table by using a LIKE comparison in the WHERE clause and comparing it to a string with wild cards. There's no index in the world that can help this! In each pass of their respective loops, they each wait a different amount of time between table scans. 1 second, 2 seconds, 3 seconds, and 4 seconds respectively. Three of the processes use the NOLOCK hint while one of them does not. This created a pattern of alternating conflicts for the database to resolve.


SignalWaitMystery.pngSo I got the wait types I targeted, but LATCH_EX just sort of happened. And I'm glad it did! Because I also noticed how many signal waits I’d generated and that the CPU was only at 50%. If *Signal Waits accounting for more than 20% our waits is cause for concern, and it is…then why does the server say the CPU is only around 50% utilized? I found very little online to explain this directly so I couldn't help myself. I dug in!


My first suspect was the LATCH_EX waits because I'd produced an abundance of them and they were the queries with the most wait time. But I wasn't sure why this would cause signal waits because having high signal waits is like having more customers calling in than staff to answer the phones. I really didn't have much running so I was puzzled. 

 

The theory I developed was that when SQL Server experiences significant LATCH_EX contention, it may require SQL Server to spawn additional threads to manage the overhead which may contribute toward signal waits. So I asked some colleagues with lots of SQL Server experience and connections with other experienced SQL Server pros. One of my colleagues had a contact deep within Microsoft that was able to say with confidence that my guess was wrong. Back to the guessing game…



With my first hypothesis dead on arrival, I turned back to Google to brush up on LATCH_EX. I found this Stack Exchange post, where the chosen correct answer stated that,

 

There are many reasons that can lead to exhaustion of worker threads :

  • Extensive long blocking chains causing SQL Server to run out of worker threads.
  • Extensive parallelism also leading to exhaustion of worker threads.
  • Extensive wait for any type of "lock" - spinlocks, latches. An orphaned spinlock is an example.


Well I didn't have any long blocking chains and I didn't see any CXPACKET waits. But I did see latches! So I developed hope that I wasn't crazy about this connection from latches to signal wait. I kept searching…

LatchClass.pngI found this sqlserverfaq.net linkIt provided the query I used to identify my latch wait class was ACCESS_METHODS_DATASET_PARENT.  It also broke down latches into 3 categories and identified that mine was, a non-buffer latch.  So I had a new keyword and a new search phrase, ACCESS_METHODS_DATASET_PAREN and "non-buffer latch".


SELECT  latch_class, wait_time_ms / 1000.0 AS [Wait In sec],

waiting_requests_count AS [Count of wait],

100.0 * wait_time_ms / SUM (wait_time_ms) OVER() AS Percentage

FROM sys.dm_os_latch_stats

WHERE latch_class NOT IN ('BUFFER')

AND wait_time_ms > 0

Then I found this MSDN postAbout half way in, the author writes this about ACCESS_METHODS_DATASET_PARENT: "Although CXPacket waits are perhaps not our main concern, knowing our main latch class is used to synchronize child dataset access to the parent dataset during parallel operations, we can see we are facing a parallelism issue".


SqlSkillsTweet.pngThen I found another blog post not only supporting the new theory, but also referencing a post by Paul Randal from SQLskills.com, one of the most reputable organizations regarding SQL Server Performance.  It states, "ACCESS_METHODS_DATASET_PARENT...This particular wait is created by parallelism...."


And for the icing on the cake, I found this tweet from SQLskills.com.  It may have been posted by Paul Randal himself.



So now I know that LATCH_EX shows when SQL Server parallelizes table scans.  So instead of one thread doing a table scan, I had several threads working together on each table scan.  So it started to make sense.  I had ruled out parallelization because I didn't see any CXPACKET waits, which many DBAs think of as THE parallelism wait.  And now THIS DBA (me) knows it's not the only parallelism wait!  #LearnSomethingNewEveryDay

 

So now I feel confident I can explain how an abundance of LATCH_EX waits can result in high CPU signal waits.  But I'm still left wondering why signal waits can be over 20% and the CPU is only showing 50% utilization.  I'd like to tell you that I have an answer, or even a theory, but for now, I have a couple of hypotheses.

 

  1. It may be similar to comparing bandwidth and latency.  It seems server CPU utilization is like bandwidth i.e. how much work can be done vs what is getting done, while signal waits is like latency i.e. how long does a piece of work wait before work begins.  Both contribute to throughput but in very different ways.  If this is true, then perhaps the CPU workload for a query with LATCH_EX is not so much work but rather time consuming and annoying.  Like answering kids in the back seat that continually ask, "are we there yet?"  Not hard.  Just annoying me and causing me to miss my exit.

  2. It may simply be that I had such little load on the server, that the little amount that was signal waits, accounted for a larger percentage of the work.  In other words, I may have had 8 threads experiencing signal wait at any time.  Not a lot, but 8 of 35 threads is over 20%.  So in other words, "these are not the droids I was looking for."

 

Maybe you have a hypothesis?  Or maybe you know that one or both of mine are wrong.  I welcome the discussion and I think other Thwack users would love to hear from you as well.


 

 

Related resources:

 

Article: Hardware or code? SQL Server Performance Examined — Most database performance issues result not from hardware constraint, but rather from poorly written queries and inefficiently designed indexes. In this article, database experts share their thoughts on the true cause of most database performance issues.

 

Whitepaper: Stop Throwing Hardware at SQL Server Performance — In this paper, Microsoft MVP Jason Strate and colleagues from Pragmatic Works discuss some ways to identify and improve performance problems without adding new CPUs, memory or storage.

 

Infographic: 8 Tips for Faster SQL Server Performance — Learn 8 things you can do to speed SQL Server performance without provisioning new hardware.

 


In a network, whether small or large, spread over one location or manythere are network administrators, system administrators, or network engineers who frequently access the IP address store. While many organizations still use spreadsheets, database programs, and other manual methods for IP address management, the same document/software is accessed and updated by multiple people. Network administrators take on the role of assigning IPs in small networks, as well as when they add new network devices or reconfigure existing ones.  The system administrator takes care of assigning IPs to new users that join the network and adding new devices like printers, servers, VMs, DHCP & DNS services, etc. Larger networks that are spread over multiple locations sometimes have a dedicated person assigned to specifically manage planning, provisioning and allocation of IP space for the organization. They also take care of research, design and deployment of IPv6 in the network. Delegating IP management tasks to specific groups’ based on expertise or operations (network & systems team) allows teams to work independent of each other and meet IP requirements faster.

 

Again, if the central IP address repository is maintained by a single person, then the problem lies in the delay of meeting these IP address requests. Furthermore, they could run into human-errors and grievances stemming from teams experiencing downtime -- waiting to complete their tasks.

 

What Could Go Wrong When Multiple Users Access the Same Spreadsheet?


Spreadsheets are an easily available and less-expensive option to maintain IP address data. But, it does come with its own downsides when multiple users access the same spreadsheet. Typically, users tend to save a copy to their local drive and then finding the most recently updated version becomes another task! You end up with multiple worksheets with different data on each of them. There is no way to track who changed what. Ultimately, this leads to no accountability for misassignments or IP changes made.

 

In short, this method is bound to have errors, obsolete data and lacks security controls. There could be situations when an administrator makes a change in the status of an IP address, but forgets to communicate the same to the team/person that handles DHCP or DNS services. In turn, chances are higher that duplicate IP addresses are assigned to a large group of users causing IP conflicts and downtime.

 

With all that said, the questions that remain are: Can organizations afford the network downtime? And are the dollars saved from not investing in a good IP address management solution more than those lost due to loss in productivity? This post discusses the problems of using manual methods for IP address management. In my next blog we  look at associated issues and the best practices of roles and permissions enabling task delegation across teams.

 

Do you face similar difficulties with your IP administration? If yes, how are you tackling them?

vinod.mohan

Doing IT Remotely!

Posted by vinod.mohan May 14, 2015

Often, as organizations grow and expand, it can make the job harder for IT teams. The IT infrastructure may become larger and more complicatedbe distributed across various sites and locations. For example, end-users to support could be onsite, offsite, or even on the road travelling. There may not be enough admins in all locations, and the need for remote IT management becomes essential.

  

Even in smaller businesses and start-ups where office space and IT infrastructure is not quite ready yet, and employees are telecommuting from home and elsewhere, the need for remote IT surfaces. A single IT pro wearing a dozen different IT hats will have to make do and support end-users wherever they may be.

  

Remote IT is generally defined in different ways by solution providers based on the solution they offer. In this blog, I am attempting to cover as many scenarios as possible that could be called remote IT.

 

SO, WHAT IS REMOTE IT?

  • IT pros in one location managing the infrastructure (network, systems, security, etc.) in remote location
  • IT  pros in one location supporting end-users in a remote location
  • IT pros within the network supporting end-users outside the network
  • IT pros monitoring and troubleshooting infrastructure issues while on the go, on vacation, or after office hours
  • Monitoring the health of remote servers, applications, and infrastructure on the Cloud
  • Remote monitoring and management (RMM) used by IT service providers to manage the IT infrastructure of their clients
  • User experience monitoring of websites and web applications—both real user monitoring and synthetic user monitoring
  • Site-to-site WAN monitoring to track the performance of devices from the perspective of remote locations
  • Certain organizations have their mobile device management (MDM) policies that include remote wiping of data on lost or stolen BYOD devices containing confidential corporate information

 

This may not be a comprehensive list. Please do add, in the comments below, what else you think fits in the realm of remote IT.

 

But the primary need for remote IT is that, without having to physically visit in person a remote site or user, we have to make IT work—monitor performance, diagnose faults, troubleshoot issues, support end-users, etc. And, this should be done in a way that is cost-effective and result-effective to the business.

 

Just like how we need a phone or computer (a tool, basically) to communicate with a person situated remotely, to make remote IT work, it comes down to using remote IT tools. When you’ve  equipped with the right tools and gear to manage IT remotely, you will gain greater control and simplicity to work your IT mojo wherever the IT infrastructure is, the user is, or you—the IT pro—are.

 

Also, share with us what tools you use for doing IT remotely.

In my last post "THERE MUST BE A BETTER WAY TO MANAGE VIRTUALIZED SYSTEMS", we talked about what systems are out there and which ones everyone is using. Ecklerwr1, posted a nice chart from VMare which compares VMware vRealize Operations to SolarWinds Virtualization Manager and a few others.


vmw-scrnsht-virtualization-practice_lg.jpg

 

Based on the discussion, it seems like many people are using some kind of software to get things sorted in their virtual environment. In my previous job, I was responsible for parts of the lab infrastructure. We hosted 100+ VMs for customer support, so our employees can reproduce customer issues  or use it for training.

 

While managing the lab and making sure we have always enough resources available, I found it difficult to identify which VMs have actively been used and which VMs were idle for some time. Another day-to-day activity was to hunt down snapshots which consumed an massive amount of space.

Back then, we wrote some vSphere CLI scripts to get the job done. Not really efficiently but done. However, using SolarWind's Virtuailzation Manger now, I see how easy my life could have been.

 

My favorite features are the ability to view idle VMs and monitor the VM snapshots disk usage. Both features could have saved me lots of hours in my previous job. 

I am curious to know what features are saving you on a regular basis? Or are there any features, which we are all missing but just don’t know it yet?As Jfrazier mentioned, maybe Virtual Reality Glasses?

If you are an Oracle DBA and reading this, I am assuming all of your instances run on *nix and you are a shell scripting ninja. For my good friends in the SQL Server community, if you haven’t gotten up to speed on PowerShell, you really need to this time. Last week, Microsoft introduced the latest version of Windows Server 2016, and it does not come with a GUI. Not like, click one thing and you get a GUI, more like run through a complex set of steps on each server and you eventually get a graphical interface. Additionally, Microsoft has introduced an extremely minimal server OS called Windows Nano, that will be ideal for high performing workloads that want to minimize OS resources.

 

One other thing to consider is automation and cloud computing—if you live in a Microsoft shop this all done through PowerShell, or maybe DOS (yes, some of us still use DOS for certain tasks).  So my question for you is how are you learning scripting? In a smaller shop the opportunities can be limited—I highly recommend the Scripting Guy’s blog. Also, doing small local operating system tasks via the command line is a great way to get started.

I was watching a recent webcast titled, “Protecting AD Domain Admins with Logon Restrictions and Windows Security Log” with Randy Franklin Smith where he talked (and demonstrated) at length techniques for protecting and keeping an eye on admin credential usage. As he rightfully pointed out, no matter how many policies and compensating controls you put into place, at some point you really are trusting your fellow IT admins to do their job—but not more—with the level of access we grant and entrust in them.

 

However, there’s a huge catch 22—as an IT admin I want to know you trust me to do my job, but I also have a level of access that could really do some damage (like the San Francisco admin that changed critical  device passwords before he left). On top of that, tools that help me and my fellow admins do my job can be turned into tools that help attackers access my network, like the jump box in Randy’s example from the webcast.

 

Now that I’ve got you all paranoid about your fellow admins (which is part of my job responsibilities as a security person), let’s talk techniques. The name of the game is: “trust, but verify.”

 

  1. Separation of duties: a classic technique which really sets you up for success down the road. Use dedicated domain admin/root access accounts separate from your normal everyday logon. In addition, use jump boxes and portals rather than flat out providing remote access to sensitive resources.
  2. Change management: our recent survey of federal IT admins showed that the more senior you are, the more you crave change management. Use maintenance windows, create and enforce change approval processes, and leave a “paper” trail of what’s changing.
  3. Monitor, monitor, monitor: here’s your opportunity to “verify.” You’ve got event and system logs, use them! Watch for potential misuse of your separation of duties (accidental OR malicious), unexpected access to your privileged accounts, maintenance outside of expected windows, and changes performed that don’t follow procedure.

 

The age old battle of security vs. ease-of-use wages on, but in the real world, it’s crucial to find a middle ground that helps us get our jobs done, but still respects the risks at hand.

 

How do you handle the challenge of dealing with admin privileges in your environment?

 

Recommended Resources

 

REVIEW - UltimateWindowsSecurity Review of Log & Event Manager by Randy Franklin Smith -

 

VIDEO – Actively Defending Your Network with SolarWinds Log & Event Manager

 

RECOMMENDED DOWNLOAD – Log & Event Manager

Throughout previous blog posts, I talked about thin provisioning, approaches to move from fat to thin, and the practice of over committing. All what I communicated was about their system, advantages, pluses & minuses, methodology, drawbacks etc. Likewise, I also talked about the need for constant monitoring of your storage as the solution to many drawbacks. This article will talk about how to apply a storage monitoring tool to your infrastructure to monitor your storage devices. But when you select the tool make sure that you select one which has alerting options too.  I will walk you through SolarWinds Storage Resource Monitor (SRM in short) which is one of the storage monitoring tools and in the course I will talk about the different necessary  features that any storage monitoring tool require to overcome the weaknesses of thin provisioning.

 

Introduction to SRM:

SRM is SolarWinds storage monitoring product. SRM monitors, reports, and alerts on SAN and NAS devices like Dell, EMC, NETAPP and so on. For a detailed list check here. In addition, SRM helps to manage and troubleshoot storage performance and capacity problems.

You can download SRM from the link below:

Storage Resource Monitor

Once you have installed SRM, next you will need to add your storage device. Adding your storage device is different based on your vendor. Visit the below page for instructions on how to add storage devices from different vendors.

How to add storage devices

 

Once you have installed SRM and added your storage devices to SRM, you will have instant visibility into all storage layersextending to virtualization and applications with the Application Stack Environment Dashboard. Using SRM, troubleshooting storage problems across your application infrastructure is a cake walk. Let’s start with SRM’s dashboard.

 

dashboard.png

 

The dashboard gives you a birds-eye view of any issues on your storage infrastructure. Further, the dashboard displays all storage devices monitored by SRM classified via product and relevant status of each layer of storage, such as storage arrays, storage pools, and LUN’s.

 

SRM and Thin Provisioning:

Moving on to Thin Provisioning, SRM allows you to more effectively manage Thin Provisioned LUN’s. And when thin provisioning is managed and monitored accurately over-provisioning or over committing can be done efficiently. SRM helps you view, analyze and plan thin provisioning deployments by collecting and reporting detailed information of virtual disks, so you can manage the level of over-commitment on your datastores.

 

LUN.png

 

This resource presents a grid of all LUNs using thin provisioning in the environment.

The columns are:

  • LUN : Shows the name of the LUN and its status
  • Storage Pool: Shows which storage pool the LUN belongs to
  • Associated Endpoint: The server volume or the datastore using the LUN
  • Total Size : The total User size of the LUN
  • Provisioned Capacity : Amount of capacity currently provisioned

There are also columns that show the provisioned percentage, File System Used Capacity, and File System Used Capacity percentage for the concerned LUN.

 

A tool tip will appear when you hover over the LUN or Storage Pool which gives you a quick snapshot of performance and capacity. This helps you decide if you need to take action. Moreover, this tool tip when hovered over storage pool shows the pool’s usable capacity summary. This shows the total usage capacity (i.e, the collected amount of storage capacity that a user can actually use), remaining capacity (the storage left behind to get occupied) and over-subscribed capacity (total capacity this storage pool is over committed).

 

hoverover storage pool _ 2.png

 

A drill-down on a specific storage pool gives information that presents important key/value pairs of information for the current storage pool. Moreover, detailed information on:

  • Total Usable Capacity
  • Total Subscribed Capacity
  • Over-Subscribed Capacity
  • Provisioned Capacity
  • Projected Run-Out time, approximate time it will take to wholly utilize this storage pool.

 

drill down on storage pool.png

 

In addition, Active Alerts displays the alerts related to this storage pool. This displays the alert name, alert message in short, name of the LUN for which alert is triggered and it’s time.

Learn how to create an alert in SRM.

 

Alerting helps proactive monitoring:

Storage performance issues can happen anytime and you cannot literally monitor each and every second on how storage is performing. This is why you need alerts. They help you by warning you before a problem occurs. By setting up alerts based on criteria, you will gain complete visibility into your storage. You have to setup an alert forecasting a particular situation that can cause issues with storage performance.

 

all active alerts.png

 

Provided below is a list of Example alerts that you can use for LUN’s while doing thin provisioning:

  • Alert when usable space in the LUN goes below a particular % (i.e 20%)
  • Alert when usable space in a storage pool goes below a particular %
  • Alert when the storage pools oversubscribed % goes higher than a particular % (i.e 10%)

The % values can only be decided by you, as it will be differ based on infrastructure. Some can add more storage in days, where as in many organizations it might take up to months to get approval for additional storage. Therefore, the decision of setting % can only be done by you. 

 

Once you have alerts in place, you can just sit back and relax. And spare your time (that you spent to monitoring thin provisioning and over committing in storage) for other endeavors.

iStock_000059460024_Small.jpg

Well, my last blog generated quite an interest and discussion on the use of CLI for box configuration.

 

As a follow up I want to write on a related topic although it may generate some difference of opinion here but this is my goal to generate a wider discussion on this topic.

 

OK, in my last post I said that CLI is cumbersome; it takes a while to get used to and the worst thing is that if something goes wrong, the troubleshooting takes ages, sometimes.

I also said that protocols like NETCON and YANG would really make the configuration easier, more intelligent and move the focus from the box configuration to the network configuration in addition to making the configuration GUI friendly.

 

I want to bring a new dimension to this discussion.

 

Let’s see if Cisco would really like to give you a better user interface and a better configuration tool.

 

Although I write Cisco here, but it can mean any vendor that gives CLI experience, for example Juniper etc. ( I specifically mean any CLI which is propriety and vendor specific )

 

Ok to start with; let’s agree on a fact that using CLI is a skill; rather an expert skill. This skill is required to configure a box and additionally to troubleshoot networking issues. Not only do you need how to move around with CLI, but you should be able to do it with speed. Isn’t it?

 

This skill requires training and certification. If one has expert certification, it means that he is not only intelligent but he is a command guru. Correct?

 

Cisco certification is a big money making industry. If not a billion dollar, it must be generating hundreds of million dollars of revenue for Cisco ( I contacted Cisco to get real figures, but seems these are not public figures). Cisco makes money by making one pay for exams and selling them trainings. Then there is a whole echo-system of Cisco learning partners, where Cisco makes money by combining their products with training services and selling through them.

 

It costs to get expert level certifications. There is a cost if one passes, and there is more cost, if one fails.

 

An engineer may end up paying thousands of dollars on trainings and exams. We are talking about huge profits for Cisco here just because of the popularity of certifications. There is one for everyone; for a beginner to expert; for an operation guy to architects.

 

Besides creating experts, Cisco is winning here from three angles:

 

  1. It is making its customers used to CLI as customers feel at home using the codes they are trained on.
  2. It is creating loyal users and customers as they would recommend products they already know very well.
  3. It is generating big revenue. ( and big margins as it is a service)

 

For sure the It is win-win for Cisco here.

 

In my perspective, therefore, a difficult to operate switch and router is in the direct interest of Cisco, as Cisco needs experts to run their products and the experts need certifications.

Cisco, therefore, would NOT be very encouraged to make networks easy to operate and configured. Even I have seen the GUI of one of Cisco products; it simply sucks. It seems to me it is not one of their focuses.

 

Thus, this raises an important question here:

 

Why would Cisco take steps to make the network more programmable, easy to operate with newer tools and take CLI out of their central focus? Wouldn’t it like to stick around with difficult to operate products and keep on making more money?

 

Would you agree with me?

 

I like to hear, both if you agree or disagree and why?

 

UPDATE:

 

After publishing this article, the majority of comments only focused on CLI versus GUI. For sure GUI is more user-friendly but CLI has delivered well because of not having good competition either from good GUI or SNMP, uptill now.

 

However the main message was to talk about “vendor specific CLI” NOT command line in general. In programmable age, tools like NETCONF and YANG offer a standard way to configure network elements. Whether you use it with GUI or with command line, the benefits far exceed compared to vendor “CLI”. NETCONF/YANG is a standard way to configure any vendor equipment. The protocol leaves it to the vendor to determine how to apply configuration instructions and in what order within their devices. This means this puts pressure on the vendors to do additional development on their products to execute the user configuration in whatever order he ordered. This removes pressure from the user to learn configuration for multiple vendors and learn multiple CLIs. This is the future, NOT CLI.

cxi

The IT Approach to Security

Posted by cxi May 11, 2015

Hello again! Welcome to my next installment with various slides I've stolen from my own presentations I'd deliver at conference

If you read last weeks installment on this Checkbox vs Checkbook Security you probably know by now that security is an area which is personally important to me.

 

With that said, let's dive a little deeper into what is often the IT Approach to security...

Screen Shot 2015-05-10 at 8.43.04 PM.png

How many times have you heard someone say "I'm not a big enough target" heck, maybe you've even heard yourself say that.

Certainly in solidly targeted world where theater actors are striking to stop you from publishing what was otherwise a horrible movie (Sony) or you experience where credit card and customer data is to be stolen for purposes of stealing monies or other uses (JPMC/Chase) or where hundreds of millions are dollars are stolen from hundreds of banks (Too many sources to count). 

 

Then sure, that puts you into the landscape of, "I'm not a big enough target, why would anyone bother with me!"

 

Let's not forget for a moment here though, that the security landscape is not hard and fast... attacking scripts and threat engines are indiscriminate in their assault at times.  A perfect example is (taken from the old war-dialing days)... Just as we'd dial entire banks of phone numbers looking for modems to connect into, there are attackers who will cycle through entire IP banks while trying to exploit the latest zero day attack on the horizon.   Most Wordpress sites that are hacked on a regular basis are not because they were targeted, it is because they were vulnerable.

 

Or if this analogy helps.. More people are likely to take something from a car with its windows open or its top down, than one which is all locked up.

 

 

What is it that makes us irrespective of size, a target?

Screen Shot 2015-05-10 at 8.58.46 PM.png

I included this image here from my own threatmap to give you a sense of just what kinds of things can and do happen.

So the question then arises of, what exactly makes something 'targetable'


You are a target if:


  • You are connected to a network
  • You run a service which is accessible via a network protocol (TCP, IP, UDP, ICMP, Token-Ring...;))
  • You run an application, server, service which has a vulnerability in it, whether known or unknown
    • I just want to mention for a moment... Shellshock the Bash Vulnerability disclosed 24SEP2014 has been VULNERABLE since September 1989; just food for thought

 

So you're pretty much a target if you... Exist, Right? Wow that leaves us all warm and fuzzy I imagine...

But it doesn't have to be that way! You don't have to run in terror and shut everything down for fear of it being hacked.  But in the same breath, we need not stick our head in the sand assuming that we are invincible and invulnerable because no one would ever attack us, or steal our data, or whatever other lies we tell ourselves to sleep at night.

 

Do you see a future with fewer Zero Day attacks or more critical ones being discovered which had existed for 25 years before being discovered (ala Shellshock) or introduced in the recent past such as Heartbleed?


You know I love your insight! So you tell me... How are you a target, or NOT a target!   What other ways do you see people being a target? (I haven't even touched the mobile landscape...)

 

I look forward to your thoughts on this matter Thwack Community!

Microsoft Ignite 2015 concluded its inaugural event with 20k+ attendees. The SolarWinds team united in the Windy City, Chicago, to provide the single point of truth in IT monitoring for the continuous delivery and integration era with Ignite attendees. SolarWinds also teamed up with Lifeboat Distribution and hosted a Partner Meet and Greet during Microsoft Ignite at Chicago's Smith & Wollensky covering steaks and application stack management. Ignite lit IT from start to finish.

 

Microsoft Announcements at Ignite

There were plenty of announcements made by Microsoft and they've been covered extensively especially on Microsoft's channel 9 program. The announcements involved SoCoMo - social, cloud and mobility with Office 365, Azure, and Windows OS taking the front and center roles. The Edge beat out the Project Spartan name by a brow...ser to become Internet Explorer's named successor. Other notable news included Windows 10 being the last version of Windows and showcase demos of some of the "software defined" roles of Windows Server 2016 aka Windows Server Technical Preview 2 especially Active Directory, Docker containers, RMS, and Hyper-V. And there was something about Office 365 and its E3 subscription, which includes the core Office application suite, plus cloud-based Exchange, SharePoint, and Skype for Business. Exchange, SharePoint, and Unified Communications admins were put on notice and the consensus was that they had to broaden and deepen their skills in other areas especially cloud.

 

From the Expo Floor

SolarWinds booth was non-stop traffic throughout Ignite. The conversations ranged from the latest and greatest Microsoft announcements to solutions that were two or three generations behind. But regardless of the environment whether it be on-premises, colo, private/public cloud or hybrid, it was clear that the application was clearly on the minds of IT Ops AND it required monitoring along with database, security, log & patch management. Conversations also included a healthy dose of the Dev side of the DevOps equation. And yes, Devs need monitoring as well. Without baselines and trends, there can be no truth on what "good" should be.

 

Enjoy some of the moments from SolarWinds' Microsoft Ignite booth.

 

Ignite PicturesIgnite Pictures
booth.jpgbooth presentation.PNG
demo.jpgswag bag.PNG
geek out.PNGthwack hammer.PNG

 

Thank you Ignite

Thank you Ignite attendees for the conversations from those of us who attended and represented the SolarWinds family! Fantastic job SolarWinds team! See you next year.

booth-staff.png

Pictured: 1st row - Ryan Albert Donovan, Brian Flynn, Troy Lehman, Danielle Higgins, Aaron Searle, Wendy Abbott. 2nd row - Matthew Diotte, Kong Yang, Mario Gomez, Patrick Hubbard, Michael Thompson. 3rd row - Dan Balcauski, Karlo Zatylny, Cara Prystowsky, Ash Recksiedler. Not pictured (because of flight times): Thomas LaRock, Jennifer Kuvlesky, Jon Peters.

Leon Adato

Convention Season

Posted by Leon Adato Expert May 8, 2015

Convention season is upon us. I know that conventions happen throughout the year, but it seems like April is when things kick into high gear.

 

As anyone who has been in IT for more than a month can tell you, there are so many incredible opportunities to get out there and network, learn, and see what is heading down the pipeline. It can really be overwhelming both to the senses and the budget.

 

The Head Geeks try very hard to find opportunities to meet up with customers, fellow thwack-izens, and like-minded IT Professionals. But like you, there are only so many days in a month and dollars in the budget.

 

I took a quick poll of the other Geeks to find out:

 

  1. Which shows we are GOING to be attending this year.
  2. Which ones we know we SHOULD be attending, but can’t due to other constraints.
  3. Which ones we WISH we could attend, even if it’s a little off the beaten path.

 

Here’s what I’d like from you: In the comments, let us know which shows YOU are going to be attending, and which ones you would like to see US attend next year. That will help us justify our decisions (and budget!) and (hopefully) meet up with you!

 

Attending:

Tom: PASS Summit, VMworld, Ignite

Kong: MS Ignite, VMworld, SpiceWorld Austin, Philadelphia VMUG USERCON, and Carolina VMUG USERCON

Patrick: Cisco Live, Ignite

Leon: Cisco Live

 

Should Attend:

Tom: Spiceworks, VMworld (Barcelona)

Kong: “Are you insane?!?! Did you see what I’m already going to?”

Patrick: VMworld

Leon: Interop, Ignite, SpiceWorld

 

Wish We Could Attend:

Tom: SXSW, AWS re:Invent 2015

Kong: AWS re:Invent

Patrick: RSA, AWS re:Invent 2015

Leon: Interop, DefCon, RSA,

 

Like I said, let us know in comments where YOU are going to be, and we’ll start to make plans to be there the next time around.

Filter Blog

By date:
By tag: