Skip navigation

Geek Speak

10 Posts authored by: mrlesmithjr

This post will be the final one on the series about Infra-As-Code and I will keep this one short as we have covered a good bit over the past few weeks. So I will only be doing a quick overview of what we have covered in regards to methodologies and processes over this series. As well as I want to ensure that we finish up by reinforcing what we have covered.

Remember we will be adopting new ways of delivering services to our customers by taking a more programmatic approach which also should involve testing and development phases (something new to most right?). At the initial phase of a new request we should have an open discussion with everyone that should be involved in the duration of the new implementation. This open discussion should cover the specifics of a test plan and when a reasonable timeline should be reached. We should also be leveraging version control for our configurations and/or code and we accomplish by using Git repos. To further benefit our version control we should adopt a continuous integration/delivery solution such as Jenkins. By using a CI/CD solution we are able to automate the testing of our configuration changes and receive the results of those tests. This not only saves us the manual tasks that can be somewhat time consuming but also ensures consistency of our testing. And when we are ready for sign-off to move into production we should leverage a code-review system and peer review. Code-review adds in an additional layer of checks against the configuration changes we intend on making. And our peer review is for us to have a follow-up open discussions covering the results of our testing and development phases with those who were involved in our initial open discussion. And once we have final sign-off and all parties are in agreement on the deliverables we can leverage the same CI/CD solution as we did in our test/dev phases to ensure consistency.

  1. Hopefully the series of posts over the past few weeks have been beneficial to others.  I am hopeful that they have either confirmed your thoughts, confirmed your methodologies or maybe even opened your eyes to the idea of heading down the Infrastructure as Code journey. And with all of the content that has been covered you may be wondering to yourself how or what are the next steps to continue this journey. And this is exactly what we will cover in this post. Up until this post we have focused on the network being treated as code versus stagnant methodologies and processes that continue to be considered dark magic on how networks are treated. Which we have covered in the previous posts on how we can begin leveraging newer methodologies and processes as well as relearning our culture as an organization. And if you follow those as well as develop your own you can begin extending these same principals into other areas of infrastructure. This could include application and server deployments along with their respective lifecycle management. But maybe these specific areas have already been addressed and the network is the next phase of your journey. And if that is the case then you can continue to grow each specific area of infrastructure as a whole in order to get a complete overall strategy for each area. You should use this journey as a way to begin breaking down the silos between teams and come together as one in order to deliver services in a much more holistic manner. But if your focus is only network infrastructure then you should continue to implement and practice the steps outlined in this post as well as any additional practices that you adopt in each implementation phase going forward. Keeping all discussions and methodologies out in the open across teams will allow for the culture shift that can and should occur going forward. This will only strengthen the relationships that teams and organizations have in the future. And I realize that DevOPS (see…I finally said it J) is on many organizations mind but do not fall victim to the mentality that we HAVE a DevOPS team. DevOPS is not a team, it is not automation, it is not Infrastructure as Code but rather at the core is culture. With culture and the areas that we have touched on over this series will only enhance all services being delivered as an organization. But if the only thing that is achieved by your journey is a more stable, repeatable and consistent delivery method allowing you to reach the desired state then you will be in a much better place.

And with all of this I will end this post for now and the final post in the series is up next which will be an overview of what has been covered. Hope you have enjoyed this series and keep the comments coming.

Over the past 3 posts we have covered what it means, how to start and what is required to begin down the Infra-As-Code journey. Hopefully things have been making sense and you have found them to be of some use. Obviously the main goal is just bringing awareness to what is involved and to help start the discussions around this journey. In the last post I also mentioned that I had created a Vagrant lab for learning and testing out some of the tooling. If you have not checked it out, you can do so by heading over to here. This lab is great for mocking up test scenarios and learning the methodologies involved.

 

In this post we will take what we have covered in the previous posts and mock up an example to see how the workflow might look. And for our mock example we will be making some configuration changes to a Cisco ACI network environment using Ansible to perform the configuration changes for our desired state.

 

The details below is what our workflow looks like for this mock-up.

  • - Developer – Writes Ansible playbooks and submits code to Gerrit
  • - Gerrit – Git Repository and Code-Review (Both master and dev branches)
  • - Code-Reviewer – Either signs off on changes or pushes back
  • - Jenkins – CI/CD – Monitors master/dev branches on Git Repository (Gerrit)
  • - Jenkins – Initiates the workflow when a change is detected on master/dev branches

 

And below outlines what our mock-up example entails from a new request received.

 

Change request:

  • - Create a new tenant for the example environment, which will consist of some web-servers and DB-servers. The web-servers will need to communicate with the DB-servers over tcp/1433 for MS SQL.
  • - After bringing all of the respective teams together to discuss in detail on the request and identify each object which must be defined, configured and made available for this request to be successful. (Below is what was gathered based on the open discussion)
    • Tenant:
      • § Name: Example1
      • Context name(s) (VRF):
        • § Name: Example1-VRF
      • Bridge-Domains:
        • § Name: Example1-BD
        • § Subnet: 10.0.0.0/24
      • Application Network Profile
        • § Name: Example1-ANP
      • Filters:
        • § Name: Example1-web-filter
        • § Entries:
          • Name: Example1-web-filter-entry-80
            • Proto: tcp
            • Port: 80
            • Name: Example1-web-filter-entry-443
              • Proto: tcp
              • Port: 443
        • § Name: Example1-db-filter
        • § Entries:
          • Name: Example1-db-filter-entry-1433
            • Proto: tcp
            • Port: 1433
      • Contracts:
        • § Name: Example1-web-contract
          • Filters:
            • Name: Example1-web-filter
            • Subjects:
              • Name: Example1-web-contract-subject
        • § Name: Example1-db-contract
          • Filters:
            • Name: Example1-db-filter
            • Subjects:
              • Name: Example1-db-contract-subject

Open discussion:

 

So based on the open discussion we have come up with the above details on what is required from a Cisco ACI configuration perspective in order to deliver the request as defined. We will use the above information to begin creating our Ansible playbook to implement the new request.

 

Development:

We are now ready for the development phase of creating our Ansible playbook in order to deliver the environment from the request. And knowing that Gerrit is used for our version control/code repository we need to ensure that we are continually committing our changes to a new dev branch on our Ansible-ACI Git repository as we are developing our playbook.

 

**Note – Never make changes directly to the master branch…Always create/use a different branch to develop your changes and then merge those into master.

 

Now we need to pull down our Ansible-ACI Git repository to begin our development.

$mkdir -p ~/Git_Projects/Gerrit

$cd ~/Git_Projects/Gerrit

$git clone git@gerrit:29418/Ansible-ACI.git

$cd Ansible-ACI

$git checkout -b dev

 

We are now in our dev branch and can now begin our coding.

 

We now create our new Ansible playbook.

$vi playbook.yml

 

And as we create our playbook we can begin committing changes as we go. (Follow the steps below on every change you want to commit)

$git add playbook.yml

$git commit -sm “Added ACI Tenants, Contracts and etc.”

$git push

 

In the example above we used -sm as part of our git commit. The -s adds a sign-off by the user making the changes and the -m designates the message that we are adding as part of our commit. You can also just use the -s and then your editor will open for you to enter your message details.

 

So we end up coming up with the following playbook which we can now proceed with testing in our test environment.

---

- name: Manages Cisco ACI

  hosts: apic

  connection: local

  gather_facts: no

  vars:

    - aci_application_network_profiles:

        - name: Example1-ANP

          description: Example1 App Network Profile

          tenant: Example1

          state: present

    - aci_bridge_domains:

        - name: Example1-BD

          description: Example1 Bridge Domain

          tenant: Example1

          subnet: 10.0.0.0/24

          context: Example1-VRF

          state: present

    - aci_contexts:

        - name: Example1-VRF

          description: Example1 Context

          tenant: Example1

          state: present

    - aci_contract_subjects:

        - name: Example1-web-contract-subject

          description: Example1 Web Contract subject

          tenant: Example1

          contract: Example1-web-contract

          filters: Example1-web-filter

          state: present

        - name: Example1-db-contract-subject

          description: Example1 DB Contract Subject

          tenant: Example1

          contract: Example1-db-contract

          filters: Example1-db-filter

          state: present

    - aci_contracts:

        - name: Example1-web-contract

          description: Example1 Web Contract

          tenant: Example1

          state: present

    - aci_filter_entries:

        - name: Example1-web-filter-entry-80

          description: Example1 Web Filter Entry http

          tenant: Example1

          filter: Example1-web-filter  #defined in aci_filters

          proto: tcp

          dest_to_port: 80

          state: present

        - name: Example1-web-filter-entry-443

          description: Example1 Web Filter Entry https

          tenant: Example1

          filter: Example1-web-filter  #defined in aci_filters

          proto: tcp

          dest_to_port: 443

          state: present

        - name: Example1-db-filter-entry-1433

          description: Example1 DB Filter MS-SQL

          tenant: Example1

          filter: Example1-db-filter

          proto: tcp

          dest_to_port: 1433

          state: present

    - aci_filters:

        - name: Example1-web-filter

          description: Example1 Web Filter

          tenant: Example1

          state: present

        - name: Example1-db-filter

          description: Example1 DB Filter

          tenant: Example1

          state: present

    - aci_tenants:

        - name: Example1

          description: Example1 Tenant

          state: present

  vars_prompt:  #Prompts for below info upon execution

    - name: "aci_apic_host"

      prompt: "Enter ACI APIC host"

      private: no

      default: "127.0.0.1"

    - name: "aci_username"

      prompt: "Enter ACI username"

      private: no

    - name: "aci_password"

      prompt: "Enter ACI password"

      private: yes

  tasks:

    - name: manages aci tenant(s)

      aci_tenant:

        name: "{{ item.name }}"

        descr: "{{ item.description|default(omit) }}"

        state: "{{ item.state }}"

        host: "{{ aci_apic_host }}"

        username: "{{ aci_username }}"

        password: "{{ aci_password }}"

      tags:

        - aci-tenants

      with_items: aci_tenants

 

    - name: manages aci context(s)

      aci_context:

        name: "{{ item.name }}"

        descr: "{{ item.description|default(omit) }}"

        tenant: "{{ item.tenant }}"

        state: "{{ item.state }}"

        host: "{{ aci_apic_host }}"

        username: "{{ aci_username }}"

        password: "{{ aci_password }}"

      tags:

        - aci-contexts

      with_items: aci_contexts

 

    - name: manages aci bridge domain(s)

      aci_bridge_domain:

        name: "{{ item.name }}"

        descr: "{{ item.description|default(omit) }}"

        context: "{{ item.context }}"

        tenant: "{{ item.tenant }}"

        subnet: "{{ item.subnet }}"

        state: "{{ item.state }}"

        host: "{{ aci_apic_host }}"

        username: "{{ aci_username }}"

        password: "{{ aci_password }}"

      tags:

        - aci-bridge-domains

      with_items: aci_bridge_domains

 

    - name: manages aci application network profile(s)

      aci_anp:

        name: "{{ item.name }}"

        descr: "{{ item.description|default(omit) }}"

        tenant: "{{ item.tenant }}"

        state: "{{ item.state }}"

        host: "{{ aci_apic_host }}"

        username: "{{ aci_username }}"

        password: "{{ aci_password }}"

      tags:

        - aci-application-network-profiles

      with_items: aci_application_network_profiles

 

    - name: manages aci filter(s)

      aci_filter:

        name: "{{ item.name }}"

        descr: "{{ item.description|default(omit) }}"

        tenant: "{{ item.tenant }}"      

        state: "{{ item.state }}"

        host: "{{ aci_apic_host }}"

        username: "{{ aci_username }}"

        password: "{{ aci_password }}"

      tags:

        - aci-filters

      with_items: aci_filters

 

    - name: manages aci filter entries

      aci_filter_entry:

        name: "{{ item.name }}"

        descr: "{{ item.description|default(omit) }}"

        tenant: "{{ item.tenant }}"

        filter: "{{ item.filter }}"

        proto: "{{ item.proto }}"

        dest_to_port: "{{ item.dest_to_port }}"

        state: "{{ item.state }}"

        host: "{{ aci_apic_host }}"

        username: "{{ aci_username }}"

        password: "{{ aci_password }}"

      tags:

        - aci-filter-entries

      with_items: aci_filter_entries

 

    - name: manages aci contract(s)

      aci_contract:

        name: "{{ item.name }}"

        descr: "{{ item.description|default(omit) }}"

        tenant: "{{ item.tenant }}"

        scope: "{{ item.scope|default(omit) }}"

        prio: "{{ item.prio|default(omit) }}"

        state: "{{ item.state }}"

        host: "{{ aci_apic_host }}"

        username: "{{ aci_username }}"

        password: "{{ aci_password }}"

      tags:

        - aci-contracts

      with_items: aci_contracts

 

    - name: manages aci contract subject(s)

      aci_contract_subject:

        name: "{{ item.name }}"

        descr: "{{ item.description|default(omit) }}"

        tenant: "{{ item.tenant }}"

        contract: "{{ item.contract }}"

        filters: "{{ item.filters }}"

        apply_both_directions: "{{ item.apply_both_directions|default('True') }}"

        prio: "{{ item.prio|default(omit) }}"

        state: "{{ item.state }}"

        host: "{{ aci_apic_host }}"

        username: "{{ aci_username }}"

        password: "{{ aci_password }}"

      tags:

        - aci-contract-subjects

      with_items: aci_contract_subjects

 

 

Testing:

The assumption here is that we have already configured our Jenkins job to do the following as part of the workflow for our test environment:

  • - Monitor the dev branch on git@gerrit:29418/Ansible-ACI.git for changes.
  • - Trigger a backup of the existing Cisco ACI environment (Read KB on this here)
  • - Execute the playbook.yml Ansible playbook against our Cisco ACI test gear and report back on the status via email as well as what is available from our Jenkins job report. (Ensuring that our test APIC controller is specified as the host)

 

Now assuming that all of our testing has been successful and we have validated that the appropriate Cisco ACI changes have been implemented successfully. We are now ready to push our new configuration changes up to the master branch for code-review.

 

Code-Review:

We are now ready to merge our dev branch into our master branch and commit for review. Remember that you should not be the one who also signs off on code-review and the person who does should have knowledge in regards to the change being implemented. So we will assume that the above is true for this mock-up.

 

So we can now merge the dev branch with our master branch.

$git checkout master

$git merge dev

 

Now we can push our code up for review.

$git review

 

Now our new code changes are staged on our Gerrit server ready for someone to either sign-off on the change and merge the new changes in our master branch or push the changes back for additional information. But before we proceed with the sign-off we need to engage our peer-review phase by following the next section.

 

Peer-Review:

We now should re-engage the original teams and discuss the testing phase results, the actual changes to be made and ensure that there is absolutely nothing missing from the implementation. This is also a good stage to include the person who will be signing off on the change as part of the discussion. In doing so will ensure that they are fully aware of the changes being implemented and have a better understanding in order to proceed or not.

 

After a successful peer-review the person who is in charge of signing off on the code-review should be ready to proceed or not. So for this mock-up we will assume that all is a go and they proceed with signing off and the changes get merged into our master branch. Those changes are now ready for Jenkins to pick up and implement in production.

 

Implementation:

So now that all of our configurations have been defined into an Ansible playbook, all testing phases have been successful and our code-review has been signed off on we are now ready to enter the implementation phase in our production environment.

 

Our production Jenkins workflow should look identical to our testing phase setup so this should be an easy one to setup. The only differences here should be the ACI controller that is configured for our production environment therefore our workflow should look similar to the following.

  • - Monitor the master branch on git@gerrit:29418/Ansible-ACI.git for changes.
  • - Trigger a backup of the existing Cisco ACI environment (Read KB on this here)
  • - Execute the playbook.yml Ansible playbook against our Cisco ACI production gear and report back on the status via email as well as what is available from our Jenkins job report. (Ensuring that our production APIC controller is specified as the host)

 

And again assuming that our Jenkins workflow ran successfully we should be good and all changes should have been implemented successfully.

 

Final thoughts

 

I hope you found the above useful and informational on what a typical Infra-As-Code change might look like. There are some additional methodologies that you may want to implement as part of your workflow as we did with this mock-up. And one of those may be some additional automation steps and/or end-user capabilities. And we will cover some of those items in the next post which will cover next steps in our journey.

In the previous post we discussed on ways to get started down the Infra-As-Code journey. However, one thing that was pointed out from others was that I missed the backup process of devices. So I wanted to go ahead and address that here in the beginning of this post to get that area covered. And I very much appreciate those who brought that to my attention.

 

So how do you currently cover backups on network devices? Do you not back them up? Do you back them up to a TFTP/FTP server? Well in order to accomplish backup tasks for Infra-As-Code we need to do our backups a little differently. We need to ensure that our backups are created as the same filename on our destination on every backup occurrence. So you are thinking to yourself this seems rather un-useful correct? Well actually it is very useful and this is what we want. And the reason behind this is that we want to store our backups in our version control system and in order to benefit from this the backup filename needs to be the same every time that it is committed to our version control system. Doing so allows for us to see a configuration diff between backups. Meaning that if any configuration change occurs our version control system will show the diff of the previous backup and the current backup allowing for us to track exact changes over time. This allows for easy identification of any changes that may have possibly caused an issue or maybe even identify an unauthorized change that was not properly implemented using our change methodologies that hopefully are in place. One last thing in regards to backups is that we do indeed need to ensure that our backup strategy is followed as part of our previous post in the implementation phase at the very least. Ensuring that our backups are running and validated prior to the actual implementation of a change. Now with this section on backups being covered let’s continue on where this post was meant to be and that is what is required for Infra-As-Code.

 

What do I mean by what is required for Infra-As-Code? I mean what tools are required to be in place to get us started on this new journey. I will only be covering a very small subset of tools that will allow us to get started as there are way too many to cover. And the tools that I will be mentioning here are the same ones that I use so I may be a bit partial but in no-way are they to be considered the best.

 

Let’s start with a version control system first because we need a repository to store all of our configurations, variables and automation tooling. Now I am sure you have heard of GitHub as a git repository for users to share their code and/or contribute to other user’s repositories. Well we are not going to be using GitHub as this is a public repository and we want a repository on-site. You could however use GitHub as a private repository if you so choose to but be very careful on what you are committing and ensuring that the repository is private. There are others such as BitBucket that allow the creation of free private repositories whereas GitHub is a pay for private repositories. So what do I use for an on-site git repository for version control? I use GitLab-CE (GitLab Community Edition) which is a free and open-source git repository system which has a very good WebUI and other nice features. GitLab also offers a paid for enterprise version which adds additional functionalities such as HA. Having a good version control system is absolutely crucial because it will allow us to create git branches within our repos to designate which are Master (Golden) and maybe some others such as a staging, test, dev and etc. Having these different branches is what will allow us to do different levels of automation testing as we work through our workflows of changes.

 

Let’s now touch on code-review. Remember our code-review is the system that will allow sign-off on code/configuration changes to be applied on our network devices. Code-review also enforces accountability for changes. The ideal recommended method is that whomever signs-off on code changes has a complete understanding of the underlying network devices as well as what the configuration/code changes are. Taking this method ensures that everyone is in the know on what is actually changing and can take the proper measures prior to a production change which brings down your entire network. And luckily if that was to happen you did have a proper backup of each configuration right? So what do I use for code-review? I use Gerrit which is developed by Google. Gerrit is a very well-known and used code-review system throughout the developer ecosystem. One thing to note on Gerrit is that it can ALSO be used as a git repository in addition to code-review. This works well for those who want only one single system for their git repositories and code-review. The integration is very tight but the WebUI is not so pretty to say the least. So when deciding on whether to use a single system or to use Gerrit for only code-review and GitLab for git repositories is a matter of choice. If you do choose to not use a single system, it takes additional steps in your methodologies.

 

Now we will touch on automation tooling. There are MANY different automation tools to leverage and each one is more of a personal preference and/or cost. Some of the different automation tools include Chef, Puppet, Salt, Ansible and many others. My preferred automation tool is Ansible. Why Ansible? I spent a good bit of time with many of the others and when I finally discovered Ansible I was sold on it’s relatively easy learning curve and power. Ansible relies on Python which is a very rich and powerful programming language and is also easy to learn. Using Ansible you can program network devices via an API (if it exists), raw commands (via SSH), SNMPv3 and other methods. There are other tools that leverage Ansible such as NAPALM as well. I highly recommend checking out NAPALM. so definitely checkout the different automation tools and find the one that works the best for you.

 

And for the final tool to discuss in this post covers our CI/CD (Continuous Integration/Continuous Delivery) automation of workflows and changes. There again are so many different tools to choose from. Some of these tools include Go-CD, Rundeck, Travis-CI, Bamboo, Jenkins and many many more. My tool of choice is Jenkins for the most part but also leverage several of the others listed above for one-off type deployments. Jenkins is a widely adopted CI platform with a massive number of plugins developed for tying other tooling into your workflows, such as Ansible playbooks and ad-hoc Ansible tasks. Jenkins allows for us to stitch together our complete workflow from top to bottom if desired. Meaning we could leverage Jenkins to kick off a backup, pull a specific branch from our git repository with configuration changes, run an Ansible playbook against a set of network devices, report back on the status and continue throughout our workflow pipelines. Jenkins is the tool that takes many manual processes and automates those tasks for us. So think of Jenkins as the brains to our Infra-As-Code.

 

Now I know this is a lot to learn and digest on many new tools and methodologies but hopefully I can help you with getting your head around these tools. And with that I have created a Vagrant lab that you can spin up and begin learning each of these tools. I will be leveraging this lab going forward with the additional follow-up posts in the next few weeks. But I wanted to share this here now so you can also start getting your head around these tools. So if you head over to here you can begin your journey too.

 

And with that this ends this post and up next we will discuss some examples of our new methodologies.

In the last post we discussed a little bit about what Infrastructure-As-Code (Infra-As-Code or IAC) meant. But we obviously didn’t go very deep into how to begin this journey and that is exactly what we will be covering in this post.

 

Let’s assume that in most environments today you have some sort of process that is followed to implement change on your network. Be it a new change or a change required to resolve a specific issue that needs to be addressed. But for sakes of conversation here let’s assume that our change is to change from existing manually added static routes to dynamic routing using OSPF. Me being the good network admin/architect that I may be, I bring this up in a weekly meeting and state that we need to change our routing methodology. The reason may be that our network is getting too large and difficult to manage by adding static routes manually everywhere. Your manager and peers listen to your reasoning and decide that it makes sense and for you to come up with a plan to implement this change. So you go back and write up some documentation (including a drawing) and add all of the steps required to make this significant change. The following week you present this to your manager and peers once again and receive the go ahead to proceed. So you put in a change control for approvals and attach your document that you spent all of that time writing up.  Now let’s assume that your manager is not so technical and really doesn’t understand the complexity of this change but proceeds with approving your change and you are set for 2 weeks out. You now go and write up some scripts to execute the changes based on your documentation you spent all the time on writing up (better than copy/paste right?). You soon realize that you will be making changes across 50 individual devices but you are not worried because you have scripted it all (50 scripts?). So at the next meeting you assure everyone that you have all of your scripts written out and all should be good to go. Again, your manager and your peers are good with the process you have put together for this massive change and all agree that this change will occur in 1 week. Now, let’s stop right here and evaluate what just occurred. Does this sound like your environment or environments that you are accustomed to? Are we now doing Infra-As-Code? Not even close. So where did we fall short on this so call Infra-As-Code journey? Everywhere could be easily stated.

 

So what was missed in the above scenario allowing us to start on this new journey? Let’s start with the most obvious first. Where was the code review and peer review? Didn’t we cover peer review discussing this in the meeting(s) with your peers and manager? Nope. Our peer review would have included our team as well as all other teams which could be affected by this change. Remember, you only know what you know and other teams will have different perspectives and uncover additional questions or concerns, in which will be different than your own. So this is a great example of beginning to tear down the silos of communication on implementing change. Up next would be code review. Who else reviewed your scripts for any typos, missing steps or possibly even something added that could break your network? And if you did employ code review did they understand the actual change? Also where was the testing process and results of those tests to review? This doesn’t even cover version control on your scripts and the fact that scripts themselves may not even be the correct automation tooling for this.

 

So how do we start this journey? It should actually be quite easy but it does require a completely different mentality than what you may be used to in the past. Our process should look something similar to the following.

 

  • - New change request – Change from manual static routing to dynamic OSPF for all networks
    • Open discussion
      • § Involve all teams and organizations and discuss the specifics and reasons on why this change should be embraced
      • § Entertain input from others on their questions and/or concerns
        • May also include additional benefits which you may not have thought of
      • § Discuss the specifics on a testing plan and what a reasonable timeline should be
    • Development phase
      • § Obtain a baseline of the state of the current routing topology
      • § Document the processes required to get you to the desired state and also include drawings
      • § Create your automation tasks and begin to version control them
        • Using automation templates vs. scripts
        • Version control – will allow you to document all change phases and ensure that your tasks can be repeatable and consistent. As well as allow you to roll back if needed.
        • Leverage API’s (if available) and/or configuration management tooling
    • Testing phase
      • § Ensure that you are testing in a like environment
        • Can also leverage virtual environments which can mimic your current production design
        • You can also use your configuration management tool to run dry runs of the configurations against production systems (No changes made but will report back on results)
      • § Leverage automation integration testing tools to actually perform configuration changes in your test environment (CI/CD)
        • Builds the testing phase documentation and results which will be used in later discussions
        • Maintains a history of the changes and when or where a failed configuration was experienced
        • Allows for a more readily available way to go back and look over all steps performed
      • § Finalize the automation tasks required to reach the desired state of the change requested
    • Code-review
      • § Leverage a code-review system
        • Ensures that the automation tasks are solid and absolutely no room for error
        • Ensures accountability
        • Ability to send back to the testing phase to correct or add tasks which may have been missed or need to be added
    • Peer-review
      • § Re-engage all teams and organizations which were involved in the open discussion phase
      • § Discuss the documentation and drawings of the current baseline established
      • § Discuss the processes and documentation on how to obtain the desired state of the change request
      • § Discuss the findings and results of the testing phase
      • § Share the phases of testing and present the details of each
      • § Discuss any further questions or concerns
      • § Agree on whether to proceed or not to the original timeline
    • Implementation phase
      • § Leverage the exact same automation integration tools and methodologies to bring your production environment to the desired state
        • This will ensure that the processes are exact. Leaving room for error to a minimum and a repeatable process.

 

As you can see from the above, there is a fair amount of mentality change that has to occur in order to begin this journey. But remember as each new implementation follows these same methodologies it will become much easier going forward. As well as it will provide a much more consistent and repeatable process for each implementation.

 

Hope you enjoyed this post and it has proven to be of some use. Up next we will discuss some of the tooling required to provide us the needed abilities discussed in the above phases.

Will this be the year of Infrastructure-As-Code (Infra-As-Code or IAC) becoming more mainstream? Or is this just a wishful journey that will never catch on? Obviously, this is not a new thing but how many companies have adopted this methodology? Or better yet, how many companies have even begun the discussions of adopting this? Do you currently write scripts and save them somewhere and think “Hey I/we are doing Infra-As-Code already”? Well if true then you are not correct. “But why?” you might be thinking. Infra-As-Code is much larger and more dynamic than just writing scripts in a traditional static method. But if you do currently utilize scripts for infrastructure related tasks and configurations, then you are much better off than those who have not began this journey at all. The reason being is that taking an automated and more programmatic approach of configurations on your infrastructure instead of a manual prone to errors approach is a much more predictable and consistent method of configuring your infrastructure components. Now these components can be of numerous types such as servers, network routers or switches, storage components and much more. But for this series of posts we will only be focusing on the network components and how we can look at beginning the journey towards Infra-As-Code.

 

Below are some of the topics that we will be covering over the series of posts.

 

 

So what does Infra-As-Code really mean? Let’s go ahead and address this here in this post and get a good foundation of what it really does mean.

 

If you begin to treat your infrastructure as code, you can begin to develop the processes in which allow you to deliver a fully tested, repeatable, consistent and deliverable methodology for configuration state in your environment. In doing this you can begin looking at these items as a pipeline of continuous delivery. Furthermore, allowing for automation tooling to consistently bring your infrastructure to the desired state which has been defined. As you begin this journey you will start with a baseline and then a desired state. Adopting chosen automation tooling for your environment, version control, code review and peer reviews will allow for a much more stable and speedy deployment. As well as, allowing for easier roll-backs in the off chance that something does not go as planned. But the chance of having to roll-back should be minimal assuming that proper testing of configurations has been followed throughout your pipeline delivery.

 

I know this all sounds great (on paper) and feels a little unrealistic in many aspects but in the next post we will begin to discover on how we can get started. And hopefully by the end of this series you will have a much better understanding and realistic view on how you too can begin the Infra-As-Code journey!

Based on some of the responses from this last post, it is good to see that there are some that take backing up their network configurations serious. I would like to build on that last post and discuss some ideas around network configuration management, specifically solutions and automation to handle some of the tasks required. I see that several stated that they use Solarwinds NCM, which I personally do use in my environment. Solarwinds NCM is extremely easy to setup and configure to “save your bacon”, as one comment stated. NCM is a perfect example of a solution, which has the ability to track changes, roll back changes, reporting and auditing; as well as many other use cases of the product. There are also many open source products that I have used previously which include routerconfigs plugin for Cacti, simple TFTP jobs setup on each router/switch to backup nightly and numerous other solutions including rConfig which is another open source product that is a very solid solution. However, you may have the solution in place but how do you handle making sure that each and every network switch/router is in a consistent state? Or is configured to have nightly scheduled backups of their configs? Do you still do this manually? Or do you use automation tools such as Ansible, Chef or Puppet to stamp out configurations? I have personally began the journey of using Ansible to build playbooks to start stamping out configurations in a consistent manner as well as for creating dynamic configurations using templates. This is also another great way to start building a solution around the fact that when a network device fails, you have a solid way to start rebuilding a failed device from a somewhat consistent state. I would definitely still leverage a nightly backup to restore configurations which may have changed over time since deployment of an automation tool but hopefully as changes are made, your automation deployment configurations are also modified to reflect these changes.

Do you treat network configuration management with the same care and love as you do for server backups in your environment? Or, are both equally as unattended to as the other? No, this is not a post about backups, that would be too boring right? Well, I would be willing to bet that some environments treat network configuration management as low down on the priority list as backups. Let’s start with something as simple as backing up the configuration of a network switch or router. Are you thinking, “Well now why would I ever do that”? Let’s be hopeful that you are not one of those organizations but rather one who could answer with “we backup all of our routers and switches nightly”. If you do not back up the configurations of your network devices, how would you recover in the event of a device failure? Hopefully not stuck trying to remember what the configuration was and having to rebuild a replacement network device from scratch. That would be way too time consuming and guaranteed to be a failure. Now, what about tracking changes to your network devices? Is this something that you keep track of in some sort of change management solution or at the very least even to a central syslog repository? This way you could be alerted when a change occurs or have the ability to identify a change that may have been the cause of a network outage? Having a good change management solution in place for network devices should absolutely be crucial to any environment. Without one, you will be left wishing that you had one WHEN disaster strikes. Now, how do you handle change management in your environments?

In my previous post I wrote it seems that the consensus is what I would have expected on IP address management. I myself have experienced these same tedious tasks in regards to assigning IP addresses on the network; which I must say is VERY tedious. So how do we get around these tedious tasks to ensure consistency and a source of truth on the network? As a few comments mentioned; Infoblox, Windows 2012 IPAM management and as well Solarwinds IPAM module are possible solutions. I have personally used each of these except Infoblox. I have looked into the Infoblox solution and it seems to be pretty good however for the most part it is overkill in my opinion. I spent a good bit of time setting up and testing the Windows 2012 IPAM management but it was a bit complicated and cumbersome for the most part. Seemed to me it should be a little easier to setup and configure. But then again, maybe it was just a lack of configuration ease on my part. I have also setup and used the Solarwinds IPAM module and it is very solid and easy to setup and configure. Once you have your DNS/DHCP servers added as managed nodes into Solarwinds, discover them as DNS/DHCP servers; you can then head over to the IPAM module and start the integration pieces. You have the ability to create subnets and supernets; and then the IPAM module will auto discover via network scans and DNS/DHCP zone lookups and begin filling in the zone information. Now if you have the luxury of assigning IP addresses using DHCP then this is pretty easy to keep track of obviously. But if you still have to assign IP addresses statically then you can easily drill into the subnet within the IPAM module and find an available IP that has not already been assigned and assign and create your DNS name there. This will ensure that no one else tries to assign the same address to another device; assuming that everyone goes to the one single source of truth for assignments. This solution makes this tedious time consuming task much simpler and consistent. So I challenge you to at least give it a shot and see if one of these solutions does not make your IP address management much easier and free your time up for other tasks of the day.

How many times have you asked or been asked for an IP address for your network and heard the famous words of “just ping until you find an available address” or “we have an IPAM solution but not everyone enters information correctly”? Oh, the joys of IP address management. It comes with such joy sometimes, doesn’t it? In this time of where we are today; it is amazing to me that organizations still function in this manner especially when everyone seems to think they want dynamic and elastic environments, correct? Why is it that just to get an IP address is such a tedious effort involving too many hoops to go through? Is it because your organization doesn’t currently have a good policy on these sorts of tasks? Or is it because there are too many manual processes in place to accomplish this simple task? So why then, would you not want to implement a streamlined, automated process to handle this workflow of assigning IP addresses by removing all of the middleman processes involved as well? Are you serious? Is what you are thinking right? If we were to do that it would mean that we would be slowly removing tasks and procedures that we are responsible for. Oh, the famous “worried about being replaced by automation” response. So how do you handle IP address management in your environments?

Filter Blog

By date: By tag: