PKI for Network Engineers Ep 10: Cisco IOS CA introduction

Greetings programs!

In the next few PKI for network engineers posts, I’m going to cover Cisco IOS CA. If you’re studying for the CCIE security lab or you’re operating a DMVPN or FlexVPN network, and you’d like to use Digital certificates for authentication, then this series could be very useful for you.

Introduction

IOS-CA is the Certification Authority that is built into Cisco IOS. While not a full featured enterprise PKI, for the purposes of issuing certificates to routers and firewalls for authenticating VPN connections it’s a fine solution. It’s very easy to configure and supports a variety of deployment options.

Key points

  • Comes with Cisco IOS
  • Supports enrollment over SCEP (Simple Certificate Enrollment Protocol)
  • RSA based certificates only
  • Easy to configure
  • Network team maintains control over the CA
  • Solution can scale from tiny to very large networks

Deployment Options

IOS-CA has the flexibility to support a wide variety of designs and requirements. The main factors to consider are the size of the network, what kind of transport is involved, how the network is dispersed geographically, and the security needs of the organization. Let’s briefly touch on a few common scenarios to explore the options.

Single issuing Root CA

For a smaller network with a single datacenter or a active/passive datacenter design, a single issuing root may make the most sense. It’s the most basic configuration and it’s easy to administer.

This solution would be appropriate when the sole purpose of the CA is to issue certificates for the purpose of authenticating VPN tunnels for a smaller network of approximately a hundred routers or less. Depending on the precautions taken and the amount of instrumentation on the network, the blast radius of this CA compromise would be relatively small and you could spin up a new CA an enroll the routers to it fairly quickly.

The primary consideration for this option is CA placement. A good choice would be a virtual machine on a protected network with access control lists limiting who can attempt to enroll. The CA is relatively safe from being probed and scanned, and it’s easily backed up.


Fig 1. Single issuing root

Offline Root plus Issuing Subordinate CAs

For a network that has multiple datacenters and/or across multiple continents it would make more sense to create a Root CA, then place a subordinate Issuing CA each datacenter that contains VPN hubs. By Default IOS trusts a subordinate CA, meaning the root CA’s certificate and CRL need not be made available to the endpoints to prevent chaining failure.

As in the case with any online/Issuing CA, steps should be taken to use access controls to limit access to HTTP/SCEP

Fig 2. Offline Root

Issuing Root with Registration Authority

Fig 3. Single root with Registration authority

In this design, there is still a single issuing root, however the enrollment requests are handled by an RA (Registration Authority). An RA acts as a proxy between the CA and the endpoint. This allows for the CA to have strict access controls yet still be able to process enrollment requests and CRL downloads.


Offline Root & Issuing Subordinate CAs w/RA

The final variation is a multi-level PKI that uses the RA to process enrollment and CRL downloads. This design provides the best combination of security, scaling, and flexibility, but it is also the most complex.

Bootstrapping remote routers

Consider a situation where you’re turning up a remote router and you need to bring up your transport tunnel to the Datacenter. The Certification authority lives in the Datacenter on a limited access network behind a firewall. In order for the remote router to enroll in-band using SCEP, it would need a VPN connection. But our VPN uses digital certificates.

There are three options for solving this:

Sideband tunnel w/pre-shared key

This method involves setting up a separate VPN tunnel that uses a pre-shared key in order to provide connectivity for enrollment. This could be a temporary tunnel that’s removed when enrollment is complete, or it could be shut down and left in place for use at a later time for other management tasks.

A sideband tunnel for performing management tasks that may bring down the production tunnel(s) is useful, making this a good option. It’s main drawback is the amount of configuration work required on the remote router. It also depends on some expertise on the part of the installer, a shortcoming shared with the manual enrollment method.

Registration Authority

A Registration Authority is a proxy that relays requests between the Certification Authority and the device requesting enrollment. In this method an RA is enabled on the untrusted network for long enough to process the enrollment request. Once the remote router has been enrolled the the transport tunnels will come up and bootstrapping is complete.

Using a proxy allows in-band enrollment with a minimum amount of configuration on the remote router, making it less burdensome for the on-site field technician. The tradeoff is we’re shifting some of that work to the head end. Because the hub site staff is likely to possess more expertise, this is an attractive trade-off.

Manual/Terminal enrollment

In this method, the endpoint produces a Enrollment request on the terminal formatted as base64 PEM (Privacy Encrypted Mail) Blob. The text is copied and pasted into the terminal of the CA. The CA processes the request and outputs the certificate as a PEM file, which is then pasted into the terminal of the Client router.

While this does have the advantage of not requiring network connectivity between the CA and the enrolling router, it does have a couple of drawbacks. Besides being labor intensive and not straightforward for a field technician to work with, endpoints enrolled with the terminal method cannot use the auto-rollover feature, which allows the routers to renew certificates automatically prior to expiration. The author regards this is an option of last resort.

CRL download on spokes problem

The issue here is when the spoke router needs to download the certificate revocation list (CRL) but the CA that has the CRL is reachable only over a VPN tunnel, that cannot come up because the spoke can’t talk to the CA to download a unexpired copy of the CRL in order to validate the certificate of the head end router.

This is actually a pretty easy problem to solve. Disable CRL checking on the spoke routers, but leave it enabled on the hub routers. This way the administrator can revoke a router certificate and that router will not be allowed to join the network because the hub router will see that it’s certificate has been revoked.

Wrap-up

Ok, so there are the basics. In the next installment, we’ll step through a minimum working configuration.


The DevOps Chronicles part 1: Why I’m studying python for 5 hours a week.

So, I’m committed to the study of coding in python for 250 hours a year. And I’m about 12 weeks in.

I have no ambitions of becoming a professional developer. I like being an infrastructure guy. So why in the heck am I making this commitment as part of my professional development?

Here’s a fundamental problem infrastructure designers & engineers of all stripes are faced with today:

The architecture, complexity, and feature velocity of modern applications overwhelms traditional infrastructure service models. The longer we wait to face this head on, the larger the technical debt will grow, and worse, we become a liability to the businesses that depend on us.

Allow me to briefly put some context around this. I’m going use some loose terminology and keep things simple and high level so we can quickly set the stage. Then I’ll explain where python fits in.

Application Components

Most all software applications have three major components:

  1. data store
  2. data processing
  3. user (or application) interface

Overall, the differences in complexity come down to component placement and whether or not each function supports multiple instances for scaling and availability.

Application Architectures

A stage 1 or basic client/server application will place all components on a single host. Users access the application using some sort of lightweight client or terminal that provides input/ouput and display operations. If you buy something at Home Depot and glance at the point of sale terminal behind the checkout person, you’re probably looking at this kind of app.

A stage 2 Client/server application will have a database back end, with the processing and user interface running as a monolithic binary running on the user’s computer. This is the bread and butter of custom small business applications. Classic example of this would the program you see the receptionist using at a doctor or dentist’s office when they’re scheduling your next appointment. Suprisingly, this design pattern is still commonly used to develop new applications even today, becuase it’s simple and easy to build.

A stage 3 Client/Server application splits the 3 major components up so that they can run on different compute instances. Each layer can scale independently of the other. In older designs, the user interface is a binary that runs on the end user’s computer. Newer designs used web based user interfaces. These tend to be applications with scaling requirements in the hundreds to tens of thousand of concurrent sessions. These apps tend to be constrained by the database design. A typical SQL database is single master and often the choke point of the app. There are many of examples in manufacturing and ERP systems of 3 layer designs where the ui and data processing is scale-out, backed by a monster SQL Server failover cluster .

All of these design patterns are based on the presumption of fixed, always available compute resources underpinning them. They’re also generally vendor centric technology stacks with long life cycles. They all depend on redundancy underpinning the compute storage and network components for resiliency, with the 3 tier being the most robust as 2 of the layers can tolerate failures and maintain service.

Traditional infrastructure service models are built around supporting these types of application designs and it works just fine.

Then came cloud computing, and the game changed.

Stage 4: Cloud native applications.

Cloud native applications operate as an elastic confederation of resilient services. Let’s unpack that.

Elastic means service instances spin up or down dynamically based upon demand. when an instance has no work to do, it’s destroyed. When additional capacity is required, new instances are dynamically instantiated to accommodate demand

Resilient means the individual services instances can fail and the application will continue running by simply destroying the failed service instance and instantiating a healthy replacement.

Confederation means a collection of services are designed to be callable via discoverable APIs. This means if an application needs a given function that can be offered by an existing service, the developer can simply have his application consume it. This means components of an application can live anywhere, as long as they’re reachable and discoverable over the network.

Because of this modular design, it’s easy to quickly iterate software, adding additional functions and features to make it more valuable to it’s users.

Great! but as an infrastructure person how do we support something like this? The fact is, this is where traditional vendor-centric shrink wrap tool approach breaks down.

Infrastructure delivery and support

Here are the main problems with shrink wrap monolithic tools in context of cloud native application design patterns:

  1. Tool sprawl (ever more tools and interfaces to get tasks done)
  2. Feature velocity quickly renders tool out of date
  3. Graphical tools surface a small subset of a product’s capabilities
  4. Difficult to automate/orchestrate/manage across technology domains

Figure 1 is what the simple task of instantiating a workload looks like on 4 different public cloud providers. Project this sprawl across all the componentry required to turn up an application instance and the scope of the problem starts to come into focus.

Fig 1: Creating a virtual machine on different cloud providers.

So how in the heck do you deal with this? The answer is surprisingly simple: infrastructure tools that mirror the applications the infrastructure needs to support.

TL;DR Learn how to use cloud native tools and toolchains. And this brings us to the title of the blog post.

Python as a baseline competency

Python is the glue that lets you assemble your own collection of services in a way that’s easy for your organization’s developers and support people to consume. The better you are with python, the easier it is to consume new tools and create tool chains. Imagine that you’re at a restaurant and you’re trying to order for yourself and your family. If you can communicate fluently, it’s much easier to get what you want than if you don’t speak the language, and are reduced to pointing at pictures and pantomiming.

Interestingly, I’ve found that many of the principles of good network and systems design are directly applicable to writing a good program. encapsulation, separation, information hiding, discreet building blocks, etc.

Use Case: Creating a Virtual Machine

Let’s take the example of creating a virtual machine.

For creating a VM, we could create a class or classes in python that allows someone to spin up, shut down and check the status of virtual machines on 4 different public cloud providers in a manner that abstracts the specifics of each provider and gives them to the user as a generic service. Then we or anyone else on the team can use it to spin up workloads with a simple call, without having to know the implementation details of each provider.

The power of making tools like this is:

  1. we only have to solve the problem of doing a thing once
  2. we encapsulate it
  3. we make it available to other code

In the next installment of my DevOps journey, I’ll take a stab at writing the python Virtual Machine class described and we’ll see how well it works.

Best regards,

-s