Evolving to Zero Trust Architecture (ZTA) – Part 1
I have a deep interest in cybersecurity, and to keep up with the latest threats, policies and security practices, I became a member of ACT-IAC organization and enrolled in the Cybersecurity Community of Interest group. This is where I got the opportunity to work as a volunteer in the Zero Trust Architecture Phase 2 project. Hence, I am trying to share the knowledge I gained around ZTA strategy and principles. I am planning to break my blog into four series based on how the project progresses.
- What is ZTA?
- Real world deployment scenarios
- ZTA core capabilities
- Vendors providing ZTA capabilities
What is ZTA and how did it come into existence?
Traditionally, perimeter-based security has been used to protect the network infrastructure behind a firewall where if the user gets authenticated, they can access all the resources behind the firewall assuming all network users/devices as trustworthy. This caused a lot of security breaches across the globe where attackers could move laterally and exploit resources to which they were not authorized. The attackers only had to get through the firewall and later crawl across any resource available in the network causing potential damage in terms of data loss and other financial implications that can come via ransomware attacks.
Currently, an enterprise’s infrastructure operates around several networks like cloud-based services, remote users connecting from their own network using their enterprise-owned or personal devices (laptops, mobile devices), network location can change based on where the users/devices are connected from for e.g. public WIFI, internal enterprise networks etc. All these complex use cases made the possibility of moving away from perimeter-based security to “perimeter less” security (not confined to one network infrastructure) which led to the evolution of a new concept called as “Zero-Trust” where you “trust no one, but verify”. ZT approach is primarily based on data protection but it can be applied across other enterprise assets like users, devices, applications and infrastructure.
ZTA is basically an enterprise cybersecurity strategy that prevents data breaches and limits lateral movement within the network infrastructure. It assumes all the internal or external agents (user, device, application, infrastructure) that wants to access an enterprise resource (internal network or externally in the cloud) is not trustworthy and needs to be verified for each request before granting access to them.
What does Zero Trust mean in a ZTA?
In the above diagram, the user who is trying to access the resource must go through the PDP/PEP. PDP/PEP decides whether to grant access to this request based on enterprise policies (data/access/risk), user identity, device profile, location of the user, time of request and any other attributes needed to gain enough confidence. Once granted, the user is on an “Implicit Trust Zone” where it can access all the resources based on network infrastructure design. “Implicit Trust Zone” is basically the boarding area in an airport where all the passengers are considered trustworthy once they verify themselves through immigration/security check.
You can still limit access to certain resources in the network using a concept called “Micro-Segmentation”. For example, after getting through the security check and reaching the boarding area, passengers are again checked at the boarding gate to make sure they are entering the authorized flight to reach their destination. This is what “Micro Segmentation” means where the resources are more isolated to a segment and access requests are verified separately in addition to PDP/PEP.
Tenets of ZTA: (As per NIST SP 800-27 publication)
All the resources whether its data related, or services provided should be communicating in a secure fashion irrespective of their network location. Each individual access request will be verified before granting access to any resource based on the client’s identity, device they are using to request, type of application used, location coordinates and other behavioral attributes. Each access request granted will be authenticated and authorized dynamically and strictly enforced. In addition, the enterprise should collect all activity information, log decisions, audit logs and monitor the network infrastructure to improve the overall security posture.
What are the logical components of ZTA?
Policy Engine: Responsible to make and log decisions based on enterprise policy and inputs from external resources (CDM, threat intelligence etc.) to grant access or not to a request.
Policy Administrator: Responsible for establishing or killing the communication path between the subject and enterprise resource based on the decision made by PE. It can generate authentication tokens for the client to access the resource. PA communicates with PEP via the control plane.
Policy Enforcement Point: Responsible for enabling, monitoring and terminating communication between subject and enterprise resource. It can be either used as a single logical component or can be broken into two components: the client agent and resource gateway component that controls access. Beyond the PEP is the “Implicit Trust Zone” to access enterprise resources.
Control Plane/Data Plane: The control plane is made up of components that receive and process requests from the data plane components that wish to access network resources. The control and data planes are more like zones in the ZTA. All the resources, devices, and users within the network can have their own control plane component within them to decide whether the data should be routed further or not. In this diagram, it is just used to explain how control plane works for data plane components. Data plane simply passes packets around and the control plane routes them appropriately based on decisions made.
Note: The dotted line that you see in the image above is the hidden network that is used for communication between the various logical components.
Why should organizations adopt ZTA?
When adopting a ZTA, organizations must weigh all the potential benefits, risks, costs, and ROI. Core ZT outcomes should be focused on creating secure networks, securing data that travels within the network or at rest, reducing impacts during breaches, improving compliance and visibility, reducing cybersecurity costs and improving the overall security posture of an organization.
Lost or stolen data, ransomware attacks, and network and application layer breaches cost organizations huge financial losses and market reputation. It takes a lot of time and money for an organization to resume back to normal if the security breach was of the highest degree. ZT adoption can help organizations avoid such breaches which is the key to survive in today’s world, where state funded hackers are always ahead of the game.
As with all technology changes, the biggest challenge to demonstrate higher ROI and lower cybersecurity costs is the time needed to deliver the desired results. Organizations should consider the following:
- Assess what components of ZTA pillars they currently have in their infrastructure. Integration of components with existing tools can reduce the overall investment needed to adopt ZTA.
- Consider including costs or impacts associated with risk levels and occurrences when doing ROI calculations.
- ZT adoption should simplify, and not complicate, the overall security strategy to reduce costs.
What are the threats to ZTA?
ZTA can reduce the overall risk exposure in an enterprise but there are some threats that can still occur in a ZTA environment.
- Wrongly or mistakenly configured PE and PA could cause disruptions to the users trying to access the resources. Sometimes, the access requests which would get unapproved previously could get through due to misconfiguration of PE and PA by the security administrator. Now, the attackers or subjects could access resources from which they were restricted before.
- Denial of service attacks on PA/PEP can disrupt enterprise operations. All access decisions are made by PA and enforced by PEP to make a successful connection of a device trying to access a resource. If the DoS attack happens on the PA, then no subject would be able to get access as the service would be unavailable due to a flood of requests.
- Attackers could compromise an active user account using social engineering techniques, phishing or any other way to impersonate the subject to access resources. Adaptive MFA may reduce the possibility of such attacks on network resources but still in traditional enterprises with or without ZTA adoption, an attacker might still be able to access resources to which the compromised user has access. Micro-segmentation may protect resources against these attacks by isolating or segmenting the resource using technologies like NGFW, SDP.
- Enterprise network traffic is inspected and analyzed by policy administrators via PEPs but there are other non-enterprise-owned assets that can’t be monitored passively. Since the traffic is encrypted and it’s difficult to perform deep packet inspection, a potential attack could happen on the network from non-enterprise owned devices. ML/AI tools and techniques can help analyze traffic to find anomalies and remediate it quickly.
- Vendors or ZT solution providers could cause interoperability issues if they don’t follow certain standards or protocols when interacting. If one provider has a security issue or disruption, it could potentially disrupt enterprise operations due to service unavailability or the time taken to switch to another provider which can be very costly. Such disruptions can affect core business functions of an enterprise when working in a ZTA environment.
[ACT-IAC] American Council for Technology and Industry Advisory Council (2019) Zero Trust Cybersecurity Current Trends. Available at https://www.actiac.org/zero-trust-cybersecurity-current-trends
Draft (2nd 1) NIST Special Publication 800-207. Available at https://nvlpubs.nist.gov/nistpubs/SpecialPublications/NIST.SP.800-207-draft2.pdf
NIST Zero Trust Architecture Release: https://www.nccoe.nist.gov/projects/building-blocks/zero-trust-architecture
Most Agile transformation efforts in the government begin with the Scrum process. However, many agencies feel that they have reached a plateau and are ready to move through to the next logical steps. Improving digital services delivery and getting working software into the users’ hands shouldn’t stop with just Scrum. As agencies progress in their Agile transformation, they begin to see the value of adding the Agile engineering practices, such as Test-Driven Development and Continuous Integration to improve code quality and the downstream delivery of fully functional and tested software. And what about the challenges of scaling Agile for very large projects? What might a strategic progression for Agile transformation look like? This will be the focus of our ninth Agile in Government workshop, Agile Engineering, SAFe and DevOps: A Roadmap to Adoption at the Potomac Forum, Willard InterContinental Hotel on Thursday June 14, 2018.
Full Agenda and Registration can be found here.
Is your business undergoing an Agile Transformation? Are you wondering how DevOps fits into that transformation and what a DevOps roadmap looks like?
Check out a webinar we offered recently, and send us any questions you might have!
Recently, I was part of a successful implementation of a project at a big financial institution. The project was the center of attention within the organization mainly because of its value addition to the line of business and their operations.
The project was essentially a migration project and the team partnered with the product vendor to implement it. At the very core of this project was a batch process that integrated with several other external systems. These multiple integration points with the external systems and the timely coordination with all the other implementation partners made this project even more challenging.
I joined the project as a Technical Consultant at a rather critical juncture where there were only a few batch cycles that we could run in the test regions before deploying it into production. Having worked on Agile/Scrum/XP projects in the past and with experience working on DevOps projects, I identified a few areas where we could improve to either enhance the existing development environment or to streamline the builds and releases. Like with most projects, as the release deadline approaches, the team’s focus almost always revolves around ‘implementing functionality’ while everything else gets pushed to the backburner. This project was no different in that sense.
When the time had finally come to deploy the application into production, it was quite challenging in itself because it was a four-day continuous effort with the team working multiple shifts to support the deployment. At the end of it, needless to say, the whole team breathed a huge sigh of relief when we deployed the application rather uneventfully, even a few hours earlier than what we had originally anticipated.
Once the application was deployed to production, ensuring the stability of the batch process became the team’s highest priority. It was during this time, I felt the resistance to any new change or enhancement. Even fixes to non-critical production issues were delayed because of the fear that they could potentially jeopardize the stability of the batch.
The team dreaded deployments.
I felt it was time for me to build my case to have the team reassess the development, build and deployment processes in a way that would improve the confidence level of any new change that is being introduced. During one of my meetings with my client manager, I discussed a few areas where we could improve in this regard. My client manager was quickly onboard with some of the ideas and he suggested I summarize my observations and recommendations. Here are a few at a high level:
It’s common for these suggestions to fall through the cracks while building application functionality. In my experience, I have noticed they don’t get as much attention because they are not considered ‘project work’. What project teams, especially the stakeholders, fail to realize is the value in implementing some of the above suggestions. Project teams should not consider this as additional work but rather treat it as part of the project and include the tasks in their estimations for a better, cleaner end product.
In Daniel H. Pink’s book, Drive: The Surprising Truth About What Motivates Us, he discusses the motivations of knowledge workers. He makes the case that knowledge workers are driven by intrinsic factors and not the extrinsic factors of punishment and money. As he states, “Carrots & Sticks are so last Century. Drive says for 21st century work, we need to upgrade to autonomy, mastery, and purpose.” A great video covering his work is viewable at https://youtu.be/u6XAPnuFjJc. For most research on extrinsic and intrinsic motivation start with the work of Edward Deci from the 1970s.
Here is an explanation of the three types of motivation:
This is the granting of control over their own work to those doing the work. Guidance is fine, but too much and it becomes the micro-management which can be detrimental to motivation. Valuable feedback, performance metrics, and boundaries can be all that is needed.
This is an innate desire to get better at doing some task. If it is too easy, workers may get bored. If it is too hard and little progress is made, workers often get frustrated and give up. So tasks must be challenging, yet doable. And fostering an environment of continuous learning will add to motivation.
This is tying the work to a cause larger than themselves. Workers, who believe in that cause, feel that there is importance to the outcome of the work beyond just their own accomplishment.
According to Wikipedia:
DevOps is a software development method that stresses communication, collaboration, integration, automation, and measurement of cooperation between software developers and other information-technology (IT) professionals.
That certainly sounds like knowledge work to me. But are the three motivations the same for software developers and operations staff? And what might they be in a DevOps team. Let’s take a look in the chart below:
The management challenge then is to create a supportive culture where DevOps can flourish and the knowledge workers will be highly motivated by having aligned motivations of Autonomy, Mastery, and Purpose.
According to James P. Womack and Daniel T. Jones, “Lean Thinking is a business methodology which aims to provide a new way to think about how to organize human activities to deliver more benefits to society and value to individuals while eliminating waste.” In my opinion, Continuous Delivery and DevOps are the application of Lean Thinking to a part of the software development lifecycle. In particularly, the processes that occur from the planning of software development to deployment of the software into production.
But how do you know if you need Continuous Delivery and DevOps? Well, here are some typical candidates for applying Lean Thinking in your organization via Continuous Delivery (CD) and DevOps.
Long cycle time or lead time
A couple of lean metrics are important measures of the amount of time it takes to deliver value to the end user. One of them starts from the moment the feature is identified (lead time). The clock starts on the other when development of a feature begins (cycle time). DevOps is more applicable to the cycle time. If your organization has long cycle times, then CD and DevOps are a great approach to reveal where those delays are and start to eliminate them.
If your development team is using Scrum, then long cycle times may already be very transparent. The development team should quickly be completing high value features regularly. But that doesn’t mean those features are getting deployed quickly.
Mistakes during deployment
When deployments do occur in your organization, are you experiencing technical issues? Ever have a roll back? These mistakes can be the result of many different causes. Generally, they come about because of the large number of manual processes during deployment. Problems occur because of communication issues between teams and a lack of familiarity with the steps necessary to deploy. Does each deployment seem new every time?
Another symptom is an organization where there is a “hero mentality” regarding deployment. What I mean by this is that there is an expectation that there will be lots of problems during deployment and some individual or small team will rescue the deployments by putting in lots of late hours, consuming multiple cans of soda, and eating pizza. This mentality is often even more entrenched when a particular individual becomes the “go to” person (the hero) for the deployments because only they know or can figure out how to do them. Sometimes the hero team or individual embraces this role, but often times they really don’t want the stress and constraints that it entails.
Large overhead in deployment process
Usually, as a direct result of the bad deployments mentioned above, organizations start to place much heavier governance on the deployment processes in an attempt to prevent mistakes. This can be manifested via complex change control processes. Usually, these are manual and include a change control board and heavy documentation (readiness reviews, traceability, etc.). As a result, deployments are slowed down even more. Sometimes so much overhead can cause a rush to get things done at the end by those that do the deployments which leads to more issues. Also, because more people are involved with the communication of what needs to be done, there is further chances of errors occurring.
User impact from deployment
Does your organization take systems offline during deployments? How long are those downtimes? And once systems are brought back online do they need to learn a large new feature set? Downtime can often have a negative impact on your mission to your users and drives a lot of organizations to deploy very infrequently. But the infrequency of deployment means that a lot of changes to the system are introduced during those deployments. Impact to users can be substantial and takes away from the value you are delivering to them.
Resistance from operations staff
Are your operations teams resistant to perform deployments? Would they rather see systems never change and just support the status quo? If so, then it is probably due to the complaints directed toward them because of the issues described above. Often they have little control to resolve those issues and feel blindsided by what they are getting from the development teams. I can assure you that it is a rare individual who enjoys the stress of a deployment gone wrong. Clearly DevOps can help with this.
Of course, there are other measurements and policies that can be used to assess if you need to make changes to a non-DevOps environment or even improvements in your DevOps environment. Do you have more ideas or want to know more about assessing your need for DevOps, Continuous Delivery, or Continuous Deployment? Leave a comment below or contact us.
As most of you who operate in the Federal space are probably aware at this point, many Federal agencies are now utilizing Agile methods such as Scrum to manage their software development efforts. The goal for most of them is to reduce risk and accelerate system delivery to their end users. By using Scrum with the development team they have achieved part of their goal. But major risks and speedbumps still exist after the software is developed. These are encountered during deployment by the operations groups and are normally outside the purview of the development team.
The de facto approach to this issue in the private sector is Continuous Delivery and DevOps. That same approach is now being successfully applied to the public sector. Just how well is the government doing in its attempts to adopt this private sector best practice? On November 18th Dr. David Patton, Federal Practice Director, and Ashok Komaragiri, Senior Technical Consultant, both with CC Pace, will be joined by Joshua Seckel and Jaya Kathuria from the Department of Homeland Security, Tina M. Donbeck from the U.S. Patent & Trademark Office and John D. Murphy, with the National Geospatial-Intelligence Agency, to take an in-depth look at the state of DevOps in the Federal government.
For additional information visit: http://www.potomacforum.org/content/agile-development-government-training-workshop-vi-devops-%E2%80%93-taking-agility-government-new
In my last post, I talked about the interesting Agile 2015 sessions on team building that I’d attended. This time we’ll take a look at some sessions on DevOps and Craftsmanship.
On the DevOps’ side, Seth Vargo’s The 10 Myths of DevOps, was by far the most interesting and useful presentation that I attended. Vargo’s contention is that the DevOps concept has been over-hyped (like so many other things) and people are soon going to be becoming disenchanted with the DevOps concept (the graphic below shows where Vargo believes DevOps stands on the Gartner Hype Cycle right now). I might quibble about whether we’ve passed the cusp of inflated expectations yet or not, but this seems just about right to me. It’s only recently that I’ve heard a lot of chatter about DevOps and seen more and more offerings and that’s probably a good indication that people are trying to take advantage of those inflated expectations. Vargo also says that many organizations either mistake the DevOps concept for just plain operations or use the title to try to hire SysAdmins under the more trendy title of DevOps. Vargo didn’t talk to it, but I’d also guess that a lot of individuals are claiming to be experienced in DevOps when they were SysAdmins who didn’t try to collaborate with other groups in their organizations.
The other really interesting myth in Vargo’s presentation was the idea that DevOps is just between engineers and operators. Although that’s certainly one place to start, Vargo’s contention is that DevOps should be “unilaterally applied across the organization.” This was characteristic of everything in Vargo’s presentation: just good common sense and collaboration.
Abigail Bangser was also focused on common sense and collaboration in Team Practices Applied to How We Deploy, Not Just What, but from a narrower perspective. Her pain point seems to have been that technical stories that weren’t well defined and were treated differently than business stories. Her prescription was to extend the Three Amigos practice to technical stories and generally treat techincal stories like any other story. This was all fine, but I found myself wondering why that kind of collaboration wasn’t happening anyway. It seems like doing one’s best to understand a story and deliver the best value regardless of whether the story is a business or a technical one. Alas, Bangser didn’t go into how they’d gotten to that state to start with.
On the craftsmanship side, Brian Randell’s Science of Technical Debt helped us come to a reasonably concise definition of technical debt and used Martin Fowler’s Technical Debt Quadrant distinguish between different types of technical debt: prudent vs. reckless, and deliberate vs. inadvertent. He also spent a fair amount of time demonstrating SonarQube and explaining how it had been integrated into the .NET ecosystem. SonarQube seemed fairly similar to NDepend, which I’ve used for some years now, with one really useful addition: both NDepend and SonarQube evaluate your codebase compared to various configurable design criteria, but SonarQube also provides an estimated time to fix all the issues that it found with your codebase. Although it feels a little gimmicky, I think it would be more useful than just having the number of instances of failed rules in explaining to Product Owners the costs that they are incurring.
I also attended two divergent presentations on improving our quality as developers. Carlos Sirias presented Growing a Craftsman through Innovation & Apprenticeship. Obviously, Sirias advocates for an apprenticeship model, a la blacksmiths and cobblers, to help improve developer quality. The way I remember the presentation, Sirias’ company, Pernix, essentially hired people specifically as apprentices and assigned them to their “lab” projects, which are done at low-cost for startups and small entrepreneurs. The apprenticeship aspect came from their senior people devoting 20% of their time to the lab projects. I’m now somewhat perplexed, though, because the Pernix website says that “Pernix apprentices learn from others; they don’t work on projects” and the online PDF of the presentation doesn’t have any text in it, so I can’t double check my notes. Perhaps the website is just saying that the apprentices don’t work as consultants on the full-price projects, and I do remember Sirias saying that he didn’t feel good about charging clients for the apprentices. On the other hand, I can’t imagine that the “lab” projects, which are free for NGOs and can be financed by micro-equity or actual money, don’t get cross-subsidised by the normal projects. I feel like just making sure that junior people are always pairing and get a fair chance to pair with people they can learn from, which isn’t always “senior” people, is a better apprenticeship model than the one that Sirias presented.
The final craftsmanship presentation I attended, Steve Ropa’s Agile Craftsmanship and Technical Excellence, How to Get There was both the most exciting and the most challenging presentation for me. Ropa recommends “micro-certifications,” which he likens to Boy Scout merit badges, to help people improve their technical abilities. It’s challenging to me for two reasons. First, I’m just not a great believe in credentialism because I don’t find they really tell me anything when I’m trying to evaluate a person’s skills. What Ropa said about using internally controlled micro-certifications to show actual competence in various skill areas make a lot of sense, though, since you know exactly what it takes to get one. That brings me to the second challenge: the combination of defining a decent set of micro-certifications, including what it takes to get each certification, and a fair way of administering such a system. For the most part, the first part of this concern just takes work. There are some obvious areas to start with: TDD, refactoring, continuous integration, C#/Java/Python skills, etc., that can be evaluated fairly objectively. After that, there are some softer areas that would be more difficult to figure out certifications for, though. How, for example, do you grade skills in keeping a code base continually releasable? It seems like an all-or-nothing kind of thing. And how does one objectively certify a person’s ability to take baby steps or pair program?
Administering such a program also presents me with a real challenge: even given a full set of objective criteria for each micro-certification, I worry that the certifications could become diluted through cronyism or that the people doing the evaluations wouldn’t be truly competent to do so. Perhaps this is just me being overly pessimistic, but any organization has some amount of favoritism and I suspect that the sort of organizations that would benefit most from micro-certifications are the ones where that kind of behavior has already done the most damage. On the other hand, I’ve never been a boy scout and these concerns may just reflect my lack of experience with such things. For all that, the concept of micro-certifications seems like one worth pursuing and I’ll be giving more thought on how to successfully implement such a system over the coming months.