Tuesday, October 24, 2017

U.S. warns public about attacks on energy - Reuters

The U.S government issued a rare public warning that sophisticated hackers are targeting energy and industrial firms, the latest sign that cyber attacks present an increasing threat to the power industry and other public infrastructure.


Read More...

Friday, August 18, 2017

Virtualization in the CIP Environment Drives Discussion of Applicable Systems Classifications

Here's a draft of a Whitepaper I intend to submit to NERC.

Alt Title:

More Problems With EACMS

Summary:

One of the topics entwined in the NEC CIP virtualization discussion is the risk posed by the virtual management consoles. This is related to consolidated interfaces and automation where management console access can change or delete entire infrastructures including virtual servers, networks, and storage. A short-hand phrase has been coined calling this the “Fewer, Bigger Buttons” problem.

Because this is a valid concern that is not really addressed at all in NERC CIP, the scope of discussion quickly grew to address similar Centralized Management Systems (CMS) in the physical arena as well, and from there to all kinds of systems which pose, or appear to pose similar risks.

Statement of the Issues:

The 2016-02 SDT has been discussing several options publicly. One option is to classify CMS (a heretofore undefined term in NERC CIP, and not an industry standard definition in Cyber Security either) as applicable systems and apply specific security requirements to them.

Another option the SDT has discussed is including CMS into the existing EACMS definition. This is more or less by default the approach taken to date with all of those legacy enterprise automation tools. Obviously, this capability and risk has existed, largely unacknowledged in CIP standards, for a long time without being confined to virtualized environments. HP Openview, IBM Tivoli, Solarwinds Orion, and CiscoWorks (to name just a few enterprise automation tools) have had the ability to affect the entire enterprise all at once for literally decades now.

Anti-malware and software patching systems likewise. If a system in my network operates via a system account with administrative privileges that allow it to modify the configuration of a BCS, isn't that tool both a target and a potential attack vector? And if such a system inside my network that performs this function is an EACMS, then isn’t Microsoft’s Software Update Services site, and Ubuntu’s Linux Repository, or Symantec or McAfee’s antivirus signature update sites on the Internet as much an EACMS as any SCCM or Anti-virus server inside my network?

However, this perpetuates and exacerbates an issue where EACMS has become a "catch-all" category of CIP-related Cyber Assets with one-size-fits-all requirements regardless of the degree of risk or technical constraints posed by the particular system.

An example is the Intermediate System. In order to make the IS subject to CIP requirements it has to be categorized somehow as a type of applicable system. To function as an intermediate in practical terms (and by definition per the NERC Glossary) it has to be outside the ESP. Apparently the IS has therefore been categorized as an EACMS simply because that is the only category currently available that allows for applicable systems to be subject to CIP requirements outside the ESP.

Thus, the otherwise-unconnected phrase “This includes Intermediate Systems” was tacked onto the end of the EACMS definition. It is notable that no other examples had previously been given.

The Definition of Intermediate Systems:

“A Cyber Asset or collection of Cyber Assets performing access control to restrict Interactive Remote Access to only authorized users. The Intermediate System must not be located inside the Electronic Security Perimeter”

Here we see a presumably audit-able CIP requirement set in the definition of Intermediate System rather than in a table of requirements or in a security objective. We see a cyber security function (Authentication and Authorization) defined as a specific type of applicable device, and we see the security benefit of such a function truncated to apply only to users who interactively access Cyber Assets inside the ESP from outside the ESP rather than applying the benefit of robust authentication, authorization and accounting to all remote access.

Another example of the problem with the one-size-fits-all approach to compliance requirements for EACMS is what is known as the “hall of mirrors” effect. Specifically, there may be some types of Ecyber Security Systems that should be required to be protected behind a firewall. However, that requirement can’t exist for all EACMS without defining a new category because a firewall is itself an EACMS. Without defining a new category, the result would be every EACMS needing to be inside an ESP and protected by another EACMS which creates a recursive "hall of mirrors" effect without end.

In addition to the catch-all and recursive problems I've just noted, there is also a missing component: Risk-based assessment and mitigation. For example, a system that only monitors and logs access (such as monitoring systems like Splunk, Tripwire, etc) does not pose the same level of risk as a management console for a large virtualized Control Center infrastructure. In addition, the technical controls to mitigate the risk may differ. A SIEM system presents a risk of leaking BES Cyber System Information; an electronic access control (AAA) system presents a risk of unauthorized access to or modification of a BES Cyber System’s operational parameters; and a Centralized Management Console (physical or virtual) presents an infrastructure reliability risk.

Risk-based assessments and mitigation would help with this. It would allow for acknowledging that there are a number of systems that both monitor and provide part of the solution of controlling access, but which do not actually control traffic at the point of entry. These devices or systems may or may not benefit from being inside a protected boundary, or they may form part of the strategy that protects BES Cyber Assets. The technical means of implementing some multi-part systems may require that components be outside or that they span the ESP.

All of this appears to be an unfortunate artifact of the single-level security mindset inherent to the ESP approach (hardened perimeter defense) and the “All-In” nature of CIP Applicable Systems. Part of this problem (creeping scope of applicability to increasingly peripheral systems) stems from using the ESP to define scope of applicability, rather than using risk to BES Cyber Systems and security objectives to define the scope of necessary controls. FERC does not allow a Registered Entity to assess and accept their own risks due to the interconnected nature of risk to the BES. At the same time, NERC and the Regions have zero incentive to accept risks on behalf of Entities. NERC and the Regions bear none of the cost of mitigation, and would receive the lion's share of criticism in the case of a failure of reliability or security breach. It is very difficult to create standards that are effective, comprehensible, and inclusive of different technical capabilities, based upon the existing definition of EACMS or even new definitions structured with the same ESP-as-hardened-perimeter mindset. ESP is easy to visualize. Drawing a “red dotted line” around assets needing protection is convenient. It’s simply not sufficient. In contrast, a modern defense-in-depth, systems- rather than device-based approach doesn’t require torturing definitions.

With these examples in mind, the issues caused by having EACMS as a “catch all” category are obvious. Some applicable systems were defined into existence for compliance purposes and these standards are incompatible with broader standardized cyber security best practices, and with one-size-fits-all requirements applied to the whole artificial group.

Recommendations:

So what do we do about it?
  1. The audit process needs to envision an approach more concerned with meeting an objective than a performance requirement. Then it doesn’t matter where something resides, as long as the applied combination of protections achieves the objective. It doesn’t matter much what is providing the control, as long as the assets needing protection receive the control. For example, the implied security objective of CIP 5 is not to “have inbound and outbound rules”. That is a limited method of achieving the real objective, which is to protect the BCS from unauthorized and potentially malicious traffic. If you have a better method, you should be allowed to use it.
  1. It might be necessary to break up the EACMS category of applicable systems into discrete functions (bullet points below) so that the appropriate security objective and requirements for each can be derived and applied whether the systems in question are physical or virtual.
  • SIEM - Security Incident & Event Monitoring systems.
This would subtract the “M” for “Monitoring” from EACMS. These are systems that strictly monitor and collect information about the ESP and BES Cyber System electronic communications or status but do not control access. A great deal of literature, discussion, guidance, and best practices are published across a broad range of industries as to how to securely implement SIEM. The risk presented by compromise of these systems revolves around the information (such as configurations and event logs) they contain. The crux of the concept here is that the protections already defined for BES Cyber Systems Information (BCSI) are adequate and effective at providing protection for this information, and it is the information that needs protecting, not necessarily the SIEM system.


A rather large issue with EACMS is CIP-004 and its applicability of most personnel-oriented controls (training, background checks, etc.) to anyone with potential access to an EACMS. This kills sharing any service whatsoever (AAA, SIEM, etc.) because anyone with any form of “access” to an EACMS gets sucked into CIP-004. Have an account on a EACMS AAA server but NO access to any BCS? Too bad, you still must have a CIP background check and be trained on the CIP program (beyond your ‘need to know’). It’s a symptom of the “all-in” issue.

Splitting these monitoring systems out and adjusting the requirements would may allow entities to more easily use outsourced managed security service providers or global/enterprise-wide SIEM systems and correlate event information in their CIP operational environments with those in their non-CIP environments to provide increased security and reliability benefits. The concern is that under current standards the CIP program and device-level CIP audits might be deemed to encompass Cyber Assets which do not actually affect the reliable operation of the BES in the wider enterprise network or at the service provider and could therefore dilute attention from BES Cyber Security functions in favor of paperwork exercises.
  • EACS - Electronic Access Control System. The fundamental defining characteristic of an Electronic Access Control system is that it performs authorization of traffic or users. This is the gatekeeper function- the classic Authentication and Authorization functions of standard AAA.

In many cases these systems do not perform any active filtering of the traffic passing through any particular interface. The primary duty of EACS is to authenticate and authorize. Additional components of electronic access security strategies are accounting (logging) systems and gateways which actually pass or drop traffic. In comparison to the SIEM discussion above, EACS & EAG move beyond the risk of unauthorized access to meta-information about an environment to unauthorized access to and modification of operational parameters of the actual BES Cyber Systems.

In contrast, theoretically an application level IPS could send a control signal from anywhere in the enterprise to anywhere in the enterprise telling a specific host not to respond to a given type of packet, with a certain payload, from a particular address- and it could do this dynamically based upon heuristics rather than signatures, but this outstanding security solution would not be a compliant solution under our current regime.

Or a network level intrusion protection system combined with dynamic firewall rules may send a control message to a firewall instructing it to dynamically change an Access Control List (ACL) in response to traffic patterns indicating a threat. The IPS does not itself filter traffic in this scenario, but it is involved in controlling access. An Active Directory server may enforce a lock-out on a user account after hours or subsequent to a number of incorrect password attempts. It is involved in controlling access (authentication and authorization), but it is not blocking traffic at layer 3 and from the current NERC CIP ESP perspective is therefore irrelevant in boundary protection. All boundary protection requires access controls, but not all access control is boundary protection.

A metaphor for Access Control Systems that do not reside in or on an ESP/ESZ is: a pair of military units with interlocking fields of fire supporting each other against frontal assaults and flanking movements. Defense relies upon being positioned to assist one another, not on “being inside a fence”. Electronic Access Control Systems (AAA) can work to protect Cyber Assets inside the ESZ from anywhere.
  • EAG - Electronic Access Gateway. The fundamental defining characteristics of an EAG are that it hosts the EAP and performs the active function of filtering or forwarding traffic at the demarcation point (boundary protection). Primarily it is firewalls and routers that perform gateway functions at the layer 3 ESP boundary demarcation point. Virtual firewalls and virtual routers inside a hypervisor perform the same function in the same manner. However, hypervisors themselves may not be EAGs if they are not configured with a virtual firewall or virtual router function to provide a gateway function.
Modern security methods typically employ a defense in depth strategy using distributed AAA (Authentication, Authorization, and Accounting) systems to authenticate and authorize access to the Electronic Security Zone based upon characteristics of the user and traffic, while the EAG subscribes to the AAA service for user permissions and filters (permits or denies) traffic based on the source, destination, and port or protocol. Further, Electronic Access Control strategies often employ multiple devices, each containing a part of the AAA solution, such that compromise of one element of AAA does not result in the entire system failing. Vendor and platform diversity within a defense-in-depth systems-based approach to Access Control are generally an element of securing the entire system from vulnerabilities common to specific classes of devices (e.g. all-Windows or all-Linux environments may have common configuration or malware vulnerabilities). Often the Accounting (logging) function is used to determine (and formulate a strategy to correct) any failures in Authentication and Authorization. 

Although some EAG devices are also capable of performing various levels of functions to authenticate and authorize traffic, many are not capable of complete AAA solutions in themselves and therefore differ enough from EACS to warrant different technical control measures. Requirements that acknowledge the difference and allow for handling them differently will prevent any "hall of mirrors" effects as described above only. 

For conceptual discussion purposes EAG acts somewhat like the legacy ESP as a logical demarcation point for conceptual discussions to delineate PCA and BCS from non-CIP-Applicable Cyber Assets. It may even be useful to replace ESP completely using ESZ with EAG for demarcation points.
  • CMS - Centralized Management System. System using an elevated privilege account either on behalf of an interactive user or in an automated fashion, allowing mass modification of BES Cyber Systems.
As discussed, these systems are the ones driving the SDT's apparent thought process. The risk posed by these systems is not just unauthorized access to information or BES Cyber Systems, but the ability to modify or destroy the infrastructure the BCS rely upon, or the BCS themselves. These will have unique requirements over and above the others. It would obviously not be beneficial to simply create a reclassification and documentation exercise for entities who would not see sufficient benefit.

Proposed definitions:

AAA: Authentication, Authorization and Accounting systems are Cyber Systems that control ‘Gatekeeper’ functions (electronic access methods and permissions, e.g. authentication of users or control messages that alter dynamic ACLs) for Cyber Systems.
CMS: Centralized Management Systems are Cyber Systems that perform automated management tasks and mass configuration of BCS whether scheduled or on demand using a dedicated service account credential.
EACMS: Deprecated.
EAG: An Electronic Access Gateway is a Cyber Asset that performs active electronic traffic control (filtering and/or forwarding) for ESZ boundary protection based upon the criteria given to it.
EAP: An Electronic Access Point is the logical interface on an EAG where traffic filtering operations take place. 
ESZ: An Electronic Security Zone is the logical container providing separation or isolation from threats or attack vectors to grouped Cyber Assets, said Cyber Assets being characterized by similar operational criticality, or sensitivity to compromised data confidentiality and integrity, as well as needing similar access controls, audit logging and/or monitoring requirements.
SIEM: Security Incident & Event Monitoring is a Cyber System that performs electronic monitoring of Electronic Security Zone(s) or Cyber Systems. 

Friday, May 5, 2017

The problem with EACMS

The NERC CIP Glossary is foundational to the (let us not forget "mandatory and enforceable") CIP Standards

One of the terms defined there is Electronic Access Control or Monitoring System (EACMS):

"Cyber Assets that perform electronic access control or electronic access monitoring of the Electronic Security Perimeter(s) or BES Cyber Systems. This includes Intermediate Systems."

Disregard the last sentence for a moment. There are a few examples of throw-away statements like this added in NERC CIP for convenience, or exceptions where no one could think of wording that would be universally applicable.

Focus on the meat of the definition: it's an access control and access monitoring system. Outside of NERC CIP Standards, this is generally known as AAA: Authentication, Authorization, and Accounting, which actually captures the steps involved in granting and monitoring access much better than the EACMS definitions. There is also tons of guidance and information on implementing AAA in the broader IT Security realm.

But ignore that for a moment too.

The real problem with NERC CIP Standards and the applicable systems that they list, is that some systems that fall into the EACMS category (plus a number which don't) actually pose a much more significant risk than simple access management. They actually perform configuration management via "service accounts" with elevated privileges.

So for example, an Active Directory system is an EACMS, even though it not only controls access, it also controls configurations. But there is no requirement in NERC CIP to monitor the configuration changes made during a session, only the access to the system. Specifically, the failed and successful login attempts.

SCCM is not specifically an EACMS, even though it has an agent installed on Windows devices, and has an elevated privilege service account with Domain Administrator equivalent permissions. But it doesn't control or monitor access attempts.

There is no requirement to protect these systems any differently than any other user-accessible system, like perhaps a data portal on a web server. There is no requirement to separate user or system access within an ESP based upon roles, or impact levels (another problem with NERC IP is impact level is based upon the facility's physical impact on the Bulk Electric System, not the Cyber System's impact on operations). Once you're in, you're in and there is no requirement and not even any explicit security objective to do more than guard the perimeter. As I've mentioned before, this is the "hard crunchy shell, soft gooey center" model from 20 years ago.

On top of that, there's an exemption to remote access requirements for machine-to-machine communications. A management system located outside the Electronic Security Perimeter isn't even required to have encryption, and has no special requirements above and beyond the simple baseline, change management, and logging requirements applied to any system used to support the BES.

Weird.

Monday, May 1, 2017

P,P, & T

Security is people, processes, and technology. Of these, there's a reason why technology is listed last.

Gadgets simply don't suffice without people driving them in a consistent, smart manner.

I bring this up because of the focus on device-level controls, measures, and impact criteria in NERC CIP. It doesn't get much more technology-oriented than expecting a security solution to be all-in-one on a particular box, rather than based upon a combination of technical controls dispersed across the network, in combination with process controls and people at the helm monitoring.

Friday, April 28, 2017

GAAP... GASP...

Tom Alrich says that, at the RF CIP workshop Lew Folkerth pointed out that:
"the key to being able to audit non-prescriptive requirements is for the entity to have to demonstrate that the measures they took were effective."

A lesson can be taken here from the principles of GAAP: "Generally Accepted Accounting Practices". 

Paraphrasing GAAP: There is no absolute, perfect accounting during an audit. There are only sliding scales of better and worse practices. You must depart from [the accepted practice] if following it would lead to a material [mistake]. In the departure you must disclose, if practical, the reasons why compliance with the principle would result in a [mistake].

I know that the electric utility industry doesn't like outside influence, and has a severe "not-invented-here" allergy. But we really do need to move toward "Generally Accepted Security Practices" (GASP, to flippantly coin an acronym.)

So the take-away is that an entity should be able to demonstrate either that the novel approach they took is effective via testing (such as penetration testing), OR that it is widely "accepted" as a security best practice. Making every entity extensively, intrusively pen test every single product, software update, and configuration is counterproductive, and outside the core competency of the entity. Vendors and security firms test these things and make recommendations. Other than due diligence in researching a solution and verifying the provenance and integrity of software or firmware to be installed, an entity shouldn't have any particular obligation to "prove" perfect security of a commercial software offering or hardware device, because it's a distraction from the real issue.

The real issue is providing overlapping defenses in depth.

Thursday, April 27, 2017

Where does "Resilience" rank in NERC CIP?

The NERC CIP discussion has an oddly blank spot when it comes to the Reliability discussion. Reliability seems to mean "always available with no outage" to most people.

I'll agree, that's a good goal for the Bulk Electric System. It's just not a very achievable goal for every single Cyber Asset in the BES.

In the world of Disaster Preparedness/Disaster Response, IT security, Business Continuity Planning, and the remainder of the Critical Infrastructure sectors, the main conversation is around Resilience, not perfection. 

We know bad guys will win sometimes. We know mistakes will be made sometimes. It's important to be able to recovery quickly with minimal residual damage when these things happen. We have to look at Risk Management in terms of 5 strategies:


  • Avoidance
  • Transference
  • Reduction
  • Mitigation
  • Acceptance


    Avoiding risk is difficult when you're a large, stationary, tempting target. Transference is not practical for Critical Infrastructure- there's nobody outside the system to insure you against damage.

    So we try to reduce risk by reducing or eliminating vulnerabilities We can't do anything about reducing threats. Many of our threats are either criminal elements or nation-state actors. these threats are for law enforcement and military force to deter. We're stuck trying to close off attack vectors (vulnerabilities to particular types of threats). We try to reduce the probability of exploit by narrowing the windows of opportunity for threat actors to exploit those vulnerabilities with things like timely updates of security patches, malware signatures, and periodic tests of security mechanisms and logging. The problem is we can do a good job of whack-a-mole and still fail on a zero-day exploit (which is a combination of a threat and a vulnerability) that either didn't exist yesterday, or was unknown to us.

    FERC has told Responsible Entities that we're not allowed to accept risk on our own, because our entanglement in the Bulk Electric System means we'd be accepting risk for the whole system, not just ourselves. The Regions who audit us have permission to accept risk on our behalf but no incentive to do so because they bear none of the cost of mitigation for those risks, and would be exposed to criticism and worse if it came to light that a risk deemed "acceptable" had been exploited.

    So, mitigation.  It sort of overlaps with risk reduction in common conversation; people often use the term mitigation to mean reducing vulnerabilities. This isn't really precise from a Risk Manager's perspective. That's actually risk reduction. Mitigation is more about controlling the amount of damage that can be done if an exploit is successful.

    One way of limiting the damage is having a means to quickly restore vital capabilities. Another way is to have a means of rapidly addressing vulnerabilities once they are exploited. And we can also spread our eggs out into different baskets so any particular attack can only get to some of them. Anything that limits the scope, impact, or duration of damage is a mitigation strategy.

    Mitigation plans are the heart of Resilience. When Reliability efforts reach the point of diminishing returns, we need to start talking about contingencies, and that means Resilience. How many times does this concept appear in the NERC CIP standards? Without losing the emphasis on risk reduction, we need to start including resilience strategies in our planning.

    Wednesday, April 26, 2017

    Rapidly evolving threats & slow-moving regulatory standards

    Over at the Anfield Group Blog, Chris Humphreys posts: 

    The DOE’ s Quadrennial Energy Review Report states that:
    “The current cybersecurity landscape is characterized by rapidly evolving threats and vulnerabilities juxtaposed against the slower-moving prioritization and deployment of defense measures.” I lump regulatory standards and requirements into the “slower-moving prioritization and deployment of defense measures” as one of the key components to preventing a truly proactive stance on cybersecurity. Additional focus on recovery and resiliency needs to be a foundational element of any cybersecurity program because the idea that an organization can combat against 100% of cyber intrusions is false. What becomes critical is the recovery of the system if/when a successful cyberattack occurs."


    I couldn't agree more. We will never eliminate all risk.So it behooves us to have a backup plan- resilient recovery strategies. NERC CIP's specific language around redundancy doesn't dismiss the importance of redundancy, but a lot of NERC CIP compliance folks do. The language says one cannot exclude a Cyber Asset from scope of CIP simply because the system is redundant. Fair enough. Redundancy doesn't protect from software vulnerabilities, malware, or mis-configuration. But too many people seem to think that this means redundancy doesn't matter, and in fact, there doesn't appear to be any requirement to have redundancy for Cyber Assets. 

    Something that NERC CIP doesn't do well: make clear that assessing technical controls for high availability at the systems level rather than at the device level can provide a more accurate perspective on real cyber security, and this high availability is achieved through redundancy of underlying infrastructure (perhaps switching and virtual network systems, or hypervisor infrastructure) that has little or nothing to do with BES functions, BES Information etc. Building resiliency in and eliminating reliance upon single devices (or as I like to call them, "single points of failure") is a key part of virtualization's benefit.

    The entire mindset behind and promoted by the NERC Glossary and the definition of BES Cyber Asset is to blame for this lack. Add that to the prescriptive requirements, device-centric example measures, and the device-oriented Severity Level tables, and you get a self-reinforcing  echo chamber about how to achieve reliability that makes it difficult to look outside the way it has always been done.

    Tuesday, April 25, 2017

    Monday, April 24, 2017

    A Tale of Two Viewpoints: Mixed Trust vs Shared Infrastructure



    In NERC CIP Standards Drafting efforts, industry chatter, and auditing, there has been quite a bit of talk about “mixed trust”; meaning an environment that has both BES Cyber Systems and Cyber Assets not subject to CIP standards. Non-CIP Cyber Assets may be systems that are under the Responsible Entity’s control, and may even be providing functions related to Grid Operation, but not functions which “if rendered unavailable, degraded, or misused for 15 minutes will affect the reliability of the Bulk Electric System. For example, corporate business systems are not BES Cyber Assets, even though they are under the control of the same Responsible Entity as the BES Cyber Systems.


    Here’s the thing; “mixed trust” is not the right term. Mixed trust would imply that Cyber Assets of different trust levels can access each other in an uncontrolled fashion. Nobody is proposing a relaxation of security controls between CIP and non-CIP assets. What is being proposed is “shared infrastructure”. At some level we all have shared infrastructure- the same building, the same power, the same Internet connection. Shared Infrastructure doesn’t mean “mixed trust”. Shared Infrastructure can have logical controls and isolation involved.


    A few days back, I wrote about Streetlight Effect in relation to Lew Folkerth's "Lighthouse" article in the March-April issue of Reliability First's newsletter. In it he talks about “Zones of Authority” in relation to the audit process for NERC compliance. At the end of the article he makes an assertion about Virtualization being a bad idea not because of actual security concerns, but because of the auditor’s inability to look at things outside of the designated Electronic Security Perimeter (compliance concerns) required for BES Cyber Assets. While I respect Lew's experience and appreciate the viewpoint, I disagreed with that approach pretty strenuously.


    Tom Alrich has another perspective on the article. He makes a point about security being enhanced if the RE’s entire network were in scope for NERC CIP Compliance audits. I don’t think I can agree with that either (although he does acknowledge that the average compliance specialist would rather repeatedly hit themselves in the head with a hammer than take this approach because of the burden of paperwork involved, so it doesn’t appear to be a serious suggestion.) Such an approach would only be a net gain if compliance evidence production didn’t overwhelm the efforts to secure things in the first place. And if all of the compliance requirements were strictly security requirements and not just designed to make audits easier.


    But realistically, bringing security mechanisms that exist outside the ESP (Electronic Security Perimeter) into scope of auditing doesn’t require making the entire corporate network subject to inspection by the Regions. Here's the problem with CIP-005 and ESP. Most NERC CIP standards require you to have a program or process to accomplish X. CIP-005, on the other hand, requires everything peripherally related to BES Cyber Systems to reside inside the ESP. What we have here is the "hard crunchy shell" concept from 20 years ago. It provides “bright line” criteria for audit and categorizing assets as either in-scope or out, because the auditor can require a diagram with the asset shown inside a neatly drawn “dotted red line” (inside joke for NERC CIP compliance specialists), but it doesn’t provide good security on its own.

    Some of this difficulty is definitional. A PCA "protected cyber asset" is any cyber asset "associated" with a BES Cyber System and can exist "in or on" an ESP, but this is only made specific in the NERC Glossary.  A BES Cyber Asset must be contained in an ESP (CIP-005). An Intermediate System must reside outside the ESP. An EACMS can be in or out.


    Documenting a mechanism that provides a security function doesn’t require that everything else on that network be in scope for auditing. Authentication, Authorization, and Accounting (AAA) may be partly provided by an RSA Token system. That AAA system could reside inside or outside of the security zone where clients of the system exist. The documented controls for CIP AAA could bring that RSA server into scope without touching other, unrelated non-CIP servers on its network segment.


    The key to providing good security for BES Cyber Assets doesn’t rely solely upon whether a device is inside a particular perimeter. It depends upon the controls applied to accessing that device. A layered defense of the device includes boundary identification & control, but it doesn’t stop at the outer boundary.

    Let's address a couple specific points:


    "Lew says (or implies) that auditors are having differences of opinion with entities on the security of mixed-trust switches. It seems these entities have switches that implement both ESP and non-ESP VLANs. When the auditors tell them this isn’t secure, the entities point out that the non-ESP VLANs have just as good security as the ESPs do. So why aren’t they safe?"


    Let's examine a scenario with two VLANs, one named "CIP" and one named "non-CIP". The main point really isn't that the security of "non-CIP" is as good as "CIP". The most important thing is that network isolation is maintained between them. The configuration of devices in one VLAN are irrelevant to devices in the other VLAN. If it's just a Layer 2 VLAN, there is no "mixed trust" because the traffic from "non-CIP" doesn't mix with that of "CIP". If it's a Layer 3 switch, and routing takes place between the 2 VLANs, then an EAP must be identified (probably the VLAN's logical or virtual interface) and inbound/outbound access controls applied at that point. If it is a Hypervisor situation, then the same principle applies. vSwitch "non-CIP" doesn't have a traffic path that includes vSwitch "CIP" and the Guests cannot communicate with each other. If you create an EAP between them, then you apply the inbound and outbound access controls at that point.

    And lest we forget, L2 VLAN is not the only way to have a shared switch infrastructure. Software Defined Networks and Network Overlays exist on a shared hardware infrastructure, but provide network isolation between the security zones as defined in configuration. This isolation, like that of the VLAN example above, is provided as a baseline function of the device, implemented in the control plane by code base. Access to modify configuration and code base is confined to a management plane interface. Traffic transiting the device doesn't have access to this function, only administrators accessing the device in the management plane do.


    "Anything else, such as a VLAN that isn’t an ESP, is completely out of his or her purview; the auditor just has to assume these are completely insecure, and thus shouldn’t be found on the same switch as ESP VLANs."


    Auditors aren't forced to assume anything.  The configuration of the switch is in scope, because it provides isolation to the "CIP" VLAN. (It doesn't provide ESP under the current model, because it doesn't provide EAP with in-bound/out-bound ACLS. It simply provides isolation, which is one flaw in CIP-005's prescriptive approach. CIP-005 would benefit from being changed to a security objective of isolating more critical assets and traffic from less critical.)The configuration of devices in the "non-CIP" VLAN are out of scope. If something is out of scope, they are not required to render an opinion on it.


    "the auditors can’t look at what isn’t in an ESP, which limits the kinds of evidence an entity can show them. "


    Currently, auditors can examine EACMS that may exist outside the ESP. They can examine Intermediate Systems which are required to be outside the ESP. I don't think this is a valid assertion. It's an auditing approach preference with deliberate blinders on.


    "most entities, if told they could force CIP auditors to consider security controls they have implemented for non-ESP networks, but only in return for having at least some of the cyber assets on those non-ESP networks fall into scope for CIP, would say “Thanks but no thanks. We’ll leave things as they are.”


    Perhaps the entity might choose that. But the issue isn't "security controls implemented for non-ESP networks." It is "security controls that exist outside the ESP, that are implemented to protect CIP Assets". ESP is a fundamentally limited construct, and must be considered fatally flawed when used without other other layered defenses.

    Friday, April 21, 2017

    Cost: FREE
     
    Date:           April 25-26, 2017
    Upcoming Location:     Denver, CO
     

    https://aws.amazon.com/government-education/events/automating-security-cloud-workshop/
     

    Thursday, April 20, 2017

    From the EnergySec website:

    "EnergySec published comments on the Transmission Owners Control Center and Virtualizations white papers the Standards Drafting Team is seeking comments on. Since EnergySec does not formally vote on SDT ballot issues, we are not formally submitting these comments to the SDT. However, we are making them available for our members to use or revise as they see fit. The documents are available in the Members section of the EnergySec Community wiki."


    Not only are they not formally submitted, it would appear that almost no one on the SDT or the NERC staff associated with it actually has access to read it. You would think that something like this would be more use to their membership if it were not behind a pay wall.

    Tuesday, April 18, 2017

    CIP 9 and Hypervisors

    Carlo asked a question about CIP 9 in comments to a previous post.

    I hope this answers it well enough:

    Backup solutions are going to be specific to your Hypervisor choice. If you're working with a Type I Hypervisor (Bare Metal) your backup solution (for the Hypervisor itself) may very well be “install fresh from vendor media” because there is very little specific data to be restored and which Host the Guest resides upon is transparent to the Guest and/or users of the system. If you have a Type II Hypervisor that includes an operating system, and possibly performs functions other than Hypervisor in addition (which I would strongly recommend AGAINST), then your backup solution may be more complex.

    In most cases, backups for Hypervisors are not urgent because you plan and implement swap space capacity into your infrastructure. You should have more Hypervisor capacity than is necessary for the Guests in each. This allows for maintenance of Hypervisors (patching, upgrades) without taking any functionality offline. You simply move Guests from the Hypervisor being taken offline to other Hypervisors for the duration of the procedure. In an unplanned outage, the same process is used. Guests don’t rely on any specific Hypervisor, they simply need a Hypervisor.

    Backup solutions for Guests can work exactly the same as your traditional network-based backup management software. I wouldn't do it that way, but it's possible.

    Most Virtual Cluster Management consoles allow for "snapshots" of the running state of your Guest. This is much more complete than a backup of files and directories, and requires merely rebooting to a previous state rather than a process of restoring files and settings to a base image.

    The advantage here is that if your target to be restored is participating in a security domain such as Active Directory, you're merely restoring a previous state, not deleting and recreating AD objects with unique and possibly conflicting GUIDs. It's more complicated if it is an AD Domain Controller; this may require authoritative or non-authoritative restore procedures (depending on the state of the AD domain and how corrupted it may be).

    Depending on your Hypervisor choice, the Guest’s configuration data (number of processor cores, RAM, storage targets etc.) may be contained in the image of the Guest or in your private cloud cluster manager. This configuration data is rather static and rarely changes. In most cases, the cluster manager is a virtual machine itself and can be rebooted from a snapshot. It could also be backed up across the network using a traditional backup application.

    If you’re using standalone, unmanaged Hypervisors (perhaps for cost reasons) then you have more manual planning to do. You have to make sure that you have a process to identify target destination Hosts with adequate resources and you should also maintain functional redundancy or security zone separation for manual moves of Guests during planned or unplanned outages. For automating Guest movements, this is managed by setting affinity of certain Guests to targeted Hosts.

    For CIP-009-6 specifically, nothing changes until CIP-009-6 R1.3. Here we need to note that the Hypervisor doesn’t have any BES Cyber System Information. The Hypervisor is just a container, and doesn’t interact with the code base of the BES Cyber Systems themselves. So “processes for the backup  and storage of information required to recover BES Cyber System functionality {emphasis added} apply more strictly to the Guests individually (Cyber Asset) and to the Hypervisors as a general function (Cyber System).

    R1.4 and 1.5 are also going to have to be applied at a Cyber Systems level rather than Cyber Asset.
     

     
     

    Virtualization Webinar Today (NERC CIP Standards Drafting)


     

     
    Industry Webinar
    Project 2016-02 Modifications to CIP Standards Virtualization in the CIP Environment

    April 18, 2017 | 3:00 – 4:30 p.m. Eastern
     
    Click here for Webinar Registration
    Dial-in: 1-415-655-0002 | Access code: 731 913 110

    Background
    In addition to the Order No. 822 directives, the Project 2016-02 Modifications to CIP Standards Drafting Team (SDT) is addressing four issues identified by the CIP Version 5 Transition Advisory Group (V5 TAG). These issues are:
    ·         Cyber Asset and BES Cyber Asset Definitions;
    ·         Network and Externally Accessible Devices;
    ·         Transmission Owner (TO) Control Centers Performing Transmission Operator (TOP) Obligations; and
    ·         Virtualization
     
    Virtualization of Cyber Assets provides advantages for the availability, resiliency, and reliability of applications and functions hosted in such an environment when implemented in a secure manner. The SDT is offering a series of webinars in an effort to provide a resource for technical information related to several concepts relevant to virtualization. During the first webinar, held on March 21, 2017, the SDT discussed logical isolation and Centralized Management Systems (CMSs), in addition to introducing storage virtualization.
     
    Webinar Objectives
    This second webinar will discuss the Hypervisor with a special focus on template considerations, as well as multi-tenancy and the concepts of underlay hardware and Electronic Security Zones. This webinar will also expound on the introduction to storage virtualization provided in the first webinar, to include discussion of storage area networks and the manner in which underlying virtualization concepts are similar to server virtualization. Finally, this webinar will discuss storage virtualization scaling and data leakage.
     
    For more information or assistance, contact Katherine Street (via email) or at (404) 446-9702 or Mat Bunch (via email) or at (404) 446-9785
    3353 Peachtree Road NE
    Suite 600, North Tower
    Atlanta, GA 30326
    404-446-2560 | www.nerc.com

     

    Monday, April 17, 2017

    Definitions matter

    There is a need to revise CIP language to clarify “programmable” due to differences between the current NERC CIP definition of Cyber Asset, the language in Section 215 of the Energy Policy Act of 2005 that discusses Cyber Assets as Electronic Programmable Devices, and commonly understood security standards and definitions of computers or cyber devices.

    There is a perception that the particular wording of “Cyber Asset” is a deliberate, well-thought-out, and legally binding definition. However, there are multiple inconsistencies between NERC CIP Standards, the Energy Policy Act, and FERC Orders which have not been legally challenged and have not prevented progress from being made to security standards. This being so, there is no practical benefit in objecting to modifications based on a presumption of precision in the original wording.

    These varying definitions have caused some confusion in categorizing Cyber Assets as in-scope. There may be gains to be achieved by modifying the definition to be more consistent both internally and with cross-sector IT security practices that are more technically in line with the way devices are designed by vendors and intended to be operated. When the parsing of the grammar becomes too circular, the utility of the definition is lost. The main goal of NERC CIP standards MUST be usefulness of the standard.

    Some commenters have made the point that NIST does not use the term “Cyber Asset” and recommend using the term “computer”. However “computer” also has connotations of server/workstation to many people and is not inclusive of other information processing devices such as network and security appliances, cyber-physical industrial control system devices, etc. “Cyber Asset” is a workable, comprehensible and inclusive term that provides benefit to the security discussion and therefore should be retained.

    Cyber Assets are platforms which can accept variable sets of encoded instructions known as operating systems and software programs. They use these instructions to manipulate data inputs to create outputs in the form of processed data or in the case of Cyber-Physical devices, control signals. This programming is stored in either volatile or non-volatile memory, and may reside in the device or on other devices in the overall Cyber System that provides storage services to the device.

    Conversely, dedicated devices which perform a function defined purely by the physical configuration of the device (dip switches, jumper connectors, or EEPROM) and not in a changeable, encoded set of logic-based instructions are not generally considered to be Cyber Assets, but rather microprocessors. The modification of that dedicated function (control plane logic) is not programmable via a human or network-accessible communications interface (management plane) that can be interacted with logically by other Cyber Assets. Re-programming requires physical modifications to the micro-processor device, often by a vendor technician at a factory using tools that change the physical or electrical properties of the device. These devices are not in any practical way “programmable” by the user and the risk of them being re-programmed maliciously or covertly are mitigated by physical access controls.


    Additionally, devices which have a stored firmware not accessible unless installed in another device (such as but not limited to internal/external hard drives, flash drives, Ethernet or Wireless NICs cards or USB, Security dongles, serial adapters, etc.) are not Cyber Assets in themselves because they are not capable of being re-programmed or executing code without being installed (permanently or temporarily) in a Cyber Asset. These types of devices are peripheral components of a Cyber Asset or removable media. While these devices may pose a risk of carrying mal-ware, the means of mitigating that risk is separately covered by removable media controls and supply chain requirements.

    Friday, April 14, 2017

    Do you really need a requirement for that?

    Over at the Anfield Blog, CEO Chris Humphreys says:


    "3. Does your organization really need a formal standard to tell you that you should be testing any/all third party software/hardware before deploying it within your operational environment? 
    This is the most alarming concern I have. If you answered “yes” to the above question, the state of security within our industry is in horrible shape. Nothing gets me more fired up than when I speak to a security “expert” at a utility he says: “There’s no NERC requirement for me to do that.” I’m sure that’s exactly what the Iranians said before they installed those PLCs in their nuclear reactor."
    {my own emphasis added in the second paragraph.}

    I really can't add anything to that, other than it's not just a Supply Chain issue. It sort of places some other people's opinions about the benefit of "Mandatory and Enforceable Standards" in context.

    Thursday, April 13, 2017

    Cyber Assets vs Cyber Systems Confusion in the Virtual Environment

    I'm concerned to still see discussion in the Nerc-isphere about how to categorize a VM in the most simple of clear-cut examples: the virtual server/hypervisor combination. Some folks still disagree that it is necessary to treat each virtual machine and hypervisor as separate Cyber Assets. They think:


    "The hypervisor (parent) is the device or software which runs the virtual machine (child). The virtual machine (VM) cannot operate without the hypervisor. This shared relationship means that neither can be separate Cyber Assets. For example, if a VM has been identified as a BES Cyber Asset (BCA); the hypervisor that runs the VM is also a BCA; which also applies to PACS, EACMS, and PCA’s

    Treating the VM and hypervisor as separate Cyber Assets can cause mixed-trust virtual environments; the hypervisor runs CIP and corporate VM’s. CIP controls are only being applied to the CIP VM and not the hypervisor; even though the hypervisor “if rendered unavailable, degraded, or misused” can impact the CIP and corporate VM’s."


     Allow me to counter:

    The hypervisor (parent) is the device or software which runs the virtual machine (child).

    The Hypervisor or Host merely provides a container or environment for the virtual machine. In this aspect the control plane of the Hypervisor operates certain control plane functions on behalf of the guest. However, the Hypervisor is not involved in the control plane¹ decisions made by the guest.  The Hypervisor does not need to (and should not be configured to) interact in the data plane of the Guests. They make computing decisions entirely independent of each other, including their reactions to inputs and malware. If an RE decided to treat Host and Guest as one Cyber Asset for compliance reasons, there is no logically consistent framework to require the long-standing and well-known security best practice of separating the management and data plane of the Hypervisor and Guests. The guest should be completely unaware of and unable to interact with the Host in a secure virtual environment. The Host and Guest more often than not run different operating systems, and it would be difficult to categorize that as one Cyber Asset for any practical purpose. 

    The virtual machine (VM) cannot operate without the hypervisor.

    This is not strictly correct. More precisely, the VM cannot operate without a hypervisor. It is not dependent upon any specific hypervisor (rather, it can exist on any Hypervisor in the cluster and this is one of its biggest advantages). Therefore treating them as one Cyber Asset is inappropriate because this approach would not require distinct vulnerability assessments, patching, baseline configuration etc.

    This shared relationship means that neither can be separate Cyber Assets.

    The assumption in this statement doesn’t work in both directions. Presuming that a Guest could not be a separate (meaning independent) Cyber Asset, does not preclude a Hypervisor from being an independent Cyber Asset. After all the Hypervisor is a complete hardware, operating system, (and potentially software) stack in itself. Depending on whether it is a Type I or Type II, it might even be the same operating system as the Guests with Hypervisor function software merely installed on top of a generic operating system. It exists with a hostname, an address, a particular set of open ports/APIs, network connection, and responds to network traffic. The Hypervisor can exist and operate without a single Guest inside.

    For example, if a VM has been identified as a BES Cyber Asset (BCA); the hypervisor that runs the VM is also a BCA; which also applies to PACS, EACMS, and PCA’s

    This may be true; it doesn’t negate the necessity of treating the Host and Guest as distinct Cyber Assets (for multiple reasons). At most it makes the entire assemblage a BCS.

    Treating the VM and hypervisor as separate Cyber Assets can cause mixed-trust virtual environments; the hypervisor runs CIP and corporate VM’s.

    This is known as argumentum ad consequentiam (appeal to consequences) and is a known logical fallacy. Whether or not shared infrastructure is allowed and whatever the consequences of doing so, it does not change the distinct character of the Guest vs the Host. It also presumes that sharing infrastructure between CIP-applicable systems and “Corporate” VMs is impossible to secure and must be prohibited. This remains to be proven.

    CIP controls are only being applied to the CIP VM and not the hypervisor; even though the hypervisor “if rendered unavailable, degraded, or misused” can impact the CIP and corporate VM’s.

    Again, this is an unproven assumption. Treating the Hypervisor and the Guest as separate Cyber Assets does not require a difference in controls. If both are BES Cyber Assets then the same requirements apply to both. Additionally, the technical configuration controls applied to a Hypervisor are generally different than those applied to a Guest (Mal-ware, for example). Again, The Host and Guest may run different operating systems with different open ports, APIs, software vulnerabilities, and baseline capabilities. 
    The question is whether a BCA Hypervisor can host a PCA or non-CIP Applicable System safely. The advisability of this approach depends upon whether or not sufficient security controls can be put in place to render negligible the risk of unavailability, degradation, or misuse of the BCA Hypervisor and associated BCA Guests. Risk assessment and acceptance are highly subjective and specific questions that depend more upon the overall architecture and defense in depth posture, than upon a simplistic question of "to virtualize, or not to virtualize."


    (1)Control plane functions are not directly accessible by users of the system, they are embedded in the logic of the code base, and are generally require modification to the code base to change them. Virtualizing a server involves abstracting the hardware interactions much like the Hardware Abstraction Layer (HAL) does in Windows. The difference being that the Guest operating system simply sends its hardware access requests to the Hypervisor rather than to firmware. This has a three-fold security benefit. 


    1. The users and software accessing the Guest in the Data plane cannot substitute malware in place of authorized device drivers. Since device drivers are one of the worst vectors for malware after Phishng attacks, this narrows the attack surface of the Guest OS.
    2. The Hypervisor can be a different operating system than the guest, which means attacks in the data plane against the guest firmware will be ineffectual against the Guest due to no device drivers present and no direct access to hardware, and will be ineffective against the Hypervisor because there is no Data plane access between the two, and the device drivers that actually do interact with hardware are for a different operating system than what is visible to the attacker.
    3. Drivers that actually interact with the hardware are generally a smaller subset of better-vetted drivers approved by the Hypervisor vendor, as long as you make the smart choice and go with a Type I Bare Metal Hypervisor.