MUSC Information Security Guidelines: Risk Management

Author: Richard Gadsden
Contact: gadsden@musc.edu
Version: 0.9
Date: 09 Jan 2006
Status: DRAFT

Contents

1   Purpose and Scope

These guidelines are intended to help MUSC System Owners to meet the risk management responsibilities that are assigned to them by MUSC's information security policies. The guidelines explain how to conduct a risk assessment, how to oversee the execution of a security plan, and how to monitor and evaluate the effectiveness of a system's security controls.

These guidelines apply to all MUSC faculty, students and staff who serve in system ownership roles, in all of the entities that comprise the MUSC enterprise.

2   Applicable MUSC Policies

3   Applicable MUSC Standards

4   Risk Management Cycle

Exposure to some risk is a normal part of doing business. Likewise, managing risk is a normal and necessary part of managing any business effectively. In any well-run business, if you're responsible for risk management, then you'll be expected to follow a process that looks something like this, no matter what kind of risks you're charged with managing:

graph showing the assess-plan-implement-measure cycle

Figure 1: Risk Management Cycle

To manage your risks effectively, first you have to identify the sources of risk, and their potential impacts. Then you can choose a cost-effective and practical set of security controls (safeguards) to help you manage those risks, and develop a plan for their implementation. After these security controls have been implemented, their effectiveness has to be monitored and evaluated. The findings are then fed back into the next iteration of the risk management cycle.

MUSC's information security policies require that a documented risk management process must be followed throughout the life of any information system that contains protected information. This requirement also applies to certain other systems that are considered mission critical, even though they may not contain protected information.

If you are the designated owner of an information system, then you have overall responsibility for the system's risk management process.

For any system within a large and complex enterprise like MUSC, risk management is a team effort, requiring contributions from appropriate management, IT personnel, and key users of a system. The system owner is the leader of the system's risk management team, and ensures that risks are identified and understood, that appropriate security controls (safeguards) are selected and implemented, and that the effectiveness of the controls is monitored and evaluated, throughout the life of the system.

5   Risk Assessment Concepts

The risk assessment process comprises the first two major steps in the overall risk management cycle:

  • Identify issues and analyze risks
  • Prioritize and select security controls

5.1   Risk Components

An information security risk can arise from anything that threatens the availability, integrity or confidentiality of information. Risks are a function of the specific threats and vulnerabilities that potentially affect your system, the likelihood (probability) of their affecting your system, and the potential impacts on your system, on the MUSC enterprise, and on individuals if they do occur.

5.2   Goals of the Risk Assessment Process

All information security risk assessments at MUSC serve the same basic purpose: to select a rational set of security controls or safeguards that will minimize the total cost of information security to the MUSC enterprise, while meeting all of MUSC's ethical obligations, regulatory requirements, and accreditation standards. The total cost of security includes both the cost of security controls, and the cost of security breaches.

The fundamental goal of information security is to protect against threats to the confidentiality, integrity, and availability of information. This goal can be expressed in term of meeting three basic types of security objectives:

  • Confidentiality: Preserving authorized restrictions on information access and disclosure, including means for protecting personal privacy and proprietary information.
  • Integrity: Guarding against improper information modification or destruction, and includes ensuring information non-repudiation and authenticity.
  • Availability: Ensuring timely and reliable access to and use of information.

When dealing with information protection, it is important to recognize that there is no such thing as perfect security. And while too little security leads to unacceptable risks, trying to impose too much security (or the wrong kinds of security) on the users, operators or administrators of a system is a waste of money, and isn't likely to be effective anyway.

The goal of the risk assessment process is to determine the right amounts and the right kinds of security, needed to achieve a reasonable, appropriate, and responsible degree of protection, at the lowest possible total cost.

Risk assessment has two other important goals:

  • Regulatory compliance
  • Maintaining public confidence in MUSC

State and federal regulators, and the public at large, expect MUSC's senior management to meet a standard of due care in assuring the protection of all the sensitive and critical information that is entrusted to MUSC's stewardship. Senior management in turn expects us to deliver appropriate information protection, and to deliver it efficiently. We cannot meet these expectations if we fail to understand our risks, if we fail to develop plans to manage our risks efficiently and effectively, or if we fail to execute those plans. This is what risk assessment is all about, and why it is such a critical piece of MUSC's information security program.

5.3   Cost of Security

At a very high level, two components contribute to the total cost of security for a system. The first component is the summed cost of all the system's security components themselves. For example, the costs of administering user accounts and passwords, and the costs of setting up and operating routine data backup and recovery procedures. The other major cost component arises from the expected cost (damages) created by security breaches. For example, the costs of lawsuits, fines and reputation damage that would be incurred if a system were compromised and sensitive and/or critical information about patients or other customers were destroyed, or exposed to the wrong people.

As a rule, we expect that the more we invest in security controls for a system (as long as we invest our money rationally), the less we expect that we will need to spend on damages from security breaches, and vice versa. This general principle is illustrated in the following figure:

graph showing optimal level of security at minimal cost

Figure 2: Optimal Level of Security

An effective risk assessment process for a given system is one that enables the owner of the system to find the optimal level of security. The assessment process should lead not only to the right amount of security for a system, but also to the right types of security for the system. The latter objective can be achieved only by first identifying the most significant risks that affect the system, and then by selecting and implementing the most cost-effective controls for managing those risks. The most significant risks are those that contribute the most to the expected cost of security breaches.

5.4   Risk Assessment Teams

The assessment process requires us to identify the most significant risks to a system, to understand the various techniques that can be used to control these risks, to understand the organization's ability and capacity to implement these controls, and last but not least, to be aware of any minimum security control standards that must be met as a result of security policies or regulations that apply to the system.

Given the depth and breadth of knowledge and skills required, it should not be surprising that an effective risk assessment normally requires a multi-disciplinary team effort. The more risks that potentially affect a system, and the more people within the MUSC enterprise that the system touches, the more important it is for you, as the system owner, to assemble a knowledgeable and skilled risk assessment team.

6   Steps in the Risk Assessment Process

The risk assessment process consists of the following steps:

  • Identify potential security issues
  • Analyze the risks that these issues create
  • Evaluate possible controls (safeguards) to manage these risks
  • Select and prioritize appropriate (cost-effective) controls
  • Document and communicate a security plan

6.1   System Identification

Before they can identify potential threats, vulnerabilities and other security issues, the members of your assessment team will need to have a solid understanding of how your system is put together (or how you expect it to be put together, if it is still in the design or development stage).

Network diagrams, information flow diagrams, and tables of hardware and software components are essential tools for understanding a system. The boundaries of your system, and its interfaces with external systems, network infrastructure components, and end-user devices, must also be clearly understood by the members of your assessment team. If you haven't already done so, you will need to develop the appropriate diagrams and tables, and familiarize your assessment team members with the architecture of your system, before going any further.

You are encouraged to use MUSC's System Identification template to help you document this information for your risk assessment team.

6.2   Identifying Security Issues (Threat-Vulnerability Pairs)

A threat is defined as the potential for a "threat-source" to intentionally exploit or accidentally trigger a specific vulnerability. In a somewhat circular fashion, a vulnerability is defined as a weakness in a system's security procedures, design, implementation, or internal controls, that could be accidentally triggered or intentionally exploited, and result in a violation of the system's security policies. For a violation (breach) to occur, both a threat and a vulnerability have to exist.

A specific type of security breach will result from a specific threat acting upon a specific vulnerability. In other words, a particular threat-vulnerability pair will, in a sense, define a particular type of security breach. For all practical purposes, the terms security breach and threat-vulnerability pair are logically equivalent.

The first step in assessing the risks to your system is to start developing a list of all the threat-vulnerability pairs that could affect your system. We strongly recommend that you use the Risk Analysis Worksheet to record the list of threat-vulnerability pairs that your assessment team generates.

The Reference section of these guidelines has additional information on the kinds of threats and vulnerabilities that your assessment team should consider, along with some examples of threat-vulnerability pairs (security issues) that often come up in risk assessments.

The Reference section also lists some vulnerability assessment tools that can be used to identify potential technical vulnerabilities that may affect one or more of your system's components. If your assessment team needs help with these or any other vulnerability assessment tools, please contact the ISO.

Developing a list of potential threat-vulnerability pairs requires a broad range of knowledge and skills, and it is best done as a brainstorming exercise involving the entire assessment team. Try to keep the team members focused on threats and vulnerabilities that they can reasonably anticipate, that have a non-negligible likelihood (probability) of occurring, and that would have a non-negligible impact if they did occur.

Your list of security issues doesn't need to be 100% complete before you can move on to the next step in your assessment; your team can always add issues to the list later, if they realize that they have overlooked a significant threat or vulnerability.

6.3   Identifying Other Security Issues (Compliance Issues)

The Risk Analysis Worksheet is also the appropriate place to document any additional (external) security compliance requirements that apply to your system, including requirements imposed by MUSC policies, state and federal regulations, and any accreditation standards that apply to your system.

At MUSC, you can use the Policy Compliance Checklist for System Owners to help you identify these kinds of compliance issues.

You should document each unmet compliance requirement from the checklist as a Security Issue in your Risk Analysis Worksheet. For these issues, you should leave the Likelihood and Impact columns blank, because they're not relevant to compliance issues, and you should rate the Risk Level as High to reflect the fact that this issue represents a mandatory compliance requirement. You will then evaluate, select, and document controls to address these compliance issues, just as you would any other Security Issue with a High associated Risk Level.

6.3.1   Security Issues at Post-Implementation Stage (Existing Systems)

If you are performing a risk assessment for an existing system, then obviously your assessment team should be thoroughly familiar not only with the system's current architecture, but also with its current state. The assessment team needs to be aware of the results of any evaluations of security controls that have already been implemented. If no recent evaluation has taken place, then an overall evaluation of the system's current security controls should be completed before attempting a Post-Implementation risk assessment. The Policy Compliance Checklist for System Owners can be a useful tool for guiding an evaluation of existing controls.

If the system's existing security controls are known to be working effectively, then the risk assessment team can focus all of its attention on assessing the "new" risk issues that arise from the specific environmental, operational, regulatory, or policy change(s) that motivated this Post-Implementation risk assessment in the first place. In other words, if all of the "old" risk issues are still under control, then the Post-Implementation assessment team does not really need to worry about them.

On the other hand, if recent evaluation has shown that any of the existing controls have not proven to be effective, then the assessment team needs to broaden the scope of its review to include all of the "old" security issues (threat-vulnerability pairs and/or compliance issues) that originally motivated the implementation of these "old" controls which turned out to be ineffective. Post-Implementation risk assessment teams do need to worry about any "old" security issues that are not currently under control.

6.4   Quantitative vs. Qualitative Analysis

The type of risk posed by a particular type of potential security breach is defined by the threat and the vulnerability that together create the risk of the breach and its adverse effects. The level of risk depends on two additional factors:

  • the likelihood that the threat will act upon the vulnerability
  • the potential impact of the breach that would result

For the purposes of risk assessment, we usually express the likelihood or probability of an event in terms of its expected frequency of occurrence.

There are two basic approaches to risk assessment: quantitative and qualitative.

In a quantitative assessment, your objective is to express the level of risk in cold, hard, monetary terms. To do this, you must first document a numerical probability for the probability of occurrence of each type of potential security breach; you'll have to use whatever valid historical frequency data you can find that would be applicable to your system. To quantify potential impacts, you need to compute an actual monetary value for the total expected losses from each type of security breach; you'll have to do a detailed and thorough financial impact analysis to compute this quantity.

In principle, a quantitative risk assessment is ideal for risk-based decision-making, because it allows the risk assessment team to precisely compute the level of risk from each type of potential breach, and to express the level of risk in cold, hard, monetary terms. This is usually done using the formula for Annualized Loss Expectancy (ALE), in which the expected annual loss from each type of potential breach is simply its frequency (expected number of breaches of that type per year) multiplied by its impact (total expected losses from each breach of that type, in dollars):

Quantitative Risk Formula

ALE(x) = Frequency(x) x Impact(x)

This formula just says that for a given type of security breach x, the annualized loss expectancy due to x is simply the product of two factors: the expected frequency of x (the expected number of occurrences of x per year), and the potential impact of each occurrence of x (the total expected damages and losses from each occurrence, in dollars).

By computing an ALE for each type of security issue, the risk assessment team can assign a dollar value to each issue's level of risk. Then, simply by sorting the security issues by their risk levels (their ALE's), the assessment team can assign each issue its appropriate priority for risk management. If the estimates of frequency and impact values used in the risk calculations were accurate and reliable, it would be hard to improve on this as a method for establishing priorities.

However, a significant problem with quantitative risk analysis is that most threat probabilities and impacts are extremely difficult to estimate reliably. Security breaches occur relatively rarely, and organizations tend not to publicize them; as a result, most sources of incidence information are essentially anecdotal, and cannot be used to support the development of reliable probability or frequency estimates. Likewise, the total expected cost of the potential losses from a given type of breach is hard to estimate; it could depend on factors such as length of downtime, amount of adverse publicity, and other such factors that are highly variable and inherently difficult to predict without a great deal of uncertainty.

Because of these difficulties with quantitative analysis, we recommend a qualitative approach to risk assessment. In a qualitative assessment, you determine relative risk rather than absolute risk. Although you may give up some precision, you simplify your analysis significantly. In a qualitative analysis, you don't attempt to quantify risk precisely. Instead, you try to produce good, but rough, estimates of risk levels. For most assessments, a simple scale with just three levels of risk (Low, Moderate, and High) is good enough to allow your risk assessment team to identify the most significant risks, and to assign mitigation priorities with a reasonable degree of confidence that all significant risks will be addressed.

6.5   Assessing Likelihood (Frequency)

Recall the two factors that contribute to the level of risk created by a given threat-vulnerability combination:

  • the likelihood that the threat will act upon the vulnerability
  • the potential impact of the breach that would result

In risk analysis, we usually express the likelihood of an event in terms of its expected frequency of occurrence. At MUSC, in keeping with our qualitative approach, we recommend using a simple 3-level scale, as shown in the following table, for rating the estimated likelihood of a given type of breach:

Rating Scale for Likelihood (Frequency)
Rating Estimated Frequency of Occurrence
High > 12 times per year
Moderate 1 - 11 times per year
Low < 1 time per year

Remember, the whole point of a qualitative risk assessment is to assess relative risks, and to determine which risks are the most significant and therefore should be assigned the highest priority for management. If using the frequency ratings suggested above results in all, or nearly all, of your potential breaches' being assigned the same likelihood, then you should consider adjusting your frequency thresholds, in order to get some useful separation; otherwise, you'll ultimately find it harder to identify the threat-vulnerability pairs that pose the highest risks.

We recommend that you use your Risk Analysis Worksheet to record the assessed Likelihood (Frequency) rating for each threat-vulnerability pair that your assessment team identified.

6.5.1   Assessing Likelihood at Post-Implementation Stage (Existing Systems)

If you are performing a Post-Implementation risk assessment, then your assessment team should generally rate likelihoods or frequencies of occurrence with respect to their understanding of the system's current state, as derived from the most recent available evaluation of the effectiveness of the system's current security controls.

For example, if the system already has one or more effective security controls in place that have reduced the expected frequency of a potential security issue from High (the likelihood of the issue if no controls were already in place) to Low (with the current controls in place), then your team should rate the Likelihood (Frequency) of that particular issue as Low.

6.6   Assessing Impacts

The second factor that determines the level of risk created by a given threat-vulnerability pair is the potential impact of the breach that would result if that specific threat did act against that specific vulnerability.

The impact of a breach depends on the effects that the breach would have on MUSC operations, on MUSC assets, and on individuals - typically students, patients, or other customers or stakeholders. Depending on the particular threat-vulnerability pair and the particulars of your system, the breach could result in one or more undesirable types of outcomes, including:

  • Disclosure or unauthorized viewing of confidential information
  • Unauthorized modification of sensitive information
  • Loss or destruction of important information
  • Interruption in availability or service

Undesirable outcomes like these can potentially affect MUSC and its customers in many different areas, including the following:

  • Life, health and well-being of MUSC's students
  • Life, health and well-being of MUSC's patients
  • Life, health and well-being of other MUSC customers/stakeholders
  • Life, health and well-being of MUSC's faculty and/or employees
  • Damage to MUSC's reputation and loss of customer confidence
  • Interference with MUSC's ability to meet its mission obligations
  • Civil/criminal penalties, fines, damages, settlements, and other legal costs

To characterize the overall impact of particular type of breach affecting a particular system, you should consider potential impacts in all of these areas, and in any other areas that may be relevant to your system. The overall rating that you give to the impact of a security breach should be the "high water mark" of its impact across all areas. For example, if the impact in all areas is low, then rate the overall impact as Low. Likewise, if the highest impact in any area is moderate, then rate the overall impact as Moderate. And finally, if the impact in any area is high, then rate the overall impact as High.

The NIST Standards for Security Categorization of Federal Information and Information Systems were published as Federal Information Processing Standards Publication 199 (FIPS 199) in February 2004. This document established a simple impact rating standard, that can be used equally well inside and outside the federal government. FIPS 199 defines a simple, three-level rating system for potential impacts. The three standard impact ratings (Low, Moderate, and High) are defined as follows:

  • The potential impact is Low if: The loss of confidentiality, integrity, or availability could be expected to have a limited adverse effect on organizational operations, organizational assets, or individuals.
  • The potential impact is Moderate if: The loss of confidentiality, integrity, or availability could be expected to have a serious adverse effect on organizational operations, organizational assets, or individuals.
  • The potential impact is High if: The loss of confidentiality, integrity, or availability could be expected to have a severe or catastrophic adverse effect on organizational operations, organizational assets, or individuals.

Obviously, the terms limited and serious and severe or catastrophic are subject to interpretation, but the FIPS 199 standard does provide a recommended interpretation (amplification) of these terms. See the sidebar on Interpreting FIPS 199 Terms.

We recommend that you use your Risk Analysis Worksheet to record the assessed Impact rating for each threat-vulnerability pair that your assessment team generated.

6.7   Calculating Risk

Recall that for a given type of security breach or threat-vulnerability combination, the level of risk that it poses is the product of two factors: its likelihood, and its potential impact.

In a qualitative risk assessment, we do not attempt to literally calculate levels of risk, in the same sense that we would calculate an Annualized Loss Expectancy or ALE in a quantitative assessment. But we do still need to be able to "calculate" different levels of risks so that we can compare them, even though the two basic risk factors (likelihood and impact) are expressed in non-numeric, qualitative terms.

If you use the three-level FIPS 199 scale for assessing your two risk factors, as recommended in these guidelines, then you can use the following multiplication table to determine a qualitative Risk level for each threat-vulnerability pair. In the following table, the row headings denote your rated potential Impact of a particular breach, the column headings denote your rated Likelihood. The entries within the table then give you a qualitative rating of the overall Risk level, which is "computed" as the "product" of Impact and Likelihood:

Qualitative Risk Multiplication Table
  Low Moderate High
Low Low Low Moderate
Moderate Low Moderate High
High Moderate High High

For example, if the rated Impact is Low, then you should rate the Risk-level as Low if the rated Likelihood is Low or Moderate, and you should rate the Risk-level as Moderate if the Likelihood is rated High, and so on.

There are an important caveat to keep in mind when rating risk levels, no matter what methodology or formula you use. Not all "equally rated" risks should necessarily be viewed or treated equally.

To illustrate why this is the case, consider the fact that a high-impact, low-probability breach, and a low-impact, high-probability breach will both end up having the same Moderate risk-level rating when you use the formula for "calculating" the Risk-level. But the impact of the first breach would be considered severe or catastrophic, even if it is unlikely to actually occur, while the impact of the second breach would be considered relatively minor, even if it is expected to occur often. Which type of breach poses the most significant risk? The devil, as they say, is in the details. Your team might well conclude that the first type of breach is much more important to protect against, because it could literally put MUSC out of business if it ever does occur, and therefore assign it a higher priority when selecting and prioritizing security controls.

There are limitations in any formal risk assessment methodology. You can and should use your assessments of relative risk levels, whether derived quantitatively or qualitatively, to help you select and prioritize the security controls for your system, as we'll discuss in the next section. But you also need to exercise good judgment, and be aware of the limitations of whatever formal risk assessment methodology you use.

We recommend that you use your Risk Analysis Worksheet to record the assessed Risk level rating for each threat-vulnerability pair that your assessment team generated.

Tip

Remember, all compliance issues should be arbitrarily assigned a High Risk rating. The Likelihood and Impact columns do not apply to compliance issues, and can be left blank.

6.8   Selecting and Prioritizing Security Controls

At this point, your team has identified all threat-vulnerability pairs, estimated values (ratings) for their likelihood and their potential impacts, and calculated at least a qualitative level of risk for each threat-vulnerability pair. Your team may also have identified some compliance issues that need to be addressed (and automatically assigned a High level of risk to them).

Now your team needs to decide (or in some cases, help your senior management to decide) how to appropriately deal with each of the risks that have been identified.

The goal now is to specify a rational set of security controls to appropriately manage these documented risks.

Selecting a rational (optimal) set of controls requires a broad range of knowledge and skills. If the members of your risk assessment team do not have a sound, collective understanding of all the different techniques that can be used to control information security risks, and of the overall ability and capacity of the people within the MUSC enterprise to implement the various types of technical, operational, and administrative controls that should be considered, then your team will not be able select and recommend an optimal set of controls.

For each threat-vulnerability pair, and for each compliance requirement, your assessment team will need to discuss and evaluate control options. There may be many different control options for addressing each issue, and some control options may address multiple issues. The goal is to select an optimal set of controls -- one that meets all compliance requirements, and addresses all known risks in an acceptable manner, at the lowest overall cost to MUSC, and with the least overall impact on the MUSC enterprise.

A cost-benefit analysis is generally needed to compare the various control alternatives, or in many cases, various alternative combinations of controls. Selecting an optimal set of controls requires your team to consider many different alternatives, and to carefully evaluate the alternatives in terms of their expected effectiveness, feasibility, and overall cost and impact.

Note

Your assessment team should not develop its control recommendations in isolation. The MUSC Office of the CIO (OCIO) in general, and the Information Security Office (ISO) in particular, can help your assessment team to select controls for your system that will work in harmony with other systems within the MUSC enterprise. Encourage your assessment team to coordinate their selection of controls with the ISO.

The end result of your assessment team's evaluation should be a prioritized list of recommended security controls. The priority that your assessment team assigns to each recommended control should be based on the importance of the issue(s) that the control would address. Issues with the highest rated risks, including any issues that arise from a policy/regulatory requirement, should all be reflected in the recommended control priorities. Your control recommendations should document which issue(s) would be addressed by each control.

If it is warranted by your security plan's overall cost or its potential organizational impact, then you should review the proposed plan with the appropriate senior management. If senior management questions your plan, then you may be asked to provide more details. You may be asked to show the details of your risk analysis, or the details of the cost-benefit analysis that led to your selection of controls.

We recommend that you use your Risk Analysis Worksheet to record and prioritize the selected controls for each threat-vulnerability pair that your assessment team generated.

6.8.1   Selecting Controls at Post-Implementation Stage (Existing Systems)

If you are performing a risk assessment for an existing system, then your assessment team's evaluation and selection of new or modified security controls should be informed by their understanding of the system's current state, as gleaned from the most recent available evaluation of the effectiveness of the system's existing security controls.

For example, the assessment team should generally not select a new or modified control that has already been tried and has proven ineffective, unless the reasons that it failed are well-understood and can be avoided the next time around.

Similarly, the assessment team should generally avoid selecting an entirely new and different type of control to address an issue, if a minor tweak or extension to an existing, successful control, would do the job just as well.

6.9   Documenting the Security Plan

Now that a set of prioritized control recommendations has been documented, and if necessary approved by senior management, you need to create a security plan for your system. Beware that anyone who is expected to help execute your plan should generally be afforded the opportunity to participate in its development.

The form and substance of a good security plan, and how the security plan should be documented, depends on which system life cycle stage the plan applies to.

6.9.1   Initiation Stage

If your (proposed) system is still in the Initiation stage, then your security plan simply needs to be reflected in the business plan for the system. (If you don't have a written business plan for your system at this stage, then you probably won't be developling a written security plan at this stage either.)

6.9.2   Development/Procurement Stage

If your (proposed) system is in the Development/Procurement stage, then your security plan should be reflected in any written specifications for the system's development and/or procurement, including any RFP(s) associated with the system. In other words, if any of your expected security controls need to be designed into the system, or are expected to be procured as part of the system, then requirements and/or specifications for those controls should be included in the written requirements and/or specifications for the system.

6.9.3   Implementation Stage

If your system is in the Implementation stage, then your security plan should be incorporated into your overall system implementation plan. For each of your prioritized, recommended controls, your system implementation plan should identify who is responsible for implementing, testing, and verifying the control, and it should document the specific resources needed to implement, test and verify each control. The implementation plan should document the planned time frame for the implementation of each control (including testing and verification). If security controls will require on-going operation and maintenance, then those resource requirements should also be documented in the implementation plan. The implementation plan should also schedule time for a final review of all of the system's security controls by appropriate compliance official(s); the plan should also allow for the possibility of needing time and resources for remedial action prior to go-live.

During the Implementation stage, we recommend that you use the Security Plan Summary to record who is responsible for implementing, testing and verifying each security control, to document the expected time frame for completion, and to note any on-going operational and/or maintenance requirements.

6.9.4   Post-Implementation Stage

If your system is in the Post-Implementation stage, then your security plan should consist of an implementation plan for any new or modified security controls that are being recommended now, as a result of the current risk assessment. For each new or modified control, the implementation plan should identify who is responsible for implementing, testing, and verifying the control, and it should document the specific resources needed to implement, test and verify each control. The implementation plan should document the planned time frame for the implementation of each new or modified control (including testing and verification). If the new or modified security controls will require on-going operation and maintenance, then those resource requirements should also be documented in the implementation plan.

During the Post-Implementation stage, we recommend that you use the Security Plan Summary template to document the implementation plan for your system's new and/or modified security controls.

6.10   Communicating the Security Plan

If warranted by your security plan's overall cost or its potential organizational impact, you should present your plan to the appropriate senior management, and obtain their approval.

Ultimately, your security plan must be communicated to everyone who is expected to participate in its implementation. If you have allowed for appropriate participation in the plan's development by those who will be affected by the plan (or their representatives), then there should be no surprises when you communicate your plan.

If you have used the System Identification template, the Risk Analysis Worksheet, and the Security Plan Summary to develop and document your security plan, then these documents are good tools for communicating your security plan to the people who need to approve it, review it, and/or participate in its implementation.

6.11   Risk Assessment Report

The preceding sections explained each of the steps in the risk assessment process. Your Risk Assessment Report serves to document your completed risk assessment. The content of a Risk Assessment Report depends on the system life cycle stage at which the assessment is performed.

6.11.1   Initiation Stage

When a risk assessment is performed during the system's Initiation stage, in most cases no separate Risk Assessment Report is produced. Instead, the business plan for the system should include the following information:

  • Identification and classification of the system (see System Identification template)
  • Listing of all significant information security issues (threats, vulnerabilities and compliance issues) expected to affect the system
  • Listing of all significant security controls expected to be required, given anticipated threats, vulnerabilities, and compliance requirements
  • Projected costs of implementing and maintaining security controls, including any required extensions or changes to existing enterprise security controls

6.11.2   Development/Procurement Stage

When a risk assessment is performed during the system's Development/Procurement stage, a formal Risk Assessment Report should be produced. Keep in mind that the primary goal of the risk assessment process, at this particular stage in the system life cycle, is to develop specifications for those security controls that will need to be designed into or procured with the system. The steps in the risk assessment process, documented in the previous section, should be followed with this specific goal in mind.

The Risk Assessment Report should include the following components:

Because the system is still in the planning stages, and its architectural and operational features may not yet be known in detail, it may not be possible to complete the risk assessment process at this stage in the system's life cycle with as much accuracy and detail as is possible during later stages.

Nevertheless, it is imperative to conduct the assessment, and to document the findings, with as much accuracy and detail as possible. It is often difficult, usually more expensive, and sometimes nearly impossible, to "add" major security features at the later stages in a system's life cycle. It is important to identify the system's major security requirements as early in the design, development and/or procurement process as possible.

6.11.3   Implementation Stage

When a risk assessment is performed during the system's Implementation stage, a formal Risk Assessment Report should be produced, including the following components:

In addition, the system's security plan should be incorporated into the system's overall implementation project plan.

6.11.4   Post-Implementation Stage

When a risk assessment is performed during the system's Post-Implementation stage, a formal Risk Assessment Report should be produced, including the following components:

7   Executing the Security Plan

As a system owner, your risk management responsibilities do not end after the system security plan has been approved by the appropriate level(s) of management, and the risk assessment report has been completed. You are also responsible for the next major step in the risk management cycle: ensuring that the security plan is executed.

If your system is moving from the Initiation stage to the Development/Procurement stage of its life cycle, then as the owner, you must ensure that the system's security features or requirements (that have been expressed or implied in the business plan for the system) are not left by the wayside. For example, if the need for the system to have strong access controls has been documented in your business plan, then you need to ensure that this general requirement gets translated into specific technical and operational requirements in the system's actual specifications.

During the Development/Procurement stage, as the owner, you need to ensure that your system's security specifications are met during the development and/or procurement processes. Sign-offs on development or procurement of a system component must be contingent on its documented security requirements and specifications having been met.

As your system moves into the Implementation stage, your primary responsibility is to ensure that a security plan for your system is developed and approved in a timely manner (through the risk assessment process), and that this security plan is incorporated into the system's overall implementation plan. The overall implementation plan for your system cannot be considered complete if it does not incorporate your system's approved security plan. The implementation, testing and verification of all controls specified in your system's security plan must be completed during your system's implementation.

Whenever a new security plan is developed as part of a Post-Implementation risk assessment, you are responsible for ensuring that all new or modified security controls specified in the security plan are actually implemented according to the plan, and that all documented resource requirements for on-going operation and maintenance are met.

If changes or revisions to a security plan are made during its implementation, then you should ensure that these changes are documented.

8   Monitoring and Evaluation

9   Reference

9.1   Threats

A threat is defined as the potential for a "threat-source" to intentionally exploit or accidentally trigger a specific vulnerability. Threats and vulnerabilities combine to create potential security issues, and their associated risks.

To assess information security risks, we are concerned with threats to the availability, integrity, and confidentiality of information, and information systems.

Threats to information security can arise from any of these four general categories:

  1. Deliberate actions of people
  2. Accidental actions of people
  3. System (technology) problems
  4. Other (environmental) problems

To identify the threats to an information system, we must be able to anticipate what kinds of actions by people, either inside or outside our organization, committed either intentionally or accidentally, can have a negative effect on our system's availability, integrity, or confidentiality. We need a good understanding of the kinds of acts of malice or negligence that can threaten our system, and what kinds of people can commit those acts. We need to think not only about people outside our organization, but about insider threats as well. Some of the people who could threaten our system might include:

  • activists
  • consultants
  • contractors
  • customers
  • deranged people
  • extortionists
  • hackers
  • industrial spies
  • insiders
  • maintenance people
  • organized crime
  • private investigators
  • terrorists
  • thieves
  • vandals

We also need a good understanding of what kinds of system (technology) problems can affect our system. Some of these might include:

  • hardware failures
  • software failures
  • failures of related systems
  • malicious code

And finally, we need to understand what kinds of other (environmental) problems can affect our system, including things like:

  • power outages
  • natural disasters
  • building environment control problems
  • water damage (man-made sources)

STRIDE is a classification for threat modeling that was developed by Microsoft. It is a way of classifying threats, primarily attack threats (those that arise from the deliberate actions of people), according to the goals and purposes of the attack. The STRIDE threat categories are:

S spoofing

T tampering

R repudiation

I information disclosure

D denial-of-service

E elevation-of-privilege

Spoofing Identify

Spoofing identity threats include anything done to illegally obtain or access and use another person's authentication information, such as a user name or password. This category of threat includes man-in-the-middle attacks, and impersonation of a trusted host by an untrusted host.

Tampering with Data

Tampering with data threats involve the malicious modification of data. Examples include unauthorized changes made to persistent data (such as defacement of a Web site), information held in a database, or data as it flows between two computers on an open network. One specific threat in this category is session hijacking, where the attacker captures or takes over a network session after the regular user has been authenticated and authorized by a server.

Repudiation

Repudiation threats involve users who deny that they performed an action, and other parties have no way to prove otherwise. An example of this type of threat would be a user performing a prohibited operation in a system that lacks the ability to trace the prohibited operation. Nonrepudiation refers to the ability of a system to counter repudiation threats. Signatures (either traditional or digital) are the standard tool for achieving nonrepudiation in all types of business processes.

Information Disclosure

Information disclosure threats involve the exposure of information to individuals who are not supposed to have access to it. Examples include the ability of users to read files to which they were not granted access, and the ability of an intruder to read data that is in transit between two computers. Threats in this category include unauthorized connections and network sniffing.

Denial of Service

Denial of service attacks are directed attacks against a specific host or network. One common form of attack is to send more traffic to a host or router than it can handle within a given time, which results in an inability of the network to handle the traffic and thereby disrupts the legitimate flow of traffic. Denial of service can also result from accidents, for example a user who deletes an important file by mistake, or a system administrator who makes a configuration change that causes a server to crash. Denial of service can also occur as the result of natural disasters, fires, floods, etc.

Elevation of Privilege

This type of threat allows an unprivileged user to gain privileged access that enables them to compromise or damage an entire system. An example would be a user with an account on an e-mail server, who exploits a flaw in the software on the server to gain an unauthorized, higher level of access, that allows him to read, tamper with, and even delete e-mail messages that belong to other users.

9.2   Vulnerabilities

By themselves, threats don't result in security issues. In order for a potential threat to create a non-negligible security issue for a specific information system, there must be one or more vulnerabilities present, that would allow that specific threat to affect that specific system. A vulnerability is a weakness or a flaw in a system's security procedures, design, implementation, or internal controls, that could be accidentally triggered or intentionally exploited, when a threat is manifested.

Vulnerabilities can fall into any of the following categories:

Technical vulnerabilities
flaws in the design, implementation and/or configuration of software and/or hardware components
Human resource vulnerabilities
key person dependencies, gaps in awareness and training, gaps in discipline, improper termination of access
Physical and environmental vulnerabilities
insufficient physical access controls, poor siting of equipment, inadequate temperature/humidity controls, inadequately conditioned electrical power
Operational vulnerabilities
lack of change management, inadequate separation of duties, lack of control over software installation, lack of control over media handling and storage, lack of control over system communications, inadequate access control or weaknesses in access control procedures, inadequate recording and/or review of system activity records, inadequate control over encryption keys, inadequate reporting, handling and/or resolution of security incidents, inadequate monitoring and evaluation of the effectiveness of security controls
Business continuity and compliance vulnerabilities
misplaced, missing or inadequate processes for appropriate management of business risks; inadequate business continuity/contingency planning; inadequate monitoring and evaluation for compliance with governing policies and regulations

9.3   Controls

A security control, sometimes called a safeguard or a countermeasure, is a mechanism for protecting an information system against one or more specific security issues (risks). For any given issue that potentially affects one of our systems, there are usually many different control options, and often a bewildering number of them. The goal of a risk assessment team is to select, from this universe of all possible security controls, the most cost-effective set of controls that will control all anticipated risks to an acceptable degree. This sounds like a tall order... and make no mistake, it often is!

At the highest level, there are essentially three basic types of security controls: those that contribute to a prevention strategy, those that contribute to a detection strategy, and those that contribute to a recovery strategy. Very often, a control or safeguard will contribute to more than one protection strategy. For example, maintaining records of system activity (system event logs) is generally considered to be a preventive control, but without good logs, detection of intrusions and other types of security incidents may not even be possible. Having good logs can contribute to efficient recovery as well.

We might also consider the timeframe in which a given control is active: before, during, or after a security incident or issue. By their very nature, all preventive controls act before an incident. Detective and recovery controls can act either during or after an incident.

In one of the standard taxonomies of security controls (GASSP), the term "real-time" is used to describe a detective or recovery control that operates during an incident, and "non-real time" to describe one that operates after the incident has already occurred. Here is the complete GASSP control taxonomy:

  • Prevention
    • Avoidance
    • Transfer
    • Reduction of Threat
    • Reduction of Vulnerability
    • Event Recording
  • Detection
    • Real-time Detection
    • Non-real time detection
  • Recovery
    • Reduction of Impact
    • Real-time Recovery
    • Non real-time Recovery

Avoidance (A)

Avoiding or completely bypassing a risk. For example, making a decision to no longer, or not to, process data; i.e., remove it from the system under review and not process it by any of our other IT systems.

Transfer (T)

Transferring an asset or a risk outside the boundary of our system. For example, a highly sensitive file could be removed and run on an entirely separate IT system, thus transferring the risk to the other system. Alternatively, the risk could be transferred outside the boundary of MUSC itself, say by Insurance.

Reduction of Threat (RT)

Building or positioning a 'barrier', between a threat and a vulnerable asset, or between a threat and the entry point into our system that it targets. For example, an RT safeguard could counter:

  1. accidental threat to a physical asset, such as the prohibiting of the storage of flammable material near IT equipment.
  2. accidental threat to a data asset, such as better training of staff to reduce the likelihood of human error.
  3. deliberate internal threat, such as paying staff more money to reduce the motivation for willful damage caused by disgruntlement, or strictly enforced personnel disciplinary procedures, or ensuring that non-security functions to be performed in the security administration role are limited to those essential to performing that role effectively.
  4. deliberate external threat, such as the removal of existing or non-installation of dial-up lines, or the use of only private circuits.

Reduction of Vulnerability (RV)

Making it more difficult for a specific vulnerability to be exploited by a threat. Examples of RV safeguards could include not installing all assets in the same physical location (thus reducing the vulnerability to the loss of all the system assets, say in a major fire), fixing exploitable bugs in software, and hardening systems to resist network attacks.

Reduction of Impact (RI)

Lessening the overall effect of an impact on an asset, especially the damage to the asset. Examples of RI safeguards are daily backups of files for later use in recovery, and the use of fire suppression systems.

Event Recording (ER)

Recording system and application events in sufficient detail to enable detection and recovery from possible security violations or incidents. Event recording (logging) is an important enabling mechanism for detection and recovery. The choices of what to record, how, and when, will often dictate the degree of detection and recovery that is possible during or after a security event. Examples of ER controls include operating system logs, application logs, database logs, handwritten operator logs, and video surveillance.

Real-time Detection (RD)

Detecting the occurrence of an event as it happens, or evidence that an event has just occurred. Examples of RD safeguards include smoke detectors to detect that a fire may be present, and network intrusion detection system (IDS) sensors to detect the unauthorized use of network services. Both real-time and non-real-time detection strategies are usually coupled with one or more prevention or recovery strategies. In the case of the smoke detector, the smoke detector may set off halon to try to prevent the loss (RV), set off sprinklers to prevent catastrophic loss (RI), or prevent the loss of life by sounding audible alarms (RT), automatically contact the fire department (RI),or any combination of these. In the IDS example, the network sensor may send a message to a network administrator's beeper (RT), it may trigger the creation of a special firewall rule to block the anomalous traffic (RV), or it may send a message to a special hardened log server to ensure an unmodifiable audit record of the suspected penetration attempt (NRR).

Non-Real-time Detection (NRD)

Discovering the evidence of an event after the fact. Examples of NRD controls include the analysis of audit trails, error logs, and database journals. Security audits of a system and its operating procedures are a type of NRD control.

Real-time Recovery (RR)

Lessening the overall impact of an event, especially the loss of service caused by the event. Examples include: keeping on-site replacement parts, checkpointing to facilitate a quick return to service following a system failure or system corruption, and using error correction codes to correct and validate corrupted or erred data in real-time.

Non-Real-time Recovery (NRR)

Returning a system or data to a desired state after a failure. Examples include recovering a user's data from backup following corruption or accidental deletion, rebuilding a system after a compromise, and moving to a disaster recovery site and resuming operations following a major disaster.

While it is instructive to think about security controls in terms of the GASSP protection strategies, there are other useful ways to categorize security controls. For example, the HIPAA Security Rule organizes controls into three broad categories or types: administrative, physical, and technical.

Another practical way to classify security controls uses the following three broad categories:

Administrative and Management controls - are those security measures that focus on the management of the system and the management of risk. These measures include:

  • Security reviews and assessments
  • Risk assessments
  • Rules of behavior

Operational controls - are those operational procedures, personnel and physical security measures established to provide an acceptable level of protection for computing resources and include:

  • Security awareness and training
  • Disaster recovery, contingency, and emergency plans
  • Background investigations
  • Security reviews and audits
  • Separation of duties
  • Physical security controls

Technical controls – are those safeguards incorporated into computer hardware, software or firmware and include:

  • Access control mechanisms
  • Antivirus software
  • Identification & Authentication mechanisms
  • Firewalls
  • Encryption
  • Audit trails
  • Backups
  • Intrusion detection systems

Regardless of what approach our assessment teams use for identifying candidate security controls, there are some general principles that teams should keep in mind when selecting controls for further analysis:

Think prevention first

An ounce of prevention and all that. But beware that prevention of some issues may be too difficult or too expensive, and so the detection and recovery aspects of system protection still have to be addressed. Even when preventive controls are practical and affordable, we still have to recognize that they won't always work. Think protection first, but don't stop there.

Detection is required for recovery

This should be self-explanatory. A compromised system that goes undetected can result in a (relatively) minor incident's becoming a major disaster. Damage control and recovery can't start until after a problem has been detected.

Timeliness matters

The requirements for timeliness in a recovery situation will usually be fairly obvious. Less obvious, but no less important, is the need for timeliness when it comes to prevention and detection. For example, access to timely information on the latest vulnerabilities and on newly emerging threats is essential to one's ability to keep prevention and detection controls up to date and effective.

Integration of controls is essential

Information security is most efficient when controls are implemented in a coordinated manner across an organization. Controls are also more effective when coordinated with each other; some controls naturally complement each other, while other combinations do nothing for each other, or worse, may even work against each other.

Defense in depth is highly desirable

Controls should be commensurate with the level of risk. High risk issues should generally warrant protection by multiple layers of defense, so that a failure of one layer does not result in the loss of all protection.

More often than not, the security controls that a risk assessment team considers will be ones that fit into a well-established pattern of known good practice. Chances are, if your assessment team needs a defense against a particular type of threat, there is likely to be a "standard" set of defenses to choose from.

Information security professionals are familiar with the patterns of good practices. If your assessment team needs help selecting controls, you can get help from the professional(s) in MUSC's Information Security Office.

9.4   Putting It All Together: Examples

Threat-Vulnerability Pairs (Potential Security Breaches or Issues)

Security Issue: An authorized employee uses the system in an unauthorized manner.

  • Threat: Deliberate misuse of system by insider.
  • Vulnerability: Inadequate training (the employee doesn't know better), or inadequate audit controls (the employee believes his misuse won't be detected), or lack of effective disiplinary process (employee believes there won't be any sanctions even if his misuse is detected).

Security Issue: An intruder gains control of the system by exploiting a software vulnerability.

  • Threat: Deliberate intrusion into system (could be either an insider or an outside attacker).
  • Vulnerability: The software environment used to support the system, or the application software itself, has a flaw in its design, implementation or configuration that can be exploited to gain control of the system and/or application.

Security Issue: Database corruption occurs as a result of operator error.

  • Threat: Accident committed by insider.
  • Vulnerability: Insufficient training, or a possible flaw in the system's design that makes it too easy for operator errors to affect the system's integrity.

Security Issue: Users frequently make errors as a result of a poorly-designed data input screen.

  • Threat: Accident committed by insider.
  • Vulnerability: Flaw in software design or implementation.

Security Issue: The only IT support person who really understands how the system is configured suddenly becomes unavailable.

  • Threat: Unavailability of critical resource.
  • Vulnerability: Over-dependence on a key person and inadequate contingency planning.

Security Issue: A laptop or PDA or thumb drive containing sensitive system information is stolen from an faculty member's car, and the data was not encrypted.

  • Threat: Theft of a hardware device and the information stored on it.
  • Vulnerability: Storage of sensitive information on a mobile device, which is exposed to a high risk or loss or theft, without encryption.

Security Issue: A disgruntled employee who believes he was wrongly terminated succeeds in sabotaging the system because his account was not promptly disabled.

  • Threat: Deliberate sabotage by insider.
  • Vulnerability: Improper termination of access.

Security Issue: A former system administrator accesses the system improperly through a hidden back-door he created.

  • Threat: Deliberate unauthorized access by (former) insider.
  • Vulnerability: Flaws in system administration procedures allowing employees to create back-doors without being detected.

Security Issue: The system is down for an extended period due to equipment damage caused by a natural disaster such as an earthquake or severe hurricane.

  • Threat: Natural disaster.
  • Vulnerability: Inadequate business continuity/contingency planning.

Security Issue: Sensitive information from the system is left on a surplus disk drive that is later purchased at a state auction of surplus computers, and the purchaser seeks notoriety by reporting it to the media.

  • Threat: Deliberate unauthorized access by outsider.
  • Vulnerability: Lack of control over the handling and disposal of storage media that has contained sensitive information.

Security Issue: A keystroke logger is installed by malicious software that is introduced onto a workstation that is used to access the system; user passwords are collected by criminals, who use them to login and extract sensitive information from the system.

  • Threat: Deliberate unauthorized access.
  • Vulnerability: Access to system from a compromised workstation.

Security Issue: A user of the system sees evidence that someone else has used her account, but fails to report it to anyone.

  • Threat: Insider (end-user) failing to report a suspected intrusion.
  • Vulnerability: Gap in end-user's awareness of his duty to recognize and report security incidents.

Security Issue: A serious intrusion into the system occurs, but investigators are unable to pinpoint the vulnerabilities that allowed it to occur, because insufficient records of system activity were being kept.

  • Threat: Deliberate unauthorized access.
  • Vulnerability: Whatever vulnerabilit(ies) contributed to the original intrusion, compounded by inaequate recording of system activity.

Security Issue: A serious, on-going system compromise is not discovered until too late because nobody was checking up on the person who was assigned to review the system activity records that would have revealed the compromise.

  • Threat: Deliberate unauthorized access.
  • Vulnerability: Whatever vulnerabilit(ies) contributed to the original intrusion, compounded by inadequate monitoring and evaluation of the effectiveness of the system's audit controls.

Security Issue: System backup tapes containing sensitive information disappear en route to an off-site storage facility and their disappearance is not discovered until months later due to inadequate procedural controls.

  • Threat: Accidental loss (or was it theft?) of storage media.
  • Vulnerability: Lack of control over media handling and/or storage.