Books and Handbooks

< Other materials available

Handbooks

Handbooks are made freely available to the community as the definitive guides to understanding and implementing STPA and CAST.

STPA handbook

The STPA Handbook explains in detail each step of the STPA process for proactively analyzing potential hazards and future loss scenarios. Examples are provided from a number of industries.

Outline of the STPA Handbook

  1. Introduction:
    • Overview of STPA and its theoretical foundations.
    • Advantages over traditional risk analysis methods, including the ability to handle complex systems and integrate software and human elements.
  2. How to do Basic STPA:
    • Step-by-step guide on performing STPA:
      • Defining the purpose of analysis
      • Creating control structures
      • Identifying Unsafe Control Actions
      • Building scenarios
      • Developing requirements, constraints, and other solutions throughout each of these steps
  3. Integration into System Engineering:
    • Approaches to incorporate STPA into the broader system engineering lifecycle, enhancing both safety and operational efficiency.
  4. Workplace Safety Applications:
    • Application of STPA in improving workplace safety through systemic hazard analysis and control.
  5. Organizational and Social Analysis:
    • Using STPA for analyzing organizational structures and social factors that may impact system safety.
  6. Identifying Leading Indicators of Risk:
    • Techniques for identifying and managing leading indicators to preemptively address potential safety issues.
  7. Designing Safety Management Systems:
    • Guidance on developing effective safety management systems that are robust and adaptive to changes in operational environments.
  8. Implementation in Large Organizations:
    • Challenges and strategies for deploying STPA within large organizational structures without disrupting existing processes.
  9. Appendices
    • Practical examples, additional guidelines for safety management system design, and basic engineering concepts for non-engineers.

Download STPA Handbook

CAST handbook

The CAST Handbook explains in detail how to apply CAST after an incident or accident to maximize learning.

Outline of the CAST Handbook

  1. Introduction:
    • Need for a new accident analysis tool
    • Goals and structure of the handbook
    • Introduction to CAST and its relationship to STPA
  2. Basic terminology
  3. Why do current accident analysis methods fall short?
    • Root Cause Seduction and Oversimplification of Causality
    • Hindsight Bias
    • Unrealistic Views of Human Error
    • Blame is the Enemy of Safety
    • Use of Inappropriate Accident Causality Models
    • Goals for an Improved Accident Analysis Approach
  4. Performing a CAST Analysis:
    • Steps for conducting a thorough CAST analysis
    • Modeling the safety control structure (aka the Safety Management System)
    • Analyzing the Control Structure
    • Reporting conclusions and generating actionable recommendations
  5. Workplace Safety
    • Workplace Safety
    • Using CAST for Analyzing Social Losses
  6. Using CAST for Analyzing Social Losses
  7. Appendices
    • Links to Published CAST Examples for Real Accidents
    • Background Information and Summary CAST Analysis of the Shell Moerdijk Loss
    • The “Bad Apple” Theory of Accident Causation
    • Factors to Consider when Evaluating the Role of the Safety Control Structure in the Loss
    • Basic Engineering and Control Concepts for Non-Engineers

Download CAST Handbook

Healthcare

This handbook has been tailored for those in the healthcare community looking to adopt the CAST method. Although the method is the same, the way it is explained has been tailored.

Books

These books provide detailed information about the underlying concepts, theoretical underpinnings, examples, and advantages/disadvantages of different approaches to safety.

Introduction to System Safety Engineering

This textbook by Nancy Leveson explains the modern principles and practices of system safety engineering. The book addresses modern challenges such as the growing complexity of automation and the increasing use of safety-critical software. Key topics covered include risk management, accident causation models, the role of software in safety, and human factors in system design. It reviews the most popular techniques for hazard analysis, discussing their main purposes, strengths, and limitations. Additional topics include managing safety in operations and strategies for integrating safety into system engineering processes.

Unlike other materials on this page, this book is not solely focused on STAMP, STPA, and CAST. It is aimed at students, professionals, and academics who need a comprehensive resource covering the general field of system safety. The book is particularly useful for engineers and safety professionals who may have entered the field without formal academic preparation in safety engineering.

Outline

  • Preface
  • 1. Historical and Industrial Perspectives on Safety Engineering
    • Differences between Workplace Safety and Product/System Safety
    • A Brief Legal View of the History of Safety
    • A Technical View of the History of Safety
    • Workplace Safety Today: An Engineer's View
    • Product/System Safety Today
      • Commercial Aviation
      • Nuclear Power
      • The Chemical Industry
      • Defense and "System Safety"
      • SUBSAFE: The US Nuclear Submarine Safety Program
      • Astronautics and Space
      • Healthcare/Hospital Safety
    • Summary
    • Exercises
  • 2. Risk in Modern Society
    • Changing Attitudes toward Risk
    • Changing Risk Factors
      • The Appearance of New Hazards
      • Increasing Complexity
      • Increasing Exposure
      • Increasing Amounts of Energy
      • Increasing Automation of Manual Operations
      • Increasing Centralization and Scale
      • Increasing Pace of Technological Change
    • How Safe Is Safe Enough?
    • Exercises
  • 3. Fundamental Concepts and Definitions
    • Definitions of Safety and Risk
    • Hazards and Hazard Analysis
    • Defining Safety Requirements and Constraints
    • Safety versus Reliability
    • What Is a System?
    • Defining Complexity
    • Approaches to Dealing with Complexity
    • Summary
    • Exercises
  • 4. Why Accidents Occur
    • The Traditional Conception of Causality
    • Subjectivity in Ascribing Causality
    • Oversimplification in Determining Causality
    • Multifactorial Explanations of Accidents
    • Systemic Causes of Accidents
    • Summary
    • Exercises
  • 5. The Role of Software in Safety
    • The Use of Software in Systems Today
    • Understanding the Problem
    • Why Does Software Present Unique Difficulties?
    • Summary
    • Exercises
  • 6. The Role of Humans in Safety
    • Why Replace Humans with Machines?
    • Do Human Operators Cause Most Accidents?
    • The Need for Humans in Automated Systems
    • Human Error as Human-Task Mismatch
    • The Role of Mental Models in Safety
    • What Is the Appropriate Role for Humans in Complex Systems?
    • Conclusions
    • Exercises
  • 7. Accident Causality Models
    • Energy Models
    • Linear Chain-of-Failure Events Models
    • Epidemiological Models
    • More Sophisticated Models of Causality
    • The STAMP Model of Causality
    • Looking Ahead
    • Exercises
  • 8. Accident Analysis and Learning from Events
    • Why Are We Not Learning Enough from Accidents?
    • Goals for Improved Accident Analysis
    • Example: The Zeebrugge Ferry Accident
    • Generating Recommendations
    • Implementing Long-Term Learning
    • The Cost of Thorough Accident Investigation
    • Summary
    • Exercises
  • 9. Hazard Analysis: Basic Concepts
    • What Is Hazard Analysis?
    • The Hazard Analysis Process
    • Types of System Models
    • General Types of Analysis
    • Who Should Do Hazard Analysis?
    • Limitations and Criticisms of Hazard Analysis
    • Analysis versus Assessment
    • Exercises
  • 10. Hazard Analysis Techniques
    • Energy Model Techniques: Hazard Indices
    • Techniques Based on the Chain-of-Failure-Events Causality Model
    • STPA: A Technique Based on STAMP
    • Task and Human Error Analysis Techniques
    • Conclusions
    • Exercises
  • 11. Design for Safety
    • The Design Process
    • Types of Design Techniques and Precedence
    • Hazard Elimination
    • Hazard Occurrence Reduction
    • Hazard Control
    • Damage Reduction
    • Design Modification and Maintenance
    • Exercises
  • 12. Human Factors in System Design
    • Determining What Should Be Automated
    • The Need for Wide Participation in Design Activities
    • Safety vs. Usability and Other Common Goals
    • Reducing Safety-Critical Human Errors through System Design
    • Training and Maintaining Skills
    • Exercises
  • 13. Assurance, Assessment, and Certification
    • Assurance of Safety
    • Hazard and Risk Assessment
    • Certification
    • Some General Conclusions
    • Exercises
  • 14. Designing a Safety Management System
    • Social Dynamics and Organizational Culture
    • Organizational Structure
    • Management of Safety-Critical System Development
    • Management of Operational Processes and Practices
    • Creating an Effective Safety Information System
    • Summary
    • Exercises
  • Epilogue: Looking Forward
  • Appendices
    • Appendix A. Medical Devices: The Therac-25
    • Appendix B. Space: The Challenger and Columbia Space Shuttle Losses
    • Appendix C. Petrochemicals: Seveso, Flixborough, Bhopal, Texas City, and Deepwater Horizon
    • Appendix D. Nuclear Power: Three Mile Island, Chernobyl, and Fukushima Daiichi
  • References
  • Index

How to Obtain

The book is published by MIT Press and available as:

Engineering a Safer World

Engineering a Safer World by Nancy Leveson introduces STAMP, STPA, and CAST for those who may be used to other approaches to safety engineering. The book outlines the most significant challenges and gaps that exist in safety engineering and how concepts from Systems Theory can be used to address them. The STAMP accident causation model, based on Systems Theory, is introduced along with examples of the additional complex causal factors that must be considered by modern system safety efforts. STPA and CAST are introduced along with other topics such as safety-guided design, integrating safety in system engineering, controlling safety during operations, and managing a safety culture.

While the handbooks provide the most detailed and up-to-date guidance on how to apply STPA and CAST, Engineering a Safer World provides additional background information about Systems Theory and STAMP, the underlying foundations of STPA and CAST, and adjacent topics like managing a safety culture.

Outline

  • I FOUNDATIONS
    • Why Do We Need Something Different?
    • Questioning the Foundations of Traditional Safety Engineering
      • Confusing Safety with Reliability
      • Modeling Accident Causation as Event Chains
        • Direct Causality
        • Subjectivity in Selecting Events
        • Subjectivity in Selecting the Chaining Conditions
        • Discounting Systemic Factors
        • Including Systems Factors in Accident Models
      • Limitations of Probabilistic Risk Assessment
      • The Role of Operators in Accidents
        • Do Operators Cause Most Accidents?
        • Hindsight Bias
        • The Impact of System Design on Human Error
        • The Role of Mental Models
        • An Alternative View of Human Error
      • The Role of Software in Accidents
      • Static versus Dynamic Views of Systems
      • The Focus on Determining Blame
      • Goals for a New Accident Model
    • Systems Theory and Its Relationship to Safety
      • An Introduction to Systems Theory
      • Emergence and Hierarchy
      • Communication and Control
      • Using Systems Theory to Understand Accidents
      • Systems Engineering and Safety
      • Building Safety into the System Design
  • II STAMP: AN ACCIDENT MODEL BASED ON SYSTEMS THEORY
    • A Systems-Theoretic View of Causality
      • Safety Constraints
      • The Hierarchical Safety Control Structure
      • Process Models
      • STAMP
      • A General Classification of Accident Causes
        • Controller Operation
        • Actuators and Controlled Processes
        • Coordination and Communication among Controllers and Decision Makers
        • Context and Environment
      • Applying the New Model
    • A Friendly Fire Accident
      • Background
      • The Hierarchical Safety Control Structure to Prevent Friendly Fire Accidents
      • The Accident Analysis Using STAMP
        • Proximate Events
        • Physical Process Failures and Dysfunctional Interactions
        • The Controllers of the Aircraft and Weapons
        • The ACE and Mission Director
        • The AWACS Operators
        • The Higher Levels of Control
      • Conclusions from the Friendly Fire Example
  • III USING STAMP
    • Engineering and Operating Safer Systems Using STAMP
      • Why Are Safety Efforts Sometimes Not Cost-Effective?
      • The Role of System Engineering in Safety
      • A System Safety Engineering Process
        • Management
        • Engineering Development
        • Operations
    • Fundamentals
      • Defining Accidents and Unacceptable Losses
      • System Hazards
        • Drawing the System Boundaries
        • Identifying the High-Level System Hazards
      • System Safety Requirements and Constraints
      • The Safety Control Structure
        • The Safety Control Structure for a Technical System
        • Safety Control Structures in Social Systems
    • STPA: A New Hazard Analysis Technique
      • Goals for a New Hazard Analysis Technique
      • The STPA Process
      • Identifying Potentially Hazardous Control Actions (Step 1)
      • Determining How Unsafe Control Actions Could Occur (Step 2)
        • Identifying Causal Scenarios
        • Considering the Degradation of Controls over Time
      • Human Controllers
      • Using STPA on Organizational Components of the Safety Control Structure
        • Programmatic and Organizational Risk Analysis
        • Gap Analysis
        • Hazard Analysis to Identify Organizational and Programmatic Risks
        • Use of the Analysis and Potential Extensions
        • Comparisons with Traditional Programmatic Risk Analysis Techniques
      • Reengineering a Sociotechnical System: Pharmaceutical Safety and the Vioxx Tragedy
        • The Events Surrounding the Approval and Withdrawal of Vioxx
        • Analysis of the Vioxx Case
      • Comparison of STPA with Traditional Hazard Analysis Techniques
      • Summary
    • Safety-Guided Design
      • The Safety-Guided Design Process
      • An Example of Safety-Guided Design for an Industrial Robot
        • Controlled Process and Physical Component Design
        • Functional Design of the Control Algorithm
      • Designing for Safety
        • Special Considerations in Designing for Human Controllers
          • Easy but Ineffective Approaches
          • The Role of Humans in Control Systems
          • Human Error Fundamentals
          • Providing Control Options
          • Matching Tasks to Human Characteristics
          • Designing to Reduce Common Human Errors
          • Support in Creating and Maintaining Accurate Process Models
          • Providing Information and Feedback
      • Summary
    • Integrating Safety into System Engineering
      • The Role of Specifications and the Safety Information System
      • Intent Specifications
      • An Integrated System and Safety Engineering Process
        • Establishing the Goals for the System
        • Defining Accidents
        • Identifying the System Hazards
        • Integrating Safety into Architecture Selection and System Trade Studies
        • Documenting Environmental Assumptions
        • System-Level Requirements Generation
        • Identifying High-Level Design and Safety Constraints
        • System Design and Analysis
        • Documenting System Limitations
        • System Certification, Maintenance, and Evolution
    • Analyzing Accidents and Incidents (CAST)
      • The General Process of Applying STAMP to Accident Analysis
      • Creating the Proximal Event Chain
      • Defining the System(s) and Hazards Involved in the Loss
      • Documenting the Safety Control Structure
      • Analyzing the Physical Process
      • Analyzing the Higher Levels of the Safety Control Structure
      • A Few Words about Hindsight Bias and Examples
      • Coordination and Communication
      • Dynamics and Migration to a High-Risk State
      • Generating Recommendations from the CAST Analysis
      • Experimental Comparisons of CAST with Traditional Accident Analysis
      • Summary
    • Controlling Safety during Operations
      • Operations Based on STAMP
      • Detecting Development Process Flaws during Operations
      • Managing or Controlling Change
        • Planned Changes
        • Unplanned Changes
      • Feedback Channels
        • Audits and Performance Assessments
        • Anomaly, Incident, and Accident Investigation
        • Reporting Systems
      • Using the Feedback
      • Education and Training
      • Creating an Operations Safety Management Plan
      • Applying STAMP to Occupational Safety
    • Managing Safety and the Safety Culture
      • Why Should Managers Care about and Invest in Safety?
      • General Requirements for Achieving Safety Goals
        • Management Commitment and Leadership
        • Corporate Safety Policy
        • Communication and Risk Awareness
        • Controls on System Migration toward Higher Risk
        • Safety, Culture, and Blame
        • Creating an Effective Safety Control Structure
        • The Safety Information System
        • Continual Improvement and Learning
        • Education, Training, and Capability Development
      • Final Thoughts
    • SUBSAFE: An Example of a Successful Safety Program
      • History
      • SUBSAFE Goals and Requirements
      • SUBSAFE Risk Management Fundamentals
      • Separation of Powers
      • Certification
        • Initial Certification
        • Maintaining Certification
      • Audit Procedures and Approach
      • Problem Reporting and Critiques
      • Challenges
      • Continual Training and Education
      • Execution and Compliance over the Life of a Submarine
      • Lessons to Be Learned from SUBSAFE
  • APPENDIXES
    • Definitions
    • The Loss of a Satellite
    • A Bacterial Contamination of a Public Water Supply
    • A Brief Introduction to System Dynamics Modeling

How to Obtain

The book is published by MIT Press and available as:

Safeware

Safeware by Nancy Leveson is a comprehensive textbook on system safety engineering circa 1995. Although it is outdated, it is remarkable how many of these traditional methods are still attempted today. Meanwhile, software and other techologies that are being introduced into safety-critical systems have changed and their complexity has dramatically increased. The rate of technological innovation has far outpaced the rate of innovation in safety engineering. This book is a good resource for understanding the historical context of system safety engineering and the limitations of traditional methods, which still exist as limitations when the same methods are used today just as they did in 1995.

Outline

  • PART 1: The Nature of Risk
    • Is there a problem?
    • How safe is safe enough?
    • The role of computers in accidents
    • Software myths
    • Why software engineering is hard
    • Problems in ascribing causality
    • A hierarchical model of causality
    • Root causes of accidents
    • Do humans cause most accidents?
    • The need for and role of humans in automated systems
  • PART 2: Introduction to System Safety
    • Foundations of system safety (systems theory and systems engineering)
    • Historical development
    • Basic concepts (hazard analysis, design for safety, management)
    • Software system safety
    • Cost and effectiveness of system safety
    • Other approaches to safety (industrial engineering, reliability engineering)
  • PART 3: Definitions and Models
    • Terminology
    • Accident models
    • Human task and error models
  • PART 4: Elements of a Safeware Program
    • Managing safety (the role of management, setting policy, communication channels, setting up a system safety organization, place in the organizational structure, documentation)
    • The system and software safety process (general tasks, real examples)
    • Hazard analysis (what it is, how to do it, types of models, types of analysis, current models and techniques, limitations, evaluations)
    • Software hazard analysis and requirements analysis
    • Designing for safety
    • Design of the human--machine interface
    • Verification of safety (testing, software fault tree analysis)
  • APPENDICES
    • Medical Devices: The Therac-25 story
    • Aerospace: The civil aviation approach to safety, Apollo 13, DC-10, and Challenger
    • The Chemical Industry: The chemical process industry approach to safety, Seveso, Flixborough, and Bhopal
    • Nuclear Power: How a nuclear power plant works, The nuclear power approach to safety, Windscale, Three Mile Island, and Chernobyl
  • References

How to Obtain





Comments are closed.