8. Preparing for Business Continuity

Adding Redundancy

  • For instance
    • Disk Redundancy with RAID
    • Server Redundancy with Failover Clusters
    • Power Redundancy with UPS/Generator
    • Site Redundancy with hot/cold/warm sites
  • Identify Single Point of Failure
    • If it breaks, will something take its place, or does everything go down?
    • Disk – Will a system crash without this disk? Will the data be lost forever?
    • Server – Will the service stop if this goes down? What else relies on that service?
    • Power – If there’s a power outage, what takes its place?
  • See RAID above
  • Server Redundancy
    • 999% availability = 5 nines.
    • Less than six minutes of downtime a year
    • Expensive, but could be justified depending on costs of downtime.
    • Failover Clusters
      • Two or more servers as nodes in a cluster
      • At least one inactive node
      • If active node fails, inactive node takes over.
    • Load Balancers for High Availability
      • Distributes traffic and data loads across system devices
      • Allows for scalability
      • Load balancing also detects failed devices
    • Power Redundancy
      • UPS – Provides power until any one of three goals
        • System should have enough time to shut down
        • Generators have enough time to power up and stabilize
        • Commercial Power returns
      • Generators
        • Expensive to run, but cheaper than failure
        • Should be able to run for a long time
      • Protecting Data with Backups
        • Ensure that when data is lost or corrupted, it can be retrieved
        • Redundancy does not remove the need for backups
        • Tapes
          • Full Backup
          • Differential Backup – backs up all changes since full backup
          • Incremental Backup – backs up all changes since last differential or incremental backup
            • Each incremental tape needs to be kept because the backups are not cumulative
          • Which setup you go with depends on maintenance time throughout the week and loss acceptance
          • Restorations take longer with smaller incremental backups
        • Its important to test backups, because it sucks if they fail
        • Protecting Backups
          • Use clear labelling in storage and physical security to protect them from theft
          • Protect it well if its being moved from one location to another.
          • Destroy backups when they’re no longer needed.
        • Backup Policies and Plans
          • Identify Data to Backup
          • Requires off-site backups
          • Requires Labeling Media
          • Mandates Testing of Backups
          • Identifies Retention Requirements
            • Note related laws on how long data needs to be stored
            • Also note how much data you want available if you go to court lol
          • Designate Frequency of Backups
          • Protects Backups
          • Identifies Acceptable Disposal Methods

Comparing Business Continuity Elements

  • Disasters can come from
    • Fires
    • Attacks
    • Power Outages
    • Data loss
    • Hardware/software Failure
    • Natural disasters
  • Business Continuity Planning (BCP) follows these steps
    • Complete a Business Impact Analysis
    • Develop Recovery Strategies
    • Develop Recovery Plans
    • Test Recovery Plans
    • Update Plans
  • Business Impact Analysis (BIA)
    • Some systems can be delayed, like loan processing, but accessing and withdrawing account funds need to be always live
    • A business must decide:
      • What are critical systems and functions?
      • Are there dependencies related to those systems?
      • What is the maximum downtime of those systems?
      • What scenarios would most likely affect those systems?
      • What is the potential loss from these scenarios?
    • You might decide that your maximum downtime is five hours, so now you need to plan how you would recover from any disaster in less than that time.
    • You might recognize that losing data from a secure server could cost you millions- so now you know you should be willing to spend a lot to make sure you never lose data from it.
    • Recovery Time Objective (RTO)
      • Max duration systems can be down.
      • Might have different RTOs for different systems
    • Recovery Point Objective (RPO)
      • How often you need to backup data in order to ensure you have acceptable data loss.
    • Continuity of Operations Planning (COOP)
      • Setting up an alternate location that can run things if things go nuts, like in a hurricane.
      • Hot site – when you need ops in 60 minutes.
      • Cold site – when you have a few days.
      • Mobile site – set up and tear down for when a company doesn’t want a permanent alternate site. Could be in a semi trailer.
      • Mirrored site– 100% identical to the primary location including real-time data transfer.
    • Disaster Recovery Plans (DRP)
      • Includes a hierarchical list of critical systems indicating the order to restore systems
        • Activate Disaster Recovery Plan
        • Implement Contingencies
          • Backup sites/systems, etc
        • Recover Critical Systems
        • Test Recovered Systems
        • Document and Review
      • Planning for Communications
        • War Room – conference room where people get their updates and report in
        • You must be able to communicate with these people even if cell lines are down:
          • Disaster Response Team Members
          • Employees
          • Customers
          • Suppliers
          • Media
            • Get a PR agency, don’t let a tech talk to the press
          • Regulatory Agencies
        • IT Contingency Planning
          • Focused only on IT, rather than full business
        • Succession Planning
          • Who takes over when…?
          • Who has say when…?
          • Someone needs authority, but it can’t be just anyone
        • BCP and DRP Testing
          • Tabletop and functional exercises
            • Backups
            • Server Restoration
            • Server Redundancy
            • Alternate Sites
          • Testing Controls
            • Try turning stuff off and see what breaks or what takes over
          • Escape Plans, Escape Routes, Drills
        • Implementing Environmental Controls
          • Heating, Ventilation and AC
            • If HVAC fails, it can fry your servers
            • Sometimes its worth shutting down the systems if the HVAC can’t keep up with the load or fails
          • Hot and Cold Aisles
            • Some aisles exhale hot air, some pull in cold
            • Make the backs of two racks face each other
          • HVAC and Fire
            • HVAC often have fire alarm systems because if they pump oxygen into a fire, the fire goes nuts, but alternately a well designed HVAC can fight the fire with dampers
          • Fail-safe v Fail-open
            • Does it fail to be most secure, or most safe for people?
            • Doors should fail open, firewalls should fail-safe.
          • Fire Suppression
            • Remove the Heat with chemical fire extinguishers
            • Remove the oxygen with CO2
            • Remove the Fuel
            • Disrupt the chain reaction with chemicals
            • Four Classes of Fires
              • Class A – Ordinary Combustibles – wood, paper, cloth, rubber, trash, and plastic
              • Class B – Flammable Liquids – Gasoline, propane, solvents, oil, paint, etc
              • Class C – Electrical Equipment – Computers, wiring, etc. Don’t throw water on it.
              • Class D – Combustible Metals – magnesium, lithium, titanium, sodium.
            • Environmental Monitoring
              • Includes Temp/Humidity sensors
              • Shielding – protects from EMI
                • If data radiates outside a cable through EMI, it can be stolen
                • Shielding Cables
                • Protected Distribution of Cabling
                  • Planning where you route cables so an attacker can’t throw on an RJ45 end or Fiber end and hack your shit
                • Faraday Cages
                  • Room that prevents signal radiation past the barrier
Advertisements
Advertisements

I post all things that interest me. Mainly computers.

%d bloggers like this: