• Accidental DBA
  • Posts
  • SQL Server Reliability: Clearing the Fuzzy Thinking

SQL Server Reliability: Clearing the Fuzzy Thinking

The terminology around reliability is a mess

If you’ve ever said, “We’re covered, it’s replicated,” you’re in good company.

SQL Server is a massive, 35+ year-old product that has changed, renamed, and re-imagined its features so many times that even seasoned IT pros get tripped up. Between log shipping, mirroring, replication, snapshots, Availability Groups, clustering, and VM backups, it’s no wonder managers end up with fuzzy expectations about what keeps their data safe.

This isn’t a failure of leadership, it’s the side effect of a product that’s both deep and deceptively familiar. You’ve got “copies,” “replicas,” and “backups” everywhere, but they don’t all mean the same thing.

Let’s clear the fog.

Replication Isn’t Disaster Recovery

Replication was born for scale, not survival. It’s great for reporting, distributing data to remote sites, and keeping subsets of tables up to date. But if the publisher goes down? There’s no built-in recovery process. You’ll spend more time re-initializing and troubleshooting than restoring from a backup.

In addition, the subscribers are read-write. This means someone down stream can delete a row, causing updates at the publisher to not get moved, which can cause a huge issue.

Replication moves data, not databases. It’s not a safety net, it’s a delivery truck.

Log Shipping Is DR, But Not HA

Log shipping is a solid, dependable form of disaster recovery. It keeps a warm copy ready for manual failover which is perfect for a secondary data center or a standby VM in the cloud.

But it’s not high availability. There’s no automatic failover, no continuous read/write capability, and your secondary is always behind by at least one transaction log.

Think of log shipping as a lifeboat, not a bridge. It’ll get you safely across the storm, but only after you climb in.

Snapshots Aren’t Backups

A SQL Server database snapshot freezes a view of a database at a moment in time - handy for quick rollbacks or comparing changes. A VM snapshot, on the other hand, captures the state of a virtual machine. Neither one replaces a backup that can be restored independently to a new server.

If you are snapshotting the entire server...how does that help you restore to a point in time for one database without bringing ALL of them backwards. That’s a really awkward conversation to have with the CEO.

Snapshots are shortcuts, not insurance policies.

Availability Groups Don’t Replace Backups

Even the shiniest Always On Availability Group can’t save you from data corruption, accidental deletes, or bad deployments that replicate instantly across every node.
Availability is not durability. You still need backups — tested, verified, and stored safely elsewhere.

An AG keeps you online, not immune.

Why It Gets Fuzzy

SQL Server grew up in an enterprise world, full of legacy features that overlap and evolve. Microsoft rarely deprecates things outright, so we’re left with a toolbox that has five ways to move data and ten ways to misunderstand them.

Don’t feel bad. SQL Server’s feature naming team must have had a bet going to see how many words they could reuse. These doesn’t help:

  • “Replication” sounds like “redundancy.”

  • “Availability” sounds like “recovery.”

  • “Snapshots” sound like “backups.”

That’s why it takes a DBA’s mindset. Understanding and transalating these terms takes intentionality and years of exposure to them.

What can you do?

  1. Inventory your protection layers. List what you’re using (replication, AGs, log shipping, backups) and define what each one really does.

  2. Map to business needs. High availability ≠ disaster recovery ≠ data protection. You probably need all three, in different forms.

  3. Test the ugly scenarios. Shut down primaries, take databases offline, pull power cords (in dev). You’ll learn fast which “solution” is just a hope.

  4. Document the real RTO/RPO. Your CFO doesn’t care if it’s log shipping or AGs - they care how long you’ll be down and how much data you’ll lose.

 

The Bottom Line

Fuzzy thinking is normal, but it’s dangerous when disaster strikes.

Replication moves data. Log shipping recovers data. Availability Groups keep you online.

Only backups and testing keep you safe.

When you know what’s really protecting you, you can make confident, budget-friendly choices, not guesses wrapped in acronyms.

 

Free Reliability Check

We’ll focus on your server’s potential downtime and data-loss

SQL TidBit:

Want to see which databases are actually protected?
This quick check shows the database name, recovery model, last full backup, and last log backup times:

SELECT
    d.name AS [Database],
    d.recovery_model_desc AS [RecoveryModel],
    MAX(CASE WHEN bs.type = 'D' THEN bs.backup_finish_date END) AS [LastFullBackup],
    MAX(CASE WHEN bs.type = 'L' THEN bs.backup_finish_date END) AS [LastLogBackup]
FROM sys.databases d
LEFT JOIN msdb.dbo.backupset bs
    ON bs.database_name = d.name
WHERE d.name <> 'tempdb'
GROUP BY d.name, d.recovery_model_desc
ORDER BY d.name;

 Note: If a database is in FULL recovery model but has no recent log backup, your log file is probably growing, and could eventually be a disk or file full, causing an outage.

Please share with your manager to help them understand the differences in SQL Server HA, DR and Distributed computing. They deserve a “cheat sheet”

Reply

or to participate.