Saturday, February 21, 2015

Basics of Database Transnational Log File Functionality and Understanding the SQL Server Crash Recovery Process

Hello all I am going to talk about the (Crash) Recovery Process in SQL Server. My intention is to make below four bullet points understandable.  
  • The importance of the Transnational Log file for database
  • T Log's functionality when the database is online
  • T Log's role in crash recovery
  • The how the Recovery process works (very high level)
I used to have lot of confusion with these things, I am hoping I am bit clear now. 

Every SQL Serve database consists of two files known as master data file (.mdf) and Log data file (.ldf). Log data file usually know as Transnational log (T Log). Transnational log plays a key role in maintaining the ACID (Atomicity, Consistency, Isolation, Durability). Transnational log file is the key part of the the database during the crash recovery.  And crash recovery is only happens when the transnational log file is undamaged. If the T log is damaged the so called crash recovery will fail and the respective database will go into Suspect mode and it needs different approach to recover database which may involve data loss and etc. 

What is Recovery?
So we keep talking about Crash recovery and T log is important for SQL Server to do the crash recovery. The obvious question that came to my mind is what is recovery?
Before we start understanding the recovery process, we need to know the concept of Logging and Checkpoint. Logging is important function of the T Log when the database is Online.

Understanding Logging: Logging is the process that T Log uses to keep track of every change (structural, DDLs, DMLs, etc.) in the Buffer Pool related to the database using the concept of LSN (Log Sequence Number). A LSN number is an ever-increasing, three-part number that uniquely defines the position of a log record (DML log, DDL log any change log) within the transaction log.

Understanding Checkpoints: A checkpoint is periodic operation that flushes the dirty pages (Committed transactions or Committed modifications from DDLs, DMLs and other operations and uncommitted transactions) from buffer pool to disk. And it also updates the database page with last LSN of the last log record that affected the page. 

Note: Every database page have a field in their page header (a 96-byte portion of the 8192-byte page that contains metadata about the page) that contains the LSN of the last log record that affected the page. 

Recovery: Recovery uses LSNs from T log and LSN of the last log record that effected DB page to do the Crash Recovery. Recovery is helpful to bring the DB back to the point where DB is transactions wise consistent after Crash. Please look at below how the crash recovery uses the LSNs from T Log and LSN of the last log record on page header to do recovery. 

If LSN of last Log record on page header > = Committed LSN in T Log ---> Recovery will do nothing, as that transaction already written to page on disk.
If LSN of last Log record on page header < Committed LSN in T Log ---> Recovery will do REDO (Reply or Roll Forward) as that transaction not written to page yet on disk.
If LSN of last Log record on page header <= Uncommitted LSN in T Log ---> Recovery will do UNDO(Roll back) as that transaction is not needed to written to page on disk to make sure this effects are not persisted.
If LSN of last Log record on page header > Uncommitted LSN in T Log ---> Recovery will do REDO (Roll Forward) as that transaction is needed to written to page on disk.

The whole point of the Recovery is recovering the Database to a consistent state, if the effects (LSNs) of a partially complete transaction were not rolled back, the database would be left in an inconsistent state (possibly even structurally corrupt, depending on what the transaction was in the middle of doing).

Note regarding the CheckpointsCheckpoints will reduce the pages (LSNs) that changed but not written to disk because of committed transaction. This leads to reduction in the number of pages that need to have REDO recovery applied during crash recovery.

What is a Committed transaction?
Transactions in SQL Server are either explicit or implicit.
In case of Explicit transaction issuing the COMMIT TRAN after BEGIN TRAN makes that particular transaction committed. In case of Implicit transaction user or application does not explicitly issue a BEGIN TRANSACTION statement before issuing a T-SQL statement. However the Storage Engine will automatically start a transaction under the covers. and automatically commits the transaction that it started to wrap around the user's statement. When Storage Engine commits the transaction, it will become a committed transaction.

What is an Uncommitted Transaction?
Any Explicit transaction or statement that's not issued with external COMMIT TRAN after BEGIN TRAN is Uncommitted transaction.In case Implicit transaction if the transaction is still running and making changes and not reached the state where Storage Engine Commit(s) it is a uncommitted transaction.

The differences and similarities between the three recovery models of SQL Server: 



I hope this post will be helpful to someone :). 

I have gone through Paul Randal's article (to TechNet magazine) and using the same as reference for this post. 
Here is the link to the article for more details:
https://technet.microsoft.com/en-us/magazine/2009.02.logging.aspx


No comments:

Post a Comment