Saturday, February 21, 2015

Basics of Database Transnational Log File Functionality and Understanding the SQL Server Crash Recovery Process

Hello all I am going to talk about the (Crash) Recovery Process in SQL Server. My intention is to make below four bullet points understandable.  
  • The importance of the Transnational Log file for database
  • T Log's functionality when the database is online
  • T Log's role in crash recovery
  • The how the Recovery process works (very high level)
I used to have lot of confusion with these things, I am hoping I am bit clear now. 

Every SQL Serve database consists of two files known as master data file (.mdf) and Log data file (.ldf). Log data file usually know as Transnational log (T Log). Transnational log plays a key role in maintaining the ACID (Atomicity, Consistency, Isolation, Durability). Transnational log file is the key part of the the database during the crash recovery.  And crash recovery is only happens when the transnational log file is undamaged. If the T log is damaged the so called crash recovery will fail and the respective database will go into Suspect mode and it needs different approach to recover database which may involve data loss and etc. 

What is Recovery?
So we keep talking about Crash recovery and T log is important for SQL Server to do the crash recovery. The obvious question that came to my mind is what is recovery?
Before we start understanding the recovery process, we need to know the concept of Logging and Checkpoint. Logging is important function of the T Log when the database is Online.

Understanding Logging: Logging is the process that T Log uses to keep track of every change (structural, DDLs, DMLs, etc.) in the Buffer Pool related to the database using the concept of LSN (Log Sequence Number). A LSN number is an ever-increasing, three-part number that uniquely defines the position of a log record (DML log, DDL log any change log) within the transaction log.

Understanding Checkpoints: A checkpoint is periodic operation that flushes the dirty pages (Committed transactions or Committed modifications from DDLs, DMLs and other operations and uncommitted transactions) from buffer pool to disk. And it also updates the database page with last LSN of the last log record that affected the page. 

Note: Every database page have a field in their page header (a 96-byte portion of the 8192-byte page that contains metadata about the page) that contains the LSN of the last log record that affected the page. 

Recovery: Recovery uses LSNs from T log and LSN of the last log record that effected DB page to do the Crash Recovery. Recovery is helpful to bring the DB back to the point where DB is transactions wise consistent after Crash. Please look at below how the crash recovery uses the LSNs from T Log and LSN of the last log record on page header to do recovery. 

If LSN of last Log record on page header > = Committed LSN in T Log ---> Recovery will do nothing, as that transaction already written to page on disk.
If LSN of last Log record on page header < Committed LSN in T Log ---> Recovery will do REDO (Reply or Roll Forward) as that transaction not written to page yet on disk.
If LSN of last Log record on page header <= Uncommitted LSN in T Log ---> Recovery will do UNDO(Roll back) as that transaction is not needed to written to page on disk to make sure this effects are not persisted.
If LSN of last Log record on page header > Uncommitted LSN in T Log ---> Recovery will do REDO (Roll Forward) as that transaction is needed to written to page on disk.

The whole point of the Recovery is recovering the Database to a consistent state, if the effects (LSNs) of a partially complete transaction were not rolled back, the database would be left in an inconsistent state (possibly even structurally corrupt, depending on what the transaction was in the middle of doing).

Note regarding the CheckpointsCheckpoints will reduce the pages (LSNs) that changed but not written to disk because of committed transaction. This leads to reduction in the number of pages that need to have REDO recovery applied during crash recovery.

What is a Committed transaction?
Transactions in SQL Server are either explicit or implicit.
In case of Explicit transaction issuing the COMMIT TRAN after BEGIN TRAN makes that particular transaction committed. In case of Implicit transaction user or application does not explicitly issue a BEGIN TRANSACTION statement before issuing a T-SQL statement. However the Storage Engine will automatically start a transaction under the covers. and automatically commits the transaction that it started to wrap around the user's statement. When Storage Engine commits the transaction, it will become a committed transaction.

What is an Uncommitted Transaction?
Any Explicit transaction or statement that's not issued with external COMMIT TRAN after BEGIN TRAN is Uncommitted transaction.In case Implicit transaction if the transaction is still running and making changes and not reached the state where Storage Engine Commit(s) it is a uncommitted transaction.

The differences and similarities between the three recovery models of SQL Server: 



I hope this post will be helpful to someone :). 

I have gone through Paul Randal's article (to TechNet magazine) and using the same as reference for this post. 
Here is the link to the article for more details:
https://technet.microsoft.com/en-us/magazine/2009.02.logging.aspx


RESTORING A TEST\DEV DB USING A PROD BKP COPY....THE PROD DB PERMISSIONS GETS CARRIED OVER

Hi Guys!! I am going to discuss about a pretty common issue that every DBA face when they refresh a development\test database with the corresponding production database. Usually I used to fix the Orphaned users and let the developers (requester) know that they are good to use the development\test database now...But I always got a call back from the requester saying that they are not able to do the things on the newly restored development\test database that they used to do before the refresh.

Here is the problem... In our environment we give the different level of permissions to the users\developers\testes\application owners on the development\test databases than the corresponding production databases. The production database permissions are carried over and overlaid on the development\test database when I did the refresh the development\test database, this is the reason why developers are not able to do the things that they used do before refresh.

I know, Especially on databases with lot of users it is hard to fix the permissions on restored database user by user by asking them what permissions they used to have before refresh. 

The solution is script out the all the users and their permissions on the development\test database before starting the restoring process. Here I am attaching the script that generates the users\permissions in T-SQL, All we need to do is save the output and run it on restored database. This script generates individual, database roles and object level permissions.

Note: This script works with SQL Servers 2005 and above.


Output of the above script looks something like below:



Got It!!: Make sure you fix all the Orphaned users if any exists after you execute the script generated from the result . 


Hope this is what someone looking for. :)

Friday, February 20, 2015

SQL SERVER TO ORACLE REPLICATION...ORA-00001: UNIQUE CONSTRAINT VIOLATED ERROR

When I was working on a heterogeneous replication between SQL Server as publisher and Oracle as subscriber, one day I seethe  replication monitor throwing an error as below.
  • ORA-00001: unique constraint (Schema.TabbleName) violated (Source: MSSQL_REPL_ORACLE, Error number: 1) Get help: http://help/
I realized that the above error as the similar error (Data Inconsistency error message) that we usually get with SQL to SQL replication, Which we can fix by switching the Subscriber agent profile from Default agent profile to Continue on data consistency errors. BUT that fix didn't really helped me to fix this inconsistency error on SQL to Oracle replication. When I tried to force that fix it said " Replication Monitor could not set default agent profile for this type of agents." .

My fellow DBA helped me in fixing this issue. we created a subscriber User profile by taking the Continue on data consistency errors profile as template and changed the error code to 1 on Skip Errors parameter as shown below. Here we set the error_code to 1 as the error_code is 1 from the table MSrepl_Errors on distributor for the above error (Image 4).  And give some new name to the newly create profile and force the Oracle subscriber to use that profile.

Note: Please be sure to switch the profile back to Default profile after the replication passes the problem records.




Images 1 to 3 will help creating and editing the User profile for the Oracle subscriber.
Image 4 is MS_Replerrors output showing the error_Code info.


Got It!!: As the error codes are different for Oracle subscription for inconsistency that is the reason why replication monitor doesn't allowed us to use the existing Continue on data consistency errors profile for this error on SQL to Oracle replication...And Same procedure (FIX) will apply for any other heterogeneous replications with SQL Server as publisher...


Hope this helps!!!

Thursday, February 19, 2015

MY BLOG INTRODUCTION POST

This is my first post. I want to share the STUFF that I know about Microsoft SQL Server. I will try to be as much accurate as I can when I blog something here. Hopefully you can find some help with SQL Server related issues\Concepts\Bugs from here. ;)