Friday, May 29, 2009

Concurrency Handling

The DataAdapter and optimistic concurrency

Before we get into some of the methods for handling concurrency issues in updating a database with ADO.NET, let's talk a little about the DataAdapter.

When you are in Visual Studio .NET, you can use the Data Adapter Configuration Wizard to generate all the Command objects in the DataAdapter. This simplifies operations, although you are free to generate these commands yourself. In addition, you can also use the Commands generated by the wizards as the basis for customized Stored Procedures, which is hopefully the direction you would want to take things in the first place.

What is often not obvious to developers is that, by default, the Data Adapter Configuration Wizard generates SQL statements that check for optimistic concurrency violations. Optimistic concurrency means that you assume no one will be making edits to a record while you are making your edits. Because you are "optimistic" that two people will not edit the same record simultaneously, you do not apply a lock to the record as soon as a user starts to edit it. Instead, you apply the lock only when the actual update is attempted. This results in a much shorter lock, and in database terms, this is good. Locks are smart; but long locks are not smart.

You can see this behavior by starting a new project in VS.NET and dragging a SqlDataAdapter or OleDbDataAdapter onto a designer surface. If you step through the wizard and choose to use SQL statements, the next page of the wizard will allow you to type in the Select statement. This page has a button labeled "Advanced Options". Clicking this button opens the Advanced SQL Generation Options dialog box, and by default, the option for using optimistic concurrency is turned on.

To check for optimistic concurrency violations, the Data Adapter Configuration Wizard writes SQL statements that verify the record you are about to update or delete has not changed since you originally loaded it into your DataSet. The Wizard does this by adding a rather large WHERE clause to the SQL statement to verify that it is updating or deleting an exact match of what was downloaded and placed in the DataSet. However, this also relies on another aspect of the DataSet: DataTables can store multiple versions of each row. Each DataTable in a DataSet is made up of DataRow objects. Each record in the table can actually exist multiple times, where each record has a different DataRowVersion. The ones that come into play with optimistic concurrency are Current and Original.

When the DataAdapter first populates your DataSet with records, each DataRow is given a DataRowVersion of Original. If you then edit a record, the changes are stored in a new version of the record with a DataRowVersion of Current. Now, you have both the Original version (from the database) and the Current version (the changes you want to make). It is called the Current version because it is the current version of the data in your DataSet, but, of course, it is not yet the current version in the underlying database.

The Four Basic Concurrency Handling Options

At this point, you have four basic optimistic concurrency options in SQL update queries:

1) Include Only the Primary Key Columns

You can include only the primary columns in the SQL UPDATE and DELETE queries. This creates a "last in wins" updating scenario. Both update attempts will succeed. Obviously, the database is not going to maintain both sets of changes. The changes made by the last update will override the previous changes.

The "last in wins" scenario is not appropriate if you want to prevent users from unwittingly overwriting other users' changes.

The CommandBuilder object does not offer this optimistic concurrency option; the Visual Studio.Net Data Adapter Configuration Wizard does. On the Advanced Options tab, you would deselect the "Use Optimistic Concurrency" check box.

2) Include All Columns in the WHERE Clause

As mentioned above, the default behavior of both the CommandBuilder and the Data Adapter Configuration Wizard is to include all the columns in the WHERE clause. This prevents your code from overwriting changes made by other users between the time your code retrieves the row and the time your code attempts to submit the pending change in the row.

Because the value of another user's updated column for a row of data has changed in the database, no row in the table satisfies all the criteria in the query's WHERE clause. Thus, the database does not modify the row. The DataAdapter queries the database to determine how many rows the query modified, discovers that the query did not successfully update the desired row, and marks the DataRow accordingly.

This is the concurrency option that the CommandBuilder object uses, and as explained, the Data Adapter Configuration Wizard uses this concurrency option by default.

You control how the DataAdapter responds to a failed update attempt by using the ContinueUpdateOnError property. By default, this property is set to False, which means that the DataAdapter will throw a DBConcurrencyException when it encounters a failed update attempt. If you want the DataAdapter to attempt to submit the remaining changes, you would set its ContinueUpdateOnError property to True.

If you set this property to True and one or more of the update attempts fail, the DataAdapter will not throw an exception. When the DataAdapter encounters a failed update attempt, it will set the HasErrors property of the corresponding DataRow object to True and set the RowError property of the DataRow to the concurrency error message. You can then check the HasErrors property of your DataSet or DataTable after calling DataAdapter.Update to determine whether any of the update attempts failed, and decide how your application should handle this business logic for the user.

Some developers will want to submit changes in a transaction and commit the changes only if all update attempts succeed. In such scenarios, you'll probably want to leave the ContinueUpdateOnError property set to its default value of False and roll back the transaction if the Update method throws an exception.

3) Include the Primary Key and Timestamp Columns

You can simplify the WHERE clause of your query-based updates by relying on timestamp columns. The SQL Server timestamp column does not actually contain date and time information. Instead, it contains binary data that's unique within the database.
You can define a timestamp column on your SQL Server table, and any time the contents of a row changes, SQL Server will modify the value of the timestamp column for that row. We could add a timestamp column to the Northwind Customers table and make our query look like this:
UPDATE Customers SET CustomerID = 'ALFKI', CompanyName = 'Original Company Name', ContactName = 'New Contact', Phone = '901-345-9856' WHERE CustomerID = 'ALFKI' AND TimestampColumn = 0x00000000000000CC Because the server will generate a new value for the timestamp column each time it updates a row, you can use a combination of the primary key and timestamp columns in the WHERE clause of your query-based updates to ensure that you don't overwrite another user's changes.
Neither the CommandBuilder nor the Data Adapter Configuration Wizard supports generating updating logic using this optimistic concurrency strategy.
NOTE: As of SQL Server 2000, "rowversion" is now synonymous with the timestamp data type. The SQL Server documentation recommends using the "rowversion" keyword instead of timestamp.
This option provides a much simpler updating logic, and the database has fewer columns to examine per update attempt.

4) Include the Primary Key Columns and Modified Columns

By default, the ADO cursor engine includes only the primary key columns and the original values of modified columns in the WHERE clause of its query-based updates. The cursor engine also includes only the modified columns in the SET clause of UPDATE queries.

Let's look at a multi-user example using this updating strategy. Let's say that User A and User B retrieve the same row of customer data at the same time. They each modify a different column of data -- User A changes the CompanyName column, and User B changes the ContactName column. User B submits the pending change to the ContactName column first. User B's UPDATE query looks like this:

UPDATE Customers SET ContactName = 'New Contact' WHERE CustomerID = 'ALFKI' AND ContactName = 'Original Contact'
User A then submits the pending change to the CompanyName column using the following UPDATE query:

UPDATE Customers SET CompanyName = 'New Company Name' WHERE CustomerID = 'ALFKI' AND CompanyName = 'Original Company Name'
The contents of the row will change from :

CustomerID CompanyName ContactName ---------- --------------------- ---------------- ALFKI Original Company Name Original Contact
to

CustomerID CompanyName ContactName ---------- --------------------- ---------------- ALFKI Original Company Name New Contact
and finally to

CustomerID CompanyName ContactName ---------- --------------------- ---------------- ALFKI New Company Name New Contact
Both updates will succeed, and the change made by User A will not overwrite changes made by User B.

The structure of the ADO.NET DataAdapter does not lend itself to this updating strategy because it requires that you change the structure of the query based on the columns that have been modified in the row that contains the pending change. The DataAdapter supplies values for the parameters in its query-based updates on a row-by-row basis, but it does not modify the actual structure of the parameterized query.

You could theoretically write code to dynamically change the structure of the appropriate Command object and use that code while handling the DataAdapter object's RowUpdating event. However, while this updating strategy has benefits, the costs are too high.

Submitting Updates in Transactions

What if you want to submit all of your updates as a single unit so that either all of the updates succeed or none of them do? The simple answer is to wrap your updates in a transaction. But, there is a problem -- the DataAdapter does not expose a Transaction property.

The DataAdapter does not actually submit the updates. It hands the work off to the Command objects in its UpdateCommand, InsertCommand, and DeleteCommand properties. The Command objects expose a Transaction property, so in order to submit the changes using the DataAdapter, you must set the Transaction property of the Command objects that the DataAdapter will use.

The following code shows one way to accomplish this task:

string strConn, strSQL; strConn = "Provider=SQLOLEDB;Data Source=(local);Initial Catalog=Northwind;Trusted_Connection=Yes;"; strSQL = "SELECT OrderID, ProductID, Quantity, UnitPrice FROM [Order Details] WHERE OrderID = 11511" + " ORDER BY ProductID"; DataTable tbl = new DataTable(); OleDbConnection cn = new OleDbConnection(strConn); OleDbDataAdapter da = new OleDbDataAdapter(strSQL, cn); cn.Open(); da.Fill(tbl); //Modify the contents of the DataTable. //Create a new transaction. OleDbTransaction txn = cn.BeginTransaction(); //Set the Transaction property of the DataAdapter's Commands. da.UpdateCommand.Transaction = txn; da.InsertCommand.Transaction = txn; da.DeleteCommand.Transaction = txn; //Submit the changes. da.Update(tbl); //Commit the changes and close the connection. txn.Commit(); cn.Close();

It's more difficult to submit changes in a transaction if you're relying on the CommandBuilder object to generate your updating logic. The CommandBuilder does not actually generate the updating logic when you instantiate it. If you instantiate a CommandBuilder object and later call DataAdapter.Update , the CommandBuilder will not actually build the updating logic until you call the DataAdapter object's Update method. This behavior poses a slight problem if you want to use the CommandBuilder to submit changes in a transaction.

If you add the following line of code just before the call to the DataAdapter object's Update method, your code will succeed:

da.SelectCommand.Transaction = txn

So this summarizes the 4 major approaches to handling concurrency in a mult-user database update scenario using ADO.NET, specifically focusing on the DataAdapter. Hopefully this will provide you with the basis for further study, and some additional ideas on how you may want to structure your database updating logic for your application.

No comments:

Post a Comment