Core C# and .NET
< Day Day Up > |
The ADO.NET disconnected model is based on using a DataSet object as an in-memory cache. A DataAdapter serves as the intermediary between the DataSet and the data source that loads the cache with data. After it has completed its task, the DataAdapter returns the connection object to the pool, thus disconnecting the data from the data source. Interestingly, the DataAdapter is actually a wrapper around a data provider's DataReader, which performs the actual data loading. The DataSet Class
In many ways, a DataSet object plays the role of an in-memory database. Its Tables property exposes a collection of DataTables that contain data and a data schema describing the data. The Relations property returns a collection of DataRelation objects that define how tables are interrelated. In addition, DataSet methods are available to Copy, Merge, and Clear the contents of the DataSet. Keep in mind that the DataSet and DataTable are core parts of ADO.NET and unlike the Connection, DataReader, and DataAdapter they are not tied to a specific data provider. An application can create, define, and populate a DataSet with data from any source. Besides tables and their relations, a DataSet can also contain custom information defined by the application. A look at Figure 11-6 shows the major collection classes in the DataSet hierarchy. Among these is PropertyCollection, which is a set of custom properties stored in a hash table and exposed through the DataSet.ExtendedProperties property. It is often used to hold a time stamp or descriptive information such as column validation requirements for tables in the data set. Figure 11-6. DataSet class hierarchy
The discussion of the DataSet class begins with its most important member the DataTable collection. DataTables
One step below the DataSet in the disconnected model hierarchy is the DataTable collection. This collection accessed through the DataSet.Tables property stores data in a row-column format that mimics tables in a relational database. The DataTable class has a rich set of properties and methods that make it useful as a stand-alone data source or as part of a table collection in a DataSet. The most important of these are the Columns and Rows properties, which define the layout and content of a table. DataColumns
The DataTable.Columns property exposes a collection of DataColumn objects that represent each data field in the DataTable. Taken together, the column properties produce the data schema for the table. Table 11-5 summarizes the most important properties.
DataTable columns are created automatically when the table is filled with the results of a database query or from reading an XML file. However, for applications that fill a table dynamically such as from user input or real-time data acquisition it may be necessary to write the code that defines the table structure. It's a worthwhile exercise in its own right that enhances a developer's understanding of the DataSet hierarchy. The following segment creates a DataTable object, creates DataColumn objects, assigns property values to the columns, and adds them to the DataTable. To make things interesting, a calculated column is included. DataTable tb = new DataTable("Order"); DataColumn dCol = new DataColumn("ID", Type.GetType("System.Int16")); dCol.Unique = true; // ID must be unique for each data row dCol.AllowDBNull = false; tb.Columns.Add(dCol); dCol= new DataColumn("Price", Type.GetType("System.Decimal")); tb.Columns.Add(dCol); dCol=new DataColumn("Quan",Type.GetType("System.Int16")); tb.Columns.Add(dCol); dCol= new DataColumn("Total",Type.GetType("System.Decimal")); dCol.Expression= "Price * Quan"; tb.Columns.Add(dCol); // List column names and data type foreach (DataColumn dc in tb.Columns) { Console.WriteLine(dc.ColumnName); Console.WriteLine(dc.DataType.ToString()); }
Note that the ID column is defined to contain unique values. This constraint qualifies the column to be used as a key field in establishing a parent-child relationship with another table in a DataSet. To qualify, the key must be unique as in this case or defined as a primary key for the table. You assign a primary key to a table by setting its PrimaryKey field to the value of an array containing the column(s) to be used as the key. Here is an example that specifies the ID field a primary key: DataColumn[] col = {tb.Columns["ID"]}; tb.PrimaryKey = col;
We'll see how to use a primary key to create table relationships and merge data later in this section. Core Note
DataRows
Data is added to a table by creating a new DataRow object, filling it with column data, and adding the row to the table's DataRow collection. Here is an example that places data in the table created in the preceding example. DataRow row; row = tb.NewRow(); // Create DataRow row["Title"] = "Casablanca"; row["Price"] = 22.95; row["Quan"] = 2; row["ID"] = 12001; tb.Rows.Add(row); // Add row to Rows collection Console.WriteLine(tb.Rows[0]["Total"].ToString()); // 45.90 A DataTable has methods that allow it to commit and roll back changes made to the table. In order to do this, it keeps the status of each row in the DataRow.RowState property. This property is set to one of five DataRowState enumeration values: Added, Deleted, Detached, Modifed, or Unchanged. Let's extend the preceding example to demonstrate how these values are set: tb.Rows.Add(row); // Added tb.AcceptChanges(); // ...Commit changes Console.Write(row.RowState); // Unchanged tb.Rows[0].Delete(); // Deleted // Undo deletion tb.RejectChanges(); // ...Roll back Console.Write(tb.Rows[0].RowState); // Unchanged DataRow myRow; MyRow = tb.NewRow(); // Detached
The two DataTable methods AcceptChanges and RejectChanges are equivalent to the commit and rollback operations in a database. These apply to all changes made from the time the table was loaded or since AcceptChanges was previously invoked. In this example, we are able to restore a deleted row because the deletion is not committed before RejectChanges is called. Note that the changes are to the data table not the original data source. For each column value in a row, ADO.NET maintains a current and original value. When RejectChanges is called, the current values are set to the original values. The opposite occurs if AcceptChanges is called. The two sets of values can be accessed concurrently through the DataRowVersion enumerations Current and Original: DataRow r = tb.Rows[0]; r["Price"]= 14.95; r.AcceptChanges(); r["Price"]= 16.95; Console.WriteLine("Current: {0} Original: {1} ", r["Price",DataRowVersion.Current], r["Price",DataRowVersion.Original]); // output: Current: 16.95 Original: 14.95 Keeping track of table row changes takes on added importance when the purpose is to update an underlying data source. We'll see later in this section how the DataAdapter updates database tables with changes made to DataTable rows. Loading Data into a DataSet
Now that we have seen how to construct a DataTable and punch data into it row-by-row, let's look at how data and a data schema can be automatically loaded from a relational database into tables in a DataSet. For details on loading XML data, refer to Section 11.5, "XML and ADO.NET," on page 533. Using the DataReader to Load Data into a DataSet
A DataReader object can be used in conjunction with a DataSet or DataTable to fill a table with the rows generated by a query. This requires creating a DataReader object and passing it as a parameter to the DataTable.Load method: cmd.CommandText = "SELECT * FROM movies WHERE movie_Year < 1945"; DBDataReader rdr = cmd.ExecuteReader(CommandBehavior.CloseConnection); DataTable dt = new DataTable("movies"); dt.Load(rdr); // Load data and schema into table Console.WriteLine(rdr.IsClosed); // True
The DataReader is closed automatically after all of the rows have been loaded. The CloseConnection parameter ensures that the connection is also closed. If the table already contains data, the Load method merges the new data with the existing rows of data. Merging occurs only if rows share a primary key. If no primary key is defined, rows are appended. An overloaded version of Load takes a second parameter that defines how rows are combined. This parameter is a LoadOption enumeration type having one of three values: OverwriteRow, PreserveCurrentValues, or UpdateCurrentValues. These options specify whether the merge operation overwrites the entire row, original values only, or current values only. This code segment illustrates how data is merged with existing rows to overwrite the current column values: cmd.CommandText = "SELECT * FROM movies WHERE movie_Year < 1945"; DBDataReader rdr = cmd.ExecuteReader( ); DataTable dt = new DataTable("movies"); dt.Load(rdr); // Load rows into table Console.Write(dt.Rows[0]["movie_Title"]); // Casablanca // Assign primary key so rows can be merged DataColumn[] col = new DataColumn[1]; col[0] = dt.Columns["movie_ID"]; dt.PrimaryKey = col; DataRow r = dt.Rows[0]; // Get first row of data r["movie_Title"] = "new title"; // Change current column value // Since reader is closed, must fill reader again rdr = cmd.ExecuteReader(CommandBehavior.CloseConnection); // Merge data with current rows. Overwrites current values dt.Load(rdr, LoadOption.UpdateCurrentValues ); // Updated value has been overwritten Console.Write(dt.Rows[0]["movie_Title"]); // Casablanca Using the DataAdapter to Load Data into a DataSet
A DataAdapter object can be used to fill an existing table, or create and fill a new table, with the results from a query. The first step in this process is to create an instance of the DataAdapter for a specific data provider. As the following code shows, several constructor overloads are available: // (1) The easiest: a query and connection string as arguments String sql = "SELECT * FROM movies"; SqlDataAdapter da = new SqlDataAdapter(sql, connStr); // (2) Assign a command object to the SelectCommand property SqlDataAdapter da = new SqlDataAdapter(); SqlConnection conn = new SqlConnection(connStr); da.SelectCommand = new SqlCommand(sql,conn); // (3) Pass in a query string and connection object SqlConnection conn = new SqlConnection(connStr); SqlDataAdapter da = new SqlDataAdapter(sql, conn);
Of these, the first version is the simplest. It accepts two strings containing the query and connection. From these, it constructs a SqlCommand object that is assigned internally to its SelectCommand property. Unlike the other constructors, there is no need to write code that explicitly creates a SqlCommand or SqlConnection object. In the overloads that accept a connection object as a parameter, the opening and closing of the connection is left to the DataAdapter. If you add a statement to explicitly open the connection, you must also include code to close it. Otherwise, the DataAdapter leaves it open, which locks the data in the database. After the DataAdapter object is created, its Fill method is executed to load data into a new or existing table. In this example, a new table is created and assigned the default name Table: DataSet ds = new DataSet(); // Create DataTable, load data, and add to DataSet // Could use da.Fill(ds,"movies") to specify table name. int numrecs = da.Fill(ds); // Returns number of records loaded
For an existing table, the behavior of the Fill command depends on whether the table has a primary key. If it does, those rows having a key that matches the key of the incoming data are replaced. Incoming rows that do not match an existing row are appended to the DataTable. Using the DataAdapter to Update a Database
After a DataAdapter has loaded data into a table, the underlying connection is closed, and subsequent changes made to the data are reflected only in the DataSet not the underlying data source. To apply changes to the data source, a DataAdapter is used to restore the connection and send the changed rows to the database. The same DataAdapter used to fill the DataSet can be used to perform this task. The DataAdapter has three properties InsertCommand, DeleteCommand, and UpdateCommand that are assigned the actual SQL commands to perform the tasks that correspond to the property name. These commands are executed when the Upate method of the DataAdapter is invoked. The challenge lies in creating the SQL commands that post the changes and assigning them to the appropriate DataAdapter properties. Fortunately, each data provider implements a CommandBuilder class that can be used to handle this task automatically. The CommandBuilder Object
A CommandBuilder object generates the commands necessary to update a data source with changes made to a DataSet. It's amazingly self-sufficient. You create an instance of it by passing the related DataAdapter object to its constructor; then, when the DataAdapter.Update method is called, the SQL commands are generated and executed. The following segment shows how changes to a DataTable are flushed to the database associated with the DataAdapter: DataTable dt= ds.Tables["movies"]; // Shortcut to reference table // (1) Use command builder to generate update commands SqlCommandBuilder sb = new SqlCommandBuilder(da); // (2) Add movie to table DataRow drow = dt.NewRow(); drow["movie_Title"] = "Taxi Driver"; drow["movie_Year"] = "1976"; dt.Rows.Add(drow); // (3) Delete row from table dt.Rows[4].Delete(); // (4) Edit Column value dt.Rows[5]["movie_Year"] = "1944"; // (5) Update underlying Sql Server table int updates = da.Update(ds, "movies"); MessageBox.Show("Rows Changed: " +updates.ToString()); // 3
There are a couple of restrictions to be aware of when using the CommandBuilder: The Select command associated with the DataAdapter must refer to a single table, and the source table in the database must include a primary key or a column that contains unique values. This column (or columns) must be included in the original Select command. Core Note
Synchronizing the DataSet and the DataBase
As demonstrated in this example, the use of a DataAdapter simplifies and automates the process of updating a database or any data store. However, there is a rock in this snowball: the problem of multi-user updates. The disconnected model is based on optimistic concurrency, an approach in which the rows of the underlying data source are not locked between the time they are read and the time updates are applied to the data source. During this interval, another user may update the data source. Fortunately, the Update method recognizes if changes have occurred since the previous read and fails to apply changes to a row that has been altered. There are two basic strategies for dealing with a concurrency error when multiple updates are being applied: roll back all changes if a violation occurs, or apply the updates that do not cause an error and identify the ones that do so they can be reprocessed. Using Transactions to Roll Back Multiple Updates
When the DataAdapter.ContinueUpdateonErrors property is set to false, an exception is thrown when a row update cannot be completed. This prevents subsequent updates from being attempted, but does not affect updates that occurred prior to the exception. Because updates may be interdependent, applications often require an all-or-none strategy. The easiest way to implement this strategy is to create a .NET transaction in which all of the update commands execute. To do so, create a SqlTransaction object and associate it with the SqlDataAdapater.SelectCommand by passing it to its constructor. If an exception occurs, the transaction's Rollback method is used to undo any changes; if no exceptions occur, the Commit method is executed to apply all the update commands. Listing 11-4 is an example that wraps the updates inside a transaction. Listing 11-4. Using Transaction to Roll Back Database Updates
SqlDataAdapter da = new SqlDataAdapter(); SqlCommandBuilder sb = new SqlCommandBuilder(da); SqlTransaction tran; SqlConnection conn = new SqlConnection(connStr); conn.Open(); // Must open to use with transaction // (1) Create a transaction SqlTransaction tran = conn.BeginTransaction(); // (2) Associate the SelectCommand with the transaction da.SelectCommand = new SqlCommand(sql, conn, tran); DataSet ds = new DataSet(); da.Fill(ds, "movies"); // // Code in this section makes updates to DataSet rows try { int updates = da.Update(ds, "movies"); MessageBox.Show("Updates: "+updates.ToString()); } // (3) If exception occurs, roll back all updates in transaction catch (Exception ex) { MessageBox.Show(ex.Message); // Error updating if (tran != null) { tran.Rollback(); // Roll back any updates tran = null; MessageBox.Show("All updates rolled back."); } } finally { // (4) If no errors, commit all updates if (tran != null) { tran.Commit(); MessageBox.Show("All updates successful. "); tran = null; } } conn.Close();
Identifying Rows That Cause Update Errors
When DataAdapter.ContinueUpdateonErrors is set to TRue, processing does not halt if a row cannot be updated. Instead, the DataAdapter updates all rows that do not cause an error. It is then up to the programmer to identify the rows that failed and determine how to reprocess them. Rows that fail to update are easily identified by their DataRowState property (discussed earlier in the description of DataRows). Rows whose update succeeds have a value of Unchanged; rows that fail have their original Added, Deleted, or Modified value. A simple code segment demonstrates how to loop through the rows and identify those that are not updated (see Listing 11-5). Listing 11-5. Identify Attempts to Update a Database That Fails
// SqlDataAdapter da loads movies table da.ContinueUpdateOnError = true; DataSet ds = new DataSet(); try { da.Fill(ds, "movies"); DataTable dt = ds.Tables["movies"]; SqlCommandBuilder sb = new SqlCommandBuilder(da); // ... Sample Update operations dt.Rows[29].Delete(); // Delete dt.Rows[30]["movie_Year"] = "1933"; // Update dt.Rows[30]["movie_Title"] = "King Kong"; // Update dt.Rows[31]["movie_Title"] = "Fantasia"; // Update DataRow drow = dt.NewRow(); drow["movie_Title"] = "M*A*S*H"; drow["movie_Year"] = "1970"; dt.Rows.Add(drow); // insert // Submit updates int updates = da.Update(ds, "movies"); // Following is true if any update failed if (ds.HasChanges()) { // Load rows that failed into a DataSet DataSet failures = ds.GetChanges(); int rowsFailed = failures.Rows.Count; Console.WriteLine("Update Failures: "+rowsFailed); foreach (DataRow r in failures.Tables[0].Rows ) { string state = r.RowState.ToString()); // Have to reject changes to show deleted row if (r.RowState == DataRowState.Deleted) r.RejectChanges(); string ID= ((int)r["movie_ID"]).ToString(); string msg= state + " Movie ID: "+ID; Console.WriteLine(msg); } } Note that even though the delete occurs first, it does not affect the other operations. The SQL statement that deletes or updates a row is based on a row's primary key value not relative position. Also, be aware that updates on the same row are combined and counted as a single row update by the Update method. In this example, updates to row 30 count as one update. Handling concurrency issues is not a simple task. After you identify the failures, the next step how to respond to the failures is less clear, and depends on the application. Often times, it is necessary to re-read the rows from the database and compare them with the rows that failed in order to determine how to respond. The ability to recognize RowState and the current and original values of rows is the key to developing code that resolves update conflicts. Defining Relationships Between Tables in a DataSet
A DataRelation is a parent/child relationship between two DataTables. It is defined on matching columns in the two tables. The columns must be the same DataType, and the column in the parent table must have unique values. The syntax for its constructor is public DataRelation( string relationName, DataColumn parentColumn, DataColumn childColumn)
A DataSet has a Relations property that provides access to the collection of DataRelations defined for tables contained in the DataSet. Use the Relations.Add method to place relations in the collection. Listing 11-6 illustrates these ideas. It contains code to set up a parent/child relationship between the directors and movies tables in order to list movies by each director. Listing 11-6. Create a Relationship Between the Directors and Movies Tables
DataSet ds = new DataSet(); // (1) Fill table with movies string sql = "SELECT movie_ID,movie_Title,movie_DirectorID, movie_Year FROM movies"; SqlConnection conn = new SqlConnection(connStr); SqlCommand cmd = new SqlCommand(); SqlDataAdapter da = new SqlDataAdapter(sql, conn); da.Fill(ds, "movies"); // (2) Fill table with directors sql = "SELECT director_id,(first_name + ' '+ last_name) AS fullname FROM directors"; da.SelectCommand.CommandText = sql; da.Fill(ds, "directors"); // (3) Define relationship between directors and movies DataTable parent = ds.Tables["directors"]; DataTable child = ds.Tables["movies"]; DataRelation relation = new DataRelation("directormovies", parent.Columns["director_ID"], child.Columns["movie_DirectorID"]); // (4) Add relation to DataSet ds.Relations.Add(relation); // (5) List each director and his or her movies foreach (DataRow r in parent.Rows) { Console.WriteLine(r["fullname"]; // Director name foreach (DataRow rc in r.GetChildRows("directormovies")) { Console.WriteLine(" "+rc["movie_title"]); } } /* Sample Output: David Lean Lawrence of Arabia Bridge on the River Kwai, The Victor Fleming Gone with the Wind Wizard of Oz, The */
Relations and Constraints
When a relationship is defined between two tables, it has the effect of adding a ForeignKeyConstraint to the Constraints collections of the child DataTable. This constraint determines how the child table is affected when rows in a parent table are changed or deleted. In practical terms, this means that if you delete a row in the parent table, you can have the related child row(s) deleted or optionally, have their key value set to null. Similarly, if a key value is changed in the parent table, the related rows in the child can have their key value changed or set to null. The rule in effect is determined by the value of the DeleteRule and UpdateRule properties of the constraint. These can take one of four Rule enumeration values:
This code segment illustrates how constraints affect the capability to add a row to a child table and delete or change a row in the parent table. The tables from the preceding example are used. // (1) Try to add row with new key to child table DataRow row = child.NewRow(); row["movie_directorID"] = 999; child.Rows.Add(row); // Fails 999 does not exist in parent // (2) Delete row in parent table row = parent.Rows[0]; row.Delete(); // Deletes rows in child having this key // (3) Relax constraints and retry adding row ds.EnforceConstraints = false; row["movie_directorID"] = 999; child.Rows.Add(row); // Succeeds ds.EnforceConstraints = true; // Turn back on // (4) Change constraint to set rows to null if parent changed ((ForeignKeyConstraint)child.Constraints[0]).DeleteRule = Rule.SetNull ; Note that setting the EnforceConstraints property to false turns off all constraints which in database terms eliminates the check for referential integrity. [3] This allows a movie to be added even though its movie_DirectorID column (foreign key) does not have a corresponding row in the directors table. It also permits a director to be deleted even though a movie by that director exists in the movies table. This clearly compromises the integrity of the database and should be used only when testing or populating individual tables in a database. [3] The foreign key in any referencing table must always refer to a valid row in the referenced table. Choosing Between the Connected and Disconnected Model
The DataReader and DataSet offer different approaches to processing data each with its advantages and disadvantages. The DataReader provides forward-only, read-only access to data. By processing a row at a time, it minimizes memory requirements. A DataSet, on the other hand, offers read/write access to data, but requires enough memory to hold a copy of the data fetched from a data source. From this, you can derive a couple of general rules: If the application does not require the capability to update the data source and is used merely for display and selection purposes, a DataReader should be the first consideration; if the application requires updating data, a DataSet should be considered. Of course, the general rules have to be weighed against other factors. If the data source contains a large number of records, a DataSet may require too many resources; or if the data requires only a few updates, the combination of DataReader and Command object to execute updates may make more sense. Despite the gray areas, there are many situations where one is clearly preferable to the other. A DataSet is a good choice when the following apply:
A DataReader is a good choice when the following apply:
|
< Day Day Up > |