DataSets and DataAdapters

ADO.NET’s SqlDataReader and OleDbDataReader classes provide stream-based access to the results of database queries. Streaming access is fast and efficient, but it’s also read-only and forward-only. You can’t, for example, back up and reread the previous record with a DataReader or change the results and write them back to the database. That’s why ADO.NET supports set-based data access as well as stream-based data access. Set-based accesses capture an entire query in memory and support backward and forward traversal through the result set. They also let you edit the data obtained through database queries, propagate the changes back to the data source, and much, much more.

Set-based data accesses revolve around two classes: DataSet, which is the equivalent of an in-memory database and is defined in the System.Data namespace, and DataAdapter, which serves as a bridge between DataSets and physical data sources. DataAdapter is actually two classes in one because it comes in provider-specific versions: SqlDataAdapter and OleDbDataAdapter. Learning about DataSet and DataAdapter unlocks the door to a whole new style of data access that further abstracts the SQL data model and lends itself extraordinarily well to data binding and caching.

The DataSet Class

If DataSet isn’t the most important class in ADO.NET, it’s certainly the one that gets the most attention. Think of a DataSet as an in-memory database. (See Figure 12-1.) The actual data is stored in DataTable objects, which are analogous to tables in a database. The DataSet.Tables property exposes a list of the DataTables in a DataSet. Records in a DataTable are represented by DataRow objects, and fields are represented by instances of DataColumn. DataTable properties named Rows and Columns expose the collections of DataRows and DataColumns that make up the table. DataTable also features a property named Constraints that permits constraints to be applied to individual columns. Ensuring that all the values in a column are unique, for example, is as simple as creating a UniqueConstraint object identifying the DataColumn and adding it to the table’s Constraints collection. DataSets also support data relations. The DataSet class’s Relations property holds a collection of DataRelation objects, each of which defines a relationship between two tables. DataTable, DataRow, DataColumn, UniqueConstraint, and DataRelation are all members of the System.Data namespace and are not provider-specific.

Figure 12-1

The DataSet object.

DataSets are ideal for capturing the results of database queries and storing them in memory for the purpose of examining and perhaps modifying the data. Unlike a DataReader, which supports forward-only access to the data that it encapsulates, a DataSet supports random access. You can also modify the contents of a DataSet and propagate the changes back to the database that provided the data in the first place. In addition, DataSets are great for caching, especially in Web applications. Rather than physically query a database every time a page is hit, for example, you can query the database once, capture the results in a DataSet, stuff the DataSet into ASP.NET’s application cache, and satisfy subsequent requests without touching the database. Of course, you would also implement a refresh policy to prevent the data in the cache from becoming stale.

DataSets vs. DataReaders

One of the most common questions that developers ask about ADO.NET is which is best, DataSets or DataReaders? The answer is: it depends. If your intention is simply to query a database and read through the records one at a time until you find the one you’re looking for, then DataReader is the right tool for the job. DataReaders, unlike DataSets, retrieve only the data that you actually use, and they don’t consume memory by storing every record that you read. If, however, you intend to use all the query results (perhaps because you’re displaying them in a table), you need the ability to iterate backward and forward through a result set, or you want to cache the result set in memory, use a DataSet.

As a corollary to the DataSet vs. DataReader debate, realize that many controls that support data binding to DataSets are perfectly capable of binding to DataReaders as well. Many examples in the .NET Framework SDK and elsewhere that demonstrate data binding to ASP.NET server controls show controls binding to DataSets:

DataSet?ds?=?new?DataSet?();
//?TODO:?Initialize?the?DataSet
MyDataGrid.DataSource?=?ds;
MyDataGrid.DataBind?();

Oftentimes, the same code can be implemented more efficiently with a Data-Reader:

SqlDataReader?reader?=?cmd.ExecuteReader?();
MyDataGrid.DataSource?=?reader;
MyDataGrid.DataBind?();

Binding to DataReaders is more efficient for the simple reason that it doesn’t leave the result set lying around in memory for the garbage collector to clean up.

DataAdapter Classes

While it’s perfectly possible to build DataSets in memory, from scratch, without relying on external data sources, in practice DataSets are almost always initialized from database queries or XML documents. DataSets don’t interact with databases directly; instead, they let DataAdapters do the dirty work. DataAdapter’s purpose is to perform database queries and create DataTables containing the query results. It’s also capable of writing changes made to the DataTables back to the database. Figure 12-2 diagrams the relationship between DataSets, DataAdapters, and databases. The DataAdapter acts as a go-between, providing a layer of abstraction between the DataSet and the physical data source.

Figure 12-2

The role of DataAdapter.

As mentioned earlier, DataAdapter is a provider-specific class that comes in two varieties: SqlDataAdapter, which interacts with databases using the SQL Server .NET provider, and OleDbDataAdapter, which uses the Framework’s OLE DB .NET provider. Both derive from a common base class—System.Data.Common.DbDataAdapter—and both feature a common set of methods and properties that control their behavior. Chief among a DataAdapter’s methods are Fill and Update. The former queries a database and initializes a DataSet (actually, a DataTable) with the results. The latter propagates changes back to the database. Fill is introduced in the next section. You’ll learn all about the Update method a few sections hence.

Initializing a DataSet from a Database: DataAdapter.Fill

The following code sample is representative of the numerous DataSet examples found in the .NET Framework SDK and on Web sites that cater to .NET developers:

SqlDataAdapter?adapter?=?new?SqlDataAdapter?("select?*?from?titles",
??? "server=localhost;database=pubs;uid=sa;pwd=");
DataSet?ds?=?new?DataSet?();
adapter.Fill?(ds, "Titles");

What does this code do? A lot. Here’s a blow-by-blow of what happens inside the call to Fill:

Fill opens a connection to the Pubs database using the connection string passed to SqlDataAdapter’s constructor.
It performs a query on the Pubs database using the query string passed to SqlDataAdapter’s constructor.
It creates a DataTable named “Titles” in the DataSet.
It initializes the DataTable with a schema that matches that of the “Titles” table in the database.
It retrieves all the records produced by the query and writes them to the DataTable.
It closes the connection to the database.

Under the hood, the DataAdapter uses a DataReader to fetch the query results, but that’s merely an implementation detail. What matters is that one simple call fills a DataSet with data from a database. Furthermore, you control what goes into the DataSet because you provide the command that makes up the query. That query can be as complex as you like—whatever it takes to initialize the DataSet the way you want it initialized. If you want, you can call Fill multiple times with the same DataSet but with different table names to populate the DataSet with several DataTables. And you can use DataSet.Clear to clear out old DataTables if you’ve finished with them but want to reuse the DataSet.

The preceding code would change only slightly if reconfigured to use the OLE DB .NET provider:

OleDbDataAdapter?adapter?=
????new?OleDbDataAdapter?("select?*?from?titles",
??? "provider=sqloledb;server=localhost;database=pubs;uid=sa;pwd=");
DataSet?ds?=?new?DataSet?();
adapter.Fill?(ds);

Once the DataSet is initialized, it doesn’t matter which provider you used to initialize it because DataSets are provider agnostic.

DataTable and Friends

The DataSet class gets all the press, but DataTable and friends are the unsung heroes of set-based data access. A DataSet is first and foremost a collection of DataTables, which are often created by DataAdapters. To examine the data returned by a DataAdapter, you browse the DataTable that the DataAdapter created. If you want to edit the data, once more the DataTable will be the focus of your actions.

The following code iterates through the Tables collection of a DataSet named ds and outputs the name of each table that it encounters:

foreach?(DataTable?table?in?ds.Tables)
????Console.WriteLine?(table.TableName);

Individual DataTables in a DataSet can be referenced by name or 0-based index. The next example retrieves the first DataTable from a DataSet and writes the value of the first column in every row to a console window:

DataTable?table?=?ds.Tables[0];
foreach?(DataRow?row?in?table.Rows)
????Console.WriteLine?(row[0]);

Columns, too, can be referenced by name as well as numeric index. Thus, if the name assigned to the first column in the DataTable is “Account_ID,” the preceding example could be rewritten this way:

DataTable?table?=?ds.Tables[0];
foreach?(DataRow?row?in?table.Rows)
????Console.WriteLine?(row["account_id"]);

Enumerating a DataTable’s columns is equally simple:

DataTable?table?=?ds.Tables[0];
foreach?(DataColumn?col?in?table.Columns)
????Console.WriteLine?("Name={0},?Type={1}",
????????col.ColumnName,?col.DataType);

A quick and easy way to display a DataTable in a Web form is to bind it to a DataGrid control, as demonstrated in Chapter 9.

Inserting Records into a DataTable

One way to insert records into a database is to call ExecuteNonQuery on a Command object wrapping an INSERT command, as demonstrated in the first half of this chapter. You can also insert records into databases using DataSets. The general approach is to perform a query with DataAdapter.Fill, add records to the resulting DataTable, and write the changes to the database. You already know how to call Fill. Let’s talk about adding records to a DataTable.

The following example adds a record to a DataTable created from the Pubs database’s “Titles” table:

SqlDataAdapter?adapter?=?new?SqlDataAdapter?("select?*?from?titles",
??? "server=localhost;database=pubs;uid=sa;pwd=");

DataSet?ds?=?new?DataSet?();
adapter.Fill?(ds, "Titles");

//?Create?a?new?DataRow
DataTable?table?=?ds.Tables["Titles"];
DataRow?row?=?table.NewRow?();

//?Initialize?the?DataRow
row["title_id"]?= "JP1001";
row["title"]?= "Programming?Microsoft?.NET";
row["price"]?= "59.99";
row["ytd_sales"]?= "1000000";
row["type"]?= "business";
row["pubdate"]?= "May?2002";

//?Add?the?DataRow?to?the?DataTable
table.Rows.Add?(row);

You begin by creating a new DataRow representing the record to be added. Rather than simply new up a DataRow, you call the DataTable’s NewRow method so that the DataTable can initialize the DataRow with a schema that matches its own. You then assign values to the DataRow’s columns. This example takes advantage of the facts that the columns are addressable by the names of the fields retrieved from the database and that nullable columns don’t have to be initialized at all. (The Pubs database’s “Titles” table contains 10 columns. This example initializes just six of them; the others are set to null.) Once the DataRow is initialized, you add it to the DataTable by calling Add on the table’s Rows collection. Repeat this process to add as many records as you like.

Incidentally, DataRow is happy to convert string values into the actual data types associated with each column, but you can make your code slightly more efficient by using strong types yourself, as shown here:

row["title_id"]?= "JP1001";
row["title"]?= "Programming?Microsoft?.NET";
row["price"]?=?59.99m;
row["ytd_sales"]?=?1000000;
row["type"]?= "business";
row["pubdate"]?=?new?DateTime?(2002,?5,?1);

In the revised code, the values assigned to the “Price,” “Ytd_sales,” and “Pubdate” fields are a decimal, an integer, and a DateTime, respectively.

Selecting Records in a DataTable

Inserting records into a DataTable is easy enough. So are updating and deleting. But before you go updating or deleting, you have to find the records targeted for update or deletion. One way to do this is to iterate through the DataRows searching for the record or records you want. A smarter way to find the records is to use the DataTable.Select method.

As its name implies, Select selects one or more records in a DataSet. It returns an array of DataRow objects representing the DataRows selected. Applied to the DataTable in the previous section, the following statement returns an array containing a single DataRow—the one added to the table with Add:

DataRow[]?rows?=?table.Select?("title_id?=?'JP1001'");

This statement selects (returns) all DataRows whose “Price” field contains a value less than 10:

DataRow[]?rows?=?table.Select?("price?<?10.00");

And this one selects records whose “Pubdate” fields hold dates on or after January 1, 2000:

DataRow[]?rows?=?table.Select?("pubdate?>=?'#1/1/2000#'");

If you want to know how many rows Select returned, read the array’s Length property.

How complex can the filter expressions passed to DataTable.Select be? The syntax is documented in the .NET Framework SDK, but here are some of the highlights. The following comparison operators are supported: <, <=, =, >=, >, and <>. You can also use IN and LIKE, as in the following:

//?Return?all?rows?where "State" equals?CA,?TN,?or?WA
DataRow[]?rows?=?table.Select?("state?in?('ca',?'tn',?'wa')");

//?Return?all?rows?where "State" begins?with?CA
DataRow[]?rows?=?table.Select?("state?like?'ca*'");

There’s also a handful of functions you can use in Select clauses. The next example uses the ISNULL function to select all the records in the DataTable with null “State” values:

DataRow[]?rows?=?table.Select?("isnull?(state,?0)?=?0");

AND, OR, and NOT work, too:

DataRow[]?rows?=?table.Select?("state?=?'tn'?and?zip?like?'37*'");

You can create complex Boolean expressions by grouping clauses with parentheses.

Updating Records in a DataTable

Once you’ve identified a record that you want to update in a DataTable, performing the update is easy: just replace the values of one or more of the record’s fields with values of your own. The following example selects all the records in the Pubs database’s “Title” table with year-to-date sales of 10,000 and adds $10.00 to their price:

SqlDataAdapter?adapter?=?new?SqlDataAdapter?("select?*?from?titles",
??? "server=localhost;database=pubs;uid=sa;pwd=");

DataSet?ds?=?new?DataSet?();
adapter.Fill?(ds, "Titles");

DataRow[]?rows?=?table.Select?("ytd_sales?>?10000");
foreach?(DataRow?row?in?rows)
????row["price"]?=?(decimal)?row["price"]?+?10.00m;

Deleting Records from a DataTable

Deleting records from a DataTable is a simple matter of calling Delete on each DataRow that you want to remove. The next example deletes all rows whose year-to-date sales are less than 10,000 or equal to null:

SqlDataAdapter?adapter?=?new?SqlDataAdapter?("select?*?from?titles",
??? "server=localhost;database=pubs;uid=sa;pwd=");

DataSet?ds?=?new?DataSet?();
adapter.Fill?(ds, "Titles");

DataRow[]?rows?=
????table.Select?("ytd_sales?<?10000?OR?isnull?(ytd_sales,?0)?=?0");
foreach?(DataRow?row?in?rows)
????row.Delete?();

Propagating Changes Back to a Database: DataAdapter.Update

Inserts, updates, and deletes performed on a DataTable do not automatically propagate back to the database. If you want changes written back to the database, you have to take matters into your own hands. Fortunately, the DataAdapter.Fill method makes your job incredibly simple.

Here’s a code sample demonstrating how to make changes to a database using a DataSet and a DataAdapter:

SqlDataAdapter?adapter?=
????new?SqlDataAdapter?("select?*?from?titles",
??? "server=localhost;database=pubs;uid=sa;pwd=");

SqlCommandBuilder?builder?=?new?SqlCommandBuilder?(adapter);
DataSet?ds?=?new?DataSet?();
adapter.Fill?(ds, "Titles");

//?Insert?a?record
DataTable?table?=?ds.Tables["Titles"];
DataRow?row?=?table.NewRow?();
row["title_id"]?= "JP1001";
row["title"]?= "Programming?Microsoft?.NET";
row["price"]?=?59.99m;
row["ytd_sales"]?=?1000000;
row["type"]?= "business";
row["pubdate"]?=?new?DateTime?(2002,?5,?1);
table.Rows.Add?(row);

//?Update?the?database
adapter.Update?(table);

The DataAdapter’s Update method examines each row in the table passed to it and writes rows that were inserted, updated, or deleted since the last update (or since the last time the table’s AcceptChanges method was called) to the database. If a DataSet contains multiple DataTables that underwent modification, pass the entire DataSet to Update and all the changes will be propagated at once.

Many samples demonstrating how to use DataAdapter.Update call a method named GetChanges to create a DataSet or DataTable containing only rows that were inserted, modified, or deleted. They then pass the “delta” DataSet or DataTable to Update, as shown here:

//?Update?the?database
DataTable?delta?=?table.GetChanges?();
adapter.Update?(delta);

This approach works, but it isn’t necessary. Update is smart enough to ignore rows that weren’t changed in a DataTable containing a mixture of modified and unmodified rows. Where GetChanges becomes interesting is when you want to control the order in which changes are propagated back to the database. If you want to make sure DELETEs are performed before INSERTs to avoid duplicate key errors, for example, you can do this:

//?Update?the?database
DataTable?deletes?=?table.GetChanges?(DataRowState.Deleted);
adapter.Update?(deletes);
DataTable?inserts?=?table.GetChanges?(DataRowState.Added);
adapter.Update?(inserts);

Another use for GetChanges is to minimize the amount of data passed between machines when the update won’t be performed locally. Passing a DataSet or DataTable containing just the rows that changed is more efficient than passing a DataSet or DataTable containing both modified and unmodified rows.

CommandBuilder Classes

Now ask yourself a question. How does Update physically update the database? The short answer is that it executes SQL INSERT commands for rows added to a DataTable, UPDATE commands for rows that were modified, and DELETE commands for rows that were deleted. But where do the INSERT, UPDATE, and DELETE commands come from? Are they manufactured out of thin air?

Close. They’re manufactured by a SqlCommandBuilder object. Note the following statement from the previous code sample:

SqlCommandBuilder?builder?=?new?SqlCommandBuilder?(adapter);

If you omit this statement, Update throws an exception. The reason? A DataAdapter has four very important properties that control how it communicates with a database:

SelectCommand, which encapsulates the command the DataAdapter uses to perform queries
InsertCommand, which encapsulates the command the DataAdapter uses to insert rows
UpdateCommand, which encapsulates the command the DataAdapter uses to update rows
DeleteCommand, which encapsulates the command the DataAdapter uses to delete rows

When you create a DataAdapter this way:

SqlDataAdapter?adapter?=
????new?SqlDataAdapter?("select?*?from?titles",
??? "server=localhost;database=pubs;uid=sa;pwd=");

the constructor initializes SelectCommand with a SqlCommand object wrapping the query string, but it leaves InsertCommand, UpdateCommand, and DeleteCommand set to null. When Update is called and it finds these properties still equal to null, it asks the SqlCommandBuilder to provide it with the commands it needs. If there is no SqlCommandBuilder, Update is powerless to update the database and indicates as much by throwing an exception.

SqlCommandBuilder and its OLE DB counterpart, OleDbCommandBuilder, generate INSERT, UPDATE, and DELETE commands on the fly based on information inferred from the DataAdapter’s SelectCommand. The commands that they generate are simple dynamic SQL commands. You can see these commands for yourself by calling the command builder’s GetInsertCommand, GetUpdateCommand, and GetDeleteCommand methods and inspecting the command text found inside the returned command objects:

string?insert?=?builder.GetInsertCommand?().CommandText;
string?update?=?builder.GetUpdateCommand?().CommandText;
string?delete?=?builder.GetDeleteCommand?().CommandText;

In the vast majority of cases, a builder’s auto-generated commands work just fine. However, if you do a lot of database updating with DataAdapters, you might achieve a performance boost by coding your own INSERT, UPDATE, and DELETE commands in stored procedures, wrapping the stored procedures in SqlCommand or OleDbCommand objects, and assigning those objects to the adapter’s InsertCommand, UpdateCommand, and DeleteCommand properties. The DataAdapter will respond by using your stored procedures to do its updating.

The DataView Class

Database programmers are familiar with the concept of views. A view is a logical table containing rows from one or more physical tables. Views can be used to sort and filter data and also to create fictitious tables that combine data from other tables.

ADO.NET also supports the concept of views. ADO.NET views are represented by instances of System.Data.DataView. They support sorting and filtering and are often used to customize the content displayed in controls through data binding. The following ASPX file displays the contents of the Pubs database’s “Titles” table in a DataGrid and sorts the output on the “Title” column:

<%@?Import?Namespace="System.Data" %>
<%@?Import?Namespace="System.Data.SqlClient" %>

<html>
??<body>
????<form?runat="server">
??????<asp:DataGrid?ID="MyDataGrid" RunAt="server" />
????</form>
??</body>
</html>

<script?language="C#" runat="server">
??void?Page_Load?(object?sender,?EventArgs?e)
??{
??????SqlDataAdapter?adapter?=
??????????new?SqlDataAdapter?("select?*?from?titles",
????????? "server=localhost;database=pubs;uid=sa;pwd=");

??????DataSet?ds?=?new?DataSet?();
??????adapter.Fill?(ds, "Titles");

??????DataView?view?=?new?DataView?(ds.Tables["Titles"]);
??????view.Sort?= "title?ASC";
??????MyDataGrid.DataSource?=?view;
??????MyDataGrid.DataBind?();
??}
</script>

The view’s Sort property contains the expression that defines the sort. You can also use a view’s RowFilter property to filter data from the view. For example, adding the following statement displays only those records whose “Price” field contains a value greater than or equal to 10:

view.RowFilter?= "price?>=?10";

DataView also contains a property named RowStateFilter that you can use to filter content based on row state—that is, based on which rows have been added, modified, and deleted from the DataTable.

It might seem odd that a chapter on ADO.NET doesn’t have a large section describing data binding. However, data binding to ASP.NET server controls is discussed at length in Chapter 6. Data binding can be used with Windows Forms controls too. Combined with views, data binding is a powerful concept that vastly simplifies the task of querying databases and displaying the results in GUI applications.