Returning Many Results with Paged Searches

In large directories, it is easy to specify a search that might match many items. However, Active Directory and ADAM impose a policy on LDAP queries that limits the maximum number of results that may be returned in a single search page. This allows the directory to conserve machine resources and helps prevent LDAP queries from launching (accidentally or otherwise) denial of service attacks against the server!

This query policy is defined by the MaxPageSize setting. By default, it is set to 1,000 objects for both Active Directory and ADAM.

Active Directory and ADAM directories frequently contain thousands if not millions of objects, though. Given that many important scenarios would not be possible if we could not retrieve more than 1,000 objects, we need a way to get around this limitation.

Active Directory and ADAM solve this dilemma by offering a search technique called paging. Paging splits the entire result set of a query into smaller subsets called, appropriately enough, pages. The client continues requesting search pages from the server until all results within the query scope have been found.

Paging using the LDAP API actually requires specific code to retrieve individual pages and is much more complex to implement than a nonpaged search. However, the ADSI IDirectorySearch interface abstracts away all of this complexity and makes paging seamless to the client. DirectorySearcher piggybacks on top of this, making paged searches seamless to .NET clients as well.

Warning: Do Not Change the MaxPageSize Policy

Active Directory and ADAM offer the ability to change the MaxPageSize policy and they will even accept enormous values. Do not do this! MaxPageSize is there to help protect our servers. Paging is so easy in .NET and all other ADSI clients that there is no compelling reason to change this anyway.

Also, what would we change the policy to? Our goal is probably to return all matches to a query, but we must choose an actual number. What if we choose a number based on the maximum number of objects in the directory and the directory adds more objects later? We end up in an escalating cycle that never ends.

 

Enabling Paging

Paging is controlled by the PageSize property on the DirectorySearcher. Setting PageSize to a value greater than zero will change the behavior of the search by enabling paging.

The rest of our code changes very little. We enumerate the SearchResultCollection as before, but this time many results are returned. Listing 4.13 shows an example of how to use paging in SDS.

Listing 4.13. Enabling Paging Support

string adsPath = "LDAP://dc=domain,dc=com"; //Explicitly create our SearchRoot DirectoryEntry searchRoot = new DirectoryEntry( adsPath, null, null, AuthenticationTypes.Secure ); using (searchRoot) //we are responsible for Disposing { DirectorySearcher ds = new DirectorySearcher( searchRoot, "(&(objectCategory=person)(objectClass=user)" //any user ); //enable paging support ds.PageSize = 1000; //wait a maximum of 2 seconds per page ds.ServerPageTimelimit = TimeSpan.FromSeconds(2); using (SearchResultCollection src=ds.FindAll()) { Console.WriteLine("Returning {0}", src.Count); foreach (SearchResult sr in src) { Console.WriteLine(sr.Path); } } }

Choosing an Appropriate Page Size

Listing 4.13 shows the page size set to 1,000. In general, it is a good idea to set this value to the maximum page size for simple searches. By setting it to the maximum value, we are minimizing the network roundtrips necessary to retrieve each page, which tends to be the more expensive operation for simple searches.

While it is possible to specify a PageSize greater than the system's MaxPageSize, the server will ignore it and use the MaxPageSize instead. No exception will be generated in this case.

In some cases, we may need to specify a smaller page size to avoid timeouts or overtaxing the server. Some queries are especially expensive, so limiting the number of results in a single page can help avoid this.

Using the ServerPageTimeLimit

We can also control paging, not only by setting a specific number of results, but also by limiting the time that the server should spend retrieving any one page. As Listing 4.13 also demonstrates, when used in conjunction with the PageSize property, the ServerPageTimeLimit specifies a TimeSpan that represents the maximum time that a server will spend retrieving a page of results before returning what it has. We can use this with more complex and expensive queries to allow the server to process other requests more efficiently.

As an example, suppose a search returned 20,000 objects. We could set our PageSize to be the MaxPageSize value of 1,000, and set our ServerPageTimeLimit to 10 seconds. This would mean that our page would contain either 1,000 results, or as many as the server could retrieve in 10 seconds. Each subsequent page will follow the same logic until the entire result set is returned. All of this happens seamlessly to the client, so it may not be noticeable.

Finally, using the ServerPageTimeLimit without also setting the PageSize has no effect, as paging support must be enabled by setting a nonzero PageSize before this setting takes effect.

Caching Result Sets

We find, on the DirectorySearcher class, an interesting property called CacheResults. If we read relevant ADSI documentation, we gather that manipulating this property will set an underlying ADSI feature that determines whether the search results are cached locally at the client. It turns out that in .NET, this property does very little for us. Internally, since the SearchResultCollection holds every SearchResult in an ArrayList, we never hit this unmanaged ADSI client cache when enumerating our collection repeatedly. As such, it is safe to turn this off and reduce client-side memory usage. We will not notice the reduced memory usage by viewing our .NET program's memory profile. Instead, turning off the ADSI caching feature will reduce our unmanaged memory requirements, so keep this in mind.

Sorting Search Results

Категории