Previous Section  < Day Day Up >  Next Section

5.3. Comparing Strings

The most efficient way to determine if two string variables are equal is to see if they refer to the same memory address. We did this earlier using the ReferenceEquals method. If two variables do not share the same memory address, it is necessary to perform a character-by-character comparison of the respective values to determine their equality. This takes longer than comparing addresses, but is often unavoidable.

.NET attempts to optimize the process by providing the String.Equals method that performs both reference and value comparisons automatically. We can describe its operation in the following pseudo-code:


If string1 and string2 reference the same memory location

   Then strings must be equal

Else

   Compare strings character by character to determine equality


This code segment demonstrates the static and reference forms of the Equals method:


string poem1 = "Kubla Khan";

string poem2 = "Kubla Khan";

string poem3 = String.Copy(poem2);

string poem4 = "kubla khan";

//

Console.WriteLine(String.Equals(poem1,poem2));  // true

Console.WriteLine(poem1.Equals(poem3));         // true

Console.WriteLine(poem1 == poem3);      // equivalent to Equals

Console.WriteLine(poem1 == poem4);      // false ?case differs


Note that the == operator, which calls the Equals method underneath, is a more convenient way of expressing the comparison.

Although the Equals method satisfies most comparison needs, it contains no overloads that allow it to take case sensitivity and culture into account. To address this shortcoming, the string class includes the Compare method.

Using String.Compare

String.Compare is a flexible comparison method that is used when culture or case must be taken into account. Its many overloads accept culture and case-sensitive parameters, as well as supporting substring comparisons.

Syntax:


int Compare (string str1, string str2)

Compare (string str1, string str2, bool IgnoreCase)

Compare (string str1, string str2, bool IgnoreCase,

         CultureInfo ci)

Compare (string str1, int index1, string str2, int index2,

         int len)


Parameters:

str1 and str2

Specify strings to be compared.

IgnoreCase

Set true to make comparison case-insensitive (default is false).

index1 and index2

Starting position in str1 and str2.

ci

A CultureInfo object indicating the culture to be used.


Compare returns an integer value that indicates the results of the comparison. If the two strings are equal, a value of 0 is returned; if the first string is less than the second, a value less than zero is returned; if the first string is greater than the second, a value greater than zero is returned.

The following segment shows how to use Compare to make case-insensitive and case-sensitive comparisons:


int result;

string stringUpper = "AUTUMN";

string stringLower = "autumn";

// (1) Lexical comparison: "A" is greater than "a"

result = string.Compare(stringUpper,stringLower);       // 1

// (2) IgnoreCase set to false

result = string.Compare(stringUpper,stringLower,false); // 1

// (3)Perform case-insensitive comparison

result = string.Compare(stringUpper,stringLower,true);  // 0


Perhaps even more important than case is the potential effect of culture information on a comparison operation. .NET contains a list of comparison rules for each culture that it supports. When the Compare method is executed, the CLR checks the culture associated with it and applies the rules. The result is that two strings may compare differently on a computer with a US culture vis-à-vis one with a Japanese culture. There are cases where it may be important to override the current culture to ensure that the program behaves the same for all users. For example, it may be crucial that a sort operation order items exactly the same no matter where the application is run.

By default, the Compare method uses culture information based on the THRead.CurrentThread.CurrentCulture property. To override the default, supply a CultureInfo object as a parameter to the method. This statement shows how to create an object to represent the German language and country:


CultureInfo ci = new CultureInfo("de-DE");  // German culture


To explicitly specify a default culture or no culture, the CultureInfo class has two properties that can be passed as parameters?tt>CurrentCulture, which tells a method to use the culture of the current thread, and InvariantCulture, which tells a method to ignore any culture.

Let's look at a concrete example of how culture differences affect the results of a Compare() operation.


using System.Globalization;   // Required for CultureInfo



// Perform case-sensitive comparison for Czech culture

string s1 = "circle";

string s2 = "chair";

result = string.Compare(s1, s2,

         true, CultureInfo.CurrentCulture));       //  1

result = string.Compare(s1, s2,

         true, CultureInfo.InvariantCulture));     //  1

// Use the Czech culture

result = string.Compare(s1, s2,

         true, new CultureInfo("cs-CZ"));          //  -1


The string values "circle" and "chair" are compared using the US culture, no culture, and the Czech culture. The first two comparisons return a value indicating that "circle" > "chair", which is what you expect. However, the result using the Czech culture is the opposite of that obtained from the other comparisons. This is because one of the rules of the Czech language specifies that "ch" is to be treated as a single character that lexically appears after "c".

Core Recommendation

When writing an application that takes culture into account, it is good practice to include an explicit CultureInfo parameter in those methods that accept such a parameter. This provides a measure of self-documentation that clarifies whether the specific method is subject to culture variation.


Using String.CompareOrdinal

To perform a comparison that is based strictly on the ordinal value of characters, use String.CompareOrdinal. Its simple algorithm compares the Unicode value of two strings and returns a value less than zero if the first string is less than the second; a value of zero if the strings are equal; and a value greater than zero if the first string is greater than the second. This code shows the difference between it and the Compare method:


string stringUpper = "AUTUMN";

string stringLower = "autumn";

//

result = string.Compare(stringUpper,stringLower,

         false, CultureInfo.InvariantCulture);            // 1

result = string.CompareOrdinal(stringUpper,stringLower);  // -32


Compare performs a lexical comparison that regards the uppercase string to be greater than the lowercase. CompareOrdinal examines the underlying Unicode values. Because A (U+0041) is less than a (U+0061), the first string is less than the second.

    Previous Section  < Day Day Up >  Next Section