CONTENTS

Chapter 11. Strings

This chapter presents the string types of the C++ standard library. It describes the basic template class basic_string<> and its standard specializations string and wstring.

Strings can be a source of confusion. This is because it is not clear what is meant by the term string. Does it mean an ordinary character array of type char* (with or without the const qualifier), or an instance of class string, or is it a general name for objects that are kind of strings? In this chapter I use the term string for objects of one of the string types in the C++ standard library (whether it is string or wstring). For "ordinary strings" of type char* or const char*, I use the term C-string.

Note that the type of string literals (such as "hello") was changed into const char*. However, to provide backward compatibility there is an implicit but deprecated conversion to char* for them.

11.1 Motivation

The string classes of the C++ standard library enable you to use strings as normal types that cause no problems for the user. Thus, you can copy, assign, and compare strings as fundamental types without worrying or bothering about whether there is enough memory or for how long the internal memory is valid. You simply use operators, such as assignment by using =, comparison by using ==, and concatenation by using +. In short, the string types of the C++ standard library are designed in such a way that they behave as if they were a kind of fundamental data type that does not cause any trouble (at least in principle). Modern data processing is mostly string processing, so this is an important step for programmers coming from C, Fortran, or similar languages in which strings are a source of trouble.

The following sections offer two examples that demonstrate the abilities and uses of the string classes. They aren't very useful because they are written only for demonstration purposes.

11.1.1 A First Example: Extracting a Temporary File Name

The first example program uses command-line arguments to generate temporary file names. For example, if you start the program as

   string1 prog.dat mydir hello. oops.tmp end.dat

the output is

   prog.dat => prog.tmp
   mydir => mydir.tmp
   hello. => hello.tmp
   oops.tmp => oops.xxx
   end.dat => end.tmp

Usually, the generated file name has the extension .tmp, whereas the temporary file name for a name with the extension .tmp is .xxx.

The program is written in the following way:

   //string/string1.cpp

   #include <iostream>
   #include <string>
   using namespace std;

   int main (int argc, char* argv[])
   {

       string filename, basename, extname, tmpname;
       const string suffix("tmp");

       /*for each command-line argument
        *(which is an ordinary C-string)
        */
       for (int i=1; i<argc; ++i) {
           //process actual argument as file name
           filename = argv[i];

           //search period in file name
           string::size_type idx = filename.find('.');
           if (idx == string::npos) {
               //file name does not contain any period
               tmpname = filename + '.' + suffix;
           }
           else {
                /* split file name into base name and extension
                 * - base name contains all characters before the period
                 * - extension contains all characters after the period
                 */
                basename = filename.substr(0, idx);
                extname = filename.substr(idx+1);
                if (extname.empty()) {
                    //contains period but no extension: append tmp
                    tmpname = filename;
                    tmpname += suffix;
                }
                else if (extname == suffix) {
                    //replace extension tmp with xxx
                    tmpname = filename;
                    tmpname.replace (idx+1, extname.size(), "xxx");
                }
                else {
                    //replace any extension with tmp
                    tmpname = filename;
                    tmpname.replace (idx+1, string::npos, suffix);
                }
          }

          //print file name and temporary name
          cout << filename << " => " << tmpname << endl;
       }
   }

At first,

   #include <string>

includes the header file for the C++ standard string classes. As usual, these classes are declared in namespace std.

The declaration

   string filename, basename, extname, tmpname;

creates four string variables. No argument is passed, so for their initialization the default constructor for string is called. The default constructor initializes them as empty strings.

The declaration

   const string suffix("tmp");

creates a constant string suffix that is used in the program as the normal suffix for temporary file names. The string is initialized by an ordinary C-string, so it has the value tmp. Note that C-strings can be combined with objects of class string in almost any situation in which two strings can be combined. In particular, in the entire program every occurrence of suffix could be replaced with "tmp" so that a C-string is used directly.

In each iteration of the for loop, the statement

   filename = argv[i];

assigns a new value to the string variable filename. In this case, the new value is an ordinary C-string. However, it could also be another object of class string or a single character that has type char.

The statement

   string::size_type idx = filename.find('.');

searches the first occurrence of a period inside the string filename. The find() function is one of several functions that search for something inside strings. You could also search backward, for substrings, only in a part of a string, or for more than one character simultaneously. All these find functions return an index of the first matching position. Yes, the return value is an integer and not an iterator. The usual interface for strings is not based on the concept of the STL. However, some iterator support for strings is provided (see Section 11.2.13). The return type of all find functions is string::size_type, an unsigned integral type that is defined inside the string class.[1] As usual, the index of the first character is the value 0. The index of the last character is the value "numberOfCharacters-1." Note that "numberOfCharacters" is not a valid index. Unlike C-strings, objects of class string have no special character '\0' at the end of the string.

If the search fails, a special value is needed to return the failure. That value is npos, which is also defined by the string class. Thus, the line

   if (idx == string::npos)

checks whether the search for the period failed.

The type and value of npos are a big pitfall for the use of strings. Be very careful that you always use string::size_type and not int or unsigned for the return type when you want to check the return value of a find function. Otherwise, the comparison with string::npos might not work. See Section 11.2.12, for details.

If the search for the period fails in this example, the file name has no extension. In this case, the temporary file name is the concatenation of the original file name, the period character, and the previously defined extension for temporary files:

   tmpname = filename + '.' + suffix;

Thus, you can simply use operator + to concatenate two strings. It is also possible to concatenate strings with ordinary C-strings and single characters.

If the period is found, the else part is used. Here, the index of the period is used to split the file name into a base part and the extension. This is done by the substr() member function:

   basename = filename.substr(0, idx);
   extname = filename.substr(idx+1);

The first parameter of the substr() function is the starting index. The optional second argument is the number of characters (not the end index). If the second argument is not used, all remaining characters of the string are returned as a substring.

At all places where an index and a length are used as arguments, strings behave according to the following two rules:

  1. An argument specifying the index must have a valid value. That value must be less than the number of characters of the string (as usual, the index of the first character is 0). In addition, the index of the position after the last character could be used to specify the end.

    In most cases, any use of an index greater than the actual number of characters throws out_of _range. However, all functions that search for a character or a position (all find functions) allow any index. If the index exceeds the number of characters these functions simply return string::npos ("not found").

  2. An argument specifying the number of characters could have any value. If the size is greater than the remaining number of characters, all remaining characters are used. In particular, string::npos always works as a synonym for "all remaining characters."

Thus, the following expression throws an exception if the period is not found:

   filename.substr(filename.find('.'))

But, the following expression does not throw an exception:

   filename.substr(0, filename.find('. '))

If the period is not found, it results in the whole file name.

Even if the period is found, the extension that is returned by substr() might be empty because there are no more characters after the period. This is checked by

   if (extname.empty())

If this condition yields true, the generated temporary file name becomes the ordinary file name that has the normal extension appended:

   tmpname = filename;
   tmpname += suffix;

Here, operator += is used to append the extension.

The file name might already have the extension for temporary files. To check this, operator == is used to compare two strings:

   if (extname == suffix)

If this comparison yields true the normal extension for temporary files is replaced by the extension xxx:

   tmpname = filename;
   tmpname.replace (idx+1, extname.size(), "xxx");

Here,

   extname.size()

returns the number of characters of the string extname. Instead of size() you could use length(), which does exactly the same thing. So, both size() and length() return the number of characters. In particular, size() has nothing to do with the memory that the string uses.[2]

Next, after all special conditions are considered, normal processing takes place. The program replaces the whole extension by the ordinary extension for temporary file names:

   tmpname = filename;
   tmpname.replace (idx+1, string::npos, suffix);

Here, string::npos is used as a synonym for "all remaining characters." Thus, all remaining characters after the period are replaced with suffix. This replacement would also work if the file name contained a period but no extension. It would just replace "nothing" with suffix.

The statement that writes the original file name and the generated temporary file name shows that you can print the strings by using the usual output operators of streams (surprise, surprise):

   cout << filename << " => " << tmpname << endl;

11.1.2 A Second Example: Extracting Words and Printing Them Backward

The second example extracts single words from standard input and prints the characters of each word in reverse order. The words are separated by the usual whitespaces (newline, space, and tab), and by commas, periods, or semicolons.

   //string/string2.cpp

    #include <iostream>
    #include <string>
    using namespace std;

    int main (int argc, char** argv)
    {

       const string delims(" \t,.;");
       string line;
       //for every line read successfully
       while (getline(cin,line)) {
           string::size_type begIdx, endIdx;

           //search beginning of the first word
           begIdx = line.find_first_not_of(delims);

           //while beginning of a word found
           while (begIdx != string::npos) {
               //search end of the actual word
               endIdx = line.find_first_of (delims, begIdx);
               if (endIdx == string::npos) {
                   //end of word is end of line
                   endIdx = line.length();
               }

               //print characters in reverse order
               for (int i=endIdx-l; i>=static_cast<int>(begIdx); --i) 
                   cout << line [i];
               }
               cout << ' ';

               //search beginning of the next word
               begIdx = line.find_first_not_of (delims, endIdx);
           }
           cout << endl;
       }
    }

In this program, all characters used as word separators are defined in a special string constant:

   const string delims(" \t,.;");

The newline is also used as a delimiter. However, no special processing is necessary for it because the program reads line-by-line.

The outer loop runs as far as a line can be read into the string line:

   string line;
   while (getline(cin,line)) {
        ...
   }

The function getline() is a special function to read input from streams into a string. It reads every character up to the next end-of-line, which by default is the newline character. The line delimiter itself is extracted hut not appended. By passing your special line delimiter as an optional second character argument you can use getline() to read token-by-token, where the tokens are separated by that special delimiter.

Inside the outer loop, the individual words are searched and printed. The first statement

   begIdx = line.find_first_not_of(delims);

searches for the beginning of the first word. The find_first_not_of() function returns the first index of a character that is not part of the passed string argument. Thus, this function returns the first character that is not one of the separators in delims. As usual for find functions, if no matching index is found, string::npos is returned.

The inner loop iterates as long as the beginning of a word can be found:

   while (begIdx != string::npos) {
        ...
   }

The first statement of the inner loop searches for the end of the actual word:

   endIdx = line.find_first_of (delims, begIdx);

The find_first_of() function searches for the first occurrence of one of the characters passed as the first argument. In this case, an optional second argument is used that specifies where to start the search in the string. Thus, the first delimiter after the beginning of the word is searched.

If no such character is found, the end-of-line is used:

   if (endIdx == string::npos) {
        endIdx = line.length();
    }

Here, length() is used, which does the same thing as size(): It returns the number of characters.

In the next statement, all characters of the word are printed in reverse order:

   for (int i=endIdx-1; i>=static_cast<int>(begIdx); --i) {
        cout << line[i];
    }

Accessing a single character of the string is done with operator [ ]. Note that this operator does not check whether the index of the string is valid. Thus, you have to ensure that the index is valid (as was done here). A safer way to access a character is to use the at() member function. However, such a check costs runtime, so the check is not provided for the usual accessing of characters of a string.

Another nasty problem results from using the index of the string. That is, if you omit the cast of begIdx to int, this program might run in an endless loop or might crash. Similar to the first example program, the problem is that string::size_type is an unsigned integral type. Without the cast, the signed value i is converted automatically into an unsigned value because it is compared with a signed type. In this case, the expression

   i>=begIdx

always yields true if the actual word starts at the beginning of the line. This is because begIdx is then zero and any unsigned value is greater than or equal to zero. So, an endless loop results that might get stopped by a crash due to an illegal memory access.

For this reason, I really don't like the concept of string::size_type and string::npos. See Section 11.2.12, for a workaround that is safer (but not perfect).

The last statement of the inner loop reinitializes begIdx to the beginning of the next word, if any:

   begIdx = line.find_first_not_of (delims, endIdx);

Unlike with the first call of find_first_not_of() in the example, here the end of the previous word is passed as the starting index for the search. If the previous word was the rest of the line, endIdx is the index of the end of the line. This simply means that the search starts from the end of the string, which returns string::npos.

Let's try this "useful and important" program. Here is some possible input:

   pots & pans
    I saw a reed

The output for this input is as follows:

   stop & snap
    I was a deer

I'd appreciate other examples of input for the next edition of this book.

11.2 Description of the String Classes

11.2.1 String Types

Header File

All types and functions for strings are defined in the header file <string>:

   #include <string>

As usual, it defines all identifiers in namespace std.

Template Class basic_string<>

Inside <string>, the type basic_string<> is defined as a basic template class for all string types:

   namespace std {
        template<class charT,
                 class traits = char_traits<charT>,
                 class Allocator = allocator<charT> >
        class basic_string;
    }

It is parameterized by the character type, the traits of the character type, and the memory model:

Types string and wstring

Two specializations of class basic_string<> are provided by the C++ standard library:

  1. string is the predefined specialization of that template for characters of type char:

       namespace std {
                typedef basic_string<char> string;
    
            }
  2. wstring is the predefined specialization of that template for characters of type wchar_t:

       namespace std {
                typedef basic_string<wchar_t> wstring;
    
            }

    Thus, you can use strings that use wider character sets, such as Unicode or some Asian character sets (see Chapter 14 for details about internationalization).

In the following sections no distinction is made between these different kinds of strings. The usage and the problems are the same because all string classes have the same interface. So, "string" means any string type, such as string and wstring. The examples in this book usually use type string because the European and Anglo-American environment is the common environment for software development.

11.2.2 Operation Overview

Table 11.1 lists all operations that are provided for strings.

Table 11.1. String Operation
Operation Effect
constructors Create or copy a string
destructor Destroys a string
=, assign() Assign a new value
swap() Swaps values between two strings
+=, append(), push_back() Append characters
insert() Inserts characters
erase() Deletes characters
clear() Removes all characters (makes it empty)
resize() Changes the number of characters (deletes or appends characters at the end)
replace() Replaces characters
+ Concatenates strings
==, !=, <, <=, >, >=, compare() Compare strings
size(), length() Return the number of characters
max_size() Returns the maximum possible number of characters
empty() Returns whether the string is empty
capacity() Returns the number of characters that can held without be reallocation
[], at() Access a character
>>, getline() Read the value from a stream
<< Writes the value to a stream
copy() Copies or writes the contents to a C-string
c_str() Returns the value as C-string
data() Returns the value as character array
substr() Returns a certain substring
find functions Search for a certain substring or character
begin(), end() Provide normal iterator support
rbegin(), rend() Provide reverse iterator support
get_allocator() Returns the allocator
String Operation Arguments

Many operations are provided to manipulate strings. In particular, the operations that manipulate the value of a string have several overloaded versions that specify the new value with one, two, or three arguments. All these operations use the argument scheme of Table 11.2.

Table 11.2. Scheme of String Operation Arguments
Arguments Interpretation
const string & str The whole string str
const string & str, size_type idx, size_type num At most, the first num characters of str starting with index idx
const char* cstr The whole C-string cstr
const char* chars, size_type len len characters of the character array chars
char c The character c
size_type num, char c num occurrences of the character c
iterator beg, iterator end All characters in the range [beg,end)

Note that only the single-argument version char* handles the character '\0' as a special character that terminates the string. In all other cases '\0' is not a special character:

    std::string s1("nico");        //initializes s1 with: 'n' 'i' 'c' 'o'
    std::string s2("nico",5) ;     //initializes s2 with: 'n' 'i' 'c' 'o' '\0'
    std::string s3(5,'\0');        //initializes s3 with: '\0' '\0' '\0' '\0' '\0'

    s1.length()                     //yields 4
    s2.length()                     //yields 5
    s3.length()                     //yields 5

Thus, in general a string might contain any character. In particular, a string might contain the contents of a binary file.

See Table 11.3 for an overview of which operation uses which kind of arguments. All operators can only handle objects as single values. Therefore, to assign, compare, or append a part of a string or C-string, you must use the function that has the corresponding name.

Operations that Are Not Provided

The string classes of the C++ standard library do not solve every possible string problem. In fact, they do not provide direct solutions for

Word processing, however, is not a big problem. See Section 11.2.13, for some examples.

Table 11.3. Available Operations that Have String Parameters
  Full String Part of String C-string (char*) char Array Single char num chars Iterator Range
constructors Yes Yes Yes Yes Yes Yes
= Yes Yes Yes
assign() Yes Yes Yes Yes Yes Yes
+= Yes Yes Yes
append( ) Yes Yes Yes Yes Yes Yes
push_back() Yes
insert(), index version Yes Yes Yes Yes Yes
insert(), iterator version Yes Yes Yes
replace(), index version Yes Yes Yes Yes Yes Yes
replace(), iterator vers. Yes Yes Yes Yes
find functions Yes Yes Yes Yes
+ Yes Yes Yes
==, !=, <, <=, >, >= Yes Yes
compare() Yes Yes Yes Yes

11.2.3 Constructors and Destructors

Table 11.4 lists all constructors and destructors for strings. These are described in this section. The initialization by a range that is specified by iterators is described in Section 11.2.13.

Table 11.4. Constructors and Destructor of Strings
Expression Effect
string s Creates the empty string s
string s(str) Creates a string as a copy of the existing string str
string s (str, stridx) Creates a string s that is initialized by the characters of string str starting with index stridx
string s(str, stridx, strlen) Creates a string s that is initialized by, at most, strlen characters of string str starting with index stridx
string s(cstr) Creates a string s that is initialized by the C-string cstr
string s (chars, chars_len) Creates a string s that is initialized by chars_len characters of the character array chars
string s(num,c) Creates a string that has num occurrences of character c
string s (beg, end) Creates a string that is initialized by all characters of the range [beg, end)
s.~string() Destroys all characters and frees the memory

You can't initialize a string with a single character. Instead, you must use its address or an additional number of occurrences:

   std:: string s('x');      //ERROR
   std:: string s(1, 'x');   //OK, creates a string that has one character 'x'

This means that there is an automatic type conversion from type const char* but not from type char to type string.

11.2.4 Strings and C-Strings

In standard C++ the type of string literals was changed from char* to const char*. However, to provide backward compatibility there is an implicit but deprecated conversion to char* for them. However, because string literals don't have type string, there is a strong relationship between "new" string class objects and ordinary C-strings: You can use ordinary C-strings in almost every situation where strings are combined with other string-like objects (comparing, appending, inserting, etc.). In particular, there is an automatic type conversion from const char* into strings. However, there is no automatic type conversion from a string object to a C-string. This is for safety reasons to prevent unintended type conversions that result in strange behavior (type char* often has strange behavior) and ambiguities (for example, in an expression that combines a string and a C-string it would be possible to convert string into char* and vice versa). Instead, there are several ways to create or write/copy in a C-string, In particular, c_str() is provided to generate the value of a string as a C-string (as a character array that has '\0' as its last character). By using copy(), you can copy or write the value to an existing C-string or character array.

Note that strings do not provide a special meaning for the character '\0', which is used as special character in an ordinary C-string to mark the end of the string. The character '\0' may be part of a string just like every other character.

Note also that you must not use a null pointer (NULL) instead of a char* parameter. Doing so results in strange behavior. This is because NULL has an integral type and is interpreted as the number zero or the character with value 0 if the operation is overloaded for a single integral type.

There are three possible ways to convert the contents of the string into a raw array of characters or C-string:

  1. data()

    Returns the contents of the string as an array of characters. Note that the return type is not a valid C-string because no '\0' character gets appended.

  2. c_str()

    Returns the contents of the string as a C-string. Thus, the '\0' character is appended.

  3. copy()

    Copies the contents of the string into a character array provided by the caller. An '\0' character is not appended.

Note that data() and c_str() return an array that is owned by the string. Thus, the caller must not modify or free the memory. For example:

    std::string s("12345");


    atoi(s.c_str())               //convert string into integer
    f(s.data(), s.length())       //call function for a character array
                                  //and the number of characters


    char buffer [100];
    s.copy (buffer, 100) ;        //copy at most 100 characters of s into buffer
    s.copy (buffer, 100,2) ;      //copy at most 100 characters of s into buffer
                                  //starting with the third character of s

You usually should use strings in the whole program and convert them into C-strings or character arrays only just immediately before you need the contents as type char*. Note that the return value of c_str() and data() is valid only until the next call of a nonconstant member function for the same string:

    std::string s;

    ...
    foo (s . c_str());     //s.c_str() is valid during the whole statement


    const char* p;
    p = s.c_str() ;        //p refers to the contents of s as a C-string
    foo (p);               //OK(p is still valid)
    s += " ext" ;          //invalidates p
    foo (p);              //ERROR: argument p is not valid

11.2.5 Size and Capacity

To use strings effectively and correctly you need to understand how the size and capacity of strings cooperate. For strings, three "sizes" exist:

  1. size() and length()

    Return the actual number of characters of the string. Both functions are equivalent.[4]

    The empty() member function is a shortcut for checking whether the numbers of characters is zero. Thus, it checks whether the string is empty. You should use it instead of length() or size() because it might be faster.

  2. max_size()

    Returns the maximum number of characters that a string may contain. A string typically contains all characters in a single block of memory, so there might be relevant restrictions on PCs. Otherwise, this value usually is the maximum value of the type of the index less one. It is "less one" for two reasons: (a) The maximum value itself is npos and (b) an implementation might append '\0' internally at the end of the internal buffer so that it simply returns that buffer when the string is used as a C-string (for example, by c_str()). Whenever an operation results in a siring that has a length greater than max_size(), the class throws length_error.

  3. capacity()

    Returns the number of characters that a string could contain without having to reallocate its internal memory.

Having sufficient capacity is important for two reasons:

  1. Reallocation invalidates all references, pointers, and iterators that refer to characters of the string.

  2. Reallocation takes time.

Thus, the capacity must be taken into account if a program uses pointers, references, or iterators that refer to a string or to characters of a string, or if speed is a goal.

The member function reserve() is provided to avoid reallocations. reserve() lets you reserve a certain capacity before you really need it to ensure that references are valid as long as the capacity is not exceeded:

   std::string s;      //create empty string
    s.reserve(80);      //reserve memory for 80 characters

The concept of capacity for strings is, in principle, the same as for vector containers (see Section 6.2.1); however, there is one big difference: Unlike vectors, you can call reserve() for strings to shrink the capacity. Calling reserve() with an argument that is less than the current capacity is, in effect, a nonbinding shrink request. If the argument is less than the current number of characters, it is a nonbinding shrink-to-fit request. Thus, although you might want to shrink the capacity, it is not guaranteed to happen. The default value of reserve() for string is 0. So, a call of reserve() without any argument is always a nonbinding shrink-to-fit request:

   s.reserve()  ;      //"would like to shrink capacity to fit the current size"

The call to shrink capacity is nonbinding because how to reach an optimal performance is implementation-defined. Implementations of the string class might have different design approaches with respect to speed and memory usage. Therefore, implementations might increase capacity in larger steps and might never shrink the capacity.

The standard, however, specifies that capacity may shrink only because of a call of reserve(). Thus, it is guaranteed that references, pointers, and iterators remain valid even when characters are deleted or changed, provided they refer to characters that have a position that is before the manipulated characters.

11.2.6 Element Access

A string allows you to have read or write access to the characters it contains. You can access a single character via either of two methods: the subscript operator [] and the at() member function. Both return the character at the position of the passed index. As usual, the first character has index 0 and the last character has index length()-1. However, note the following differences:

For example:

    const std::string cs("nico");      //cs contains: 'n' 'i' 'c' 'o'
    std::string s("abcde");            //s contains: 'a' 'b' 'c' 'd' 'e'


    s[2]                               //yields 'c'
    s.at(2)                            //yields 'c'


    s[100]                             //ERROR: undefined behavior
    s.at(100)                          //throws out_of_range


    s[s.length()]                      //ERROR: undefined behavior
    cs[cs.length()]                    //yields '\0'
    s.at(s.length())                   //throws out_of _range
    cs.at(cs.length())                 //throws out_of _range

To enable you to modify a character of a string, the nonconstant versions of [] and at() return a character reference. Note that this reference becomes invalid on reallocation:

    std::string s("abcde");        //s contains: 'a' 'b' 'c' 'd' 'e'


    char& r = s[2];                //reference to third character
    char* p = s[3];                //pointer to fourth character


    r = 'X';                       //OK, s contains: 'a' 'b' 'X' 'd' 'e'
    *p = 'Y';                      //OK, s contains: 'a' 'b' 'X' 'Y' 'e'


    s = "new long value";          //reallocation invalidates r and p


    r = 'X';                       //ERROR: undefined behavior
    *p = 'Y';                      //ERROR: undefined behavior

Here, to avoid runtime errors, you would have had to reserve() enough capacity before r and p were initialized.

References and pointers that refer to characters of a string may be invalidated by the following operations:

The same applies to iterators (see Section 11.2.13).

11.2.7 Comparisons

The usual comparison operators are provided for strings. The operands may be strings or C-strings:

    std::string s1, s2;
    ...


    s1 == s2       //returns true if s1 and s2 contain the same characters
    s1 < "hello"   //return whether s1 is less than the C-string "hello"

If strings are compared by <, <=, >, or >=, their characters are compared lexicographically according to the current character traits. For example, all of the following comparisons yield true:

    std::string("aaaa") < std::string("bbbb")
    std::string("aaaa") < std::string("abba")
    std::string("aaaa") < std::string("aaaaaa")

By using the compare() member functions you can compare substrings. The compare() member functions can process more than one argument for each string so that you can specify a substring by its index and by its length. Note that compare() returns an integral value rather than a Boolean value. This return value has the following meaning: 0 means equal, a value less than zero means less than, and a value greater than zero means greater than. For example:

    std::string s("abcd");


    s.compare("abcd")          //returns 0
    s compare ("dcba")         //returns a value < 0 (s is less)
    s compare ("ab")           //returns a value > 0 (s is greater)


    s.compare (s)              //returns 0 (s is equal to s)
    s.compare(0,2,s,2,2)       //returns a value <0("ab" is less than "cd")
    s.compare (1,2, "bcx",2)   //returns 0 ("bc" is equal to "bc")

To use a different comparison criterion you can define your own comparison criterion and use STL comparison algorithms (see Section 11.2.13, for an example), or you can use special character traits that make comparisons on a case-insensitive basis. However, because a string type that has a special traits class is a different data type, you cannot combine or process these strings with objects of type string. See Section 11.2.14, for an example.

In programs for the international market it might be necessary to compare strings according to a specific locale. Class locale provides the parenthesis operator as convenient way to do this (see page 703). It uses the string collation facet, which is provided to compare strings for sorting according to some locale conventions. See Section 14.4.5, for details.

11.2.8 Modifiers

You can modify strings by using different member functions and operators.

Assignments

To modify a string you can use operator = to assign a new value. The new value may be a string, a C-string, or a single character. In addition, you can use the assign() member functions to assign strings when more than one argument is needed to describe the new value. For example:

    const std::string aString("othello");
    std::string s;


    s = aString;                //assign "othello"
    s = "two\nlines";           //assign a C-string
    s = ' ';                    //assign a single character


    s.assign(aString);        //assign "othello" (equivalent to operator =)
    s.assign(aString, 1,3);     //assign "the"
    s.assign(aString, 2, string::npos);       //assign "hello"


    s.assign("two\nlines") ;    //assign a C-string (equivalent to operator =)
    s.assign("nico" ,5);        //assign the character array: 'n' 'i' 'c' 'o' '\0'
    s.assign(5,'x');            //assign five characters: 'x' 'x' 'x' 'x' 'x'

You also can assign a range of characters that is defined by two iterators. See Section 11.2.13, for details.

Swapping Values

As with many nontrivial types, the string type provides a specialization of the swap() function, which swaps the contents of two strings (the global swap() function was introduced in Section 4.4.2). The specialization of swap() for strings guarantees constant complexity. So you should use it to swap the value of strings and to assign strings if you don't need the assigned string after the assignment.

Making Strings Empty

To remove all characters in a string, you have several possibilities. For example:

    std::string s;


    s = "";          // assign the empty string
    s.clear();       // clear contents
    s.erase();       // erase all characters
Inserting and Removing Characters

There are a lot of member functions to insert, remove, replace, and erase characters of a string. To append characters, you can use operator +=, append(), and push_back(). For example:

    const std::string aString("othello");
    std::string s;


    s += aString;            //append "othello"
    s += "two\nlines";       //append C-string
    s += '\n';               //append single character


    s.append(aString);       //append "othello" (equivalent to operator +=)
    s.append(aString,1,3);   //append "the"
    s.append(aString,2,string::npos);    //append "hello"


    s.append("two\nlines");  //append C-string (equivalent to operator +=)
    s.append("nico" ,5);     //append character array: 'n' 'i' 'c' 'o' '\0'
    s.append(5,'x');         //append five characters: 'x' 'x' 'x' 'x' 'x'


    s.push_back('\n');       //append single character (equivalent to operator +=)

Operator += appends single-argument values, append() lets you specify the appended value by using multiple arguments. One additional version of append() lets you append a range of characters specified by two iterators (see Section 11.2.13). The push_back() member function is provided for back inserters so that STL algorithms are able to append characters to a string (see Section 7.4.2, for details about back inserters and Section 11.2.13, for an example of their use with strings).

Similar to append(), several insert() member functions enable you to insert characters. They require the index of the character, behind which the new characters are inserted:

    const std::string aString("age");
    std::string s("p");


    s.insert(1,aString);        //s: page
    s.insert(1, "ersifl");      //s: persiflage

Note that no insert() member function is provided to pass the index and a single character. Thus you must pass a string or an additional number:

    s.insert(0,' ');     //ERROR
    s.insert(0," ");     //OK

You might also try

    s.insert(0,1, ' ');   //ERROR: ambiguous

However, this results in a nasty ambiguity because insert() is overloaded for the following signatures:

    insert (size_type idx, size_type num, charT c); //position is index
    insert (iterator  pos, size_type num, charT c); //position is iterator

For type string, size_type is usually defined as unsigned and iterator is often defined as char*. In this case, the first argument 0 has two equivalent conversions. So, to get the correct behavior you have to write:

    s.insert((string::size_type)0,1,' ');  //OK

The second interpretation of the ambiguity described here is an example of the use of iterators to insert characters. If you wish to specify the insert position as an iterator, you can do it in three ways: insert a single character, insert a certain number of the same character, and insert a range of characters specified by two iterators (see Section 11.2.13).

Similar to append() and insert(), several erase() functions remove characters, and several replace() functions replace characters. For example:

    std::string s = "i18n";                     //s: i18n
    s.replace(1,2, "nternationalizatio");       //s: internationalization
    s.erase(13);                                //s: international
    s.erase(7,5);                               //s: internal
    s.replace(0,2, "ex");                       //s: external

resize() lets you change the number of characters. If the new size that is passed as an argument is less than the actual number of characters, characters are removed from the end. If the new size is greater than the actual number of characters, characters are appended at the end. You can pass the character that is appended if the size of the string grows. If you don't, the default constructor for the character type is used (which is the '\0' character for type char).

11.2.9 Substrings and String Concatenation

You can extract a substring from any string by using the substr() member function. For example:

    std::string s("interchangeability");


    s.substr()                      //returns a copy of s
    s.substr(11)                    //returns string("ability")
    s.substr(5,6)                   //returns string ("change")
    s.substr(s.find('c'))           //returns string ("changeability")

You can concatenate two strings or C-strings, or one of those with single characters by using operator +. For example, the statements

    std::string s1("enter");
    std::string s2("nation");
    std::string i18n;


    i18n = 'i' + s1.substr(1) + s2 + "aliz" + s2.substr(1);
    cout << "i18n means: " + i18n << endl;

have the following output:

    i18n means: internationalization

11.2.10 Input/Output Operators

The usual I/O operators are defined for strings:

These operators behave as they do for ordinary C-strings. In particular, operator >> operates as follows:

  1. It skips leading whitespaces if the skipws flag (see Section 13.7.7) is not set.

  2. It reads all characters until any of the following happens:

    • The next character is a whitespace

    • The stream is no longer in a good state (for example due to end-of-file)

    • The actual width() of the stream (see Section 13.7.3) is greater than zero and width() characters are read

    • max_size() characters are read

  3. It sets width() of the stream to 0.

Thus, in general, the input operator reads the next word while skipping leading whitespaces. A whitespace is any character for which isspace(c,strm.getloc()) is true (isspace() is explained in Section 14.4.4).

The output operator also takes the width() of the stream in consideration. That is, if width() is greater than 0, operator << writes at most width() characters.

The string classes also provide a special function in namespace std for reading line-by-line: std::getline(). This function ignores leading whitespaces and reads all characters until the line delimiter or end-of-file is reached. The line delimiter is extracted but not appended. By default, the line delimiter is the newline character, but you can pass your own "line" delimiter as an optional argument:[5]:

    std::string s;


    while (getline(std::cin,s)) {       //for each line read from cin
        ...

    }


    while (getline(std:: cin, s,':')) { //for each token separated by ':'
        ...

    }

Note that if you read token-by-token, the newline character is not a special character. In this case, the tokens might contain a newline character.

11.2.11 Searching and Finding

Strings provide a lot of functions to search and find characters or substrings.[6] You can search

In addition, all search algorithms of the STL can be called when iterators are used.

All search functions have the word find inside their name. They try to find a character position given a value that is passed as an argument. How the search proceeds depends on the exact name of the find function. Table 11.5 lists all of the search functions for strings.

Table 11.5. Search Functions for Strings
String Function Effect
find() Finds the first occurrence of value
rfind() Finds the last occurrence of value (reverse find)
find_first_of() Finds the first character that is part of value
find_last_of() Finds the last character that is part of value
find_first_not_of() Finds the first character that is not part of value
find_last_not_of() Finds the last character that is not part of value

All search functions return the index of the first character of the character sequence that matches the search. If the search fails, they return npos. The search functions use the following argument scheme:

Unfortunately, this argument scheme differs from that of the other string functions. With the other string functions, the starting index is the first argument, and the value and its length are adjacent arguments. In particular, each search function is overloaded with the following set of arguments:

For example:

   std::string s("Hi Bill, I'm ill, so please pay the bill");


    s.find ("i1")                        //returns 4 (first substring "i1")
    s.find("il", 10)                     //returns 13 (first substring "il" starting from s[10]
graphics/ccc.gif)
    s.rfind("il")                        //returns 37 (last substring "il")
    s.find_first_of("il")                   //returns 1 (first char 'i' or 'l')
    s.find_last_of("il")                    //returns 39 (last char 'i' or 'l')
    s.find_first_not_of("il")               //returns 0 (first char neither 'i' nor 'l')
    s.find_last_not_of("il")                //returns 36 (last char neither 'i' nor 'l')
    s.find("hi")                            //returns npos

You could also use STL algorithms to find characters or substrings in strings. They allow you to use your own comparison criterion (see Section 11.2.13, for an example). However, note that the naming scheme of the STL search algorithms differs from the naming scheme for string search functions (see Section 9.2.2, for details).

11.2.12 The Value npos

If a search function fails, it returns string::npos. Consider the following example:

    std::string s;
    std::string::size_type idx;         //be careful: don't use any other type!
    ...


    idx = s.find("substring");
    if (idx == std::string::npos) {
       ...
    }

The condition of the if statement yields true if and only if "substring" is not part of string s.

Be very careful when using the string value npos and its type. When you want to check the return value always use string::size_type and not int or unsigned for the type of the return value; otherwise, the comparison of the return value with string::npos might not work.

This behavior is the result of the design decision that npos is defined as -1:

    namespace std {
        template<class charT,
                 class traits = char_traits<charT>,
                 class Allocator = allocator<charT> >
        class basic_string {
          public:
                typedef typename Allocator::size_type size_type;
                ...
                static const size_type npos = -1;
                ...
        };
    }

Unfortunately, size_type (which is defined by the allocator of the string) must be an unsigned integral type. The default allocator, allocator, uses type size_t as size_type (see Section 15.3). Because -1 is converted into an unsigned integral type, npos is the maximum unsigned value of its type. However, the exact value depends on the exact definition of type size_type. Unfortunately, these maximum values differ. In fact, (unsigned long)-1 differs from (unsigned short)-1 (provided the size of the types differ). Thus, the comparison

    idx == std::string::npos

might yield false, if idx has the value -1 and idx and string::npos have different types:

    string s;

    ...
    int idx = s.find("not found");     //assume it returns npos
    if (idx == std:: string::npos) {   //ERROR: comparison might not work
        ...
    }

One way to avoid this error is to check whether the search fails directly:

    if (s.find("hi") == std::string::npos) {
        ...
    }

However, often you need the index of the matching character position. Thus, another simple solution is to define your own signed value for npos:

    const int NPOS = -1;

Now the comparison looks a bit different (and even more convenient):

    if (idx == NPOS) {     //works almost always
        ...
    }

Unfortunately, this solution is not perfect because the comparison fails if either idx has type unsigned short or the index is greater than the maximum value of int (because of these problems the standard did not define it that way). However, because both might happen very rarely, the solution works in most situations. To write portable code, however, you should always use string::size_type for any index of your string type. For a perfect solution you'd need some overloaded functions that consider the exact type of string::size_type. I hope the standard will provide a better solution in the future.

11.2.13 Iterator Support for Strings

A string is an ordered collection of characters. As a consequence, the C++ standard library provides an interface for strings that lets you use strings as STL containers.[7]

In particular, you can call the usual member functions to get iterators that iterate over the characters of a string. If you are not familiar with iterators, consider them as something that can refer to a single character inside a string, just as ordinary pointers do for C-strings. By using these objects, you can iterate over all characters of a string by calling several algorithms that either are provided by the C++ standard library or that are user defined. For example, you can sort the characters of a string, reverse the order, or find the character that has the maximum value.

String iterators are random access iterators. This means that they provide random access and that you can use all algorithms (see Section 5.3.2, and Section 7.2, for a discussion about iterator categories). As usual, the types of string iterators (iterator, const_iterator, and so on) are defined by the string class itself. The exact type is implementation defined, but usually string iterators are defined simply as ordinary pointers. See Section 7.2.6, for a discussion of a nasty difference between iterators that are implemented as pointers and iterators that are implemented as classes.

Iterators are invalidated when reallocation occurs or when certain changes are made to the values to which they refer. See Section 11.2.6, for details.

Iterator Functions for Strings

Table 11.6 shows all of the member functions that strings provide for iterators. As usual, the range specified by beg and end is a half-open range that includes beg but excludes end (often written as [beg,end), see Section 5.3).

To support the use of back inserters for string, the push_back() function is defined. See Section 7.4.2, for details about back inserters and page 502 for an example of their use with strings.

Example of Using String Iterators

A very useful thing that you can do with string iterators is to make all characters of a string lowercase or uppercase via a single statement. For example:

   //string/iter1.cpp

    #include <string>
    #include <iostream>
    #include <algorithm>
    #include <cctype>
    using namespace std;
Table 11.6. Iterator Operations of Strings
Expression Effect
s.begin() Returns a random access iterator for the first character
s.end() Returns a random access iterator for the position after the last character
s.rbegin() Returns a reverse iterator for the first character of a reverse iteration (thus, for the last character)
s.rend() Returns a reverse iterator for the position after the last character of a reverse iteration (thus, the position before the first character)
string s(beg,end) Creates a string that is initialized by all characters of the range [beg,end)
s.append(beg,end) Appends all characters of the range [beg,end)
s.assign(beg,end) Assigns all characters of the range [beg,end)
s.insert(pos,c) Inserts the character c at iterator position pos and returns the iterator position of the new character
s.insert(pos,num,c) Inserts num occurrences of the character c at iterator position pos and returns the iterator position of the first new character
s.insert(pos,beg,end) Inserts all characters of the range [beg,end) at iterator position pos
s.erase(pos) Deletes the character to which iterator pos refers and returns the position of the next character
s.erase(beg,end) Deletes all characters of the range [beg,end) and returns the next position of the next character
s.replace(beg, end, str) Replaces all characters of the range [beg,end) with the characters of string str
s.replace(beg,end,cstr) Replaces all characters of the range [beg,end) with the characters of the C-string cstr
s.replace(beg,end,cstr,len) Replaces all characters of the range [beg,end) with len characters of the character array cstr
s.replace(beg,end,num,c) Replaces all characters of the range [beg,end) with num occurrences of the character c
s.replace(beg,end,newBeg,newEnd) Replaces all characters of the range [beg,end) with all characters of the range [newBeg,newEnd)
    int main()
    {
        //create a string
        string s("The zip code of Hondelage in Germany is 38108");
        cout << "original: " << s << endl;


        //lowercase all characters
        transform (s.begin(), s.end(),    //source
                   s.begin(),             //destination
                   tolower);              //operation
        cout << "lowered: " << s << endl;


        //uppercase all characters
        transform (s.begin(), s.end(),    //source
                   s.begin(),             //destination
                   toupper);              //operation
        cout << "uppered: " << s << endl;

    }

The output of the program is as follows:

    original: The zip code of Hondelage in Germany is 38108
    lowered:  the zip code of hondelage in germany is 38108
    uppered:  THE ZIP CODE OF HONDELAGE IN GERMANY IS 38108

Note that tolower() and toupper() are old C functions that use the global locale. If you have a different locale or more than one locale in your program, you should use the new form of tolower() and toupper(). See Section 14.4.4, for details.

The following example demonstrates how the STL enables you to use your own search and sort criteria. It compares and searches strings in a case-insensitive way:

    //string/iter2.cpp

    #include <string>
    #include <iostream>
    #include <algorithm>
    using namespace std;


    bool nocase_compare (char c1, char c2)
    {
        return toupper(c1) == toupper(c2);
    }
    int main()
    {
        string s1("This is a string");
        string s2("STRING");


        //compare case insensitive
        if (s1.size() == s2.size() &&        //ensure same sizes
            equal (s1.begin(),s1.end(),      //first source string
                   s2.begin(),               //second source string
                   nocase_compare)) {        //comparison criterion
            cout << "the strings are equal" << endl;
        }
        else {
            cout << "the strings are not equal" << endl;
        }


        //search case insensitive
        string::iterator pos;
        pos = search (s1.begin() ,s1.end(),  //source string in which to search
                      s2.begin(), s2.end(),  //substring to search
                      nocase_compare);       //comparison criterion
        if (pos == s1.end()) {
            cout << "s2 is not a substring of s1" << endl;
        }
        else {
            cout << ' " ' << s2 << "\" is a substring of \""
                 << s1 << "\" (at index " << pos - s1.begin() << ")"
                 << endl;
        }
    }

Note that the caller of equal() has to ensure that the second range has at least as many elements/characters as the first range. Thus, comparing the string size is necessary; otherwise, the behavior will be undefined.

In the last output statement you can process the difference of two string iterators to get the index of the character position:

    pos - s1.begin()

This is because string iterators are random access iterators. Similar to transferring an index into the iterator position, you can simply add the value of the index.

In this example the user-defined auxiliary function nocase_compare() is provided to compare two strings in a case-insensitive way. Instead, you can also use a combination of some function adapters and replace the expression nocase_compare with the following expression:

    compose_f_gx_hy(equal_to<int>(),
                     ptr_fun(toupper),
                     ptr_fun(toupper))

See page 309 and page 318 for further details.

If you use strings in sets or maps, you might need a special sorting criterion to let the collections sort the string in a case-insensitive way. See page 213 for an example that demonstrates how to do this.

The following program demonstrates other examples of strings using iterator functions:

    //string/iter3.cpp

    #include <string>
    #include <iostream>
    #include <algorithm>
    using namespace std;


    int main()
    {
        //create constant string
        const string hello("Hello, how are you?");


        //initialize string s with all characters of string hello
        string s(hello.begin(),hello.end());


        //iterate through all of the characters
        string::iterator pos;
        for (pos = s.begin(); pos != s.end(); ++pos) {
            cout << *pos;
        }
        cout << endl;


        //reverse the order of all characters inside the string
        reverse (s.begin(), s.end());
        cout << "reverse:       " << s << endl;


        //sort all characters inside the string
        sort (s.begin(), s.end());
        cout << "ordered:       " << s << endl;


        /*remove adjacent duplicates
         *-unique() reorders and returns new end
         *-erase() shrinks accordingly
         */
        s.erase (unique(s.begin(),
                        s.end()),
                 s.end());
        cout << "no duplicates: " << s << endl;
    }

The program has the following output:

    Hello, how are you?
    reverse:       ?uoy era woh ,olleH
    ordered:          ,?Haeehlloooruwy
    no duplicates:  ,?Haehloruwy

The following example uses back inserters to read the standard input into a string:

    //string/unique.cpp

    #include <iostream>
    #include <string>
    #include <algorithm>
    #include <locale>
    using namespace std;


    class bothWhiteSpaces {
      private:
        const locale& loc; //locale
      public:
        /*constructor
         *-save the locale object
         */
        bothWhiteSpaces (const locale& l) : loc(l) {
        }
        /*function call
         *-returns whether both characters are whitespaces
         */
        bool operator() (char elem1, char elem2) {
            return isspace(elem1,loc) && isspace(elem2,loc);
        }
    };


    int main()
    {
        string contents;


        //don't skip leading whitespaces
        cin.unsetf (ios::skipws);


        //read all characters while compressing whitespaces
        unique_copy(istream_iterator<char>(cin) ,      //beginning of source
                    istream_iterator<char>(),          //end of source
                    back_inserter (contents),          //destination
                    bothWhiteSpaces (cin. getloc ())); //criterion for removing
        //process contents
        //-here: write it to the standard output
        cout << contents;
    }

By using the unique_copy() algorithm (see Section 9.7.2), all characters are read from the input stream cin and inserted into the string contents. The bothWhiteSpaces function object is used to check whether two consecutive characters are both whitespaces. To do this, it is initialized by the locale of cin and calls isspace(), which checks whether a character is a whitespace character (see Section 14.4.4, for a discussion of isspace()). unique_copy() uses the criterion bothWhiteSpaces to remove adjacent duplicate whitespaces. You can find a similar example in the reference section about unique_copy() on page 385.

11.2.14 Internationalization

As mentioned in the introduction of the string class (see Section 11.2.1), the template string class basic_string<> is parameterized by the character type, the traits of the character type, and the memory model. Type string is the specialization for characters of type char, and type wstring is the specialization for characters of type wchar_t.

The character traits are provided to specify the details of how to deal with aspects depending on the representation of a character type. An additional class is necessary because you can't change the interface of built-in types (such as char and wchar_t), and the same character type may have different traits. The details about the traits classes are described in Section 14.1.2.

The following code defines a special traits class for strings so that they operate in a case-insensitive way:

    //string/icstring.hpp

    #include <string>
    #include <iostream>
    #include <cctype>


    /* replace functions of the standard char_traits<char>
     * so that strings behave in a case-insensitive way
     */
    struct ignorecase_traits : public std::char_traits<char> {
        //return whether c1 and c2 are equal
        static bool eq(const char& c1, const char& c2) {
            return std::toupper(c1)==std::toupper(c2);
        }
        //return whether cl is less than c2
        static bool It(const char& c1, const char& c2){
            return std::toupper(c1)<std::toupper(c2);
        }
        //compare up to n characters of s1 and s2
        static int compare(const char* s1, const char* s2, size_t n) {
            for (size_t i=0; i<n; ++i) {
                if (!eq(s1[i],s2[i])) {
                    return lt(s1 [i],s2[i])?-1:1;
                }
            }
            return 0;
        }
        //search c in s
        static const char* find(const char* s, size_t n,
                                const char& c) {
            for (size_t i=0; i<n; ++i) {
                 if (eq(s[i],c)) {
                     return &(s[i]);
                 }
            }
            return 0;
        }
    };
    //define a special type for such strings
    typedef std::basic_string<char,ignorecase_traits> icstring;


    /*define an output operator
     *because the traits type is different than that for std::ostream
     */
    std::ostream& operator << (std::ostream& strm, const icstring& s)
    {
        //simply convert the icstring into a normal string
        return strm << std::string(s.data(), s.length());
    }

The definition of the output operator is necessary because the standard only defines I/O operators for streams that use the same character and traits type. But here, the traits type differs, so we have to define our own output operator. For input operators the same problem occurs.

The following program demonstrates how to use these special kinds of strings:

    //string/icstring1.cpp

    #include "icstring.hpp"


    int main()
    {
        using std::cout;
        using std::endl;


        icstring s1("hallo");
        icstring s2("otto");
        icstring s3("hALLo");


        cout << std::boolalpha;
        cout << s1 << " == " << s2 << " : " << (s1==s2) << endl;
        cout << s1 << " == " << s3 << " : " << (s1==s3) << endl;


        icstring::size_type idx = s1.find("All");
        if (idx != icstring::npos) {
            cout << "index of \"A11\" in \"" << s1 << "\": "
                 << idx << endl;
        }
        else {
            cout << "\"All\" not found in \"" << s1 << endl;
        }
    }

The program has the following output:

    hallo == otto : false
    hallo == hALLo : true
    index of "All" in "hallo": 1

See Chapter 14 for more details about internationalization.

11.2.15 Performance

The standard does not specify how the string class is to be implemented. It only specifies the interface. There may be important differences in speed and memory usage depending on the concept and priorities of the implementation.

If you prefer better speed, make sure that your string class uses a concept such as reference counting. Reference counting makes copies and assignments faster because the implementation only copies and assigns references instead of the contents of a string (see Section 6.8, for a smart pointer class that enables reference counting for any type). By using reference counting you might not even need to pass strings by constant reference; however, to maintain flexibility and portability, you always should.

11.2.16 Strings and Vectors

Strings and vectors behave similarly. This is not a surprise because both are containers that are typically implemented as dynamic arrays. Thus, you could consider a string as a special kind of a vector that has characters as elements. In fact, you can use a string as an STL container. This is covered by Section 11.2.13. However, considering a string as a special kind of vector is dangerous because there are many fundamental differences between the two. Chief of these are their two primary goals:

These different goals typically result in completely different implementations. For example, strings are often implemented by using reference counting; vectors never are. Nevertheless, you can also use vectors as ordinary C-strings. See Section 6.2.3, for details.

11.3 String Class in Detail

In this section string means the actual string class. It might be string, wstring, or any other specialization of class basic_string<>. Type char means the actual character type, which is char for string and wchar_t for wstring. Other types and values that are in italic type have definitions that depend on individual definitions of the character type or traits class. The details about traits classes are provided in Section 14.1.2.

11.3.1 Type Definitions and Static Values

string::traits_type

string::value_type

string::size_type

string::difference_type

string::reference

string::const_reference

string::pointer

string::const_pointer

string::iterator

string::const_iterator

string::reverse_iterator

string::const_reverse_iterator

static const size_type string::npos

11.3.2 Create, Copy, and Destroy Operations

string::string ()

string::string (const string& str)

string::string (const string& str, size_type str_idx)

string::string (const string& str, size_type str_idx, size_type str_num)

string::string (const char* cstr)

string::string (const char* chars, size_type chars_len)

string::string (size_type num, char c)

string ::string (InputIterator beg, Input Iterator end)

string::~string ()

Most constructors allow you to pass an allocator as an additional argument (see Section 11.3.12).

11.3.3 Operations for Size and Capacity

Size Operations

size_type string::size () const

size_type string::length () const

bool string::empty () const

size_type string::max_size () const

Capacity Operations

size_type string::capacity () const

void string::reserve ()

Void string::reserve (size_type num)

11.3.4 Comparisons

bool comparison (const string& str1, const string& str2)

bool comparison (const string& str, const char* cstr)

bool comparison (const char* cstr, const string& str)

int string::compare (const string& str) const

int string::compare (size_type idx, size_type len, const string& str) const

int string::compare (size_type idx, size_type len, const string& str, size_type str_idx, size_type str_len) const

int string::compare (const char* cstr) const

int string::compare (size_type idx, size_type len, const char* cstr) const

int string::compare (size_type idx,size_type len, const char* chars, size_type chars_len)const

11.3.5 Character Access

char& string::operator [ ] (size_type idx)

char string::operator [ ] (size_type idx) const

char& string::at (size_type idx)

const char& string::at (size_type idx) const

11.3.6 Generating C-Strings and Character Arrays

const char* string::c_str () const

const char* string::data () const

size_type string::copy (char* buf, size_type buf_size) const

size_type string::copy (char* buf, size_type buf_size, size_type idx) const

11.3.7 Modifying Operations

Assignments

string& string::operator = (const string& str)

string& string::assign (const string& str)

string& string::assign (const string& str, size_type str_idx, size_type str_num)

string & string:: operator = (const char* cstr)

string & string::assign (const char* cstr)

string& string::assign (const char* chars, size_type chars_len)

string& string:: operator = (char c)

string & string::assign (size_type num, char c)

void string::swap (string& str)

void swap (string& str1, string& str2)

Appending Characters

string& string::operator += (const string& str)

string& string::append (const string& str)

string& string::append (const string& str, size_type str_idx, size_type str_num)

string& string:: operator += (const char* cstr)

string& string::append (const char* cstr)

string& string::append (const char* chars, size_type chars_len)

string& string::append (size_type num, char c)

string& string::operator += (char c)

void string:: push_back (char c)

string& string::append (InputIterator beg, InputIterator end)

Inserting Characters

string& string::insert (size_type idx, const string& str)

string& string::insert (size_type idx, const string& str, size_type str_idx, size_type str_num)

string& string::insert (size_ type idx, const char* cstr)

string& string::insert (size_type idx, const char* chars, size_type chars_len)

string& string ::insert (size_type idx, size_type num, char c)

void string ::insert (iterator pos, size_type num, char c)

iterator string ::insert (iterator pos, char c )

void string ::insert (iterator pos, InputIterator beg, InputIterator end )

Erasing Characters

void string ::clear ()

string& string ::erase ()

string& string ::erase (size_type idx )

string& string ::erase (size_type idx, size_type len )

string& string ::erase (iterator pos)

string& string ::erase (iterator beg, iterator end )

Changing the Size

void string ::resize (size_type num)

void string ::resize (size_type num, char c )

Replacing Characters

string& string ::replace (size_type idx, size_type len, const string& str)

string& string ::replace (iterator beg, iterator end, const string& str)

string& string::replace (size_type idx, size_type len, const string& str, size_type str_idx, size_type str_num)

string& string::replace (size_type idx, size_type len, const char* cstr)

string& string::replace (iterator beg, iterator end, const char* cstr)

string& string::replace (size_type idx, size_type len, const char* chars, size_type chars_len)

string& string::replace (iterator beg, iterator end, const char* chars, size_type chars_len)

string& string::replace (size_type idx, size_type len, size_type num, char c)

string& string::replace (iterator beg, iterator end, size_type num, char c)

string& string::replace (iterator beg, iterator end InputIterator newBeg, InputIterator newEnd)

11.3.8 Searching and Finding

Find a Character

size_type string::find (char c) const

size_type string::flnd (char c, size_type idx) const

size_type string::rfind (char c) const

size_type string::rfind (char c, size_type idx) const

Find a Substring

size_type string::find (const string& str) const

size_type string::find (const string& str, size_type idx) const

size_type string::rfind (const string& str) const

size_type string::rfind (const string& str, size_type idx) const

size_type string::find (const char* cstr) const

size_type string::find (const char* cstr, size_type idx) const

size_type string::rfind (const char* cstr) const

size_type string::rfind (const char* cstr, size_type idx) const

size_type string::find (const char* chars, size_type idx, size_type chars_len) const

size_type string::rfind (const char* chars, size_type idx, size_type chars_len) const

Find First of Different Characters

size_type string::find_first_of (const string& str) const

size_type string::find_first_of (const string& str, size_type idx) const

size_type string::find_first_not_of (const string& str) const

size_type string::find_first_not_of (const string& str, size_type idx) const

size_type string:: find_first_of (const char* cstr) const

size_type string::find_first_of (const char* cstr, size_type idx) const

size_type string::find_first_not_of (const char* cstr) const

size_type string:: find_first_not_of (const char* cstr, size_type idx) const

size_type string::find_first_of (const char* chars, size_type idx, size_type chars_len) const

size_type string::find_first_not_of (const char* chars, size_type idx, size_type chars_len) const

size_type string::find_first_of (char c) const

size_type string::find_first_of (char c, size_type idx) const

size_type string::find_first_not_of (char c) const

size_type string::find_first_not_of (char c, size_type idx) const

Find Last of Different Characters

size_type string::find_last_of (const string& str) const

size_type string::find_last_of (const string& str, size_type idx) const

size_type string::find_last_not_of (const string& str) const

size_type string::find_last_not_of (const string& str, size_type idx) const

size_type string::find_last_of (const char* cstr) const

size_type string::find_last_of (const char* cstr, size_type idx) const

size_type string::find_last_not_of (const char* cstr) const

size_type string::find_last_not_of (const char* cstr, size_type idx) const

size_type string::find_last_of (const char* chars, size_type idx, size_type chars_len) const

size_type string::find_last_not_of (const char* chars, size_type idx, size_type chars_len) const

size_type string::find_last_of ( char c) const

size_type string::find_last_of (char c, size_type idx) const

size_type string::find_last_not_of (char c) const

size_type string::find_last_not_of (char c, size_type idx) const

11.3.9 Substrings and String Concatenation

string string::substr () const

string string::substr (size_type idx) const

string string::substr (size_type idx, size_type len) const

string operator + (const string& str1, const string& str2)

string operator + (const string& str, const char* cstr)

string operator + (const char* cstr, const string& str)

string operator + (const string& str, char c)

string operator + (char c, const string& str)

11.3.10 Input/Output Functions

ostream& operator<< (ostream& strm, const string& str)

istream& operator >> (istream& strm, string& str)

istream& getline (istream& strm, string& str)

istream& getline (istream& strm, string& str, char delim)

11.3.11 Generating Iterators

iterator string::begin ()

const_iterator string::begin() const

iterator string::end ()

const_iterator string::end() const

reverse_iterator string::rbegin ()

const_reverse_iterator string::rbegin () const

reverse_iterator string::rend ()

const_reverse_iterator string::rend () const

11.3.12 Allocator Support

Strings provide the usual members of classes with allocator support.

string::allocator_type

allocator_type string::get_allocator () const

Strings also provide all constructors with optional allocator arguments. The following are all of the string constructors, including their optional allocator arguments, according to the standard:

    namespace std {
        template<class charT,
                 class traits = char_traits<charT>,
                 class Allocator = allocator<charT> >
        class basic_string {
          public:
            //default constructor
            explicit basic_string(const Allocator& a = Allocator());


            //copy constructor and substrings
            basic_string(const basic_string& str,
                         size_type str_idx = 0,
                         size_type str_num = npos);
            basic_string(const basic_string& str,
                         size_type str_idx, size_type str_nnm,
                         const Allocator&);


            //constructor for C-strings
            basic_string(const charT* cstr,
                         const Allocator& a = Allocator());


            //constructor for character arrays
            basic_string(const charT* chars, size_type chars_len,
                         const Allocator& a = Allocator());


            //constructor for num occurrences of a character
            basic_string(size_type num, charT c,
                         const Allocator& a = Allocator());


            // constructor for a range of characters
            template<class InputIterator>
            basic_string(InputIterator beg, InputIterator end,
                         const Allocator& a = Allocator());
            ...

        };

   }

These constructors behave as described in Section 11.3.2, with the additional ability that you can pass your own memory model object. If the string is initialized by another string, the allocator also gets copied.[10] See Chapter 15 for more details about allocators.

[1]  In particular, the size_type of a string depends on the memory model of the string class. See Section 11.3.12, for details.

[2]  In this case, two member functions do the same with respect to the two different design approaches that are merged here. length() returns the length of the string as strlen() does for ordinary C-strings, whereas size() is the common member function for the number of elements according to the concept of the STL.

[3]  In systems that do not support default template parameters, the third argument is usually missing.

[4]   In this case, two member functions do the same thing because length() returns the length of the string, as strlen() does for ordinary C-strings, whereas size() is the common member function for the number of elements according to the concept of the STL.

[5]  You don't have to qualify getline() with std:: because "Koenig lookup" will always consider the namespace where the class of an argument was defined when calling a function (see page 17).

[6]  Don't be confused because I write about searching "and" finding. They are (almost) synonymous. The search functions use "find" in their name. However, unfortunately they don't guarantee to find anything. In fact, they "search" for something or "try to find" something. So I use the term search for the behavior of these functions and find with respect to their name.

[7]  The STL is introduced in Chapter 5.

[8]  The standard specifies the behavior of this form of compare() differently: It states that cstr is not considered a C-string but a character array, and passes npos as its length (in fact, it calls the following form of compare() by using npos as an additional parameter). This is a bug in the standard (it would always throw a length_error exception).

[9]  The standard specifies that the second form of this function returns the position after end. This is a bug in the standard.

[10]  The original standard states that the default allocator is used when a string gets copied. However, this does not make much sense, so this is the proposed resolution to fix this behavior.

CONTENTS
Browser Based Help. Published by chm2web software.