Effective C++ Third Edition 55 Specific Ways to Improve Your Programs and Designs

2017-07-07 02:10:07

When it comes to multiple inheritance (MI), the C++ community largely breaks into two basic camps. One camp believes that if single inheritance (SI) is good, multiple inheritance must be better. The other camp argues that single inheritance is good, but multiple inheritance isn't worth the trouble. In this Item, our primary goal is to understand both perspectives on the MI question.

One of the first things to recognize is that when MI enters the designscape, it becomes possible to inherit the same name (e.g., function, typedef, etc.) from more than one base class. That leads to new opportunities for ambiguity. For example:

class BorrowableItem { // something a library lets you borrow public: void checkOut(); // check the item out from the library ... }; class ElectronicGadget { private: bool checkOut() const; // perform self-test, return whether ... // test succeeds }; class MP3Player: // note MI here public BorrowableItem, // (some libraries loan MP3 players) public ElectronicGadget { ... }; // class definition is unimportant MP3Player mp; mp.checkOut(); // ambiguous! which checkOut?

Note that in this example, the call to checkOut is ambiguous, even though only one of the two functions is accessible. (checkOut is public in BorrowableItem but private in ElectronicGadget.) That's in accord with the C++ rules for resolving calls to overloaded functions: before seeing whether a function is accessible, C++ first identifies the function that's the best match for the call. It checks accessibility only after finding the best-match function. In this case, both checkOuts are equally good matches, so there's no best match. The accessibility of ElectronicGadget::checkOut is therefore never examined.

To resolve the ambiguity, you must specify which base class's function to call:

mp.BorrowableItem::checkOut(); // ah, that checkOut...

You could try to explicitly call ElectronicGadget::checkOut, too, of course, but then the ambiguity error would be replaced with a "you're trying to call a private member function" error.

Multiple inheritance just means inheriting from more than one base class, but it is not uncommon for MI to be found in hierarchies that have higher-level base classes, too. That can lead to what is sometimes known as the "deadly MI diamond"

class File { ... }; class InputFile: public File { ... }; class OutputFile: public File { ... }; class IOFile: public InputFile, public OutputFile { ... };

Any time you have an inheritance hierarchy with more than one path between a base class and a derived class (such as between File and IOFile above, which has paths through both InputFile and OutputFile), you must confront the question of whether you want the data members in the base class to be replicated for each of the paths. For example, suppose that the File class has a data member, fileName. How many copies of this field should IOFile have? On the one hand, it inherits a copy from each of its base classes, so that suggests that IOFile should have two fileName data members. On the other hand, simple logic says that an IOFile object has only one file name, so the fileName field it inherits through its two base classes should not be replicated.

C++ takes no position on this debate. It happily supports both options, though its default is to perform the replication. If that's not what you want, you must make the class with the data (i.e., File) a virtual base class. To do that, you have all classes that immediately inherit from it use virtual inheritance:

class File { ... }; class InputFile: virtual public File { ... }; class OutputFile: virtual public File { ... }; class IOFile: public InputFile, public OutputFile { ... };

The standard C++ library contains an MI hierarchy just like this one, except the classes are class templates, and the names are basic_ios, basic_istream, basic_ostream, and basic_iostream instead of File, InputFile, OutputFile, and IOFile.

From the viewpoint of correct behavior, public inheritance should always be virtual. If that were the only point of view, the rule would be simple: anytime you use public inheritance, use virtual public inheritance. Alas, correctness is not the only perspective. Avoiding the replication of inherited fields requires some behind-the-scenes legerdemain on the part of compilers, and the result is that objects created from classes using virtual inheritance are generally larger than they would be without virtual inheritance. Access to data members in virtual base classes is also slower than to those in non-virtual base classes. The details vary from compiler to compiler, but the basic thrust is clear: virtual inheritance costs.

It costs in other ways, too. The rules governing the initialization of virtual base classes are more complicated and less intuitive than are those for non-virtual bases. The responsibility for initializing a virtual base is borne by the most derived class in the hierarchy. Implications of this rule include (1) classes derived from virtual bases that require initialization must be aware of their virtual bases, no matter how far distant the bases are, and (2) when a new derived class is added to the hierarchy, it must assume initialization responsibilities for its virtual bases (both direct and indirect).

My advice on virtual base classes (i.e., on virtual inheritance) is simple. First, don't use virtual bases unless you need to. By default, use non-virtual inheritance. Second, if you must use virtual base classes, try to avoid putting data in them. That way you won't have to worry about oddities in the initialization (and, as it turns out, assignment) rules for such classes. It's worth noting that Interfaces in Java and .NET, which are in many ways comparable to virtual base classes in C++, are not allowed to contain any data.

Let us now turn to the following C++ Interface class (see Item31) for modeling persons:

class IPerson { public: virtual ~IPerson(); virtual std::string name() const = 0; virtual std::string birthDate() const = 0; };

IPerson clients must program in terms of IPerson pointers and references, because abstract classes cannot be instantiated. To create objects that can be manipulated as IPerson objects, clients of IPerson use factory functions (again, see Item 31) to instantiate concrete classes derived from IPerson:

// factory function to create a Person object from a unique database ID; // see Item 18 for why the return type isn't a raw pointer std::tr1::shared_ptr<IPerson> makePerson(DatabaseID personIdentifier); // function to get a database ID from the user DatabaseID askUserForDatabaseID(); DatabaseID id(askUserForDatabaseID()); std::tr1::shared_ptr<IPerson> pp(makePerson(id)); // create an object // supporting the // IPerson interface ... // manipulate *pp via // IPerson's member // functions

But how does makePerson create the objects to which it returns pointers? Clearly, there must be some concrete class derived from IPerson that makePerson can instantiate.

Suppose this class is called CPerson. As a concrete class, CPerson must provide implementations for the pure virtual functions it inherits from IPerson. It could write these from scratch, but it would be better to take advantage of existing components that do most or all of what's necessary. For example, suppose an old database-specific class PersonInfo offers the essence of what CPerson needs:

class PersonInfo { public: explicit PersonInfo(DatabaseID pid); virtual ~PersonInfo(); virtual const char * theName() const; virtual const char * theBirthDate() const; ... private: virtual const char * valueDelimOpen() const; // see virtual const char * valueDelimClose() const; // below ... };

You can tell this is an old class, because the member functions return const char*s instead of string objects. Still, if the shoe fits, why not wear it? The names of this class's member functions suggest that the result is likely to be pretty comfortable.

You come to discover that PersonInfo was designed to facilitate printing database fields in various formats, with the beginning and end of each field value delimited by special strings. By default, the opening and closing delimiters for field values are square brackets, so the field value "Ring-tailed Lemur" would be formatted this way:

[Ring-tailed Lemur]

In recognition of the fact that square brackets are not universally desired by clients of PersonInfo, the virtual functions valueDelimOpen and valueDelimClose allow derived classes to specify their own opening and closing delimiter strings. The implementations of PersonInfo's member functions call these virtual functions to add the appropriate delimiters to the values they return. Using PersonInfo::theName as an example, the code looks like this:

const char * PersonInfo::valueDelimOpen() const { return "["; // default opening delimiter } const char * PersonInfo::valueDelimClose() const { return "]"; // default closing delimiter } const char * PersonInfo::theName() const { // reserve buffer for return value; because this is // static, it's automatically initialized to all zeros static char value[Max_Formatted_Field_Value_Length]; // write opening delimiter std::strcpy(value, valueDelimOpen()); append to the string in value this object's name field (being careful to avoid buffer overruns!) // write closing delimiter std::strcat(value, valueDelimClose()); return value; }

One might question the antiquated design of PersonInfo::theName (especially the use of a fixed-size static buffer, something that's rife for both overrun and threading problems see also Item21), but set such questions aside and focus instead on this: theName calls valueDelimOpen to generate the opening delimiter of the string it will return, then it generates the name value itself, then it calls valueDelimClose.

Because valueDelimOpen and valueDelimClose are virtual functions, the result returned by theName is dependent not only on PersonInfo but also on the classes derived from PersonInfo.

As the implementer of CPerson, that's good news, because while perusing the fine print in the IPerson documentation, you discover that name and birthDate are required to return unadorned values, i.e., no delimiters are allowed. That is, if a person is named Homer, a call to that person's name function should return "Homer", not "[Homer]".

The relationship between CPerson and PersonInfo is that PersonInfo happens to have some functions that would make CPerson easier to implement. That's all. Their relationship is thus is-implemented-in-terms-of, and we know that can be represented in two ways: via composition (see Item 38) and via private inheritance (see Item 39). Item 39 points out that composition is the generally preferred approach, but inheritance is necessary if virtual functions are to be redefined. In this case, CPerson needs to redefine valueDelimOpen and valueDelimClose, so simple composition won't do. The most straightforward solution is to have CPerson privately inherit from PersonInfo, though Item 39 explains that with a bit more work, CPerson could also use a combination of composition and inheritance to effectively redefine PersonInfo's virtuals. Here, we'll use private inheritance.

But CPerson must also implement the IPerson interface, and that calls for public inheritance. This leads to one reasonable application of multiple inheritance: combine public inheritance of an interface with private inheritance of an implementation:

class IPerson { // this class specifies the public: // interface to be implemented virtual ~IPerson(); virtual std::string name() const = 0; virtual std::string birthDate() const = 0; }; class DatabaseID { ... }; // used below; details are // unimportant class PersonInfo { // this class has functions public: // useful in implementing explicit PersonInfo(DatabaseID pid); // the IPerson interface virtual ~PersonInfo(); virtual const char * theName() const; virtual const char * theBirthDate() const; virtual const char * valueDelimOpen() const; virtual const char * valueDelimClose() const; ... }; class CPerson: public IPerson, private PersonInfo { // note use of MI public: explicit CPerson( DatabaseID pid): PersonInfo(pid) {} virtual std::string name() const // implementations { return PersonInfo::theName(); } // of the required // IPerson member virtual std::string birthDate() const // functions { return PersonInfo::theBirthDate(); } private: // redefinitions of const char * valueDelimOpen() const { return ""; } // inherited virtual const char * valueDelimClose() const { return ""; } // delimiter }; // functions

In UML, the design looks like this:

This example demonstrates that MI can be both useful and comprehensible.

At the end of the day, multiple inheritance is just another tool in the object-oriented toolbox. Compared to single inheritance, it's typically more complicated to use and more complicated to understand, so if you've got an SI design that's more or less equivalent to an MI design, the SI design is almost certainly preferable. If the only design you can come up with involves MI, you should think a little harder there's almost certainly some way to make SI work. At the same time, MI is sometimes the clearest, most maintainable, most reasonable way to get the job done. When that's the case, don't be afraid to use it. Just be sure to use it judiciously.

Things to Remember

Multiple inheritance is more complex than single inheritance. It can lead to new ambiguity issues and to the need for virtual inheritance.

Virtual inheritance imposes costs in size, speed, and complexity of initialization and assignment. It's most practical when virtual base classes have no data.

Multiple inheritance does have legitimate uses. One scenario involves combining public inheritance from an Interface class with private inheritance from a class that helps with implementation.

Категории