Effective C++ Third Edition 55 Specific Ways to Improve Your Programs and Designs

Item 22: Declare data members private

Okay, here's the plan. First, we're going to see why data members shouldn't be public. Then we'll see that all the arguments against public data members apply equally to protected ones. That will lead to the conclusion that data members should be private, and at that point, we'll be done.

So, public data members. Why not?

Let's begin with syntactic consistency (see also Item 18). If data members aren't public, the only way for clients to access an object is via member functions. If everything in the public interface is a function, clients won't have to scratch their heads trying to remember whether to use parentheses when they want to access a member of the class. They'll just do it, because everything is a function. Over the course of a lifetime, that can save a lot of head scratching.

But maybe you don't find the consistency argument compelling. How about the fact that using functions gives you much more precise control over the accessibility of data members? If you make a data member public, everybody has read-write access to it, but if you use functions to get or set its value, you can implement no access, read-only access, and read-write access. Heck, you can even implement write-only access if you want to:

class AccessLevels { public: ... int getReadOnly() const { return readOnly; } void setReadWrite(int value) { readWrite = value; } int getReadWrite() const { return readWrite; } void setWriteOnly(int value) { writeOnly = value; } private: int noAccess; // no access to this int int readOnly; // read-only access to this int int readWrite; // read-write access to this int int writeOnly; // write-only access to this int };

Such fine-grained access control is important, because many data members should be hidden. Rarely does every data member need a getter and setter.

Still not convinced? Then it's time to bring out the big gun: encapsulation. If you implement access to a data member through a function, you can later replace the data member with a computation, and nobody using your class will be any the wiser.

For example, suppose you are writing an application in which automated equipment is monitoring the speed of passing cars. As each car passes, its speed is computed and the value added to a collection of all the speed data collected so far:

class SpeedDataCollection { ... public: void addValue(int speed); // add a new data value double averageSoFar() const; // return average speed ... };

Now consider the implementation of the member function averageSoFar. One way to implement it is to have a data member in the class that is a running average of all the speed data so far collected. Whenever averageSoFar is called, it just returns the value of that data member. A different approach is to have averageSoFar compute its value anew each time it's called, something it could do by examining each data value in the collection.

The first approach (keeping a running average) makes each SpeedDataCollection object bigger, because you have to allocate space for the data members holding the running average, the accumulated total, and the number of data points. However, averageSoFar can be implemented very efficiently; it's just an inline function (see Item 30) that returns the value of the running average. Conversely, computing the average whenever it's requested will make averageSoFar run slower, but each SpeedDataCollection object will be smaller.

Who's to say which is best? On a machine where memory is tight (e.g., an embedded roadside device), and in an application where averages are needed only infrequently, computing the average each time is probably a better solution. In an application where averages are needed frequently, speed is of the essence, and memory is not an issue, keeping a running average will typically be preferable. The important point is that by accessing the average through a member function (i.e., by encapsulating it), you can interchange these different implementations (as well as any others you might think of), and clients will, at most, only have to recompile. (You can eliminate even that inconvenience by following the techniques described in Item 31.)

Hiding data members behind functional interfaces can offer all kinds of implementation flexibility. For example, it makes it easy to notify other objects when data members are read or written, to verify class invariants and function pre-and postconditions, to perform synchronization in threaded environments, etc. Programmers coming to C++ from languages like Delphi and C# will recognize such capabilities as the equivalent of "properties" in these other languages, albeit with the need to type an extra set of parentheses.

The point about encapsulation is more important than it might initially appear. If you hide your data members from your clients (i.e., encapsulate them), you can ensure that class invariants are always maintained, because only member functions can affect them. Furthermore, you reserve the right to change your implementation decisions later. If you don't hide such decisions, you'll soon find that even if you own the source code to a class, your ability to change anything public is extremely restricted, because too much client code will be broken. Public means unencapsulated, and practically speaking, unencapsulated means unchangeable, especially for classes that are widely used. Yet widely used classes are most in need of encapsulation, because they are the ones that can most benefit from the ability to replace one implementation with a better one.

The argument against protected data members is similar. In fact, it's identical, though it may not seem that way at first. The reasoning about syntactic consistency and fine-grained access control is clearly as applicable to protected data as to public, but what about encapsulation? Aren't protected data members more encapsulated than public ones? Practically speaking, the surprising answer is that they are not.

Item 23 explains that something's encapsulation is inversely proportional to the amount of code that might be broken if that something changes. The encapsulatedness of a data member, then, is inversely proportional to the amount of code that might be broken if that data member changes, e.g., if it's removed from the class (possibly in favor of a computation, as in averageSoFar, above).

Suppose we have a public data member, and we eliminate it. How much code might be broken? All the client code that uses it, which is generally an unknowably large amount. Public data members are thus completely unencapsulated. But suppose we have a protected data member, and we eliminate it. How much code might be broken now? All the derived classes that use it, which is, again, typically an unknowably large amount of code. Protected data members are thus as unencapsulated as public ones, because in both cases, if the data members are changed, an unknowably large amount of client code is broken. This is unintuitive, but as experienced library implementers will tell you, it's still true. Once you've declared a data member public or protected and clients have started using it, it's very hard to change anything about that data member. Too much code has to be rewritten, retested, redocumented, or recompiled. From an encapsulation point of view, there are really only two access levels: private (which offers encapsulation) and everything else (which doesn't).

Things to Remember

  • Declare data members private. It gives clients syntactically uniform access to data, affords fine-grained access control, allows invariants to be enforced, and offers class authors implementation flexibility.

  • protected is no more encapsulated than public.

Категории