Part1 Part2 Part3 Part4 Part5 Part6 Part7 Part8 Part9 Part10 Part11 Part12 Part13 Part Part

Part Part Part Part Part Part Part Part1 Part1 Part1 Part1 1 2 3 4 5 6 7 8 9 0 1 2 3


Last month we saw, among others, how we can give a struct well defined values by using constructors, and how C+ + exceptions aid in error handling. This month we'll look at classes, a more careful study of object lifetime, especially in the light of exceptions. The stack example from last month will be improved a fair bit too.

A class
The class is the C++ construct for encapsulation. Encapsulation means publishing an interface through which you make things happen, and hiding the implementation and data necessary to do the job. A class is used to hide data, and publish operations on the data, at the same time. Let's look at the "Range" example from last month, but this time make it a class. The only operation that we allowed on the range last month was that of construction, and we left the data visible for anyone to use or abuse. What operations do we want to allow for a Range class? I decide that 4 operations are desirable: • Construction (same as last month.) • find lower bound. • find upper bound. • ask if a value is within the range. The second thing to ask when wishing for a function is (the first thing being what it's supposed to do) is in what ways things can go wrong when calling them, and what to do when that happens. For the questions, I don't see how anything can go wrong, so it's easy. We promise that the functions will not throw C++ exceptions by writing an empty exception specifier. I'll explain this class by simply writing the public interface of it: struct BoundsError {}; class Range { public: Range(int upper_bound = 0, int lower_bound = 0) throw (BoundsError); // Precondition: upper_bound >= lower_bound // Postconditions: // lower == upper_bound // upper == upper_bound int lowerBound() throw (); int upperBound() throw (); int includes(int aValue) throw (); private: // implementation details. }; This means that a class named "Range" is declared to have a constructor, behaving exactly like the constructor for the "Range" struct from last month, and three member functions (also often called methods,) called "lowerBound", "upperBound" and "includes". The keyword "public," on the fourth line from the top, tells that the constructor and the three member functions are reachable by anyone using instances of the Range class. The keyword "private" on the 3rd line from the bottom, says that whatever comes after is a secret to anyone but the "Range" class itself. We'll soon see more of that, but first an example (ignoring error handling) of how to use the "Range" class: int main(void) { Range r(5); cout << "r is a range from " << r.lowerBound() << " to " << r.upperBound() << endl; int i; for (;;) {

cout << "Enter a value (0 to stop) :"; cin >> i; if (i == 0) break; cout << endl << i << " is " << "with" << (r.includes(i) ? "in" : "out") << " the range" << endl; } return 0; } A test drive might look like this: [d:\cppintro\lesson2]rexample.exe r is a range from 0 to 5 Enter a value (0 to stop) :5 5 is within the range Enter a value (0 to stop) :7 7 is without the range Enter a value (0 to stop) :3 3 is within the range Enter a value (0 to stop) :2 2 is within the range Enter a value (0 to stop) :1 1 is within the range Enter a value (0 to stop) :0 Does this seem understandable? The member functions "lowerBound", "upperBound" and "includes" are, and behave just like, functions, that in some way are tied to instances of the class Range. You refer to them, just like you do member variables in a struct, but since they're functions, you call them (by using the, in C++ lingo named, function call operator "()".) Now to look at the magic making this happen by filling in the private part, and writing the implementation: struct BoundsError {}; class Range { public: Range(int upper_bound = 0, int lower_bound = 0) throw (BoundsError); // Precondition: upper_bound >= lower_bound // Postconditions: // lower == upper_bound // upper == upper_bound int lowerBound() throw (); int upperBound() throw (); int includes(int aValue) throw (); private: int lower; int upper; }; Range::Range(int upper_bound, int lower_bound) throw (BoundsError) : lower(lower_bound), /***/ upper(upper_bound) /***/ { // Preconditions. if (upper_bound < lower_bound) throw BoundsError();

// Postconditions. if (lower != lower_bound) throw BoundsError(); if (upper != upper_bound) throw BoundsError(); } int Range::lowerBound() throw () { return lower; /***/ } int Range::upperBound() throw () { return upper; /***/ } int Range::includes(int aValue) throw () { return aValue >= lower && aValue <= upper; /***/ } First, you see that the constructor is identical to that of the struct from last month. This is no coincidence. It does the same thing and constructors are constructors. You also see that "lowerBound", "upperBound" and "includes", look just like normal functions, except for the "Range::" thing. It's the "Range::" that ties the function to the class called Range, just like it is for the constructor. The lines marked /***/ are a bit special. They make use of the member variables "lower_bound" and "upper_bound." How does this work? To begin with, the member functions are tied to instances of the class, you cannot call any of these member functions without having an instance to call them on, and the member functions uses the member variables of that instance. Say for example we use two Range instances, like this: Range r1(5,2); Range r2(20,10); Then r1.lowerBound() is 2, r1.upperBound() is 5, r2.lowerBound() is 10 and r2.upperBound() is 20. So how come the member functions are allowed to use the member data, when it's declared private? Private, in C++, means secret for anyone except whatever belongs to the class itself. In this case, it means it's secret to anyone using the class, but the member functions belong to the class, so they can use it. So, where is the advantage of doing this, compared to the struct from last month? Hiding data is always a good thing. For example, if we, for whatever reason, find out that it's cleverer to represent ranges as the lower bound, plus the number of valid values between the lower bound and upper bound, we can do this, without anyone knowing or suffering from it. All we do is to change the private section of the class to: private: int lower_bound; int nrOfValues; And the implementation of the constructor to: Range::Range(int upper_bound, int lower_bound) throw (BoundsError) : lower(lower_bound), /***/ nrOfValues(upper_bound-lower_bound) /***/ ... And finally the implementations of "upperBound" and "includes" to: int Range::upperBound() throw () { return lower+nrOfValues; } int Range::includes(int aValue) throw () { return aValue >= lower && aValue <= lower+nrOfValues;

but prepended with the ~ character. for (unsigned u = 0. Destructor Just as you can control construction of an object by writing constructors.} We also have another. } Tracer::~Tracer() { cout << ". A destructor has the same name as the class. // destructor private: const char* string. there was a promise that the member variable "upper" would have a value greater than or equal to that of the member variable "lower". Tracer t2("t2"). }. } delete tp. A destructor is executed when an instance of an object dies. ~Tracer(). #include <iostream.lower = 25. The only one allowed to make changes to the member variables are functions belonging to the class.lower > r. either by going out of scope. We can use this to write a simple trace class. prepended by a "-" character. // Oops! Now r. you can control destruction by writing a destructor. Let's toy with it! int main(void) { Tracer t1("t1"). and usually more important. u < 3. a promise of integrity. and it never accepts any parameters. or when removed from the heap with the delete operator. benefit. Tracer::Tracer(const char* tracestring) : string(tracestring) { cout << "+ " << string << endl." << string << endl. Already with the struct. } When run. that helps us find out the life time of objects. It won't work. I get this behaviour (and so should you. and the same string. when constructed. tp = new Tracer("on heap"). Tracer t3. r.upper!!! Try this with the class. 2). } Tracer* tp = 0. How much was that promise worth with the struct? This much: Range r(5. and those we can control. unless you have a buggy compiler): . when destroyed. } What this simple class does is to write its own parameter string. ++u) { Tracer inLoop("inLoop"). prepended with a "+" character. eh?"). return 0.h> class Tracer { public: Tracer(const char* tracestring = "too lazy. { Tracer t1("Local t1"). Tracer* t2 = new Tracer("leaky").

t1 This means that the contained object ("Tracer") within "SuperTracer" is constructed before the "SuperTracer" object itself is. SuperTracer t2("t2"). right? class SuperTracer { public: SuperTracer(const char* tracestring).t2 .Local t1 . return 0.on heap .exe + t1 SuperTracer(t1) + t2 SuperTracer(t2) ~SuperTracer . the object on heap.inLoop + inLoop .too lazy. instantiated with the string "leaky" is never destroyed.exe + t1 + t2 + too lazy. SuperTracer::SuperTracer(const char* tracestring) : t(tracestring) { cout << "SuperTracer(" << tracestring << ")" << endl.t1 What conclusions can be drawn from this? With one exception. objects are destroyed in the reversed order of creation (have a careful look.inLoop + inLoop . } SuperTracer::~SuperTracer() { cout << "~SuperTracer" << endl. it's true.) We also see that the object. eh? . private: Tracer t. What happens with classes containing classes then? Must be tried. looking at how the constructor is written. eh? + inLoop .inLoop + Local t1 + leaky + on heap . }. ~SuperTracer().t2 ~SuperTracer .[d:\cppintro\lesson2]tracer. This is perhaps not very surprising. } int main(void) { SuperTracer t1("t1"). } What's your guess? [d:\cppintro\lesson2]stracer. with a call to the "Tracer" . and it's always true.

Here's the new "SuperTracer" along with an interesting "main" function. const char* tracestring) throw (const char*) : t(tracestring). "throw in constructor"). if (destructorThrow) throw (const char*)"SuperTracer::~SuperTracer". } try { SuperTracer t1(1. and one where the destructor throws. destruction always in the reversed order of construction. Perhaps a bit surprising is the fact that the "SuperTracer" objects destructor is called before that of the contained "Tracer". zero for throwing in the constructor. and non-zero for throwing in the destructor. It's not unlikely that the member data is useful in some way to the destructor.class constructor in the initialiser list. but there is a good reason for this. what about C++ exceptions? Now here we get into an interesting subject indeed! Let's look at two alternatives. } catch (const char* p) { . SuperTracer::SuperTracer(int i. "throw in destructor"). int destructorThrow. we'd have serious problems properly destroying our no longer needed objects. } SuperTracer::~SuperTracer() throw (const char*) { cout << "~SuperTracer" << endl. if (!destructorThrow) throw (const char*)"SuperTracer::SuperTracer". destructorThrow(i) { cout << "SuperTracer(" << tracestring << ")" << endl. "throw in destructor"). SuperTracer t1(1. the curious wonders. and what if the member data is destroyed when the destructor starts running? At best a destructor would then be totally worthless. "throw in constructor"). Superficially. const char* tracestring) throw (const char*). We'll control this by a second parameter. SuperTracer t2(0. } try { cout << "Let the fun begin" << endl. So. } catch (const char* p) { cout << "Caught " << p << endl. class SuperTracer { public: SuperTracer(int i. ~SuperTracer() throw (const char*). one where the constructor of "SuperTracer" throws. the reason might appear to be that of symmetry. private: Tracer t. but more likely. but it's a bit deeper than that. } catch (const char* p) { cout << "Caught " << p << endl. }. } int main(void) { try { SuperTracer t1(0.

Comments about the bug found are below the result: [d:\cppintro\lesson2]s2tracer. and then an object is created that throws at once. The C++ way is. your program will terminate very quickly. before allowing a destructor to throw exceptions. More important. Minimum for a stack is functionality to push new elements onto it. What bugs does your compiler have? Here's the result when running with GCC. how do you destroy something that was never constructed? The next four lines reveal the GCC bug. and whatever's needed is available to the users. The one that removes it either returns or throws an exception (remember. After all. not surprisingly. The bug in VisualAge C++ is that it destroys the contained Tracer object before calling terminate. What happens here is that an object is created that throws on destruction. As can be seen. Before going into that. is to implement it as an abstract data type. where functions push. at once. What's the lesson learned from this? To begin with that it's difficult to find a compiler that correctly handles exceptions thrown in destructors. some thinking is needed regarding what the stack should do. but it was far from adequate. This behaviour is dangerous in terms of errors.) OK. the exception is thrown in the destructor.) and from there decide what to do. Why? Well. pop. If it fails. The pop function is a classical headache.throw in constructor Caught SuperTracer::SuperTracer + throw in destructor SuperTracer(throw in destructor) ~SuperTracer Caught SuperTracer::~SuperTracer Let the fun begin + throw in destructor SuperTracer(throw in destructor) + throw in constructor SuperTracer(throw in constructor) .throw in constructor ~SuperTracer Abnormal program termination core dumped The first 4 lines tell that when an exception is thrown in a constructor. through a call to their destructor. however. the destructor for all so far constructed member variables are destructed. An easy. though. looks something like this: class intstack { public: . to write a stack class. though.) Next we see the interesting case. and to pop the top element from it. Both GCC and VisualAge C++ have theirs. and this is done by a call to the function "terminate". The correct result can be seen in the execution above. there's no middle way. does that indicate that the it has been removed? It's better to make two functions of it.cout << "Caught " << p << endl. either a function fails. because it both changes the state of the stack (removes the top element from it) and returns whatever was the top element. Program execution must stop. } Here we can study different bugs in different compilers. but the destructor for the object itself is never run. think *very* carefully. because you can easily lose data. This means that the first object will be destroyed because an exception is in the air. and when destroyed it will throw another one.exe + throw in constructor SuperTracer(throw in constructor) . we can see a class that. otherwise it returns. one that returns the top element. it exits through an exception. If you have a bleeding edge compiler. if you throw an exception because an exception is in the air. and one that removes it. you can control this by calling the function "uncaught_exception()" (which tells if an exception is in the air. on the surface. but think carefully about the consequences. What if something fails while removing the top element? Should you return the top element value? If you do. or does what it's supposed to do. C-ish way of improving it. An improved stack The stack from last month was in many ways better than a corresponding C implementation. } return 0. the member Tracer variable is not destroyed as it should be (VisualAge C++ handles this one correctly. so.

Throw exception and leave stack unchanged. or this article will grow far too long. then nrOfElements will be one less after pop. This looks fair. // *1* struct stack_underflow {}. not failure. • construction. let's think about what to do when they occur. with the problems identified. and out of memory. • top. // Preconditions: // Postconditions: . Instead of having the method isEmpty() we add the method nrOfElements(). so another function is needed. • Out of memory on push. stack remains empty. What if the stack is empty? It mustn't be. // Preconditions: ~intstack() throw (). I don't see how anything can go wrong in here. // remove top element int top(). Thus. Destruction? Nothing. I *think* the best solution for this problem is to just be careful with the coding. What's the post conditions for the different operations? push(anInt): The stack can't be empty after that (post conditions always reflect successful completion. that probably increases the likelihood of exactly the kind of errors we want to avoid. top(): nrOfElements() same after as before. // retrieve value of top element private: // implementation details }. So. Tough one. and tell me if you find one. Nothing really. • invalid stack state in destruction? Can we find out of we have them? I don't think we can. what if the stack is empty? • push. void pop(). We also found. If the stack is in a bad state. • isEmpty. it might be indestructible. Again. Normally copying and assignment (a = b) would be implemented too. and hope it doesn't happen. rather easily.) So. // initialise empty stack ~intstack(). Now let's look at what can go wrong in the different operations. pop(): Currently no way to say. // Preconditions: // Postconditions: // The memory allocated by the object is deallocated unsigned nrOfElements() throw (). but we can't check it (try to think of a method to do that. Since top and pop requires that the stack isn't empty. • destruction. This leaves us with two different errors: Stack underflow (pop or top on empty stack). but let's change things a bit.intstack(). // Preconditions: // Postconditions: // nrOfElements() == 0 || top() == old top() // *2* void push(int anInt) throw (bad_alloc). now we can write the public interface of the stack: struct bad_alloc {}. but we'll wait with that until next month. we must allow the user to check if the stack is empty. • pop. the preconditions for operations pop and top (!isEmpty(). without adding significant control data. Out of memory. class intstack { public: intstack() throw (). throw exception.) Now to think of post conditions.) Also top() == anInt. // free memory by popping all elements void push(int aValue). There's no object left to check the post condition on! We can state a post condition that all memory allocated by the stack object is deallocated. • top and pop on empty stack. Construction (from nothing): nrOfElements() == 0. otherwise we don't leave them a chance.

but with an additional element counter? I think that's a perfectly reasonable approach. class intstack { public: intstack() throw (). the C++ compiler will do it for you. // used for post condition violations. and ironically that is why they are declared private. so it's not a problem. and unfortunately. I'll talk more about this next month. private: intstack& operator=(const intstack& is). but what this means is that if there are elements on the stack. The promise to always leave the stack unchanged in when exceptions occur means that we must guarantee that whatever internal data structures we're dealing with must always be destructible. Here comes the complete class declaration.) I said we wouldn't implement these this month. // *3* intstack(const intstack& is). and below it. however. If you have such a compiler. You'll get used to this reversed looking logic. remove the declaration of it above. struct bad_alloc {}. the compiler generated ones are usually not the ones you'd want. the top elements must be the same. *1*: the structs stack_underflow and bad_alloc are empty. So. This is tricky. how do we implement this then? Why not like the one from last month. Or literally as it says in the code comment. coping and assignment is explicitly illegal. void pop(void) throw (stack_underflow). int top(void) throw (stack_underflow). struct pc_error {}. // Preconditions: // Postconditions: // The memory allocated by the object is deallocated unsigned nrOfElements() throw (pc_error). and use the struct itself as the information. or the top elements are equal. but it can be done. // Preconditions: ~intstack() throw (). nothing more is needed. we just throw them. either the stack is empty. // Preconditions: - . the copy constructor (constructing a new stack by copying the contents of an old one. // *1* struct stack_underflow {}. *2*: This looks odd. // Preconditions: // nrOfElements() > 0 // Postconditions: // nrOfElements() == old nrOfElements() // Behaviour on exception: // Stack unchanged. This requirement is also implied by our destructor guaranteeing not to throw anything. perhaps. *3*: This is how the assignment operator looks like. // implementation details }. if included. For really new compilers. The reason is that if you don't declare a copy constructor and assignment operator. // Preconditions: // nrOfElements() > 0 // Postconditions: // nrOfElements() == 1 + old nrOfElements() // Behaviour on exception: // Stack unchanged. the new operator throws a pre-defined class called bad_alloc. By declaring them private.// nrOfElements() > 0 // top() == anInt // Behaviour on exception: // Stack unchanged. with the old "stack_element" as a nested struct within the class.

but it's OK for trivial member functions. void pop(void) throw (stack_underflow. like this constructor.// Postconditions: // nrOfElements() == 0 || top() == old top() void // // // // // // push(int anInt) throw (bad_alloc. // Preconditions: // nrOfElements() > 0 // Postconditions: // nrOfElements() == old nrOfElements() // Behaviour on exception: // Stack unchanged. pc_error). }. at the point of declaration. stack_element* pNext. . intstack::intstack() throw () : pTop(0). Preconditions: Postconditions: nrOfElements() = 1 + old nrOfElements() top() == anInt Behaviour on exception: Stack unchanged. pc_error). private: intstack(const intstack& is). The only peculiarity here is that the constructor for the nested struct "stack_element" is defined in line (i. pTop = p. }. unsigned elements. pNext(p) {}.) As a rule of thumb. this should be avoided. So let's look at the implementation. delete pTop.e. // hidden !! struct stack_element { stack_element(int aValue. // hidden!! intstack& operator=(const intstack& is). bit by bit. stack_element* p) throw () : value(aValue). pc_error). which only copies values. // guaranteed not to throw. int value. stack_element* pTop. int top(void) throw (stack_underflow. // Preconditions: // nrOfElements() > 0 // Postconditions: // nrOfElements() + 1 == old nrOfElements() // Behaviour on exception: // Stack unchanged. elements(0) { // Preconditions: } intstack::~intstack() throw () { // Preconditions: // Postconditions: // The memory allocated by the object is deallocated while (pTop != 0) { stack_element* p = pTop->pNext.

} } catch (. but since all that is done is to return a value. it is obvious that the stack cannot change from this. It's also valuable if. stack_element* pOld = pTop. elements = old_nrOfElements. pc_error) { // Preconditions: unsigned old_nrOfElements = nrOfElements().. though. I leave the post condition. since it's valuable to others reading the sources. // // // // the above either throws or succeeds. and an explanation for my laziness.) try { pTop = pTmp. } Here I admit to being a bit lazy. Strictly speaking. delete pTop. stack_element* pTmp = new stack_element(anInt. for some reason. the implementation is changed so that it is not obvious. // Postconditions: // nrOfElements() == 1 + old nrOfElements() // top() == anInt if (nrOfElements() != 1 + old_nrOfElements || top() != anInt) { throw pc_error(). // Postconditions: // nrOfElements() == 0 || top() == old top() // no need to check anything with this // implementation as it's trivially // obvious that nothing will change. the memory is deallocated and we're leaving the function with the exception (before assigning to pTop.) { // Behaviour on exception: // Stack unchanged. so the stack remains unchanged. The guarantee that "delete pTop" doesn't throw comes from the fact that the destructor for "stack_element" can't throw (which is because we haven't written anything that can throw. . pTop).. If that happens.) unsigned intstack::nrOfElements() throw (pc_error) { // Preconditions: return elements. void intstack::push(int anInt) throw (bad_alloc. if (pTmp == 0) throw bad_alloc(). and the contents of "stack_element" itself can't throw since it's fundamental data types only. If it throws. the post condition should be checked. ++elements. // get rid of the new top element pTop = pOld. the check should be implemented. as a comment.} } These are rather straight forward.

" means to re throw whatever it was that was caught. } unsigned old_elements = nrOfElements(). in which case everything is fine. On the next line we store the top of stack as it was before the push. Let's start from the beginning. "catch (. since they're unnecessary in that case. This is harmless since we haven't done anything to the stack yet. Here there are three possibilities. what we must do. or we're out of memory (the only possible error cause here since the constructor for "stack_element" cannot throw. but careful to document the behaviour should the implementation for some reason change into something less obvious.)" will catch anything thrown from within the try block above. } This is not trivial. Setting the new stack top and incrementing the element counter is not hard to understand. The call to "nrOfElements" could throw "pc_error". operator new itself throws "bad_alloc" when we're out of memory. otherwise return the top value.. "old_nrOfElements" is used both for the post condition check that the number of elements after the push is increased by one.throw. though. just remove them. The call to "nrOfElements" may throw. without having leaked memory. on the other hand. This assignment cannot throw since "pOld" and "pTop" are fundamental types (pointers). What we do when catching something. The post condition check is interesting. Then. is to pass the error on to the caller of "push". is to free the just allocated memory (which won't throw for the same reason as for the destructor. If it does. On the next line a new stack element is created on the heap." does. // Postconditions: // nrOfElements() == old nrOfElements() } // No need to check with this implementation! } This is not so difficult. If so. OK. and also contains some news. } return pTop->value. the exception passes "push" and to the caller since we're not catching it. If you have a brand new compiler. If we have no elements on the stack. Either the creation succeeds as expected. either case. All these three situations are handled in the catch block. A throw like this is only legal within a catch block (use it elsewhere and see your program terminate rather quickly. "bad_alloc" will be thrown and the stack will be unchanged. an out of memory situation will mean that the return value stored in "pTmp" is 0.. Here we have three situations in which an exception results. pc_error) { // Preconditions: // nrOfElements() > 0 if (nrOfElements() == 0 || pTop == 0) { throw stack_underflow(). the call to "top" may throw. in which case we throw ourselves. stack_element* pOld = pTop.) For most of you. things that do change the stack goes into a "try" block. I'm lazy with the post condition check.) We also restore the old stack top and the element counter. and since we promise the stack won't be changed in the case of exceptions. we throw. As with "nrOfElements". . Thus the stack is restored to the state it had before entering "push". but also when restoring the stack should an exception be thrown. This is used solely for restoring the stack in the case of exceptions. it'll most probably complain about the next two lines.) int intstack::top(void) throw (stack_underflow. void intstack::pop(void) throw (stack_underflow. That case is taken care of on the next two lines. if we're out of memory here. and that is what the empty "throw. An empty "throw. and the post condition check itself might fail. Next we start doing things that changes the stack. If you have such a compiler. pc_error) { // Preconditions: // nrOfElements() > 0 if (nrOfElements() == 0 || pTop == 0) { throw stack_underflow().

The thing worth mentioning here.pop()" << endl.pop(). cout << "is1.try { pTop = pTop->pNext. --elements. if it breaks its promise. After having spent this much time on writing this << endl. throw something.nrOfElements() = " << is1. // Postconditions: // nrOfElements() + 1 == old nrOfElements() if (nrOfElements() + 1 != old_elements) { // Behaviour on exception: // Stack unchanged. Suppose the deletion did. } The exception protection of "pop" works almost exactly the same way as with "push".nrOfElements() << endl. it would be = " << is1.push(32)" << endl. it's time to have a little fun and play with it.nrOfElements() << endl. cout << "is1. is1. cout << "is1. we too break our promise not to alter the stack when leaving on << endl. pTop = pOld.nrOfElements() << endl.nrOfElements() = " << is1. cout << "is1.nrOfElements() = " << is1. cout << "is1. cout << "is1. though.. cout << "is1. As it is now. } throw pc_error(). is1. cout << "is1. is1.) { elements = = " << is1.pop()" << endl.push(5)" << endl. . If it did. despite its promise..pop(). cout << " = " << is1. } catch (. cout << "is1. is1. } delete pOld. don't you think? #include <iostream.h> int main(void) { try { cout << "Constructing empty stack is1" << endl.push(32). // guaranteed not to throw. throw. and the top of stack would be left to point to something << endl.nrOfElements() = " << is1. but we at least make sure the stack is in a usable (and most notably. is why "delete pOld" is located after the "catch" block and not within the "try" block. cout << "is1.nrOfElements() << endl. destructible) state.push(5). intstack is1.

} catch (pc_error&) { cout << "Post condition violation" << endl.. but that a lot of thought is required before doing so. since I'll be on a well needed vacation. How can a user of the class prevent things from going wrong. Recap Again a month filled with news." helps considerably here. When you have satisfactory answers to all four questions for all functionality of your class. yes. Now I will break a promise from last month.) Next Next month there will most probably be a break. What the function should do.] . I'll be net-less most of August. on the catch clauses I don't bind the exception instance caught to any named parameter. Can you think of why? Mail me your reasons. It's generally considered to be a bad idea to have public data in a class. } return 0. however. 2. The knowledge that something of that type has been caught is. • Destruction of objects is done in the reversed order of construction. what if we somehow manage to get "elements" to non-zero. but please make changes to the test program. is a good check for the integrity of the stack object itself. The reason is simply that I don't use it. That'll be dealt with later. by the way. For example. What can go wrong. ask me. Something that is badly missing in the stack implementation above. } catch (bad_alloc&) { cout << "Out of memory" << endl.") I promise to explain the references in more detail too. • A member function can access private parts of a class. enough. Please. it must not go undetected. and thus always have access to the member data of the class.)" and "throw. because this is where I end this month's lesson. and the stack implementation. What I'd like you to do. tell me. while "pTop" is 0? That's a terrible error that must not occur. After that. Part Part Part Part Part Part Part Part Part Part1 Part1 Part1 Part1 1 2 3 4 5 6 7 8 9 0 1 2 3 [NOTE: Here is a link to a zip of the source code for this article. 3. or say why it fails. in which case they're destroyed when the delete operator is called for a pointer to them. • You have seen how it is possible to. • You have seen how classes can be used to encapsulate internals and publish interfaces with the aid of the access specifiers "public:" and "private:" • Member functions are always called together with instances (objects) of the class. • We have seen that throwing exceptions in destructors can be lethal. Exercise 1. by carefully crafting your code. too slow. going to fast. If you think I'm wrong about things. make your member functions "exception safe" without being bloated with too many special cases ( "catch(. please don't feel ignored. in this case. It should either work. Write me! I want opinions and questions. 4. Ed.} I'm staying within the limits of the allowed here.. Um. whatever. teaching the wrong things. and if it does. to ensure that the program won't die all of a sudden. (This is not to say that it should never ever be done. What should happen when something goes wrong. to break the rules and see what happens. and to implement them. except when the objects are allocated on the heap. you have a safe design. if I take a long time in responding. } catch (stack_underflow&) { cout << "Stack underflow" << endl.) 2. is to see what kind of "internal state" tests that can be done. I won't go into more details with references. I need your constant feedback. we'll have a look at copy construction and assignment (together with a C++ idiom often referred to as the "orthodox canonical form. Please discuss your ideas with me over e-mail (this month.) • You can now iterate your way to a good design by thinking of 1.

they're very different. Here's an example: struct A { int b[5]. ra. what should we make classes of. ++r. char d. instead of manipulating our own copy. it always is a reference of some type. However. the function has . } From this. passing parameters by reference can be dangerous. can some times be necessary. References C++ introduces an indirect kind of type that C does not have. that is identical to the one you passed. C* pc. just the same ways as pointers are denoted with an unary "*". then passing by reference is cheaper. and encapsulation with classes. • Once bound to a variable there is no way to make the reference refer to something else." • You cannot get the address of a reference. A reference is a means of indirection. Some details about references: • References must be initialized to refer to something. See a reference more as an alias for another variable. // r now refers to i. References are also often used for parameters to functions. but what you get is the address of the variable referred to. one may wonder. ra. // now i == 1. pc is given a value and is // here used. would be an unbound reference. or something else than the intended. int x.b[3]=5. what on earth are references for? Why would anyone want them? Well. for one. The feedback from part 2 tells me I forgot a rather fundamental thing. A reference is in itself not a type. int& x. This may sound a lot like a pointer. In parts 1 and 2. A& ra = pc->p[2]. it means that instead of getting a local copy of the thing thrown. You can try to. // error. Passing an object by reference instead of by value. but don't confuse the two. What's the meaning of a class? I will get to this. // pc->p[2]->b[3] = 5. if (&r != &i) throw "Broken compiler". Let's have a look at an example: int main(void) { int i = 0. // and somewhere else. They're also handy as a short-cut into long nested structs" and arrays. int q. If we look at the exceptions. int& r = i. they offer a certain security over pointers. The same goes for parameters to functions. }. The reference in this case just makes life easier. References are denoted with an unary "&". • There is no such thing as "reference arithmetic. I used references when catching exceptions. return 0. // pc->p[2]->b[4] = 2. we get a reference to the thing thrown. just like arrays are arrays of something and pointers are pointers to something. }. the reference. If copying the object is an expensive operation. it's so easy to get a pointer referring to something that doesn't exist. struct C { A* p[10]. the object is copied.b[4]=2. The reason for the performance benefit is that the when passing a parameter by value. When you do. but first let's finally have a look at the promised references. the function uses a local object of its own. which in some cases it is. r still refers to i. and we can manipulate it if we want to. It's a way of reaching another variable. There is no such thing as a 0 reference. what exactly is encapsulation. and sometimes beneficial in terms of performance.So far we've covered error handling through exceptions.

a class is a type. we introduced a new type to the language. as I mentioned in part 2. at once. be it C++. "push" and "nrOfElements". and if the function modifies it. What. It might just crash. So. and because of this all attempts to change its value will cause a compile time error. the caller better be prepared for that. but what does the reference returned refer to? It refers to the local variable "i" in "ir".access to the very object you pass. what should the class allow you to do. i. void workWithStack(const intstack& is) { // work with is is. you write classes. descriptions of ideas. a method of encapsulation. With the "intstack". but more importantly. return i. do it now. The member functions of the class. describe the semantics of the type. This means that you get a reference. With the built in integral types. "Bicycle" for example. when we wrote the class "intstack". When should you write a class. return 0. } int main(void) { intstack i.. the reference returned refers to a variable that no longer exists! Don't do this! Or rather.h> int& ir(void) // function returning reference to int { int i = 5. We can increment the value of instances of the type with operations like ++. A class is. A commonly used way around this is to declare the parameter as a "const" reference.push(5). yielding a third instance. Here's an example of passing a parameter by "const" reference. C++ comes with a set of built in types like "int". the stack of integers. but the reference is treated as was it a constant. The idea "Bicycle" that is. and what's a good name for a class? What's the relation between classes and objects? When you write programs in Object Oriented Programming Languages. not my particular bicycle. When you define a class. So far so good. } int main(void) { cout << ir() << endl.pop(). that the "main" function prints. you add a new type to the language. Modula-3. If you remember the "tracer" examples from the previous lesson. SmallTalk. Eiffel. attempting to alter // const reference. Have a look at this: #include <iostream. Objective-C. Why? The function "ir" returns a reference to an "int". that I think all C++ programmers fall into at least once. we had the operations "pop". is the meaning of a class. workWithStack(i). One such trap. remember?) it is impossible to pass instances of it to functions in other ways than by reference (or by pointer. exactly. } What will this program print? It's hard to tell. how can you know what classes to make? Classes are. // Error. you remember that the variable ceases to exist when exiting the function.) There are situations when a reference is dangerous. is returning a reference to a local variable. as before. which programs could use. In other words. though. we have operations like adding two instances of the type. It uses the "intstack" from the previous lesson: // put the declaration of intstack here. if it prints at all. is a classic example of a class. which value happens to be the sum of the values of the other two. } Since the "intstack" class does not have a copy constructor (it was declared private.. It could be anything. and so on. In the previous lesson. just to have your one time mistake over with :-) What's a class? Now for the theoretical biggie. Java or whatever. return 0. in addition to well defined construction and destruction of instances. "unsigned" and "double". as a rule of thumb. "top". My bicycle is a .

The Orthodox Canonical Form The basic operations you should. There are. is class oriented programming.. (like "pc" in the reference example above) are replaced by bit-patterns representing objects. they often differ.) Having state means that the same member function can give different results depending on what has been done to the object before calling the member function (again. just the run time instances of the classes." So.". The objects are the instances of types (yes. Normally this extra burden is light. and their order. as expected. What my bicycle is. construction by copying another instance. the things that you need to do. . }. and descriptions of semantics and state representation. exceptions to this rule of thumb. // copy constructor // other necessary member functions. The Orthodox Canonical Form poses the additional requirement that an instance must always be copyable. what member functions should a class have? This is even harder to say.physical entity that is currently getting wet in the rain. be able to do with objects of any class is construction from scratch.. they probably should be parameters to the member functions.". Note that objects don't exist when you write your program. When your program executes. if you can say "The X . the copy constructor looks like this: class C { public: C(const C& c). the state of a stack is the elements in it. It is your job to make sure it does this. they need not be instances of classes. you might start to feel like someone's been fooling you. the answer would..". The objects are. assignment and destruction. For example. // other necessary member functions. you better have amplifiers (multiplication. what exists are types." The "intstack" guaranteed that no matter what happened. subtractors and so on. is that they're semantically identical (i. is a good candidate for an instance of the class "Bicycle. an instance was always destructible.) The class has member data to represent state. since a mathematical function is state less. but if you design a program for use by electronics engineers when designing their gadgets. "An X. However. the identifiers. you probably need a class "Road". If the starting point or destination are important. given a class C. Given this. an instance of type "float" is also an object. given the same input to member functions. is a mathematical function a class that you'd like to have instances of to toy with in your program? According to the rule. Given a class named C. Again. This places a slightly heavier burden on us. though. the value returned by "top()" or "nrOrElements()" depends on the history of "push()" and "pop()" calls. compared to the work with the "intstack.) The "intstack" for example must make its own copy of the stack representation in the copy constructor. which you want to pass an instance of to the member function of either "Bicycle::beRiddenBy" or "Human::ride". Construction by copying is done by the copy constructor. with a stack. or "A kind of X. but it might also be that class "Human" should have the member function "ride" accepting an instance of class "Bicycle" as its parameter. This means that the base pointer will differ. Objects are run-time entities. but there are tricky cases. there is no difference between the copy and the original. When you write your program. then. or they won't use your program. in general. should have the member function "beRiddenBy" accepting an instance of class "Human". however.. they give the same response. If I need to ride my bicycle. "pop" and "top" member functions. be no.. the copy assignment operator looks like this: class C { public: C& operator=(const C&). it can be that the class "Bicycle". On the contrary. Construction from scratch we've seen. when thinking of the problem you want to solve.) adders. but as far as you can see through the "push". because there are so many ways to solve every problem. Usually instances of the class has a state (for example. Next in line is copy assignment. In most situations. you might have a good candidate for a class X. and the functions that represent the semantics. like "Bicycle" or "intstack". This Object Oriented Programming thing is a hoax! What it's all about. when solving your problem. The idea of bicycle is a very good candidate for a class. to instances of types.. }. So.e.) A class represent the idea. descriptions of how instances of types can be used. after all. The job of the copy constructor is to create an object that is identical to another object. This does not mean that every member variable of the newly constructed object must have values identical to the ones in the original. If the road itself is important. What's important. on the other hand. it is not. must in one way or the other be expressible through the classes.

and the pointer itself is a constant (i. not by necessity) a reference to the object just assigned to.e. This means that the memory area // pointed to by b2." which is a pointer to the class type. // assignment is also disasterous.Writing the copy assignment operator is more difficult than writing the copy constructor. When inside a member function (the assignment operator as defined above is a member function) the object can be reached through a pointer named "this. { } bad::bad(const bad& b) : pi(b. above.pi is now the same as b1. // b2. bad& operator=(const bad&).) The reference to the object is obtained by dereferencing the "this" pointer.pi. // This seamingly logical and simple return *this. bad(const bad&). // // // // default constructor copy constructor destructor copy assignment bad::bad(void) : pi(new int(5)) // allocate new int on heap and // give it the value 5. private: int* pi. // Make b3.pi. so the last statement of an assignment operator is almost always "return *this. For the class C. you cannot make "this" point to anything else than the current instance.pi point to the same invalid // memory area! bad b4. }. as you will soon see } bad::~bad(void) { delete pi. the type of "this" is "C* const" This means that is's a pointer to type C.pi) is // no longer valid bad b3(b1). ~bad(void).) The return value of an assignment operator is (by tradition. Not only does the copy assignment operator need to make the object equal to its parameter.pi) // initialize pi with the value of b's pi { // This is very bad." The difficulty of writing a good copy constructor and copy assignment operator is best shown through a classical error: class bad { public: bad(void). } bad& bad::operator=(const bad& b) { pi = b.pi (and hence also b1. b5 = b4. } // Here b2 is destroyed. and b2's destructor is // called. // The memory allocated by b5 was never // deallocated. We have a memory leak! . { bad b2(b1). } int main(void) { bad b1. it also needs to cleanly get rid of whatever resources it might have held when being called (The copy constructor does not have this problem since it creates a new object that cannot have held any data since before. bad b5.

} // Can you spot the problem with this one? int main(void) { bad b1. // Allocate new memory b3=b1. so from the example it is pretty clear that it's more work than this. but it also needs to discard the pointer it already had. and initialise that memory with the same value as that pointed to by the original.pi is still valid. that of self assignment. The copy constructor should allocate its own memory.pi). } // The destrctor of b1 and b3 attempt to deallocate // the memory already dealloceted by the destructor // of b2. though. and then b4. // the copy constructor. so assigning an object to itself is perhaps not the most frequently done operation in a program.return 0.pi is first deallocated. but that doesn't mean it's allowed to crash. how can we make the copy assignment operator safe from self assignment? Here are two alternatives: bad& bad::operator=(const bad& b) { if (pi != b. so deallocation is not a // problem.pi) { delete pi. We do. however.pi).pi is allocated again and initialised with the value no longer available!!! // all OK. bad::bad(const bad& b) : pi(new int(*b. A version of the program fixing the above issues can show you what is meant by that: // class declaration. bad& bad::operator=(const bad& b) { delete pi. . } // Here b2 is destroyed. all objects refer to their own // memory. // b2.pi)) // initialize pi as a new int // with the value of b's pi { // This guarantees that both the new object and the } // original are destructible.pi is no longer valid // b1. have yet a problem to deal with. // No more memory leak pi = new int(*b. and b2's destructor is // called. // correctly so. OK. and allocate new again b3=b3. we guarantee that the pointers owned by the objects are truly theirs. By doing so. Whoa!! b3. default constructor and // destructor are identical and because of that not // shown here. } OK. // Get a new pointer and // initialise just as in return *this. { bad b2(b1).pi now points to its own area. // Deallocate. which deallocates // already deallocated memory. pi = new int(*b. // // // return 0. right? So. then b3. The destructors of b4 and b5 both attempt // to deallocate the same memory (b5 first. and their destructor can safely deallocate them. bad b3. This means that the memory area // pointed to by b2. This goes for the copy assignment operator too.

Fortunately we can tell the compiler differently. we can alter the value of the top element by writing like this: intstack is. right? The problem is that the compiler doesn't know which member functions modify the objects. Otherwise they might easily think you've simply forgotten to write them. I mentioned the const reference as a way to ensure that the parameter won't get modified. the copy constructor and copy assignment operator was declared private. but it's actually the one most frequently seen. The question is. one by one. you are.. it does. the auto generated copy constructor and/or copy assignment operator is OK. So. If it is. The latter perhaps feels a bit harder to understand. now when we know about references. It removes the top element. since then both the original and the copy would share the same representation (and have exactly the same problem as described in the above "bad" example!) If you decide that for your class. one "const" and one not. The second by comparing the pointer to the objects themselves.} } return *this. In some cases this is perfectly OK. with the non-const version returning a non-const reference to the element instead. Does "top" modify the stack? No. Const Correctness When talking about passing parameters to functions by reference. so that readers of the source code know what you're thinking. the class deserves a name change. it should be allowed to call "top" for a constant stack.pc_error). however." The "const" member function is called for constant objects (or. The auto-generated copy constructor and assignment operator. This has a tremendous advantage: For constant stack objects. bad& bad::operator=(const bad& b) { if (this != &b) { delete pi. Just as I mentioned in part one. // misc other member functions }. not allowed to do anything at all to a constant object. by default. The "Range" class from the previous lesson. does fine with this auto-generated copy constructor and copy assignment operator. a member function is assumed to alter the object. Note that if your class only has member variables of types for which copying the values does not lead to problems. } Common to both is that they check if the right hand side (parameter b) is the same object. The reason is that a C++ compiler automatically generates a copy constructor and copy assignment operator for you if you don't declare them. Since. In the previous lesson. and which don't (and assumes they do. One last thing before wrapping up. functions can be overloaded if their parameter list differs. // change value of top element! There is no magic involved in this. is. With these changes done. the tests above are not necessary. we can get the value of the top element. to prevent copying and assignment. for non-constant stack objects. leave a comment in the class declaration saying so. As a matter of fact. } return *this. The "intstack" on the other hand does not. how does the compiler know if something you do to an object will modify it? Does "pop" modify the "intstack?" Yes. just to be on the safe side) unless you tell it differently. const references or pointers. // top is now 5.) The non- . we can do even better by writing two member functions "top". This is of course hard. Member functions can be overloaded on "constness. for example. We can change "top" to be declared as follows: class intstack { public: // misc member functions int top(void) const throw(stack_underflow. because normally classes have more than one member variable to check for. by default. is. will just copy/assign the member variables. It is no longer = 3. The first alternative does this by comparing the "pi" pointer.push(5).. pi = new int(*b. since the const reference treats whatever it refers to as a constant and thus won't allow you to do things that would modify it. the assignment is simply not done.pi). since they both treat the object referred to as if it was a constant. It's the word "const" after the parameter list that tells the compiler that this member function will not modify the object and can safely be called for constant objects.

that mistake is hard to make. // Preconditions: // Postconditions: // nrOfElements() == 0 || top() == old top() int& // // // // // // top(void) throw (stack_underflow. I have a private helper function that does the job." Our overloaded "top" member functions can be declared like this: class intstack { public: // misc member functions int top(void) const throw(stack_underflow.pc_error). void destroyAll(void) throw(). Since copying elements of a stack is the same when doing copy assignment and copy construction. copy constructor and destructor. stack_element* copy(void) const throw (bad_alloc). copy assignment operator. Here's a version of "intstack" with copy constructor. Just as a private member variable can . The same goes for deallocation of the stack. // Preconditions: // nrOfElements() > 0 // Postconditions: // nrOfElements() == old nrOfElements() // Behaviour on exception: // Stack unchanged. You'll find a zip file with the complete sources at the top. }. This is getting too much without concrete examples. const version of "top" and "nrOfElements". This is not necessary by any means. // Preconditions: unsigned nrOfElements() const throw (pc_error). After all. With only one place to update. int& top(void) throw (stach_underflow. // misc other member functions }. class intstack { public: // the previous memberfunctions intstack(const intstack& is) throw (bad_alloc). int top(void) const throw(stack_underflow. and you have a subtle bug that may be hard to find. but it means I won't have identical code in two places. // Preconditions: intstack& operator=(const intstack& is) throw (bad_alloc). private: // helper functions for copy constructor. Since these helper functions "copy" and "destroyAll" are purely intended as an aid when implementing copy assignment. You cannot declare non-member functions "const.const member function is called for non-constant objects. if ever you need to change the code. copy // assignment and destructor. pc_error). and a non-const version of "top" (just as above.pc_error).pc_error). Note that it is only member functions you can do this "const" overloading on. Preconditions: nrOfElements() > 0 Postconditions: nrOfElements() == old nrOfElements() Behaviour on exception: Stack unchanged. they're declared private. you can bet you'll forget to update one of them otherwise. and that is usually desirable.) Only the new and changed functions are included here. It is needed both in copy assignment and destructor.

} return pTop->value. pc_error) { // Preconditions: // nrOfElements() > 0 if (nrOfElements() == 0 || pTop == 0) { throw stack_underflow(). pc_error) { // Preconditions: // nrOfElements() > 0 if (nrOfElements() == 0 || pTop == 0) { throw stack_underflow()." Had we.) Next in turn is "top". in addition to making those member functions callable for constant objects (or constant references or pointers. if it wasn't for the fact that . and quite likely to cause unpredictable run-time behaviour. They have nothing what so ever to do with how the stack works.only be accessed from the member functions of a class. } return pTop->value. other than that it's declared to be "const. // Postconditions: // nrOfElements() == 0 || top() == old top() // no need to check anything with this // implementation as it's trivially // obvious that nothing will change. So why do we have two identical implementations here. in this member function (or any other member function declared as "const" attempted to modify any member variable." Can you see what's different from the previous lesson? unsigned intstack::nrOfElements() const throw (pc_error) { // Preconditions: return elements. neither can be expressed in terms of the other. declare it "const" so that errors can be caught by the compiler. // Postconditions: // nrOfElements() == old nrOfElements() // No need to check with this implementation! } As can be seen. does not have the desired effect. It saves you debug time. The non-const version cannot be implemented with the aid of the const version. just how it's implemented. or rather the two versions of "top": int intstack::top(void) const throw (stack_underflow. since we'd then return a reference to a local value. the other one is not declared const and returns a reference. the compiler would give an error. This is always bad. but the first one returns a value and is declared const. // Postconditions: // nrOfElements() == old nrOfElements() // No need to check with this implementation! } int& intstack::top(void) throw (stack_underflow. The "const" version could be implemented in terms of the non-const version. Note that declaring a member function "const" does not mean it's only for constant objects. and not by anyone else. Whenever you have a member function that does not modify any member variable. "const" methods are thus good also as a way of preventing you from making mistakes. } There isn't anything at all that differs from the previous version of "nrOfElements". when I earlier mentioned that this is always undesirable? The reason is simply that although the implementation is identical. it just means that it's callable on constant objects too. not much differs between the two variants of "top." The implementation is in fact identical for both. member functions declared private can only be accessed from member functions of the same class. Here comes the new implementation of "nrOfElements. saying that we're attempting to break our promise not to modify the object.

it can be modified. is. and the object remains unchanged (whenever possible. since we first get a local copy of the object assigning from. } Seemingly simple. int& i=is. Note that there is a danger in this too: What about this example? intstack is.copy()). since "bad_alloc" is not caught in the function.) Again. Suppose we first destroyed the contents and then tried to get a copy. your program will crash right away. the self assignment guard ("if (this != &i)") is not necessary." Since we're not catching "bad_alloc". Here a temporary pointer "pTmp" is first set to refer to the copy of "i's" representation. and thus our promise to always stay destructible. // what happens here? The answer to the last two questions is that "i" refers to a variable that no longer exists and that when assigning to it. as a consequence of this. elements(i. and "bad_alloc" thrown. since we've promised that our destructors won't throw) the rest is guaranteed to work. With the aid of the "destroyAll" helper function. . The copy assignment operator is a little bit trickier. and thus the new object will never be constructed. first getting the copy is essential.elements) { // Preconditions: // Postconditions: } The "pTop" member of the instance being created is initialized with the value from "i. // guaranteed not to throw! pTop=pTmp. With the help of the "copy" member function." The implementation of a const member function is not allowed to alter the object. the destructor becomes trivial: } return *this. the member variables are not altered. This is very important from an exception handling point of view. If you're lucky. not allowed to call non-const member functions for the same object. creates a new copy of i's representation ("pTop" and whatever it points to) on the heap and returns the pointer to its base. Since it is the element itself. If "i" is an empty stack.copy(). If the copying fails. and copyable whenever resources allow. Instead. whatever was allocated will be deallocated. In this case it means that what's returned from the non-const version of "top" is the top element itself. is. and yet both efficient and exception safe. try to leave objects in an unaltered state in the presence of exceptions. The "copy" helper function. elements=i. anything can happen. If we run out of memory when "copy" is working. if you're out of luck. it'll flow off to the caller if and always leave them destructible and but not that is not declared "const. // can throw bad_alloc destroyAll(). Also. Since we've promised that "destroyAll" won't throw anything (a promise we could make. but our own "pTop" would point to something illegal. int val=is. but the copying threw "bad_alloc. would be broken. it's really simple! intstack::intstack(const intstack& i) throw (bad_alloc) : pTop(i. not a local copy of it. it flows out of the function as intended.copy()". // what does i refer to now? i=45. In this case. The difficulty lies in being careful with the order in which to do things. or getting a value from it. Remember that a reference really isn't an object on its own? You cannot distinguish it in any way from the object it refers to. It's a pure performance boost by making sure we do nothing at all instead of duplicating the representation. // i now refers to the top element i=32. If copying is successful. and after that destroy our own representation. we can safely destroy whatever we have and then change the "pTop" member variable.pop(). // val is 32. intstack& intstack::operator=(const intstack& i) throw (bad_alloc) { if (this != &i) { stack_element* pTmp = i. // modify top element. and is. it'll start behaving randomly erratically! Now for the copy constructor.push(45). it means that "bad_alloc" will be thrown before "pTop" is initialized. just to destroy the original. "copy" returns 0.

pFirst = pTmp. if (p != 0) { // take care of first element here. delete pTop. // used in catch block.0). while ((p = p->pNext) != 0) //**2 { pPrevious->pNext = new stack_element(p->value. pTop = p.intstack::~intstack(void) { destroyAll(). } } . // guaranteed not to throw. // cannot throw except bad_alloc pPrevious = pPrevious->pNext.. void intstack::destroyAll(void) throw () { while (pTop != 0) { stack_element* p = pTop->pNext.. // Cannot throw anything except bad_alloc if (pFirst == 0) //**1 throw bad_alloc(). } So. pFirst = new stack_element(p->value. } throw. delete pFirst. try { stack_element* p = pTop.) // If anything went wrong. It's the by far trickiest function of them all. // guaranteed not to throw. } catch (. intstack::stack_element* intstack::copy(void) const throw (bad_alloc) { stack_element* pFirst = 0. } } return pFirst. } } Now the only thing yet untold is how the helper function "copy" is implemented.0). pPrevious = pFirst. // Here we take care of the remaining elements. if (pPrevious == 0) //**1 throw bad_alloc(). deallocate all { // and rethrow! while (pFirst != 0) { stack_element* pTmp = pFirst->pNext. stack_element* pPrevious = 0. how is this magic "destroyAll" helper function implemented? It's actually identical with the old version of the destructor.

as coming C++ programmers. The whole copying is in a "try" block. • The "while" statement marked //**2 might look odd." As long as we're "in the header" of a member function. you are the ones who can make this course the ultimate C++ course for you. If it was not set to 0. Send me e-mail at once. If "pTop" is non-zero. • You have learned about the "Orthodox Canonical Form". so if we left out the parenthesis. nested types must be explicitly stated. When is it OK to use the auto-generated copy constructor and copy assignment operator? Recap This month. stating your opinions. there would be no way it could find the memory to deallocate. Old compilers. here you have one. and that value is compared against zero. Mail me your reasons for why this can be a bad idea (it can. will be very beneficial for you. assignment and destruction. At the places where a "stack_element" is allocated. used to point to the first element of the copy. There are two details worth mentioning here. for example. and usually even is!) Can it be bad in this case? When can returning references be dangerous? When is it not? Mail me an exhaustive list of reasons when assignment or construction can be allowed to fail under the Orthodox Canonical Form. so it can be used inside the "catch" block. Knowing this library. • You have learned that your objects should always be in a destructible and copyable state. and our program would behave erratically. If we didn't leave this for the "catch" block. partly because it is standard. and partly because it's remarkably powerful. • You have found out how you can overload member functions on "constness" to get different behaviour for const objects and non-const objects. What happens is that the variable "p" is given the value of "p->pNext". As usual. return 0. • The "if" statements marked //**1 are only needed for older compilers. but fortunately it is available and downloadable for free from a number of sources. is defined outside of the "try" block. It's not until we have successfully created another element to append to the stack. Whenever you have a need for a stack of integers. since it is always put at the end of the stack. so we can deallocate things if something goes wrong. it's up to you to toy with the "intstack". Well within the function. the whole structure that "pTop" refers to is copied. • You have seen how C++ references work. These member functions are then only callable from within member functions of that class. The type "stack_element" is only known within "intstack. and that expression can be used. no matter what happens. desires. The precedence rules are such that assignment has lower precedence than comparison. Now. New compilers automatically throw "bad_alloc" when they're out of memory. Remember that assignment is an expression. and seen that member functions declared "const" are callable for non-const objects as well. which always gives you construction from nothing. the return type is "intstack::stack_element*". that the "pNext" member variable is given a value other than 0. yet more news has been introduced to you. for comparisons." so whenever used outside of "intstack" it must be explicitly stated that it is the "stack_element" type that is defined in "intstack.To begin with. Coming up Next month I hope to introduce you to components of the C++ standard library. and how to use it. Exercises • • • • • • When is guarding against self assignment necessary? When is it desirable? How can you disallow assignment for instances of a class? The non-const version of "top" returns a reference to data internal to the class. • You have seen how you can make member functions callable for "const" objects by declaring them as "const". since it is then known what class the type belongs to. • You have learned about "const". the effect would be to assign "p" the value of "p->pNext" compared to 0. however. it is important that the "pNext" member variable is given the value 0. • You have seen how you can implement common behaviour in private member functions. questions and (of course) answers to this month's exercises! Capitulo 4 . it would not be possible to know that it was the last element. construction by copying. The local variable "pFirst". and how it works for objects. Most compilers available today do not have this library. it is no longer needed. The assignment "p=p->pNext" must be in a parenthesis for this to work. which would not be what we intended.

it looks for a function "print" taking an "int" as a parameter. } The keyword "template" says we're dealing with a template. } Weird? OK. Here's what a template function for printing something can look like: template <class T> void print(const T& t) { cout << "t=" << t << endl. although "class" will still work). The code in each of those functions was exactly the same (exactly the kind of redundancy that's always to be avoided). so I guess I've explained what templates are for. Despite the keyword "class". at least. of course. so it expands the template function "print" with the type "int". there's always a template parameter list. // print<int> print(3.. the latter alternative isn't an alternative in my book.. Then I want a stack of char* (another rewrite) and a stack of bicycles (yet a rewrite. This is very much like a cookie cutter. OK. which printed parameters of different types. some type. For writing a template function. without sacrificing type safety (this will get clear later). The same happens for the other types. it does pretty much nothing at all. rather than creating yet another copy of it.141592). return 0. we'll look at another very important aspect of C++. The name "T" is of course arbitrarily chosen. Once you have a cookie cutter. This is an ideal place for a template. more or less identical pieces of code. you have to correct it in as many places. 4. The compiler always first checks if there is a function available. Type safety is essential for writing solid programs (Smalltalk programmers disagree). More or less any kind of cookie can be made with that cutter. and if there isn't. They're the solution to the above problem. // print<double> print("cool"). When the compiler reaches "main". you can make cookies with the shape of the cutter. enumerations. What do I do? Rewrite it all and call it doublestack? It's one alternative. When you find a bug (when. I want a stack of doubles. is not a function. promised something I cannot keep. time for some demystifying.] So I've done it again. It could be any name. something which the compiler uses to create functions. '>' pair. enclosed in a '<'. I wanted to introduce you to the C++ standard library. Note that this is done by the compiler at compile time. and sees the call to "print(5)". always make it a stack of void*. it tries to create it by expanding the function template. called T. I wrote a set of overloaded functions called "print". Ed. that of templates. It's a function template. When declaring/defining a template. A template is a way of loosening your dependency on a type. This order of things is necessary to avoid unnecessary duplication. it will accept the keyword "typename" instead of "class". There's of course the C kind of solution.. Just think of this little nightmare. T does not have to be a class. This means that the template deals with a type. Today. After all. however. where T is used just as if it was a legal type. And then. When the compiler reads the function template. Yuck. but there are problems with the available implementations and Watcom compilers. Templates are the foundation of the standard C++ library too. No. it can be any of the built in types. not if). structs. Sigh. "print(2)" uses the same function as "print(5)" does. and cast to whatever you want (and just hope you won't cast to the wrong type). other than to remember that there's a function template with one template parameter and the name "print". The code for the template. Here are some examples using it: int main(void) { print(5). The template parameter for this template is "class T". After this comes the function definition.. // print<int> again.. just one data member in the internal struct with different type for all of them. and I don't own a Watcom compiler to work around it with. There is none. and actually makes a new function. and so on (if you have a modern compiler. But how? Function templates In the first C++ article. so getting to know them before hand might not be so bad anyway. Later. Let's compile and run: .. // print<const char*> print(2).NOTE: Here is a link to a zip of the source code for this article. Why templates The last two articles made some effort in perfecting a stack of integers. This month. we end up with 4 versions of stack. The former alternative isn't an alternative either. and a bizarre view). all with identical code. that's really all there is.

they're still arrays. This problem is something I strongly dislike about C++. class Range { public: Range(int upper_bound = 0.e. This was not a mistake. Note that not writing an exception specifier means that any exception may be thrown. In case you don't remember. After having generated the function. by adding the exception specifier "throw ()". Class templates Just as you can write functions that are independent of type (and yet type safe!) we can write classes that are independent of type. // Precondition: upper_bound >= lower_bound // Postconditions: // lower == upper_bound // upper == upper_bound . where every occurrence of "T" (only one. while the "template function" is the cookie. Please note the different meanings of the terms "function template".[c:\desktop]icc /Q temp_test.cpp [c:\desktop]temp_test. some type. nor was it sloppiness. there's only one requirement on the type T. If the type cannot be printed. We'll deal with that later this fall/early winter.14159 t=cool t=2 [c:\desktop] Although it does not seem like it. called a template function. In a sense. the original "Range" looks like this: struct BoundsError {}. I wish there was a way to say "The exceptions that might be thrown are the ones from operator<< on T" but there is no way to say that. One example of a template function above. Not nice. Note that a function is not generated from a function template until a call is seen (the compiler cannot know what types to generate the function for before that). I didn't write an exception specifier for the "print" function template. It's a template from which the compiler generates functions. since the type "intstack" cannot be printed with "<<" on "cout" (the error message says "ostream". I could try to make the promise that the function "print" does not throw exceptions. and the program terminate. pointers and references. The "function template" is what you write. It's just seen in a somewhat different way. by improving the old "Range" class from lesson 2.exe t=5 t=3. You have arrays and pointers (and references) that all act on a type. in one or a few articles on C++ I/O). class templates exist as builtins in C and C++. "unexpected" would be called. but of different types. and there's not much to do about it. The compiler generated functions are the template functions. and "template function".cpp -fhandle-exceptions -lstdcpp temp_test2. here's what GCC says when trying to print the "intstack" from last month: [c:\desktop]gcc temp_test2.cpp:285: no match for `operator <<(class ostream. To test it. and noticed the error. What if it does? If so. The type they act on does not change their behaviour. Templates and exceptions As you may have noticed. it must be possible to print it with the "<<" operator to "cout". class intstack)' GCC delivered a compilation error. Let's explore writing a simple class template. which is correct. The "function template" is the cookie cutter. a compilation error occurs. the "int" version of print). For the function template "print". int lower_bound = 0) throw (BoundsError). in the function parameter list) is replaced with "intstack". and I cannot know if operator<< on that type can throw an exception or not. Here the compiler generated a new function. The drawback with templates is that they make writing exception specifiers a bit difficult. but this is how it works. and the function template had an empty exception specifier list. but that would not be wise. is "print<int>()" (i. type safety is by no means compromised here. it compiled it. other than as a comment.cpp: In function `void print(const class intstack &)': temp_test2. The problem is that I cannot know what kind of type T will be.

int upperBound() throw (). which will include some news: template <class T> Range<T>::Range(const T& upper_bound. int includes(int aValue) throw (). T upper. why it shouldn't be a range of any type. since after all. } . // copy constructor upper(upper_bound) // copy constructor { if (upper < lower) throw BoundsError(). There's no reason. "includes" on the other hand. // Throws: Whatever operator>= and operator <= on T // throws. template <class T> class Range { public: Range(const T& upper_bound = 0. "lowerBound". } template <class T> const T& Range<T>::lowerBound() throw () { return lowerBound() throw (). however. T is used just as if it was a type existing in the language. const T& lower_bound) : lower(lower_bound). const T& lower_bound = 0). I've changed the constructor so that it accepts the parameters as const reference instead of by value. the parameters must be copied and the copying may be an expensive operation). Writing a class template is in many ways similar to writing a function template: struct BoundsError {}. for the same reason as the constructor does. and instead used a comment. const T& upperBound() throw (). The reason is performance if T is a large type (if passed by value. This class is a range of int. private: int lower. after "template <class T>". there is no way to know if T throws anything. int includes(const T& aValue). As can be seen. "upperBound" and "includes" uses const T& instead of value. const T& lowerBound() throw (). }. private: T lower. They just return a reference to one of them. on line 3. // Precondition: upper_bound >= lower_bound // Postconditions: // lower == upper_bound // upper == upper_bound // Throws: // Bounds error on precondition violation // Whatever T's copy constructor throws. // Whatever operator < on T throws. }. "lowerBound" and "upperBound" can safely have empty exception specifiers. since those member functions do not do anything with the T's. Time for the implementation. I've also removed the exception specifier. int upper. does need the unfortunate comment.

if (ri. There is unfortunately no way to say "Range(5. } The syntax for member functions is very much the same as that for function templates. and we must be explicit about that. which prints ranges looking like the constructor call for the range. Advanced Templates Now that the basics are covered. by adding "<T>" after the class name. I will be able to write: print(make_range(10.h> int main(void) { Range<int> ri(100. } if (!rd. which is used to create ranges. and finally. but had "includes" not been called. it would not have been expanded. The above code calls all members of "Range". One unusually clever place for templates. } return 0. Let's have a look at how it's used: #include <iostream.upperBound() << "] does not include 62" << endl. since they're useless on their own.includes(62)) { cout << "[" << rd. and to write a special template print. } template <class T> int Range<T>::includes(const T& aValue) { return aValue >= lower && aValue <= upper. The compiler will also treat every member function just as any template function. " << ri.4)). A traits class is never instantiated. We must also precede every member function with "template <class T>". The reason is that we're not dealing with a complete class.-3. The name "traits class" is odd.includes(55)) { cout << "[" << ri.10)". and we must specify that it's the template version of the class. Range<double> rd(3. } Take a careful look at the syntax here.e. My intention is to write a traits class.10). When done. so it was changed.10)" and have the compiler automatically understand that you mean "Range<int>(5. and doesn't contain any data. that is its sole purpose.141592). which tells the name of the type it is specialized for (explanation of that comes below).lowerBound() << ". but with a class template.141592. and belong to something else. a class template is expanded when it's referred to. One unfortunate side of this is that "includes" could actually contain errors.template <class T> const T& Range<T>::upperBound() throw () { return upper. it creates the class. and when executed. The only difference is that we must refer to the class (of course). As with function templates. you must explicitly state what type it is for. To use a class template. but for some reason some people didn't like the name. without needing to specify the type.lowerBound() << ".upperBound() << "] includes 55" << endl. a function template. There isn't much more to say about this. we should have a look at some power usage (this section is abstract. see: . i. and this would be unnoticed by the compiler. so when the compiler first sees "Range<int>". is as something called "traits classes". so it may require a number of readings). by expanding whatever is needed. It just tells things about other classes. Originally they were called "baggage classes". " << rd. the code will not be expanded until it is called from somewhere. until "includes" was called.

The class template just looks like this: template <class T> class type_name { public: static const char* as_string(). and must be // called on an object. // error! Cannot access data. void A::f(void) { cout << data << endl. }. in that it does not belong to an instance. like this: .f(). // also prints "A::h" A::f(). The class template is the general way of doing things. it holds no data.. No data. you can do what's called a specialization. This is the way traits classes usually look. is a simple class template. // Error. A member function specialization is usually not declared. because it's declared static. and as such cannot access any member data. since member data belongs to objects. Since "h" is not tied to an object. } void A::g(void) { cout << data << endl. A traits class. and only static member functions.h(). a. and must be called on an object (through the ". // prints something. Here's an example: class A { private: int data.h()" and "A::h()" are synonymous. That is. static void g(void). Calling "A::f()" is an error. and the member function is declared as "static". The whole idea for traits classes is one of "specialization". Now back to traits classes. just defined. but if you want the class to take some special care for a certain type. return 0. but belongs to the class itself. A member function declared static. static void h(void). public: void f(void)." operator). is different from normal member functions. which means it's the "h" belonging to the class named "A". The calls "a. } void A::h(void) { cout << "A::h" << endl.Range<int>(10. f is bound to an object. is one that tells the name of a type.4) Magic? No. since it is not static. This means it belongs to an object. just templates! Here we go. a. it can be called through the class scope operator "A::". // prints "A::h" A::h(). }.. } int main(void) { A a. The traits class needed here. and thus not bound to any object. } "A::g()" is in error.

t2). } Here we see two new things. which declares a number of static member functions. The other new thing is how elegantly the "type_name" traits class blends with the function template. since we're specializing for known types. you'll get a compilation error. And it will work (if we specialize "type_name<int>::as_string()". Their purpose is only to tell something about other classes. but the template parameter list is empty. . Neat. Now. Now for the last detail. They have a template interface. Piece of cake: template <class T> void print(const Range<T>& r) { cout << "Range<" << type_name<T>::as_string() << ">(" << r. and print it. we'll get an error when compiling. If the types differ in a call. nothing else. eh? If you want to learn more about traits classes. just as we planned to. understandable difference. it will know what kind of "Range" to create and return. if you have a top modern compiler. Please add all the fundamental types. which with the above seems fairly simple. For being such an incredibly simple construct. but in a sense. Note also that this means we cannot print ranges of types for which the "type_name" traits class is not specialized.const char* type_name<char>::as_string() { return "char". We can now write: print(make_range(10. Normally. } A minor. all template parameters must be used in the parameter list for the function). The "template <>" part clarifies that it's a template we're dealing with.5)). so compilers very much up to date with the standardization requires you to write like this: template <> const char* type_name<char>::as_string() { return "char". Now over to the print template.5)). When the type is known. template <class T> Range<T> create_range(const T& t1. " << r. just as the constructor call for the "Range" was done. the parameter for the function template need not be the template parameter itself. The function template is by the compiler translated to a template function. we can use the "type_name" traits class for "char" as follows: cout << type_name<char>::as_string() << endl. } Doesn't seem too tricky. the traits classes are unbelievably useful. } Of course.upperBound() << ". This is how traits classes usually look. such as "double". You can of course make any specializations you like. that is). Now we're almost there. It's supposed to accept an instance of a "Range". though (for all except the absolutely newest compilers. using the types of the parameters. now does it? There actually is no catch in this. It needs to be something that makes use of the template parameter. instead specializations are. have a look at Nathan Meyers traits article from the June '95 issue of C++ Report. The syntax has changed. We can now write: print(Range<int>(10. If we try for a type we haven't specialized. the class. const T& t2) { return Range<T>(t1. the compiler will give you an error message. Those member functions are intended to tell something about some other class. the template member functions are not defined. the function template that creates "Range" instances.lowerBound() << ")" << endl.

Exercises • • • • Biggie: Rewrite last months "intstack" as a class template. questions and (of course) answers to this month's exercises! Part Part Part Part Part Part Part Part Part Part1 Part1 Part1 Part1 1 2 3 4 5 6 7 8 9 0 1 2 3 Introduction We've seen how the fundamental types of C++ can be written to the screen with "cout << value" and read from standard input with "cin >> variable". Coming up If the standard library's up and running on Watcom. • how to write and use traits classes.. you will learn how you can do the same for your own classes and structs. without sacrificing type safety. . Exploring I/O of fundamental types Formatted I/O. As always. stating your opinions. }. • how the compiler generates the template functions from your function template. How can this be expressed in the language? To begin with.) We've seen a number of times how we can print something with "cout << value". It's surprisingly easy to do. . X x. "stack<T>" What happens if the copy constructor.. which can contain data of a type not known at the time of writing. that's why it's built into the language itself. X& operator=(int i). that'll be next month's topic. send me e-mail at once. • how to specialize class templates for known types.. • that templates restricts the usefulness of exception specifiers.. the language doesn't allow it. You've learned: • how to write type independent functions with templates. You've already seen that with operator=. that's implemented in the language. You can't. is not part of the language proper in C++ (or in C for that matter.) It's handled by an I/O library. • about template classes. operator== or destructor of T throws exceptions? When can you. class X { public: . the syntax is legal only because you can overload operators in C++. try to implement something like Write and WriteLn in Pascal. Let's see what actually happens when we use operator=. This month. and when can you not use exception specifiers for templates? What are the requirements on the type parameter of the templatized "Range"? Can you use a range of "intstack"? What are the requirements on the type parameter of the templatized "stack<T>"? Recap Quite a lot of news this month. otherwise it'll be C++ I/O streams. (If you're familiar with Pascal. desires.

(cout. ostream& operator<<(signed char). The C++ I/O library only supports I/O of the fundamental types. is synonymous with int i. such that the operator becomes a member function for that class (only. this syntax is legal. ostream& operator<<(unsigned short). ostream& operator<<(long). but the syntax would change. double d.operator<<(d).. The value returned by each of these is the stream object itself (i.operator<<(i)). "cout" is an object of some class. So. because this is how the compiler will treat the more human-readable form "x=5". and the stream to print on/read from. ostream& operator<<(double). ostream& operator<<(const char*). double d. The only difference for reading is that the class is called "istream" instead. cout << i << d. The relevant section of the class definition looks as follows: class ostream { . "a << b" is identical with "a. sort of.. }. and if we add operators << and >> to our class.) With the above in mind.operator<<(b)". In fact. ostream& operator<<(unsigned char)... . ostream& operator<<(unsigned long). Let's go back to printing again. public: . I/O would be very difficult indeed. on the right hand side. we'll require our object on the left hand side. what actually happens is that operator= is called for the object named "x". ostream& operator<<(int). //** At the last line of the example. if you call "operator<<(char)" on "cout". As I wrote above. we can see that writing int i. The solution does yet again lie in operator overloading. an operator overridden in a class. but this time in a somewhat different way. which has operator<<(T) overloaded.e. so if our own data types consisted of something completely different. Another possible way of doing this is to edit the ostream and istream class to contain operator<< and operator>> for our own classes. I/O with our own types The most important thing to recognise is that our own types (classes and structs) always consists of fundamental types. is just like any other member function. it's just called in a peculiar form. the return value will be a reference to "cout" itself. ostream& operator<<(long double)..operator=(5). ostream& operator<<(float). and that the operator used is operator>>(). and it generates identical code. As we can see then. Does that seem like a good idea to you? It doesn't to me. ostream& operator<<(unsigned int).x=5. and that's not what we want. Another way of expressing this is: x. work. in its .. how do we make sure we can do I/O on ranges and stacks (from the earlier lessons?) What about extending our own class with the members operator<< and operator>>? This would. We just saw how we can overload an operator for a class. ostream& operator<<(short). where T is any of the fundamental types of C++. This is important. ostream& operator<<(char).

but very liberal in what you accept as input. but #2. This even works for more complex expressions. See part 3 for details if you've forgotten): struct BoundsError {}. I'll skip #2 for now.operator<<(i). int lower_bound = 0) throw (BoundsError). like: Range r. int upper. No unnecessary computations. On input however. it is not at all obvious that this will occur. int i. class Range { public: Range(int upper_bound = 0. '.. the syntax differs. }. The syntax and semantics for printing must be the same as for the fundamental types of C++. Such is the case for our new friends operator<< and operator>>. two integers separated by a comma. number. Normally. 1. now that you know what it's for. int upperBound() const throw (). no spaces anywhere. or we print nothing at all.) It's also possible to overload operators. This declares a function. We want printing and reading synchronized (i. and what format should we accept when reading? A golden rule in I/O (and not just in C++) is to be very strict in your output. int includes(int aValue) const throw (). 7. the C++ I/O library handles just exactly this for you. that is. then reads something.e. the class "Range. The print must be in a form distinguishable from. so now we have a pretty good picture on what to do. and between any of the tokens (the tokens here are '['... How should this thing be printed and read? Here's a wishlist... Let's revisit our old friend. if we read something. What's the appearance we want of a range when printed.) Since both reading and writing is normally buffered. . Full commit or roll back. how? Overloading operator<< as a global function. . we want the reading to complete before printing. accept two parameters. OK. 6. which has the syntax of a left shift operator.use. provided that at least one of the parameters to the operator is not a built-in type. Most operators that can be defined like a nonmember function. We'll reduce it a little bit. now. 2.' and ']'). Full type safety 5. Encapsulation not violated. say. Which the compiler interprets as: ." This is the definition of "Range".. #6 and #7 are usually skipped. 3. The signature becomes: ostream& operator<<(ostream&. cout << i << r << j. int i. For format I chose is "[upper. white space is allowed before the first bracket. such that the operator becomes a function.lower]". The compiler will treat it as operator<<(cout. r). either we print all there is to be printed. private: int lower. for those who do not have old issues handy (I've added "const" on the member functions. If we have code like: Range r. to be more realistic later. then print a range. and we want it all printed before reading again. 4. // Precondition: upper_bound >= lower_bound // Postconditions: // lower == upper_bound // upper == upper_bound int lowerBound() const throw ().. cout << r << i. All of these are possible. int j. const Range&).

int lower. but I mentioned already in the beginning that we'll skip that for now.' << r. since the function does not alter "r" in any way (and promises it won't.' << r. for example. .operator<<(j).) I dare you to find this in a C++ book (I know of one book. That was printing. if (c != '. we do make some unnecessary computations if the stream is bad in one way or the other. Detached processes do not have standard output and standard input (unless redirected) and as such printing will always fail.lowerBound() << ']'. even try? We also do not synchronize our output with input.) Inside the function. Why then. is >> c. This is as far as most books on C++ cover when it comes to printing your own types.) I don't know why it's just about always skipped.upperBound() << '. Study these examples carefully. ostream&A& operator<<(ostream& os.) The stream. is >> c.lowerBound() << ']'. int upper. It is not possible to pass it by value.osfx(). and the semantics are too. The check and synchronization is simple to make. however. and copying a stream doesn't make much sense (think about it. ostream& operator<<(ostream& os. we have type safety and encapsulation is not violated. we have a detached process. return os. char and int. how about reading? The signature and general appearance of the function is pretty much clear from the above discussions.opfx()) return os. is >> c. if (c != '[') // signal error somehow and roll back stream. as you will see further down. given the facts known this far (more is needed. is passed by non-const reference. Printing does alter a stream. is >> lower. const Range& r) { if (!os. Say. } The "prefix" ("opfx" means "output prefix") function checks for a valid stream. but oddly enough not mentioned in most C+ books. The "suffix" ("osfx" means "output suffix") signals end of output. The format is distinct enough.upperBound() << '. is >> upper. . to make sure you understand what's going on.') // signal error somehow and roll back stream.operator<<(cout. Now.ipfx()) return is. char c.) We do not have full commit or rollback. } Here "r" is passed as const reference. Range& r) { if (!is. return os. However. since it isn't more difficult than this to avoid unnecessary computations and synchronize input with output. means copying. if (c != ']') // signal error and roll back stream. so that synchronized input streams can begin accepting input again.r). os << '[' << r. Let's make a try: istream& operator>>(istream& is. we're printing known types. so the operator<< provided by the I/O class library suits just fine.operator<<(i). . "os". since when passing by value. This is essential. How well does this suit the 7 points above? The syntax is correct. and also synchronizes output with input. It will not be the same after printing as it was before printing. it's fairly easy to get down to work with implementing the operator<< function. const Range& r) { os << '[' << r. after these examples. os.

return is. } int lower. but you won't be very popular among them. In fact. and other means to handle the expected. So. since it means the stream is really out of touch with reality and we cannot trust anything from it (I've only seen this one once. how to roll back the stream.) The bits we can set are "ios::badbit". There are three issues above that needs to be resolved.lower). In other words. a not too unusual situation (as a matter of fact.rdstate()). is to set the stream state to "fail. return is. and thus not exceptional. and usually we want to do that when setting or resetting a status bit. and leave the other bits as they were before the call. so we needn't even try. Use exceptions to signal exceptional situations. so if nothing is passed. Now with the above in mind. since it's very difficult to do. The status bits can also be checked with the calls "is.isfx()" was wrong. It's also absolutely necessary that the character put back is the same as the last one read.ipfx()) return is.} Hmm.rdstate()". the program may do *anything*. "is. erroneous user input is expected. or actually changes the character. "bad" is something we hope to never see. if (c != '.rdstate()). . and hit a bad sector. and soon will have none...good()" returns non-zero if no error state bits are set. and thus not to be handled with exceptions. The problem is fixed by removing the faulty line (don't you just love bugs that you fix solely by removing code!) Rolling back the stream is interesting indeed. is >> lower >> c.. since the guess "is. r=Range(upper. if (c != '[') { is. } int upper. How then? A stream object has an error state consisting of three orthogonal failure flags.) "fail" is the one we're interested in here. "eof" and "fail". a situation most programs rely on. .fail()". so reading wasn't as easy.putback(c). Range& r) { if (!is. We can get the current status bits by calling "is. and 0 otherwise. "eof" is used to signal end of file.) The obvious solution to signalling an error.clear(ios::failbit|is. char c. it's almost impossible. The wrong input *is* expected. The reason is conceptual. OK. "bad". We can put back a character. return is. in theory. and it was due to a bug in a library!) I guess we can expect "bad" if reading from a file. our only chance is if the first character read is not right. "ios::failbit" and "ios::eofbit". No. otherwise the behaviour is undefined (which literally means all bets are off. and how to deal with the suffix function. Putting back a character is done with "istream::putback(char)".eof()" (which return 0 if the bit they represent is not set. if (c != ']') { is. it's usually a failure. "clear" sets the status bits of the stream to the pattern of the integer parameter (which defaults to 0. that is all that is guaranteed to work. the name makes sense. but in practice it means you cannot know if it just backs a position. the suffix function. One character. How to signal error.bad()" and "is. what we should do if we read something unexpected. since we want to affect only that bit. is wrong. is >> c. to throw an exception. Remember you're dealing with input generated by human beings here. and non-zero otherwise. is.. is. is >> upper >> c. but if it occurs in the middle of reading something.isfx().clear(ios::failbit|is. // ERROR! Does not exist! return is.') { is. demand that the users of your program enter the exact right data in the exact correct format every time. The solution is that there isn't one.) A fourth call "is.clear(ios::failbit|is." This is done with the odd named member function "clear(int)". let's make another try: istream& operator>>(istream& is. Let's begin from the easy end. but the stream itself is OK.rdstate()). it's used to signal that we received something that was not what we expected. Sure you can.

". the compiler did it for us. If the first character read is not a '['. If they were. Then read the lower limit and the terminator.unsetf()".setf(ios::oct. . The base for integral output is altered with a call to "os. cout. Formatting There are a number of ways in which the output format of the fundamental types of C++ can be altered.good()" is enough to know if all parts were read as we expected. we put the character back and set the fail bit (the order is important. and we'll visit those later. If the separator is not ". ios::basefield).setf(ios::dec. and yet some. and the separator. cout. Note that operator>> for built in types skips leading whitespace.) thus the check near the end for "is. but there is no way to see what base it is. I think they're difficult to use. but fortunately there are easier ways of achieving the same effect. cout. the stream is set to fail state.) otherwise set the fail error state. For integral types. cout << i << endl. where v is one of "ios::hex". and alignment within that field. so the call is valid. cout. but also checks for error conditions and reads past leading white space. ios::basefield).clear(ios::failbit|is.rdstate()). ios::basefield). If reading of either upper limit or lower limit failed. hexadecimal). and a few ways in which the requirements on the input format can be altered.} This actually solves the problem as far as is possible. } The result of running this program is: 19 13 23 19 The base is converted as expected.lower). } if (is. cout << i << endl. "putback" is not guaranteed to work if the stream is in error.setf(ios::showbase). so we needn't work on that at all. cout << i << endl. octal. All of these. all we need to do is to check that the upper limit indeed is at or above the lower limit (precondition for the range) and if so set "r" (since we haven't declared an assignment operator.setf(v. If the terminator is not ']'. As a small example. so let's set that one too. cout << i << endl. and return. a field width can be set. How well do we match the 7 item wish list? You check and judge.ipfx()" not only synchronizes the input stream with output streams.setf()" and "os. cout << i << endl. I think we're doing fine. mark the stream as failed. } return is. and in fact better than what can be found in most books on the subject. The call to "is. "ios::dec" or "ios::oct". we set the stream state to failed and return.) After this we read the upper limit of the range. int main(void) { int i=19. and other reads will not do anything at all (not even alter the stream error state.setf(ios::hex. ios::basefield)". For floating point types the format can be fixed point or scientific. consider: #include <iostream.h> int main(void) { int i=19. This can be improved with the formatting flag ios::showbase. return 0. else is.good()) { if (upper >= lower) r=Range(upper. For example. All flags are set or cleared with the member functions "os. are controlled with a few formatting flags. and a little data. the base can be set (decimal.

it's not a very good idea (yields undefined behaviour. cout << '[' << -55 << ']' << endl. and the mask is "ios::basefield". for sure. and leaves the others unchanged (in other words. and the second parameter is "ios::adjustfield". The field width is set with "os. cout. the three alignment forms are mutually exclusive. but if there's extra room." Simple enough.setf(ios::hex)". is potentially dangerous (what if "ios::oct" was already set? Then you'd end up with both being set. The second form.setf(ios::hex. field width and alignment. cout << i << endl.) Now you begin to see why this is messy. Now. If the masked version is called. Alignment is set with the two parameter version of "os. All the formatting flags of the iostreams are represented as bits in an integer. or the field width set is smaller than that necessary to represent the value to be printed. let's try it out: #include <iostream. now for something that's common to all types. and the version with the mask clears the bits represented by the mask.h> int main() . cout << -55 << ']' << endl. then "ios::oct" and "ios::dec" will be cleared.width(int)".setf()". so a call to "os. the one accepting only one parameter. cout. and the width is reset after printing the first thing that uses it. return 0.) The second parameter "ios::basefield" guarantees that if you set "ios::hex". ios::basefield). or "ios::internal". cout. except those explicitly set by the first parameters. } The output of this program is 19 0x13 023 19 That's more like it. If the field width is not set.width(void). cout << '[' << -55 << ']' << endl. ios::basefield). where the first parameter is one os "ios::left". #include <iostream.width(10). the only formatting flags of the stream that will be affected are "ios::hex" or "ios::dec" or "ios::oct". While it's possible to set two. cout << i << endl. the width set does not affect the printing separate characters.h> int main() { cout << '[' << -55 << ']' << endl. the other one a full set of flags only. ios::basefield). alignment doesn't matter. This is not very intuitive I think. so don't set two of them at the same time. it bitwise "or"es the current bit-pattern with the one provided as the parameter. cout << '['. } Executing this programs shows something interesting. though. alignment does make a difference. cout. "ios::right". Formatting bits not represented by the mask will remain unchanged.setf(ios::dec.setf(ios::oct. One accepts a set of flags and a mask. return 0. and the curious can get the current field width by calling "os. As with the base for integral types. The result of running the program is shown below: [-55] [ -55] [ -55] [-55] Had you expected this? I didn't. "setf()" is overloaded in two forms. let's play with alignment within a field.width(10).cout.) That was setting the base for integral types. Let's alter the width setting program to show the behaviour. The three of these are mutually exclusive. right? The call to "setf()" for setting the "ios::showbase" flag is different. cout << i << endl. or all three of these flags at the same time. sets the flags sent as parameter.

. OK. and get the current value with a call to "os. Now that you have the general idea.. return 0..) ios::showpos controls whether a "+" should be prepended to positive numbers or not (just like a "-" is prepended to negative numbers. ios::adjustfield).width(10).precision"..width(10).. ios::adjustfield). cout << '[' << -55 << ']' << endl.fill(char)". I found the formatting of "ios::internal" to be a bit odd.fill('.setf(ios::right. cout.width(10). If the field width is larger than that required for a value. } Running it yields the surprising result .setf(ios::left. cout. cout. cout.h> int main() { cout. ios::adjustfield). The unpleasant thing about this parameter.'). and where in the field space will be. } The result of running this is.setf(ios::left. and one with an int parameter. ios::adjustfield).. cout << '[' << -55 << ']' << endl.setf(ios::right. but it kind of makes sense. cout.. cout. is just the default. The pad character.{ cout.-5 . cout. remains the same until explicitly changed. Some think the precision is the number of digits after the decimal point.fill(void)". cout << -5 << endl. while most think it's the number of digits to display. we can change the "padding character".. cout << -5 << endl. One without parameters which reports the current precision. The November 1997 draft C++ standards document (which. ios::adjustfield). cout.-5 Why was this surprising? Earlier we saw that the field width is "forgotten" once used. • ios::uppercase controls whether hexadecimal digits should be displayed with upper case letters or lower case letters. the current alignment defines where in the field the value will be. cout..width(10). • ios::showpoint controls whether the decimals should be shown for floating point numbers if they are all zero.. Space.. but the way. by calling "os. Let's exercise that one too: #include <iostream. cout. after the above explanations. cout << '[' << -55 << ']' << endl. is that many compilers interpret it differently.setf(ios::internal. which comes in two flavours. cout << '[' << -55 << ']' << endl. by . The only thing remaining for formatting is "os. return 0..width(10). why not try the other formatting flags there are: • ios::fixed and ios::scientific control the format of floating point numbers (the mask used is ios::floatfield. cout << '[' << -55 << ']' << endl. not very surprising: [-55] [-55] [-55] [ -55] [-55 ] [55] Well.. ios::adjustfield)..setf(ios::internal. cout << '[' << -55 << ']' << endl. however.

Their use is fairly straightforward and doesn't require any example. Or actually. return os. and flushes the stream buffer. and that there's no way you can accidentally set illegal base flag combinations.) says the number of digits after the decimal point is what's controlled. The ones usually accessed from there are "setw" (for setting the field width." You've already used one manipulator a lot.< left". forces printing right away.e." A manipulator does may. Let's write one that prints a defined number of spaces: class spaces { public: spaces(int s) : nr(s) {}. actually ends up as "left(cout)".h>.the way. so they defined something called "manipulators. "hex". it's there to print a terminating '\0' (the terminating '\0' of strings is never printed normally. ostream& printOn(ostream& os) const { for (int i=0. those accepting a parameter. but I'm not sure if that's what the current standards document says.) How do these manipulators work? There's a rather odd looking operator<< for output streams. Cool. return 0. so if we "print" it with "cout lt.) "setprecision". "ends" is rarely used. you need to #include <iomanip. return os. } The advantage of this is both that the code becomes clearer.setf(ios::left. but it will alter the stream in some way. Their use is simple: #include <iostream. because it really is simple. Let's first focus on those that don't. and "setfill". print something on the stream." The ones available are: "dec". most probably is the final C++ standards document. so doing it in a portable way is very difficult.h> int main(void) { cout << hex << 127 << " " << oct << 127 << " " << oct << 127 << endl. "ends". To access them. the function will be called with the stream as its parameter.) Then there are some manipulators accepting a parameter. } This function matches the required signature. For example "endl" prints a new line character. inconsistencies aside. Let's exercise this by rolling our own "left" alignment manipulator: ostream& left(ostream& os) { os. "endl". and returning an ostream&. }. Every compiler I've seen provides its own mechanism for writing such manipulators." and if you do. . and those that does not. and "flush". this is a mess. and it in its turn calls the function for the stream. that function can be "printed. } private: int nr. "endl. ios::adjustfield). what on earth does this mean? It means that if you have a function accepting an ostream& parameter. At any rate. ++i) cout << ' '. isn't it? An easier way The authors of the I/O package realized that this is a mess. eh? Roll your own "right" and "internal" manipulators as a simple exercise (they're handy too.) "flush" flushes the stream buffer (i. so "cout << left". "oct". just like "endl. i < nr. } Now. There are two kinds of manipulators. or may not. it isn't if you skip the mechanism offered by your compiler vendor and do the job yourself. the above mentioned operator<< is called. It looks like: ostream& operator<<(ostream& (*f)(ostream&)) { return f(*this).

} return is. and remembering that destructors can be put to good work. what about our I/O of our own classes with respect to the formatting state of the stream? How's the "Range" class printed if the field width and alignment is set to something? How should it be printed (hint. "istream" is typedefed as "basic_istream<char. with a parameter of 40. like this: istream& work(istream& is) { istream::sentry cerberos(is).) With the above in mind. which have effect. Then the global operator<< for an ostream& and a const space& is called. do they have the effect you expect? Write an input manipulator accepting a character. and the somewhat less messy way of altering the formatting state of a stream.ostream& operator<<(ostream& os. that are streams of wide characters. • • . } • The destructor of the sentry object does the work corresponding to that of the postfix function. The class templates are template <class charT. That parameter is in the constructor stored in the member variable "nr". and template<class charT. and which on destruction will restore the ostreams formatting state to what it was on construction. traits>. write a class which will accept an ostream as its constructor parameter. which goes through the loop printing space characters.. Now something for you to think about until next month. char_traits<char> >". Recap This month you've learned a number of things regarding the fundamentals of C++ I/O. • How to write your own stream manipulators. class traits> class basic_istream<charT. no exceptions are thrown. • The very messy. Experiment with the formatting flags on input. your probably want it printed differently from what will be the case if you don't take care of it. • Why exceptions are not to be used when input is wrong. class traits> class basic_ostream<charT. which when called compares it with a character read from the stream. and "ostream" as "basic_ostream<char. const spaces& s) { return s.) Exercises • • • • Find out which formatting parameters "stick" (like the choice of padding character) and which ones are dropped immediately after first use (like the field width. traits>. For example • How to set. • How to make sure your own classes can be written and read. There's also the pair "wistream" and "wostream". Any operation that sets an error status bit may throw an exception. I think writing manipulators requiring parameters this way is lots easier than trying to understand the non-portable way provided by your compiler vendor.. Which error status bits cause exceptions to be thrown is controlled with an exception mask (a bit mask.) By default. The mechanism for writing manipulators is standardised (and heavily based on templates. clear and recognise the error state of a stream. Instead you create an object of type istream::sentry or ostream::sentry.printOn(os). though. Standards update • The prefix and postfix functions are history. and sets the ios::fail status bit if they differ. char_traits<char> >". but typedef's for class templates. and which don't? Of those that do have an effect. and that function in its turn calls the printOn member function for the spaces object.) I still think it's easier to write a class the way I showed you. "istream" and "ostream" are in fact not classes in the standard. if (kerberos) { . } Can you see what happens if we call "cout << spaces(40)"? First the object of class "spaces" is created. and check it.

23. Operator ++ (prefix) must be allowed. send me e-mail at once. both for the parameter and the return value. Example: Study this function template: template <class IN.darr). } return dest. whose return value must be something which can be assigned to the result of operator* for IN.h) Coming up Next month we'll have a look at inheritance. Let's say that the types are "T1*" and "T2*". Operator++ (prefix) must be allowed. So. whose return value must be assignable. double darr[isize]. } What does it mean? Let us first have a look at the requirements on the types for IN and OUT. ++begin.) Must be comparable with operator !=. At the call-site. not by reference. ++dest. even though it's short and not the planned details on inheritance. what does the above do? The name no doubt gives a hint. Must have operator*. or the to be switch of jobs.iarr+isize. People who enjoy and understand the philosophy of Platon will feel at home. OUT: Must be copy-able. By default most implementations will probably use the same formatting as they do today. but with the support for "imbuing" streams with other locales (formatting rules. instead of leaving you in the dark for a month. As always. size_t isize=sizeof(iarr)/sizeof(iarr[0]). OUT dest) { while (begin != end) { *dest = *begin. I've had very little inspiration for writing this month. stating your opinions. from) a type "T2".h) and the names actually std::istream and std::ostream (everything in the C++ standard library is named std::whatever.• • Formatting of numeric types (and time) is localised. the function template is expanded to a template function. questions and (of course) answers to this month's exercises! Part1 Part2 Part3 Part4 Part5 Part6 Part7 Part8 Part9 Part10 Part11 Part12 Part13 Part Part Part Part Part Part Part Part Part Part1 Part1 Part1 Part1 1 2 3 4 5 6 7 8 9 0 1 2 3 Int Introduction I admit it. One match for "IN" and "OUT" is obvious: a pointer in an array. Inheritance is a way of expressing commonality.) The header name is <iostream> (no . class OUT> OUT copy(IN begin.34. then this is legal for all types "T1" that can be assigned (or implicitly converted to. For example: int iarr[]={ 12. However. Must have an operator*. and every standard header is named without trailing . Maybe it's the winter darkness. IN end. but let's analyze it. IN: Must be copy-able (since the parameters are passed by value.45}. Inheritance is what Object Orientation is all about. desires. copy(iarr. as follows: . here's some food for thought.

all bets are off. Note. while operator++ requires some thought since there are two operator++. C operator++(int). and "end" the end of one. use a loop over all elements and print them. and decrementing it will make it point to the last element (as opposed to making it point n. That is. however. where n>1. i. operator* is very straight. one postfix and one prefix (i. "copy(a. we can now copy arrays (or parts of arrays) of any type to arrays (or parts there of) of any type.. like this: +----+-----+-----+-----+----+ primes_lt_10 = | 2 | 3 | 5 | 7 |XXXX| +----+-----+-----+-----+----+ ^ ^ | | begin end (points to non-existing element one past the last one..double copy<int*.) The problem thus becomes. double* dest) { while (begin != end) { *dest = *begin. To show you how this can be done. . by using the copy function template." This puts a requirement on "begin" and "end" by using operator++ (prefix only) on "begin".. prints on the screen. elements past the end. the fun has only begun. there's a difference between "dest++" and "++dest. Of these. How? Printing an array (actually. // prefix. }. if the source type can be implicitly converted to the destination type.double*>(int* begin.e. operator ++ is (usually) modeled as follows (and always with these member function signatures:) class C { public: . which yields "undefined behaviour". "end" must point one past the last element of the array.) Fortunate as it is. } return dest. This means that to copy an entire array. c++ .a.e. Either "*dest = *begin" prints the value of "*begin" on the screen.forward. You can do the latter as an exercise. It illegal to dereference the "one-past-the-end" pointer. ++begin.) This is useful. As I see it. the values in an array) means copying the values from the array to the screen. how do we make a type that does the necessary conversion.e. i. // postfix. Of course. and conforms to the requirements stated for the template parameter "OUT?" The secrets seems to lie in the lines "*dest = *begin" and "++dest". int* end. and "++dest" does nothing at all. or "*dest = *begin" makes our variable "dest" remember the value to print and on "++dest" it does the printing. However." For example. The real joy begins when we realize we can write our own types that behaves the same way. it must be possible to reach a state where "begin" does not compare unequal to "end. operator* and operator++. through some conversion (the output formatting. but the value is legal. } What this function does is to assign the value pointed to by "dest" the value of the dereferenced pointer "begin". "begin" might be the beginning of an array. Other uses Let's assume we want to print the contents of an array. it is now necessary to see how some more operators can be overloaded. ++dest. like in the example above.e. as usual. // misc C& operator++(void). ++c. i. and then increment "begin" and "dest.b)" does nothing at all (except return "b"). this is legal in C and C++. // other misc. How can we do this? Of course we can. no copying is done. I will do the former. Very useful. that when "begin" and "end" are equal. there are two alternatives. as long as "begin" does not equal "end".. we can use the copy function template to do it. but I can assure you.") Assuming a class C.

Weird? Well.operator*(). I say that dereferencing an int_writer yields an int_writer. and which very assignment actually means writing.iarr+isize. // do nothing } int_writer& int_writer::operator*() { return *this.) class int_writer { public: // trust the compiler to generate necessary constructors and destructor int_writer& operator++().operator=(*begin).int_writer()). Let's make our simple "int_writer" class. decrementing is analogous. dereferencing it yields a T. the name "int_writer" is a dead giveaway for a class template. the line "*dest = *begin". } This means that if "dest" is of type int_writer. int_writer& operator=(int i). // and return the old value } Needless to say. it's not used. normally the screen. Operator++ we implement to do nothing at all. // do nothing } int_writer& int_writer::operator=(int i) { cout << i << endl. and I want the assignment to write something on standard output. which writes integers on standard output (i. and one class whose only job in life is to be assignable by int. right? If I make operator* return the very object for which operator* was called on. the following "operator=(*begin)" means "dest. }. // whatever's needed to "increment" it. What do I want to use the result of operator* for? Only for assigning to. return *this. isn't it? Why limit it to integers only? template <class T> class writer { public: . Since the return value of "dest.operator=(*begin)". In this case. Perhaps the latter is purer. one int_writer. If you look at a pointer to T. // remember the old value ::operator++(). we can use operator= for that class to do the writing. yes.operator*()" is a reference to "dest" itself. It's weird.e. Cool eh? Here's all it takes to write the contents of the prime number array: copy(iarr. can be expanded to: dest.. we need to create two types. we're writing something. but the former is so much less work. but it makes perfect sense anyway. return *this. and if the type of "*begin" can be implicitly converted to "int". // let the prefix version do the job return old_val. however. If we made operator* return some other type. Here's what the implementation looks like: int_writer& int_writer::operator++() { return *this. since it's not used for anything. // does the real writing.C& C::operator++(void) { . other than to { // distinguish between pre. int_writer& operator*(). } C C::operator++(int) // throw away int.and post-fix ++ C old_val(*this). Of course. but operator* and operator= are interesting..

private: unsigned remaining. and we can use the parameter-less constructor for that. }. let's have a look at the source side. yet another requirement surfaced. The prime number copying now becomes: copy(iarr. } I've changed the signature for "operator=" to accept a const reference instead of a value. and the number of reads remaining is decremented. writer<T>& operator=(const T&). Here's how reader<T> might look like: template <class T> class reader { public: reader(unsigned count=0). For every operator++. Of course. int operator!=(const reader<T>& r) const. // does the real writing. } template <class T> writer<T>& writer<T>::operator*() { return *this. I propose that we can create a "reader<T>" with a number. Can we create a type matching the requirements for "IN". and the number is the amount of T's to read from standard input.iarr+isize. This requires some thought. template <class T> reader<T>::reader(unsigned count) : remaining(count) // the number of remaining reads. especially on the reachability issue. and operator* must return a value. how else would you write it? As a last example. const T& operator*() const. }. such that a copy would read values from standard input (normally the keyboard?) The requirements for "IN" are a little bit more complicated than those for "OUT. It must also be possible to create an "end" reader<T>. such that operator!= yields true. the types for "IN". a new T is read. that's no surprise. } template <class T> reader<T>& reader<T>::operator++() .writer<int>())." It must be not-equal comparable. With this template. T t. writer<T>& operator*(). it must be possible to reach one value from another. for writer<T>. reader<T>& operator++(). template <class T> writer<T>& writer<T>::operator++() { return *this. 0 for { // the parameter-less constructor. since T might be a type for which copying is expensive. } template <class T> writer<T>& writer<T>::operator=(const T& t) { cout << t << endl. To make this example simple.// trust the compiler to generate necessary constructors // and destructor writer<T>& operator++(). T must be writable through operator<<. through operator++.

I've decided that operator != is really only useful for comparing with the end. float array[size]. } The last one's perhaps debatable. no remaining reads) state. it will only return false if both sides have reached the end (i.array+size. this is all part of the now final draft C++ standard. we . and iterators called "input_iterator" and "output_iterator" which behaves very similarly to the "reader" and "writer" class templates. We can write input iterators for data base access. The standard documents 5 iterator categories. output iterator. but not in detail. to enter values at the end of a linked list. generic programming. input iterator. forward iterator (sort of the combination. your iterators can be used with any of the algorithms that requires such iterators. Part Part Part Part Part Part Part Part Part Part1 Part1 Part1 Part1 1 2 3 4 5 6 7 8 9 0 1 2 3 Short recap of inheritance Inheritance can be used to make runtime decisions about things we know conceptually. and whatever you need. enumerating files in a directory." or "input iterators" and "output iterators" to be more specific. which behaves identically to what I used in this article. by knowing about employees in general.remaining != 0 || remaining != 0. you can use it with any kind of data source/sink which have iterators that follows your convention. // read a new value only if // there are values to read. To make matters even better. Conclusion What you've seen here is. } template <class T> int reader<T>::operator!=(const reader<T>& r) const { return r. i. } template <class T> const T& reader<T>::operator*() const { return t. The employee/engineer/manager inheritance tree was an example of that. as long as the types are convertable from *IN to *OUT. return *this.) and lastly random access iterators (iterators which can be incremented/decremented by more than one.writer<unsigned long>()). The template parameters "IN" and "OUT" (from "copy") are called "iterators. If you write an algorithm in terms of generic iterators. the series of prime numbers or whatever you want to get values from. but allows moving backwards too.e. // return the last read value. That is a major time/code/debug saver.array). The draft contains a function template "copy". We can write output iterators to store values in a data base.) Pointers in arrays are typical bidirectional iterators. However. can be used as one. // print the read values as unsigned long's copy(array.e.) bidirectional iterator (like forward. The function template "copy" will be useful for any combination of the above. most probably your first ever encounter with. allows both read and write. This is *VERY* useful. copy(reader<int>(size). to send audio data to our sound card.{ if (remaining > 0 ) cin >> t. If you write your iterators to comply with the requirements of one of these categories. and likewise for output iterators.reader<int>(). Anything that behaves like an input iterator. // read 5 integers from standard input and // store in our float array. Every algorithm you write can be used with any iterators of the type your algorithm requires. this is mighty neat: const unsigned size=5.

text. ``Rectangle''. What would you do with a generic shape object? It's better to make it impossible to create one by mistake. Instead I'll attack another often forgotten issue. Please note the obvious that errors that cannot be detected until runtime might go undetected! How to discover errors at design or edit time is not for this article (or even this article series). the better. including engineers. secretaries. As such the ``do nothing at all'' code belongs in ``Circle`` only. it's not quite enough.'' Addressing pure virtuals I won't write a drawing program. lines and so on. This doesn't seem like a good idea because then the programmer implementing the square might forget to implement ``rotate'' without getting compiler errors. so how can we take care of that scenario? It's unnecessary to write code that does nothing. Herein lies the problem. Abstract because you cannot instantiate objects of the class. ``The sooner you catch an error. It won't work. and code its implementation to do nothing. In other words. and any piece of code that can understand the interface can operate on objects implementing the interface (the concrete classes like ``Triangle''. for example marketers. The classic counter example is a vector drawing program. while it is an optimization for circles. A shape can be a square. This makes sense. and make sure to override these in our concrete shape classes. Such a program usually holds a collection of shapes. Pure virtual means that it must be overridden by descendants. The latter is more descriptive. The bad thing with that is that it violates a very simple rule-of-thumb. since that'd make this article way too long. }. • Let's just ignore it. The graphically experienced reader has of course noticed that rotation of a circle can be implemented extremely efficiently by doing nothing at all.'') • We can change the interface of ''Shape`` such that ``rotate'' is not a pure virtual. we can create our base class ``Shape'' with virtual member functions ``drawOn(Canvas&)''. Pure virtual (abstract base classes) C++ offers a way of saying ``This member function must be overridden by all descendants. managers. The root of this lies in the illusion that doing nothing at all is the default behaviour. rotated. and the point would be drowned in all other intricacies of graphical programming. project leaders. This in itself is not a problem. the best solution is with the original pure abstract ``Shape'' class. and an empty implementation for ``Circle::rotate.'' There are 5 phases in which an error can be found. ``translate(Coordinate c)''. A deficiency in the model While this is good. and ``Circle''). virtual void rotate(double angle) = 0. is it not? Let's have a look at the alternatives. If you try you'll get compiler errors. only objects of classes inheriting from it. etc. compile. you can draw them on a canvas. The problem in the model lies in the common base. since it's meaningless anyway. salesmen and janitors.'' Saying so also implies that objects of the class itself can never be instantiated.can handle any kind of employee. or some times an interface class. but there's a simple way of moving this particular discovery from runtime to compile time. Here's how a pure abstract base class might be defined: class Shape { public: virtual void draw(Canvas&) = 0. The problem is. translated. you can scale them. you can rotate them and translate them. a collection of grouped images. ``rotate(double degrees)''. a circle. virtual void translate(Coordinate c) = 0. Having one or more pure virtual member functions in a class makes the class an abstract base class. design. It's only the concrete shapes that can be drawn. scaled. If you send something internationally . The ``= 0'' ending of a member function declaration makes it pure virtual. virtual void scale(double) = 0. link and runtime. How do we force the descendants to override them? One way (a bad way) is to implement them in the base class in such a way that they bomb with an error message when called. since then our ``Circle'' class will be an abstract class (at least one pure virtual is not ``terminated. a rectangle. shape. how do you do any of these for a shape in general? How is a generic shape drawn or rotated? It's impossible. You know a number of things for shapes in general. Mailing addresses have different formatting depending on sender and receiver country. though. the class defines an interface that descendants must conform to. ``scale(double)'' and so on. edit. addresses. and even kinds of employees we haven't yet thought of. A class which has only pure virtual member functions and no data is often called a pure abstract base class.

As an exercise you can improve this. phone number. Ham Radio call-signs. of course. (i. E-mail. This class. but much stricter than public. however. Make sure ``State'' is only dealt with in address kinds where it makes sense. }. If the parameter for ``print'' is non-zero. This can be achieved through the third protection level. and I will assume that PostalCode and State/Zip in U. We've seen how we can make things generally available by declaring them public. I'll only have one field that's used either as postal code or as state/zip combination. addresses are synonymous (i. but not pure virtual (what would happen if it was?) Unselfish protection All kinds of mailing addresses will share a base. virtual ~Address(). e-mail address and so on. or by hiding them from the general public by making them private.K. As a simplification for this example I'll treat State and Zip in U. The Country-Code as can be seen in the Swedish address example will also be ignored (this too makes for an excellent exercise to include). It is thus looser than private. virtual void print(int international=0) const = 0. fax number. but only the descendants and no one else. We want descendants. the address will be printed in international form. inheriting from ``Address''.e. country name will be added to mailing addresses and international prefixes added to phone numbers). there are totally different types of addresses. Here we want something in between. to access the address fields. Note that the destructor is virtual. depending on country). ``protected. The member function ``acquire'' is used for asking an operator to enter address data. will not implement any of the formatting pure virtuals from ``Address.S. The idea here is that ``type'' can be used to ask an address object what kind of address it add the destination country to the address. that contains the address fields. The member function ``type'' will be defined here. etc. addresses as a unit. Name Number Street City {Country} Postal-Code Then. to always return the string ``Mailing address''. while for domestic letters that's not necessary. State Zip {Country-Name} Canada and U.S.S.'' That must be done by the concrete address classes with knowledge about the country's formatting and naming. and this is a problem. Access to the address fields is for the concrete classes only. virtual void acquire(void) = 0. a mailing address. Here are a few (simplified) examples: Sweden Name Street Number {Country-Code}Postal-Code City {Country-Name} USA Name Number Street City. Here's the base class: class Address { public: virtual const char* type() const = 0. the concrete address classes. Here comes the ``MailingAddress'' base class: .'' Protected means that access is limited to the class itself (of course) and all descendants of it. The formatting itself also differs from country to country. The address class hierarchy will be done such that other kinds of addresses like e-mail addresses and phone numbers can be added.e. Addresses. and ways to access them. since all kinds of mailing addresses are mailing addresses. however. even if they're Swedish addresses or U.

if I left it to the compiler to generate them). Now we get to the concrete address classes: class SwedishAddress : public MailingAddress { public: SwedishAddress(). // set const char* city() const. char* city_data. MailingAddress& operator=(const MailingAddress&). No one but descendants can construct objects of this class anyway. char* postalCode_data. char* number_data. void name(const char*). // get private: char* name_data. // set const char* name() const.class MailingAddress : public Address { public: virtual ~MailingAddress(). This is not because they conceptually don't make sense. // get void city(const char*). The reason for the constructor to be protected is more or less just aestethical. // get void number(const char*). protected: MailingAddress(). // // declared private to disallow them // MailingAddress(const MailingAddress&). It's the responsibility of this class to manage memory for the data strings. // set const char* number() const. Having all data private. // get void country(const char*). }. but because I'm too lazy to implement them (and yet want protection from stupid mistakes that would come. distributing this to the concrete descendants is asking for trouble. // set const char* street() const. // set const char* postalCode() const. Here the copy constructor and assignment operator is declared private to disallow copying and assignment. char* country_data. no doubt. protected data is a bad mistake. virtual void print(int international=0) const. // set const char* country() const. // get void postalCode(const char*). // get void street(const char*). and giving controlled access through protected access member functions will drastically cut down your aspirin consumption. since some of the pure virtuals from ``Address'' aren't yet terminated. As a rule of thumb. const char* type() const. class USAddress : public MailingAddress { public: . virtual void acquire(void). and always manage the resources for the data in a controlled way. char* street_data. }.

virtual void print(int international=0) const. which means the compiler cannot create them the ``USAddress'' and ''SwedishAddress. virtual void acquire(void). and yet implement it! Pure virtual does not illegalize implementation. it will be 0.'' This is what explicit qualification means. we can save a little typing by declaring it pure virtual and there won't be a need to implement it. it must be implemented by the descendants. Don't be afraid of copy construction and assignment.USAddress(). you'll probably get a nasty run-time error when the first concrete descendant is destroyed. a reference or a pointer to an object. number_data(0). in order to guarantee destructability. delete[] postalCode_data. The ``type'' and read-access methods are trivial: const char* MailingAddress::type(void) const .'' but we want to be certain that descendants do implement it. by just calling the function on an object. It only means that the pure virtual version will NEVER be called through virtual dispatch (i. MailingAddress::~MailingAddress() { delete[] name_data. street_data(0). country_data(0) { } The only thing the constructor does is to make sure all pointers are 0. By termination. you can declare a member function pure virtual. the destructor will be empty: Address::~Address() { } A trap many beginners fall into is to think that since the destructor is empty. since the destructor will be called when a descendant is destroyed. The only difference lies in the implementation of ``print''. delete[] city_data. that it is responsible for handling the resources for the member data. hence the rule that you cannot instantiate objects where pure virtuals are not terminated. delete[] number_data. Since there's no data to take care of in these classes (it's all in the parent class) we don't need to do anything special here. by the way. The observant reader might have noticed a nasty pattern of the authors refusal to get to the point with pure virtuals and implementation. be called through virtual dispatch. but rather dynamically allocate whatever is needed. one of the fields are not set to anything. just for the sake of argument. I mean declaring it in a non pure virtual way. Since we don't know the length of the fields. the definitions of ``USAddress'' and ``SwedishAddress'' are identical. Now let's look at the middle class. The only way to call the implementation of ``acquire'' in ``Address'' is to explicitly write ``Address::acquire. delete[] street_data.) Since it will never. From this to the constructor: MailingAddress::MailingAddress() : name_data(0). There's no escape for the compiler. and ``acquire''. As you can see. OK. I've left the destructors to be implemented at the compilers discretion. They were declared private in ``MailingAddress''. Yes. For the ``Address'' base class only one thing needs implementing and that is the destructor. we oughtn't restrict them. ever. so a pure virtual won't ever be called through virtual dispatch. delete[] country_data. Let's assume. That's wrong. The ``delete[]'' syntax is for deleting arrays as opposed to just ``delete'' which deletes single objects. writing it like this can only mean one thing. Since the class holds no data.e. } I said when explaining the interface for this class. Deleting the 0 pointer does nothing at all. }. the ``MailingAddress'' base class. postalCode_data(0). Note that it's legal to delete the 0 pointer. though. for some reason. city_data(0). This is used here. Then how can one be called? Through explicit qualification. If you declare it pure virtual and don't implement it. If. even if ``Address::acquire'' is declared pure virtual. There's no way around that. that we through some magic found a way to implement the some reasonable generic behaviour of ``acquire'' in ``Address.'' Let's look at the implementation. We know the parent takes care of it.

of course. } The write access methods are a bit trickier. } const char* MailingAddress::country(void) const { return country_data.n). The meaning of this is. strcpy(data. the old destination must be deleted. it's perfectly possible to see something like: name(name()). and copies strings.n). If the source and destination are different. strcpy(name_data. Like this: void MailingAddress::name(const char* n) { if (n != name_data) { delete[] name_data.{ } return "Mailing address". but I can't think of any way). } } This is done so many times over and over. const char* MailingAddress::name(void) const { return name_data. } const char* MailingAddress::street(void) const { return street_data. } const char* MailingAddress::postalCode(void) const { return postalCode_data. } const char* MailingAddress::number(void) const { return number_data. This is to achieve robustness. data = new char[strlen(n)+1]. // OK even if 0 name_data = new char[strlen(n)+1]. though. First we must check if the source and destination are the same. static void replace(char*& data.'' to do the job. ``set the name to what it currently is. however. } const char* MailingAddress::city(void) const { return city_data. a new one allocated on heap and the contents copied. ``replace. ``strlen'' and ``strcpy'' are the C library functions from <string> that calculates the length of. While it may seem like a very stupid thing to do.'' We must make sure that doing this works (or find a way to illegalize the construct. const char* n) { if (data != n) { delete[] data. exactly the same way for all kinds of data members. and do nothing in those situations. that we'll use a convenience function. } } .

sizeof(buffer)).n).n). cout << "Street: " << flush. . street(buffer). } void MailingAddress::city(const char* n) { ::replace(city_data. } void MailingAddress::street(const char* n) { ::replace(street_data.getline(buffer. All they do is to ask questions with the right terminology and output the fields in the right places: SwedishAddress::SwedishAddress() : MailingAddress() { country("Sweden").getline(buffer. // A mighty long field cout << "Name: " << flush.getline(buffer. Now it's time for the concrete classes.n). if (international) cout << country() << endl. name(buffer). cout << street() << ' ' << number() << endl.sizeof(buffer)). cout << postalCode() << ' ' << city() << endl. } void MailingAddress::postalCode(const char* n) { ::replace(postalCode_data. // what else? } void SwedishAddress::print(int international) const { cout << name() << endl. cin.n). } That was all the ``MailingAddress'' base class does. cin. } void MailingAddress::country(const char* n) { ::replace(country_data. } void MailingAddress::number(const char* n) { ::replace(number_data. } void SwedishAddress::acquire(void) { char buffer[100].sizeof(buffer)). cin. cout << "Number: " << flush.n).Using this convenience function. the write-access member functions will be fairly straight forward: void MailingAddress::name(const char* n) { ::replace(name_data.n).

getline( buffer.getline(buffer. cin.sizeof(buffer)). cout << "City: " << flush. Here's an short and simple example program that (of course) also makes use of the generic programming paradigm introduced last month.sizeof(buffer)). cin.sizeof(buffer)).getline(buffer.S.number(buffer). Address* addrs[size].getline(buffer. cout << "Number: " << flush. Address** first = addrs. } USAddress::USAddress() : MailingAddress() { country("U. name(buffer). sizeof(buffer)).getline(buffer. int main(void) { const unsigned size=10. // what else? } void USAddress::print(int international) const { cout << name() << endl.").sizeof(buffer)). cout << endl << "--------" << endl. city(buffer).sizeof(buffer)). cout << "Street: " << flush. street(buffer). cin. } A toy program Having done all this work with the classes. cin. city(buffer).getline(buffer.sizeof(buffer)). // Seems like a mighty long field cout << "Name: " << flush. cout << "City: " << flush. .A. // needed for VACPP (bug?) Address** last = get_addrs(addrs.addrs+size). cout << "State and ZIP: " << flush. postalCode(buffer). cin. number(buffer).getline(buffer. cout << number() << ' ' << street() << endl. cin. we must of course play a bit with them. } void USAddress::acquire(void) { char buffer[100]. cout << city() << ' ' << postalCode() << endl. if (international) cout << country() << endl. postalCode(buffer). cin. cout << "Postal code: " << flush.

but it would be equally unfair of me to propose using virtual dispatch here. now for the rest. break. but which might store a state of some kind (in this case whether the country should be added to written addresses or not). default: return current.sizeof(answer)). It could be implemented like this: template <class OI. Here's how it may be implemented: Address** get_addrs(Address** first. ``for_each'' does something for every iterator in a range. The reason is that we'd need to work a lot without gaining anything. switch (answer[0]) { case 'U': case 'u': *current = new USAddress. if (!cin) break. although it looks odd at first. which reads addresses into a range of iterators (in this case pointers in an array) until the array is full.} OK. Instead we'd need a set of address creating objects.print(1)). Why? We obviously cannot do virtual dispatch on the ``Address'' objects we're about to create. for_each(first. that was reading. and call a virtual creation member function for. cin.last. ++first. const F& functor) { while (first != last) { functor(*first). and which can be passed around like any object. .'' or ``function object'' as they're often called. (S)wedish or (N)one " << flush. that was mean. char answer[5]. // Should be enough. What is ``print'' then? Print is a ``functor.getline(answer. which we can access through some subscript or whatever.last. It's pretty handy.deallocate<Address>()). Obviously there's a function ``get_addrs''. or it terminates for some other reason.Address** last) { Address** current = first.OI last. while (current != last) { cout << endl << "Kind (U)S. there is a beast called ``for_each'' and behaving almost like this one (it returns the functor). break. case 'S': case 's': *current = new SwedishAddress. } In part 6 I mentioned that virtual dispatch could replace switch statements. since they're not created yet. } (**current). and yet here is one. ++current. for_each(first.acquire(). Doesn't seem to save a lot of work does it? Probably the selection mechanism for which address creating object to call would be a switch statement anyway! So. It's something which behaves like a function. return 0. } } In fact.class F> void for_each(OI first. Could this one be replaced with virtual dispatch as well? It would be unfair of me to say ``no''. } return current. Defining one is easy. in the (final draft) C++ standard. Imagine never again having to explicitly loop through a complete collection again.

by the way. }. and have some experience with the C++ standard class library. Like this: print pobject. The only remaining thing now is ``dealllocate<T>''. . and implementing one. and how you declare pure virtual functions.'' • why protected data is bad. • a new protection level.class print { public: print(int i) . and yet define it. Exercises • • • Find out what happens if you declare the ``MailingAddress'' destructor pure virtual. and simply call it. We'll look mostly at library stuff and clever ideas for how to use the language from now on.operator()(1). } This is well enough for one month. This is usually called the ``function call'' operator. Most of the language issues that remain are more or less obscure and little known. • why it's a bad idea to make destructors pure virtual. // define print object. • that switch statements cannot always be replaced by virtual dispatch. } What on earth is ``operator()''? It's the member function that's called if we boldly treat the name of an object just as if it was the name of some function. Recap This month. // pobject. and how you can work around it in a clever way. print::print(int i) : international(i) { } void print::operator()(const Address* p) const { p->print(international). you've learned: • what pure virtual means. disadvantages of the methods?) Rewrite ``get_addrs'' to accept templatized iterators instead of pointers. • that there is a ``function call'' operator and how to define and use it. ``protected. but you probably already guessed it looks like this: template <class T> class deallocate { public: void operator()(T* p) const. and implement both (what are the advantages. void operator()(const Address*) const. Think of two ways to handle the State/Zip problem. template <class T> void deallocate<T>::operator()(T* p) const { delete p. cout << endl. isn't it? You know what? You know by now most of the C++ language. pure virtual functions can be implemented. private: int international. pobject(1). • that despite what most C++ programmers believe. • that the above means that there's a distinction between terminating a pure virtual. }.

go to the other extreme and allow you to inherit the same base several times Personally I think multiple inheritance is very useful if used right. but where it will end up (and most notably. To refresh your memory. but it can cause severe problems. there is very little difference. /Björn Part Part Part Part Part Part Part Part Part Part1 Part1 Part1 Part1 1 2 3 4 5 6 7 8 9 0 1 2 3 In parts 5 and 6. and instead use streams and streaming. We'll now have a look at I/O for files. commonality is expressed either through inheritance or templates. while other programming languages. For example a stack of some data type.) Files In what way is writing ``Hello world'' on standard output different from writing it to a file? The question is worth some thought. since the ideas expressed here and in parts 5 and 6 can be used for other things than I/O. Inheritance is used when you want similar. there's very much in common. this means . and lots of useful and cool techniques are waiting to be exploited.Coming up As I mentioned just a few lines above. In other words. In the former case. Java. Many programming languages have banned it: Objective-C. independent of data. The ``f'' in the names imply that they're file streams. depending on what's common and what's not. pointer and delete. Here's something for you to think about destructor. templates are used when we want the same kind of behaviour. and ``fstream'' which inherits from both ``ifstream'' and ``ofstream. regarding the type of characters used. it's on your screen. the basics of I/O were introduced. The C++ standard does indeed have templatized streams. As we've seen so far. it's better to stop using the term I/O here. behaviour at runtime for the same kind of data. Few compilers today support this. Then there's the odd ones. like Eiffel. Is the message different? Is the format (as seen from the program) different? I cannot see any difference in those aspects. just for differing between character types. and is by many seen as evil. Anyway. The only thing that truly differs is the media where the formatted message ends up. how it does end up there) differs.) The inheritance tree for stream types look like this: The way to read this is that there's a base class named ``ios''. In this case it's inheritance that's the correct solution. which inherits from both ``istream'' and ``ostream''. See the ``Standards Update'' towards the end of the article for more information. most of the C++ language is covered. We saw this for the staff hierarchy and mailing addresses in parts 7 and 8. The classes ``ifstream'' and ``ofstream'' in their turn inherit from ``istream'' and ``ostream'' respectively. however. but in some important aspects different.'' Inheriting from two bases is called multiple inheritance. Smalltalk to mention a few. there's a good case for using templates too. Quite a bit of the library remains. Anyway. (Incidentally. ``iostream''. for example in-memory formatting of data (we'll see that at the very end of this article. since in many programming languages there is a distinct difference. or at least. with formatted reading and writing from standard input and output. since the data will be the same. from which the classes ``istream'' and ``ostream'' inherit. Here is a situation where it's used in the right way. but for file I/O it's in a file somewhere on your hard disk. In a sense. next month we'll look at file I/O (finally).

int mode=ios::in). The empty constructors always create a file stream object that is not tied to any file. int mode=ios::in). ofstream(const char* name. however. while ``iostream'' is an abstract stream for both reading and writing. open for append. ``name'' is of course the name of the file. this is normally the only parameter you need to supply. they belong to class ``ios_base. void open(const char* name. File Streams The first thing you need to know before you can use file streams is how to create them. }..h>. that is. ``ios::out''. in which you use bitwise or (``operator|'') for any of the values ``ios::in''. the six ones listed first are required by the standard (although. .'' Some implementations also provide ``ios::nocreate'' and ``ios::noreplace. }. class ofstream : public ostream { ofstream()..'' rather than ``ios. and finally ``ios::binary. Since you normally use either ``ifstream'' or ``ofstream'' and rarely ``fstream''.. int mode). you probably don't want to use the ``iostream'' or ``fstream'' classes. Fortunately. This inheritance.. int mode=ios::out). wasn't that neat? In other words. The parts of interest look like this: class ifstream : public istream { ifstream(). }. class fstream : public ofstream. Now. ``ios::ate''. the only things you need to learn for file based I/O are the details that are specific to files. You get access to the classes by #including <fstream. a call to ``open'' must be made. More often than you think.'' but those are extensions. ``open'' and the constructors with parameters behaves identically. Sometimes. void open(const char* name. will work just as they do with file streams.'') The meaning of these are: ios::in ios::out ios::ate ios::app ios::trunc open for reading open for writing open with the get and set pointer at the end (see Seeking for info) of the file. ifstream(const char* name. ``ios::trunc''.'' while others call it ``ios::bin. . public ifstream { fstream(). . any write you make to the file will be appended to the file. . int mode). you need to use the ``mode'' parameter. fstream(const char* name. means that all the stream insertion and extraction functions (the ``operator>>'' and ``operator<<'') you've written.'' These variations of course makes it difficult to write portable C++ today. int mode=ios::out). however.. To tie such an object to a file. void open(const char* name.that ``fstream'' is a file stream for both reading and writing. scrap all data in the file if it already exists.. It's a bit field. Some implementations do not have ``ios::binary. ``ios::app''.

CP/M (RIP). . once the stream object is created. just use the object as you've used ``cin'' earlier. The file stream classes also have a member function ``close''. Binary streaming is done through the stream member functions : class ostream .h> int main(int argc. If you look at a file produced by. Of course combinations like ``ios::noreplace | ios::nocreate'' doesn't make sense -. since the destructors do close the file.the failure is guaranteed. Actually this is all there is that's specific to files. and opening a file with the ``ios::binary'' mode.ios::binary open in binary mode. that by force closes the file and unties the stream object from it. Few are the situations when you need to call this member function.. that is. Binary streaming So far we've dealt with formatted streaming only. Note that binary streaming does not necessarily mean using the ``ios::binary'' mode when opening a file (although.) They're two different concepts. that is. the process of translating raw data into a human readable form. ios::noreplace cause the open to fail if the file already exists. Some times you want to stream raw data as raw data. How this parameter behaves is very operating system dependent.. or translating human readable data into the computer's internal representation. ios::nocreate cause the open to fail if the file doesn't exist. means turning the brain damaged LF<->CR/LF translation off. its usage is analogous to that of ``cout'' that you're already familiar with. } // Now the file stream object is created. // create the ofstream object // and open the file. if (!of) { // something went wrong cout << ``Error. and probably other operating systems. that is indeed often the case. streamsize n). // error code } ofstream of(argv[1]). raw data that is. char* argv[]) { if (argc != 2) { cout << ``Usage: `` << argv[0] << ``filename'' << endl. so often insist on.'' a protection parameter. { public: ostream& write(const char* s. Windows. On many implementations today there's also a third parameter for the constructors and ``open. for example to save space in a file. return 1. ostream& put(char c). Write to it! of << ``Hello file!'' << endl. Now for some simple usage: #include <fstream. ostream& flush(). so there's no need for it. return 2. cannot open `` << argv[1] << endl. } As you can see. return 0. Of course reading with ``ifstream'' is done the same way. The reason some implementations do not have ios::binary is that many operating systems do not have this conversion. DOS. do not do the brain damaged LF<->CR/LF conversions that OS/2. Binary streaming is what you use your stream for. it's most likely not in a human readable form. for example a word processor.

The writing interface is extremely simple and straight forward. char delim='\n'). istream& get(char* s. Reads at most ``n'' characters from the stream.. streamsize n. If the delimiter character is read.'') istream& istream::get(char& c). istream& istream::get(char* s. Force the data in the stream to be written (file streams are usually buffered. streamsize n). }. { public: istream& read(char* s.'' that's physically impossible.. if the delimiter is ``EOF'' (as is the default) it does not read past ``EOF. streamsize n). unless the last character read from the stream indeed is '\0'. }. istream& ignore(streamsize n=1. istream& get(char& c).) istream& istream::read(char* s. int delim=EOF). Note that when the delimiter is found. It will not be zero terminated. streamsize n. streamsize n. int get(). char delim='\n').. you're of course not allowed to pass a negative size here (what would that mean?) Exactly the characters found in ``s'' will be written to the stream. It stops if the delimiter character is found. but with the difference that it reads at most ``n'' characters. it is not read from the stream. Let's have a look at them. but doesn't store them anywhere. Read one character from the stream. that the delimiter is not stored in the array. and return it. Write ``n'' characters to the stream. Note that only the characters read from the stream are inserted into the array. Of course. . while the reading interface includes a number of small but important differences. no less. from the array pointed to by ``s. ostream& ostream::put(char c). one by one: ostream& ostream::write(const char* s. although files are where you're most likely to use them. The only difference between this one and ``get'' above. Note. streamsize n. but read the character into ``c'' instead. class istream . Read ``n'' characters into the array pointed to by ``s. Note that these member functions are implemented in classes ``istream'' and ``ostream. int istream::get().'' Here you better make sure that the array is large enough. Same as above.. istream& istream::ignore(streamsize n=1. streamsize n).'' ``streamsize'' is a signed integral data type. is that this one does read the delimiter from the stream. or unpleasant things will happen. istream& getline(char* s.eof()'' on the reference returned. char delim='\n').. it stops there.'' since you can check the value directly by calling ``. however. istream& istream::getline(char* s.'' so they're not specific to files. ostream& ostream::flush(). Inserts the character into the stream. Here a ``char'' is used instead of an ``int. no more. Despite ``streamsize'' being signed. char delim='\n'). int delim=EOF). The value is an ``int'' instead of ``char'' since the return value might be ``EOF'' (which is not uniquely representable as a ``char. This one's similar to ``read'' above.

there's a need to move around. ios::seek_dir). in contrast. p = new int[elems]. which refers to the next position to read data from.h> size_t readArray(istream& is. istream& istream::seekg(streampos). what it sounds like. streampos ostream::tellp(). is. and read the data into it.'' Repeating ``int'' again just means I'll forget to update one of them when I change the type to something else. sizeof(elems)). which you get from ``tellg'' and ``tellp'' is an absolute position in a stream. if you attempt to write anything.sizeof(elems)). Random access streams have something called position pointers. or other releases of the same compiler.Array on file An example: Say we want to store an array of integers in a file. might show different characteristics for ``streampos. They're not to be confused with pointers in the normal C++ sense. first read the number of elements. then allocate an array of that size. but what you find out might hold only for the current release of your specific compiler. there are two other things you can do with ``streampos'' values. There's a total of 6 new member functions that deal with random access in a stream: streampos istream::tellg(). Streams like standard input and standard output are truly continuous streams. } The above code does a lot of ugly type casting. ostream& ostream::seekp(streamoff. which refers to the next position to write data to. ``streampos''.'') Well. After this. You cannot use the values for anything other than ``seekg'' and ``seekp''. it does the same kind of thing for the array. but that's normal for binary streaming. ios::seek_dir). is. What this actually does is to write out the raw memory that ``elems'' resides in to the*)&elems. both backward and forward. #include <fstream. os. size_t elems) { os.h> void storeArray(ostream& os. A reasonable way is to first store a size (in elements) followed by the data. You especially cannot examine a value and hope to find something useful there (i. and the get pointer. other compilers.write((const char*)p. Files. within which you cannot move around. but it's something referring to where in the file you currently are. elems*sizeof(*p)). To read such an array into memory requires a little more work: #include <fstream.write((const char*)&elems. continuous streams of data. you can. Note that ``sizeof(*p)'' reports the size of the type that ``p'' points to.'' but that is a dangerous duplication of facts. const int* p. Naturally we want to be able to read the array as well. Seeking Up until now we have seen streams as. int*& p) { size_t elems. There's the put pointer. ostream& ostream::seekp(streampos). An ostream of course only has the put*)elems. You can subtract two values. Both the size and the data will be in raw format. and we want to do this in raw binary format. Sometimes however. I could as well have written ``sizeof(int). What's done here is to use brute force to see the address of ``elems'' as a ``const char*'' (since that's what ``write'' expects) and then say that only the ``sizeof(elems)'' bytes from that pointer are to be read. istream& istream::seekg(streamoff. are true random access data stores. } It's not particularly hard to follow.e. It's enough that I've said that ``p'' is a pointer to ``int. and an istream only the get pointer. return elems. and get a ``streamoff'' . elems*sizeof(*p)).

is OK. such as addressing beyond the range of the array. This makes for slow access. so it can be used to store arbitrary types. I'll raise some interesting questions along the way. FileArray& operator=(const FileArray&). something truly happens on disk (or wherever the stream data resides. You search your way to a position relative to the beginning of the stream. you know why. when you call any of the seek member functions. T operator[](size_t index) const.. say 10 million floating point numbers.'' In any reasonable implementation. etc. we cannot have the entire array duplicated in memory (then all the benefits will be lost.'' by the way. ??? operator[](size_t index). size_t size() const. To prevent this article from growing way too long. There must be a type. . FileArray(const char* name). for really huge amounts of data Suppose we have a need to access enormous amounts of simple data. Of course. As can be expected. This would allow the user to create several arrays within the same file. or do relative searches by adding/subtracting ``streamoff'' values. Here's the idea. and you can add a ``streamoff'' value to a ``streampos'' value. size_t elements). but nothing else will suffer.'' By using the value returned from ``tellg'' or ``tellp.'' you have a way of finding your way back. It's not until you actually read or write. ask . The array must be possible to use with any data type. already here we see a problem. Since we do not want the size to be part of the type signature. ``operator[]'' can be overloaded. The ``seekg'' and ``seekp'' methods accept a ``streamoff'' value and a direction. the selection of which. we want some measures of safety from stupid mistakes.) In addition to arrays. quite a few of the above listed features will be left for next month.value.'' where ``os'' is some random access ``ostream. }. which has these three values ``ios::beg''. FileArray(const FileArray&).) We also want to say that an array is just a part of a file and not necessarily an entire file. that can be used for traversing it. That is. Instead. cannot create file.'' To make the next write occur on the very first byte of the stream.seekp(0. probably a ``long. Its usage must resemble that of real arrays as much as possible. We'll also skip error handling for now (you can add it as an exercise. the only thing that happens is that some member variable in the stream object changes value. ``streamoff. is done through the ``ios::seek_dir'' enum. template <class T> class FileArray { public: FileArray(const char* name. such as asking for the number of elements in it. It's not a very good idea to just allocate that much memory. is some signed integral type. private: // don't want these to be used. let's use a file to access the data. or the current position. which is handy for providing a familiar syntax. call ``os. for sure. and also for errors that arrays cannot have (disk full. resembling pointers to arrays. at least not on my machine with a measly 64Mb RAM. the size is not a template parameter. // use compiler defined destructor.. // Create a new array and set the size. but a parameter for the constructor. which lacks pointers and is limited to one file per array.) A stream array. However. but probably the whole system due to excessive paging.) instead we will search for the data on file every time it's needed. and work in a slightly different way. but extra functionality that arrays do not have.ios::beg). It'll not just make this application crawl. any of the seek member functions use lazy evaluation. ``ios::end'' and ``ios::cur. disk corruption. What's the non-const ``operator[]'' to return? To see why this is a problem. The things to cover this month are: An array of built-in fundamental types only. the array must be a template. including user defined classes.) and add that too next month. Here's the outline for the class. We do not want the size of the array to be part of its type (if you've programmed in Pascal. the end of the stream. // Create an array from an existing file. First of all. get the // size from the file.

'' The declaration then becomes: template <class T> class FileArrayProxy { public: FileArrayProxy& operator=(const T&). of course. The only alternative here to using friendship. // write a value operator T() const. and if its on the right hand side of an assignment. Warning: I've often seen it suggested that the solution is to have the const version read and return a value. violating encapsulation with friendship strengthens encapsulation when done right. and (this is the real shock) that's a good thing! Friends break encapsulation in a controlled way. Friends break encapsulation. I want ``operator[]'' to do two things. x[5] = 4.'' thus ``FileArray<T>'' is declared a friend of ``FileArrayProxy<T>. We have to make sure. and that's what we wanted to prevent. I want to read data from the file. Paradoxically. looking like this: template <class T> class FileArrayProxy { public: FileArrayProxy<T>& operator=(const T&). to add another level of indirection. The const version is called for const array objects. Instead what we have to do is to pull a little trick. and is not intended to ever even be seen. which does the job.yourself what you want ``operator[]'' to do. is to make the constructors public.) All constructors. When ``operator[]'' is on the left hand side of an assignment. private: . as so often in computer science.. // write value operator T() const. but it's important to use it only in situations where two (or more classes) are so tightly bound to one another that they're meaningless on their own. but then anyone can create objects of this class. except for the copy constructors. The trick is. This is the case with ``FileArrayProxy<T>. We create a class template. that there are member functions in ``FileArray<T>'' that can read and write (and of course. // read a value // compiler generated destructor FileArrayProxy<T>& // read from p and then write operator=(const FileArrayProxy<T>& p). FileArrayProxy(const FileArrayProxy<T>&). what you read is right. Friends are useful for strong encapsulation. What?!?! Yes. how can ``FileArray<T>::operator[]()'' create and return one? Enter another C++ feature: friends. the non-const version for non-const array objects. however. those functions are not the ``operator[]. . this class is a helper for the array only.'' including things that are declared private. .'' but rather let it return a type. }. This is done by not taking care of the problem in ``operator[]. After all. I want to write data to the file. and the non-const version write a value.. Friends are a way of breaking encapsulation. This. As slick as it would be.'' It's meaningless without ``FileArray<T>. with the constructors being private. all other constructors. like this: FileArray<int> x. // read a value // compiler generated destructor FileArrayProxy<T>& operator=(const FileArrayProxy<T>& p). poses a problem. are made private to prevent users from creating objects of the class whenever they want to. FileArray<T>& array. Ouch. This means that ``FileArray<T>'' can access everything in ``FileArrayProxy<T>. int y = x[3]. it's wrong and it won't work. depending on where it's used.'' since then we'd have an infinite recursion. in ``FileArrayProxy<T>'' declare ``FileArray<T>'' to be a friend. We can. const size_t index...

// Forward declaration necessary. // what if read fails? return t. private: FileArray(const FileArray<T>&).hpp #ifndef FARRAY_HPP #define FARRAY_HPP #include <fstream. void storeElement(size_t index. Some problems still lie ahead. FileArray<T>& array. // use existing array T operator[](size_t size) const.// compiler generated copy contructor private: FileArrayProxy(FileArray<T>& fa. friend class FileArray<T>. const T&).read((char*)&t. sizeof(t)). } All of a sudden. // create FileArray(const char* name). since FileArray<T> // returns the type. // for use by FileArrayProxy<T> T readElement(size_t index) const. all member variables are ``const''. // illegal FileArray<T>& operator=(const FileArray<T>&). }. we face an unexpected problem. const size_t index. size_t size). The functions for reading and writing are made private members of the array. // what if seek fails? stream. and neither ``seekg'' nor ``read'' are allowed on constant . Let's define them right away template <class T> T FileArray<T>::readElement(size_t index) const { T t. but I'll mention them as we go. since they're not for anyone to use. size_t size() const. we need to make use of friendship to grant ``FileArrayProxy<T>'' the right to access them. // farray. fstream stream. and as such. Again.h> // size_t template <class T> class FileArrayProxy. // for use by FileArray<T> only. }. friend class FileArrayProxy<T>. The above code won't compile. size_t n). template <class T> class FileArray { public: FileArray(const char* name.seekg(sizeof(max_size)+index*sizeof(T)). The member function is declared ``const''. FileArrayProxy<T> operator[](size_t size). size_t max_size. We can now start implementing the array.h> #include <stdlib. stream.

'' I'll probably devote a whole article exclusively for these some time. private: ptr(const ptr<T>&). The only thing we have to keep in mind when using it.'' not a ``const T&. as it does not alter the array in any way. only bitwise constness.'' let's use a ``ptr<stream>'' member named ``pstream.) The only reasonable way to achieve this is to store the stream object on the heap. however. Thought of anything? What about this extremely simple class template? template <class T> class ptr { public: ptr(T* pt). template <class T> ptr<T>::ptr(T* pt) : p(pt) { } template <class T> ptr<T>::~ptr() { delete p. ~ptr(). } This is probably the simplest possible of the family known as ``smart pointers. have a very old compiler. This solution is. the thing pointed to still isn't a constant (look at the return type for ``operator*. } template <class T> T& ptr<T>::operator*() const { return *p. I can have a pointer to an ``fstream. but what if an exception is thrown already in the constructor. // nor assignment T* p.'' With this change. }. . but not what it points to (there's a difference between a constant pointer. one of adding another level of indirection.'' in the class definition. The problem is one of differing between logical constness and bitwise constness. the stream member changes. // we don't want copying ptr<T>& operator=(const ptr<T>&). pointer and delete. T& operator*() const. what if I forget to delete the pointer? Sure. then the destructor will never execute (since no object has been created that must be destroyed.'') So. When this thing is a constant. destructor. you declare ``stream'' to be ``mutable fstream stream.'' When in a ``const'' member function.) Do you remember the ``thing to think of until this month?'' The clues were.seekg(sizeof(max_size)+index*sizeof(T)). and in doing this I introduce a possible danger. whatever it points to is deleted. instead of using an ``fstream'' member variable called ``stream. This solves our problem nicely. C++ cannot understand logical constness. yet again. and a pointer to a constant. is to make sure that whatever we feed it is allocated on heap (and is not an array) so it can be deleted with operator delete. so I have to find a different solution. it is not bitwise const. the solution is very simple. Whenever an object of this type is destroyed. This member function is logically ``const''. I. ``readElement'' must be slightly rewritten: template <class T> T FileArray<T>::readElement(size_t index) const { (*pstream). However.'' it's a ``T&. // what if seek fails? T t. I'll delete it in the destructor.streams. the pointer is also ``const''. If you have a modern compiler.

read((char*)&max_size. (*pstream). sizeof(max_size)).write((char*)&*)&t. // what if seek fails? (*pstream). // what if write failed? } Now for the constructors: template <class T> FileArray<T>::FileArray(const char* name. template <class T> void FileArray<T>::storeElement(size_t index. size_t size) : pstream(new fstream(name. ios::in|ios::out|ios::binary)). sizeof(elem)). // what if write failed? // We want to write a value (any value) at the end // to make sure there is enough space on disk.(*pstream). storeElement(max_size-1. T t. max_size(size) { // what if the file could not be opened? // store the size on file. // What if read failed because of a disk error? } template <class T> . // what if read fails? return t.t). sizeof(t)).seekp(sizeof(max_size)+index*sizeof(T). sizeof(max_size)). } I bet the change wasn't too horrifying. const T& elem) { (*pstream). max_size(0) { // get the size from file. (*pstream). ios::in|ios::out|ios::binary)). ios::beg). // What if this fails? } template <class T> FileArray<T>::FileArray(const char* name) : pstream(new fstream(name.write((const char*)&max_size. // what if read fails or max_size == 0? // How do we know the file is even an array? } The access members: template <class T> T FileArray<T>::operator[](size_t size) const { // what if size >= max_size? return readElement(size).

The assignment operator is necessary. I've left out the ``size'' member function. friend class FileArray<T>.'' template <class T> class FileArrayProxy { public: // copy constructor generated by compiler operator T() const. Note. Next in line is ``FileArrayProxy<T>. the compiler will try to generate one for us if we don't. FileArrayProxy<T>& operator=(const FileArrayProxy<T>& p). but the result would *NOT* be what we want. it would succeed. size_t i). Sure. size). size_t index.readElement(index).FileArrayProxy<T> FileArray<T>::operator[](size_t size) { // what if size >= max_size? return FileArrayProxy<T>(*this . return *this. but then. size_t i) : index(i). } template <class T> FileArrayProxy<T>& FileArrayProxy<T>::operator=( const FileArrayProxy<T>& p . Now for the implementation: template <class T> FileArrayProxy<T>::FileArrayProxy(FileArray<T>& f.storeElement(index. however. since the return value must be copied (return from ``FileArray<T>::operator[]. The compiler doesn't generate a default constructor (one which accepts no parameters. will do just fine. // read from one array and write to the other. private: FileArrayProxy(FileArray<T>& f. as can be seen by the comments. } Well. which just copies all member variables. that if we instead of a reference had used a pointer. fa(f) { } template <class T> FileArrayProxy<T>::operator T() const { return fa. but what we want to do is to read data from one array and write it to another. but it will fail. The copy constructor is needed. FileArrayProxy<T>& operator=(const T& t).t). What it would do is to copy the member variables. }. FileArray<T>& fa.) since we have explicitly defined a contructor. there's absolutely no error handling here. The one that the compiler generates for us.'') and it must be public for this to succeed. however. this wasn't too much work. since its implementation is trivial. since references (``fa'') can't be rebound. } template <class T> FileArrayProxy<T>& FileArrayProxy<T>::operator=(const T& t) { fa.

'' One where you have an array you want to store data in. Since ``storeElement'' wants an ``int. where you don't know how large a buffer you will need. they're just syntactic sugar. 3 int x=arr[2]. The other variant. Thus. An example will explain: char* s = "23542". #endif // FARRAY_HPP That was it. *p=2. which will allow us to use the file arrays almost exactly like real ones.storeElement(2. and want those digits as an integer. With the aid of ``istrstream''. has as its member ``fa'' a reference to ``arr''.'' As you can see. With ordinary arrays. There are two alternative uses for ``ostrstream. say we have a string containing digits. The assignment operator is called. The variable ``buffer'' will contain the string ``x=23.) ostrstream os. ``ostrstream'' and ``strstream''. return *this. ``arr[2]=0'' ends up as ``arr. and one to index 2. os << "x=" << x << ends. however. For example. istrstream is(s). There's one thing we cannot do: int* p = &arr[2]. the ``operator int() const'' is called. this is easy. The stream manipulator ``ends'' zero terminates the buffer. ostrstream os(buffer. or vice versa. which calls ``arr.p). finally. which creates a ``FileArrayProxy<int>'' from ``arr'' with the index 2.'' where ``index'' is still 2 and the value of ``t'' is 0. ``arr[0]=arr[2]'' creates two temporary proxies. t).operator[](2)'' is called. ``operator=(int)'' is executed. With them we can treat our file arrays very much like any kind of array.'' On line 4. This member function in turn calls ``fa. albeit very useful. double x=23.) The former usage is like this: char buffer[24]. the proxies don't add any new functionality. as needed (usually because you have no idea what size the buffer must have.34. but unfortunately the compiler does not prevent it (a decent compiler will warn that we're binding a constant or pointer to a temporary. 2 arr[2]=0.0)''.operator int() const'' is called. and one where you want the ``ostrstream'' to create it for you. the thing to do is to create an ``istrstream'' object from the string. This operator in turn calls ``fa. the above would be legal and have well defined semantics. With our file array we cannot do this. ``x'' will have the value 23542. .readElement(2). The object.p)''. Zero termination is not done by default. and besides you might not always want it. a similar proxy is created through the call to ``operator[](2)'' This time.storeElement(0. where p is the temporary proxy referring to element 2. which is a temporary and does not have a name. After executing this snippet.) We'll mend that hole next month (think about how) and also add iterators. int& x = arr[3]. In memory data formatting One often faced problem is that of converting strings representing some data to that data. int x. thus ``int x=arr[2]'' translates to ``int x=arr.storeElement(index. Can you see what happens with the proxy? Let's analyze a small code snippet: 1 FileArray<int> arr("file". On this temporary object.storeElement(index. 4 arr[0]=arr[2].readElement(2).34'' after this snippet. assigning arr[2] the value 2. On line two. which in turn calls ``fa. ``istrstream'' isn't much more exciting than that. one referring to index 0. and arr[3] the value 5.readElement(2)). sizeof(buffer)). ``ostrstream'' on the other hand is more exciting.) { } fa.readElement(2)'' and returns its value.storeElement(0. ``arr. arr.10).'' In other words ``arr[0] = arr[2]'' generates the code ``arr. is >> x.'' ``p. since the stream cannot know where to put it. is generally more useful (I think. On line 3. x=5. and as its member ``index'' the value 2.

where the underlying type is ``wchar_t''. is just like ``fstream'' the combined read/write stream. ``std::wostream''. but it's most useful in this case. but on strings (there is a string class. The class template ``char_traits'' is a traits class which holds the type used for EOF. • truly simple smart pointers can save some memory management house keeping. I think the example pretty much shows what this kind of usage does. const char* p = os. y=34. It's so easy to forget to release the buffer (by simply forgetting to call ``os. rather a string class template.h> (or for some compilers <strstrea. Finally. etc. that are highly visible. the generally useful strstreams has been replaced by ``std::istringstream''.) dynamic_cast<T>.) defined in the header <sstream>. // work with p and length. It's generally not possible to move around in ``cin'' and ``cout. os. • It is possible to move around in streams. For ``ostream'' this is ``char'' (ostream is actually a typedef. in the standard. but the C++ compiler doesn't know and always assumes bitwise const. will fail. at least file streams and in-memory formatting streams. the headers are actually <iostream> and <fstream>.) • streams can be used for binary. when done right.45. or in-memory formatting. OK?'' The good thing about it is that it's so visible that anyone doubting it can easily spot the dangerous lines and have a careful look. sizeof(variable)). the way of declaring a variable as non-const for const members. or ``unfreeze'' it. // release the memory. that is. As I mentioned already last month. const size_t length=os.str(). and the names std::istream. and it's hard to see in large code blocks.) ``std::ostringstream'' does not suffer from the freeze problem that ``ostrstream'' does. are used just the same way as the familiar ``cout'' and ``cin. In the binary streaming seen in this article. I know I'm violating type safety. The syntax is: os. etc. how to differentiate between logical and bitwise const. which on most systems probably will be 16-bit Unicode. or again. I find this interface to be unfortunate. • there's a difference between logical const and bitwise const. .34. os << x << '*' << y << '=' << x*y << ends.double x=23.write(reinterpret_cast<const char*>(&variable). a lot of things have changed regarding streams.) Standards update With the C++ standard. const_cast<T> and reinterpret_cast<T>. The string streams can be found in the header <strstream. std::wistringstream. and also be used as a work around for compilers lacking ``mutable'' (i.freeze(0)'') and that leads to a memory leak.) ``pcount'' returns the number of characters stored in the buffer. but hey.) • friends break encapsulation in a way that. the stream guarantees that it will not deallocate the buffer.) • streams can be used also for in-memory formatting of data. Why the standard has removed the file stream open modes ios::create and ios::nocreate is beyond me. They're (in approximate order of increasing danger.'' • proxy classes can be used to differentiate read and write operations for ``operator[]'' (the construction can of course be used elsewhere too.h>. reinterpret_cast<T> would be used. which both makes life easier and not. class traits=std::char_traits<charT> > ``charT'' is the basic type for the stream. as they're extremely useful. The underlying type for std::ostream is: std::basic_ostream<class charT. Attempts to alter the stream while frozen. unformatted I/O too. They do not operate on ``char*''. There are four new cast operators. in other words. Last ``freeze'' can either freeze the buffer. ``strstream'' finally.) There's another typedef. Recap The news this month were: • streams dealing with files. The streams are templatized too.freeze(0). Casting is ugly.'' which saves both learning and coding (the already written ``operator<<'' and ``operator>>'' can be used for all kinds of streams already. the value of EOF. ``Yeah. The member function ``str'' returns a pointer to the internal buffer (which is then frozen. nor overwrite it. I know what I'm doing.e. The latter is done by giving it a parameter with the value 0. std::ostream. static_cast<T>.pcount(). ``std::ostringstream'' and ``std::stringstream'' (plus wide variants. and some other house keeping things. strengthens encapsulation. This normally doesn't make sense for ``cout'' and ``cin'' or in-memory formatting (as the name implies.) but it's often useful when dealing with files. as a way of saying. where the most important template parameter is the underlying character.

• • • • • Improve the file array such that it accepts a ``stream&'' instead of a file name, and allows for several arrays in the same file. Improve the proxy such that ``int& x=arr[2]'' and ``int* p=&arr[1]'' becomes illegal. Add a constructor to the array that accepts only a ``size_t'' describing the size of the array, which creates a temporary file and removes it in its destructor. What happens if we instantiate ``FileArray'' with a user defined type? Is it always desireable? If not, what is desireable? If you cannot define what's desireable, how can instantiation with user defined types be banned? How can you, using the stream interface, calculate the size of a file?

Coming up
Next month will be devoted to improving the ``FileArray.'' We'll have iterators, allow arbitrary types, add error handling and more. I assume I won't need to tell you that it'll be possible to use the ``FileArray,'' just as ordinary arrays with generic programming, i.e. we can have the exact same source code for dealing with both! Part Part Part Part Part Part Part Part Part Part1 Part1 Part1 Part1 1 2 3 4 5 6 7 8 9 0 1 2 3 [Note: the source code for this month is here. Ed.] Last month a file based array template for truly huge amounts of data was introduced. While good, it was nowhere near our goals. Error handling was missing completely, making it dangerous to use in real life. There was no way to say how a user defined data type should be represented on disk, yet they weren't disallowed, which is a dangerous combination. It was also lacking iterators, something that is handy, and is an absolute requirement for generic programming with algorithms that are independent of the source of the data. On top of that, we'd really like the ability to store several different arrays in the same file, and also have an anonymous array which creates a temporary file and removes it when the array is destroyed. All of these will be dealt with this month, yet very little will be new. Instead it's time to make use of all the things learned so far in the course.

The data representation problem
In the file array as implemented last month, data was always stored in a raw binary format, exactly mirroring the bits as they lay in memory. This works fine for integers and such, but can be disastrous in other situations. Imagine a file array of strings (where string is a ``char*''). With the implementation from last month, the pointer value would be stored, not the data pointed to. When reading, a pointer value is read, and when dereferenced, whatever happens to be at the memory location pointed to (if anything) will be used (which is more than likely to result in a rather quick crash.) Anything with pointers is dangerous when stored in a raw binary format, yet we must somehow allow pointers in the array, and preferably so without causing problems for those using the array with built-in arithmetic types. How can this be done? In part 4, when templates were introduced, a clever little construct called ``traits classes'' was shown. I then gave this rather terse description: ``A traits class is never instantiated, and doesn't contain any data. It just tells things about other classes, that is its sole purpose.'' Doesn't that smell like something we can use here? A traits class that tells how the data types should be represented on disk? What do we need from such a traits class? Obviously, we need to know how much disk space each element will take, so a ``size'' member will definitely be necessary, otherwise we cannot know much disk space will be required. We also need to know how to store the data, and how to read it. The easiest way is probably to have member functions ``writeTo'' and ``readFrom'' in the traits class. Thus we can have something looking like this: template <class T> class FileArrayElementAccess { public: static const size_t size; static void writeTo(T value, ostream& os); static T readFrom(istream& is); }; The array is then rewritten to use this when dealing with the data. The change is extremely minor. ``storeElement'' needs to be rewritten as: template <class T> void FileArray<T>::storeElement(size_t index, const T& element)

{ // what if index >= array_size? typedef FileArrayElementAccess<T> traits; (*pstream).seekp(traits::size*index +sizeof(array_size), ios::beg); // what if seek fails? traits::writeTo(element,*pstream); // what if write failed? // what if too much data was written? } The change for ``readElement'' is of course analogous. However, as indicated by the last comment, a new error possibility has shown up. What if the ``writeTo'' and ``readFrom'' members of the traits class are buggy and write or read more data to disk than they're allowed to? Since it's the user of the array that must write the traits class (at least for their own data types) we cannot solve the problem, but we can give the user a chance to discover that something went wrong. Unfortunately for writing, the error is extremely severe; it means that the next entry in the array will have its data destroyed... In the traits class, by the way, the constant ``size'', used for telling how many bytes in the stream each ``T'' will occupy, poses a problem with most C++ compilers today (modern ones mostly makes life so much easier.) The problem is that a static variable, and also a static constant, in a class, needs to reside somewhere in memory, and the class declaration is not enough for that. This problem is two-fold. To begin with, where should it be stored? It's very much up to whoever writes the class, but somewhere in the code, there must be something like: const size_t ArrayFileElementAccess<X>::size = ...; where ``X'' is the name of the class dealt with by the particular traits specialisation. The second problem is that this is totally unnecessary. What we want is a value that can be used by the compiler at compile time, not a memory location to read a value from. As I mentioned, a modern compiler does make this much easier. In standard C++ it is allowed to write: template<> class ArrayFileElementAccess<X> { public: const size_t size = ...; ... }; Note that for some reason that I do not know, this construct is only legal if the type is a constant of an integral or enumeration type. ``size_t'' is such a type, it's some unsigned integral type, probably ``unsigned int'', but possibly ``unsigned long''. The expression denoted ``...'' must be possible to evaluate at compile time. Unless code is written that explicitly takes the address of ``size'', we need not give the constant any space to reside in. The odd construct ``template <>'' is also new C++ syntax, and means that what follows is a specialisation of a previously declared template. For old compilers, however, there's a work-around for integral values, no larger than the largest ``int'' value. We cheat and use an enum instead of a ``size_t''. This makes the declaration: class ArrayFileElementAccess<X> { public: enum { size= ... }; ... }; This is a bit ugly, but it is perfectly harmless. The advantage gained by adding the traits class is flexibility and safety. If someone wants to use a file array for their own class, they're free to do so. However, they must first write a ``FileArrayElementAccess'' specialisation. Failure to do so will result in a compilation error. This early error detection is beneficial. The sloppy solution from last month would not yield any error until run-time, which means a (usually long) debugging session.

Several arrays in a file
What is needed in order to host several arrays in the same file? One way or the other, there must be a mechanism for finding out where one array begins and another ends. I think the simplest solution, is to let go of the file names, and instead make the constructors accept an ``fstream&''. We can then require that the put and get pointer of the stream must be where the array can begin, and we can in turn promise that the put and get pointer will be positioned at the byte after the array end. Of course, in addition to having a reference to the ``fstream'' in our class, we also need the

``home'' position, to seek relative to, when indexing the array. This becomes easy to write for us, it becomes easy to use as well. For someone requiring only one array in a file, there'll be slightly more code, an ``fstream'' object must be explicitly initialised somewhere, and passed to the constructor of the array, instead of just giving it a name. I think the functionality increase/code expansion exchange is favorable. In order to improve the likelihood of finding errors, we can waste a few bytes of disk space by writing a well known header and trailer pattern at the beginning and end of the array (before the first element, and after the last one.) If someone wants to allocate an array using an existing file, we can find out if the get pointer is in place for an array start. The constructor creating a file should, however, first try to read from the file to see if it exists. If it does, it should be created from the file, just like the constructor accepting a stream only does. If the read fails, however, we can safely assume that the file doesn't exist and should instead be created. The change in the class definition, and constructor implementation is relatively straight forward, if long: template <class T> class FileArray { public: FileArray(fstream& fs, size_t elements); // create a new file. FileArray(fstream& fs); // use an existing file and get size from there ... private: void initFromFile(const char*); fstream& stream; size_t array_size; // in elements streampos home; }; template <class T> FileArray<T>::FileArray(fstream& fs, size_t elements) : stream(fs), array_size(elements) { // what if the file could not be opened? // first try to read and see if there's a begin // pattern. Either there is one, or we should // get an eof. char pattern[6];,6); if (stream.eof()) { stream.clear(); // clear error state // and initialise. // begin of array pattern. stream.write("ABegin",6); // must store size of elements, as last month const size_t elem_size =FileArrayElementAccess<T>::size; stream.write((const char*)&elem_size, sizeof(elem_size)); // and of course the number of elements stream.write((const char*)&array_size, sizeof(array_size)); // Now that we've written the maintenance // stuff, we know what the home position is. home = stream.tellp();

the element sizes // mismatch. stream. // and the size given in the constructor // mismatches! What now? stream. stream.tellp()).read((char*)&elem_size. stream. stream. now let's see if // it's of the right kind. what to do? Let's set // the fail flag for now. // set put and get pointer to past the end pos. } template <class T> FileArray<T>::FileArray(fstream& fs) : stream(fs) { // First read the head pattern to see if // it's right. } .4). right? char pattern[6].seekp(home+elem_size*array_size).seekp(stream. // set put and get pointer to past the end pos. initFromFile(pattern). // for lack of better. } template <class T> void FileArray<T>::initFromFile(const char* p) { // Check if the read pattern is correct if (strncmp(p. The data read from the stream. return.tellg()). } // set put and get pointer to past the end pos.tellg())."ABegin". Again. stream.// Then we must go the the end and write // the end pattern. // set the fail flag.clear(ios::failbit). // shared with other // stream constructor if (array_size != elements) { // Uh oh.clear(ios::failbit). stream. stream. } initFromFile(pattern).clear(ios::failbit).sizeof(elem_size)). size_t elem_size. if (elem_size != FileArrayElementAccess<T>::size) { // wrong kind of array.seekg(stream.seekp(stream. // stupid name for the // member function.write("AEnd". stream. return.6). we have a valid array. } // OK.6)) { // What to do? It was all wrong! stream.

// Get the size of the array. Can't do much with // the size here, though.*)&array_size,sizeof(array_size)); // Now we're past the header, so we know where the // data begins and can set the home position. home = stream.tellg(); stream.seekg(home+elem_size*array_size); // Now positioned immediately after the last // element. char epattern[4];,4); if (strncmp(epattern,"AEnd",4)) { // Whoops, corrupt file! stream.clear(ios::failbit); return; } // Seems like we have a valid array! } Other than the above, the only change needed for the array is that seeking will be done relative to ``home'' rather than the beginning of the file (plus the size of the header entries.) The new versions of ``storeElement'' and ``readElement'' become: template <class T> T FileArray<T>::readElement(size_t index) const { // what if index >= max_elements? typedef FileArrayElementAccess<T> traits; stream.seekg(home+index*traits::size); // what if seek fails? return traits::readFrom(stream); // what if read fails? // What if too much data is read?


template <class T> void FileArray<T>::storeElement(size_t index, const T& element) { // what if index >= array_size? typedef FileArrayElementAccess<T> traits; stream.seekp(home+traits::size*index); // what if seek fails? traits::writeTo(element,stream); // what if write failed? // what if too much data was written? }

Temporary file array
Making use of a temporary file to store a file array that's not to be persistent between runs of the application isn't that tricky. The implementation so far makes use of a stream and known data about the beginning of the stream, number of elements and size of the elements. This can be used for the temporary file as well. The only thing we need to do is to create the temporary file first, open it with an fstream object, and tie the stream reference to that object, and remember to delete the file in the destructor. What's the best way of creating something and making sure we remember to undo it later? Well, of course, creating a new helper class which creates the file in its constructor and removes it in its destructor. Piece of cake. The only problem is that we shouldn't always create a temporary file, and when we do, we can handle it a bit different from what we do with a ``global'' file that can be shared. For example, we know that we have exclusive rights to the file, and that it won't be reused, so there's no need for the extra information in the beginning and end. So, how's a

temporary file created? The C++ standard doesn't say, and neither is there any support for it in the old de-facto standard. I don't think C does either. There are, however, two functions ``tmpnam'' and ``tempnam'' defined as commonly supported extensions to C. They can be found in <stdio.h>. I have in this implementation chosen to use ``tempnam'' as it's more flexible. ``tempnam'' works like this: it accepts two string parameters named ``dir'' and ``prefix''. It first attempts to create a temporary file in the directory pointed to by the environment variable ``TMPDIR''. If that fails, it attempts to create it in the directory indicated by the ``dir'' parameter, unless it's 0, in which case a hard-coded default is attempted. It returns a ``char*'' indicating a name to use. The memory area pointed to is allocated with the C function ``malloc'', and thus must be deallocated with ``free'' and not delete[]. Over to the implementation details: We add a class called temporaryfile, which does the above mentioned work. We also add a member variable ``pfile'' which is of type ``ptr<temporaryfile>''. Remember the ``ptr'' template from last month? It's a smart pointer that deallocates whatever it points to in its destructor. It's important that the member variable ``pfile'' is listed before the ``stream'' member, since initialisation is done in the order listed, and the ``stream'' member must be initialised from the file object owned by ``pfile''. We also add a constructor with the number of elements as its sole parameter, which makes use of the temporary file. class temporaryfile { public: temporaryfile(); ~temporaryfile(); iostream& stream(); private: char* name; fstream fs; }; temporaryfile::temporaryfile() : name(::tempnam(".","array")), fs(name, ios::in|ios::out|ios::binary) { // what if tmpnam fails and name is 0 // what if fs is bad? } temporaryfile::~temporaryfile() { fs.close(); ::remove(name); // what if remove fails? ::free(name); } In the above code, ``tempnam'', ``remove'' and ``free'' are prefixed with ``::``, to make sure that it's the names in global scope that are meant, just in case someone enhances the class with a few more member functions whose name might clash. For the sake of syntactical convenience, I have added yet another operator to the ``ptr'' class template: template <class T> class ptr { public: ptr(T* tp=0) : p(tp) {}; ~ptr() { delete p; }; T* operator->(void) const { return p; }; T& operator*(void) const { return *p;}; private: ptr(const ptr<T>&); ptr<T>& operator=(const ptr<T>&); T* p; }; It's the ``operator->'' that's new, which allows us to write things like ``p->x,'' where p is a ``ptr<X>'', and the type ``X'' contains some member named ``x''. The return type for ``operator->'' must be something that ``operator->'' can be applied to. The explanation sounds recursive, but it makes sense if you look at the above code.

``ptr<X>::operator->()'' returns an ``X*''. ``X*'' is something you can apply the built in ``operator->'' to (which gives you access to the elements.) template <class T> FileArray<T>::FileArray(size_t elements) : pfile(new temporaryfile), stream(pfile->stream()), array_size(elements), home(stream.tellg()) { const size_t elem_size= FileArrayElementAccess<T>::size; // put a char just after the end to make // sure there's enough free disk space. stream.seekp(home+array_size*elem_size); char c; stream.write(&c,1); // what to do if write fails? // set put and get pointer to past the end pos stream.seekg(stream.tellp()); } That's it! The rest of the array works exactly as before. No need to rewrite anything else.

Code reuse
If you're an experienced C programmer, especially experienced with programming embedded systems where memory constraints are tough and you also have a good memory, you might get a feeling that something's wrong here. What I'm talking about is something I mentioned the first time templates were introduced: ``Templates aren't source code. The source code is generated by the compiler when needed.'' This means that if we in a program uses FileArray<int>, FileArray<double>, FileArray<X> and FileArray<Y> (where ``X'' and ``Y'' are some classes,) there will be code for all four types. Now, have a close look at the member functions and see in what way ``FileArray<int>::FileArray(iostream& fs, size_t elements)'' differs from ``FileArray<char>::FileArray(iostream& fs, size_t elements)''. Please do compare them. What did you find? The only difference at all is in the handling of the member ``elem_size'', yet the same code is generated several times with that as the only difference. This is what is often referred to as the template code bloat of C++. We don't want code bloat. We want fast, tight, and slick applications. Since the only thing that differs is the size of the elements, we can move the rest to something that isn't templatised, and use that common base everywhere. I've already shown how code reuse can be done by creating a separate class and have a member variable of that type. In this article I want to show an alternative way of reusing code, and that is through inheritance. Note very carefully that I did not say public inheritance. Public inheritance models ``is-A'' relationships only. We don't want an ``is-A'' relationship here. All we want is to reuse code to reduce code bloat. This is done through private inheritance. Private inheritance is used far less than it should be. Here's all there is to it. Create a class with the desired implementation to reuse and inherit privately from it. Nothing more, nothing less. To a user of your class, it matters not at all if you chose not to reuse code at all, reuse through encapsulation of a member variable, or reuse through private inheritance. It's not possible to refer to the descendant class through a pointer to the private base class, private inheritance is an implementation detail only, and not an interface issue. To the point. What can, and what can not be isolated and put in a private base class? Let's first look at the data. The ``stream'' reference member can definitely be moved to the base, and so can the ``pfile'' member for temporary files. The ``array_size'' member can safely be there too and also the ``home'' member for marking the beginning of the array on the stream. By doing that alone we have saved just about nothing at all, but if we add as a data member in the base class the size (on disk) for the elements, and we can initialise that member through the ``FileArrayElementAccess::size'' traits member, all seeking in the file, including the initial seeking when creating the file array, can be moved to the base class. Now a lot has been gained. Left will be very little. Let's look at the new improved implementation: Now for the declaration of the base class. class FileArrayBase { public: protected: FileArrayBase(iostream& io,

private: class temporaryfile { public: temporaryfile(). fstream fs. } iostream& FileArrayBase::temporaryfile::stream() { return fs. it looks a bit ugly. it's inaccessible from anywhere other than the ``FileArrayBase'' implementation. ::remove(name). iostream& seekp(size_t index) const. since the surrounding scope must be used. It's actually possible to nest classes in class templates as well. instead of the traits class. void initFromFile(const char* p).'' Yes. The only difference is that we use a parameter for the element size. ~temporaryfile().". FileArrayBase::FileArrayBase(iostream& io. // What if remove fails? ::free(name). FileArrayBase(iostream& io). When implementing the member functions of the nested class. FileArrayBase::temporaryfile::temporaryfile() : name(::tempnam(". // number of elements size_t element_size() const. size_t elem_size). size_t elem_size) : stream(io)."array")). size_t e_size. size_t size() const. iostream& seekg(size_t index) const.close(). iostream& stream. fs(name.size_t elements. The only surprise here should be the nesting of the class ``temporaryfile. streampos home. it's possible to define a class within a class. }.ios::in|ios::out|ios::binary) { // what if tmpnam fails and name is 0 // what if fs is bad? } FileArrayBase::temporaryfile::~temporaryfile() { fs. size_t elements. iostream& stream(). e_size(elem_size) { . }. Since the ``temporaryfile'' class is defined in the private section of ``FileArrayBase''. } The implementation of ``FileArrayBase'' is very similar to the ``FileArray'' earlier. ptr<temporaryfile> pfile. but few compilers today support that. size_t array_size. array_size(elements). private: char* name. FileArrayBase(size_t elements. size_t elem_size).

stream. sizeof(elem_size)). // clear error state // and initialize.clear(ios::failbit). stream.sizeof(ArrayBegin)).sizeof(ArrayEnd)). // shared with other // stream constructor if (array_size != elements) { // Uh oh. } To make life a little bit easier. home = stream. stream. // set put and get pointer to past the end pos.clear(ios::failbit). return. initFromFile(pattern). The data read from the stream. we know what the home position is. // and of course the number of elements stream.sizeof(pattern)).seekp(stream.write((const char*)&array_size.write(ArrayBegin. stream.seekp( } FileArrayBase::FileArrayBase(size_t elements. size_t elem_size) . // must store size of elements stream.seekg(stream.tellp()).char pattern[sizeof(ArrayBegin)].write((const char*)&elem_size. } // set put and get pointer to past the end pos.sizeof(pattern)). stream.write(ArrayEnd. // Now that we've written the maintenance // stuff. if (stream. stream.tellg()). I've assumed two arrays of char named ``ArrayBegin'' and ``ArrayEnd''. } if (e_size != elem_size) { stream.tellg()). stream. // begin of array pattern.seekp(home+elem_size*array_size). // Then we must go the the end and write // the end pattern. which hold the patterns to be used for marking the beginning and end of an array on disk. // set put and get pointer to past the end pos. FileArrayBase::FileArrayBase(iostream& io) : stream(io) { char pattern[sizeof(ArrayBegin)]. sizeof(array_size)).read(pattern. // and the size given in the constructor // mismatches! What now? stream. stream.eof()) { stream. } initFromFile(pattern).clear().

sizeof(epattern)). now let's see if // it's of the right kind. } .seekg(home+index*e_size).seekg(stream.tellg(). // What if seek failed? return stream.seekp(home+array_size*e_size).clear(ios::failbit). // set the fail*)&*)&array_size. home(stream. so we know where the // data begins and can set the home position.tellg()) { stream.ArrayBegin. // set put and get pointer to past the end pos. stream. though.write(& // Now we're past the header. // for lack of better. array_size(elements).clear(ios::failbit). char epattern[sizeof(ArrayEnd)].seekg(home+e_size*array_size). } // Seems like we have a valid array! } iostream& FileArrayBase::seekg(size_t index) const { // what if index is out of bounds? stream. stream(pfile->stream()).1). } iostream& FileArrayBase::seekp(size_t index) const { // What if index is out of bounds? stream. char c. } // OK.sizeof(e_size)). stream. stream. corrupt file! stream. home = stream. stream.sizeof(ArrayBegin))) { // What to do? It was all wrong! stream.seekp(home+index*e_size). return. if (strncmp(epattern.sizeof(array_size)). // Now positioned immediately after the last // element. we have a valid array. } void FileArrayBase::initFromFile(const char* p) { // Check if the read pattern is correct if (strncmp(p. stream. // what if seek failed? return stream. Can't do much with // the size here. // Get the size of the array. stream. e_size(elem_size). return.: pfile(new temporaryfile).ArrayEnd.sizeof(ArrayEnd))) { // Whoops.

template <class T> class FileArray : private FileArrayBase { public: FileArray(iostream& io.size_t FileArrayBase::size() const { return array_size. size_t size() { return FileArrayBase::size(). however. FileArrayProxy<T> operator[](size_t index). elements. // create temporary T operator[](size_t index) const. void storeElement(size_t index. } Apart from the tricky questions. it's all pretty straight forward. size_t size). }. Now watch this! template <class T> FileArray<T>::FileArray(iostream& io. // use existing array FileArray(size_t elements). size_t size) : FileArrayBase(io. FileArrayElementAccess<T>::size) { } template <class T> T FileArray<T>::operator[](size_t index) const { // what if index>= size()? return readElement(index). private: FileArray(const FileArray<T>&). // illegal FileArray<T>& operator=(const FileArray<T>&). The really good news. } size_t FileArrayBase::element_size() const { return e_size. friend class FileArrayProxy<T>. }. is how easy this makes the implementation of the class template ``FileArray''. // illegal T readElement(size_t index) const.// create one. } . FileArrayElementAccess<T>::size) { } template <class T> FileArray<T>::FileArray(iostream& io) : FileArrayBase(io) { // what if element_size is wrong? } template <class T> FileArray<T>::FileArray(size_t elements) : FileArrayBase(elements. const T& elem). FileArray(iostream& io).

B1 : public B{}. // parent seekg return FileArrayElementAccess<T>::readFrom(s). dynamic binding works. Here's a mini example showing the idea: class class class class A {}. part 1. That one thing is that when exceptions are caught. // may throw any of the above void x() { try { f(). extend and maintain. B : public A {}. } catch (B& b) { // **1 } catch (C& c) { // **2 } catch (A& a) { // **3 } .s). // what if read failed? // What if too much data was read? return t. const T& element) { // what if index>= size()? iostream& s = seekp(index). I introduced exceptions. or to use wording slightly more English-like. } template <class T> void FileArray<T>::storeElement(size_t index. } template <class T> T FileArray<T>::readElement(size_t index) const { // what if index>= size()? iostream& s = seekg(index). the C++ error handling mechanism. Of course exceptions should be used to handle the error situations that can occur in our array class. I didn't tell the whole truth about them. C : public A {}. and also makes the source code easier to understand. index). When I introduced exceptions. What can go wrong? Already in the very beginning of this article series. There was one thing I didn't tell. // parent seekp // what if seek fails? FileArrayElementAccess<T>::writeTo(element.template <class T> FileArrayProxy<T> FileArray<T>::operator[](size_t index) { // what if index>= size()? return FileArrayProxy<T>(*this. we can create exception class hierarchies with public inheritance. because at that time it wouldn't have made much sense. // what if write failed? // What if too much data was written? } How much easier can it get? This reduced code bloat. void f() (throw A). and we can choose what level to catch.

and we note that the header or trailer doesn't match the expected.} At ``**1'' above. abuse and environmental issues outside the control of the programmer. class FileArrayRuntimeError : public FileArray Exception {}.) and we want iterator arithmetic with integers. This may seem like a curious detail of purely academic worth. Addressing outside the legal bounds. class FileArrayDataCorruptionError : public FileArrayRuntimeError {}. it's not a good idea to add exception specifications to the member functions making use of the T's (since you cannot know which operations on T's that may throw. something goes wrong with a stream. yet an application that wishes a coarse level of error handling can choose to catch the higher levels of the hierarchy only. We can use abstraction levels for errors.) class FileArrayStreamError : public FileArrayRuntimeError {}. In a perfectly debugged program. I think this is quite enough. Iterators An iterator into a file array is something whose behavior is analogous to that of pointers into arrays. and what they do throw. Whenever the iterator is dereferenced. An easy way of getting there is to let an iterator contain a pointer to a file array. Now we have a reasonably fine level of error reporting. a check if there's enough disk space is still taking a chance. for example if seeking or reading/writing fails. the only exceptions ever thrown from file arrays will be of the ``FileArrayRuntimeError'' kind. objects of class ``B'' and class ``B1'' are caught if thrown from ``f''. At ``**3'' all others from the ``A'' hierarchy are caught. we return (*array)[index]. but it's extremely useful. That way we even have error handling for iterator arithmetic that lead us outside the valid range for the array given for free from the array itself. and with environmental issues I mean faulty or full disks (Since there are several programs running.) We want to access that element by dereferencing the iterator (unary operator *. regardless of why (it's not very easy to find out if it's a faulty disk or lack of disk space. though. from which all other exceptions regarding the file array inherits. For example. For whenever the creation of the array fails. I invite you to add the throws to the code. and an index. If after creation. class FileArrayLogicError : public FileArrayException {}. Here ``FileArrayLogicError'' are for clear violations of the not too clearly stated preconditions. if any are declared elsewhere) are caught. We can divide those further into: class FileArrayCreateError : public FileArrayRuntimeError {}. for example.) You can increase the code size and eligibility gain from the private inheritance of the implementation in the base by putting quite a lot of the error handling there. In ``**2'' objects of class ``C'' (and descendants of C. and ``FileArrayRuntimeError'' for things that the programmer may not have a chance to do something about. class FileArrayBoundsError : public FileArrayLogicError {}. If the read/write members of the element access traits class are faulty and either write too much (thus overwriting the data for the next element) or reads too much (in which case the last few bytes read will be garbage picked from the next element.) It's of course possible to take this even further. As an exercise. For abuse I mean things like indexing outside the valid bounds.) A reasonable start for the exception hierarchy then becomes: class FileArrayException {}. we can have a root class ``FileArrayException''. We can see that there are clearly two kinds of errors that can occur in the file array. If an array is created from an old existing file. class FileArrayElementSizeError : public FileArrayLogicError {}. however. that space may be occupied when the next statement in the program is executed. Even if there was enough free space when the check was made. . The iterator arithmetics becomes simple too. Beware. We want to be able to create an iterator from the array (in which case the iterator refers to the first element of the array.

• iterator1>=iterator2 returns !(iterator1<iterator2). It's just a lot of code to write. This addition is never an error. all that's needed is to define the operations needed for the iterators. Operator -= is analogous. and the actions we want. As an example. and analogous for operator-. thus reducing the amount to write and also the risk for errors. Likewise for operator>. it's an error and we throw an exception. a rule of thumb when writing a class for which an object ``o'' and some other value ``v'' the operations ``o+=v''. • iterator+=n (where n is of type long int) adds n to the value of the index in the iterator.. quite a lot of code can be reused over and over. template <class T> FileArrayIterator<T>::FileArrayIterator( const FileArray<T>& a ) : array(&a). • iterator1-iterator2 yields a long int which is the difference between the indices of the iterators. • iterator1==iterator2 returns non-zero if the arrays and indices of iterator1 and iterator2 are equal. private: FileArray<T>* array. unless you want to give the class users some rather unhealthy surprises) is to define ``operator+='' as a member of the class.e a • leArrayProxy. Here's my idea: • creation from array yields iterator referring to first element • copy construction and assignment are of course well behaved. If iterator1 and iterator2 refer to different arrays. index(0) { } template <class T> FileArrayIterator<T>::FileArrayIterator( const FileArrayIterator<T>& i ) . however. • iterator1<iterator2 returns true if the iterators refer to the same array and iterator1.index+n:th element of the array. With a little thought. • iterator+n yields a new iterator referring to the iterator. • addition of array and ``long int'' value ``n'' yields iterator referring to n:th element of array. . it's an error and we throw an exception. long n).. template <class T> FileArrayIterator<T> operator+(const FileArrayIterator<T>& i. * iterator[n] returns (*array)[index+n]. If the iterators refer to different arrays. FileArrayProxy<T> operator*(). ``o+v'' and ``v+o'' are well defined and behaves like they do for the built in types (which they really ought to. Likewise for operator<=. i. template <class T> FileArrayIterator<T> operator+(long n. FileArrayProxy<T> operator[](long n).index. FileArrayIterator<T>& operator+=(long n). and two versions of operator+ that are implemented with ``operator+=''.since it's just ordinary arithmetics on the index type. it's dereferencing the iterator that's an error if the index is out of range. • moving forwards and backwards with operator++ and operator--. Neither of the above is difficult. and thus a good chance of making errors. I think the above is an exhaustive list. The implementation thus seems easy. unsigned long index. const FileArrayIterator<T>& i). }.index < iterator2. • iterator1!=iterator2 returns !(iterator1==iterator2) • *iterator returns whatever (*array)[index] returns. Here's how it's done in the iterator example: template <class T> class FileArrayIterator { public: FileArrayIterator(FileArray<T>& f).

it's fairly simple.{ } : array(i. • Exception catching is polymorphic (i. you can study it in the sources. while private inheritance models ``is-implemented-in-terms-of'' relationships. • Private inheritance can be used for code reuse. In many situations where public inheritance is used. The above shows how it all works. } template <class T> FileArrayIterator<T> operator+(long n. } template <class T> FileArrayIterator<T> operator+(const FileArrayIterator<T>& i. • Private inheritance is in real-life used far less than it should be.) . • Standard C++ and even C. • Enumerations in classes can be used to have class-scope constants of integral type. return *this.index) template <class T> FileArrayIterator<T>& FileArrayIterator<T>::operator+=(long n) { index+=n. does not have any support for the notion of temporary files. Fortunately there are commonly supported extensions to the languages that do. but since its behaviour is defined in terms of ``operator+='' it means that if we have an error. Public inheritance models ``is-A'' relationships. • Modern compilers do not need the above hack. Defining a class-scope static constant of an integral type in the class declaration is cleaner and more type safe. • A user of a class that has privately inherited from something else cannot take advantage of this fact. } template <class T> FileArrayProxy<T> FileArrayIterator<T>::operator*() { return (*array)[index]. though. the code for the two versions of ``operator+'' must be written. there's only one place to correct it. private inheritance should've been used. } Surely. const FileArrayIterator<T>& i) { FileArrayIterator<T> it(i). and as you can see. return it+=n. • Private inheritance is very different from public inheritance. There's no need to display all the code here in the article.e. } template <class T> FileArrayProxy<T> FileArrayIterator<T>::operator[](long n) { return (*array)[index+n]. index(i. dynamic binding works when catching. return it+=n. To a user the private inheritance doesn't make any difference. Recap This month the news in short was: • You can increase flexibility for your templates without sacrificing ease of use or safety by using traits classes.array). long n) { FileArrayIterator<T> it(i).

as we will see later in this article. (hint. and always implement them in terms of the operator+=. // or the 0 pointer on failure. Here is a code fragment showing such a situation: void f(). where the alternatives store the data in different formats. it's a bit too simplistic to be generally useful. You can find a few things in common with them all. Exercises • • • Alter the file array such that it's possible to instantiate two (or more) kinds of FileArray<X> in the same program. for reusing code? In which situations is it crucial which alternative you choose? Coming up Next month we'll have a look at smart pointers. their syntax resembles that of pointers. we've seen how a simple smart pointer. they aren't pointers. with respect to exception safety. operator-. • exception safety • safe memory area ownership transfer • no confusion with normal pointers • controlled and visible rebinding and release of ownership • works with dynamic types • pointer-like syntax for pointer-like behaviour Let us have a look at each of these in some detail and compare with the previous ``ptr<T>''. The only thing that ``auto_ptr<T>'' has to offer over ``ptr<T>''. . so please write and give me suggestions for future topics to cover.) Always implement binary operator+. the alternatives will all need different traits class specialisations. but I know what problems the implementation provided does solve. For example there was no way to rebind an object to another pointer. They both delete whatever they point to in their destructor. and they're dangerous if you forget that you're dealing with smart pointers. The problem to solve I do not know what the core issues where when the ``auto_ptr<T>'' was designed. operator*= and operator/= members of the classes. Part Part Part Part Part Part Part Part Part Part1 Part1 Part1 Part1 1 2 3 4 5 6 7 8 9 0 1 2 3 [Note: the source code for this month is here. is that we can tell an ``auto_ptr<T>'' object that it no longer owns a memory area. and using a member variable of that same class.• • The polymorphism of exception catching allows us to create an arbitrarily fine-grained error reporting mechanism while still allowing users who want a coarse error reporting mechanism to use one (they'll just catch classes near the root of the exception class inheritance tree.] In the past two articles. operator* and operator/ as functions outside the classes. however. operator-=. Exception safety In this respect ``auto_ptr<T>'' and ``ptr<T>'' are equal. Ed. called simply ``ptr<T>'' was used to make memory handling a little bit easier. While ``ptr<T>'' served its purpose. I'm beginning to dry up on topics now. That is the core purpose of all smart pointers.) This article is devoted to the only smart pointer provided by the standard C++ library. but deallocate in exceptional situations. or to tell it not to delete the memory (that too can be useful at times. They're templates.) What's the difference between using private inheritance of a base class. they relieve you of the burden of remembering to deallocate the memory. This can be used for holding onto something we want to return in normal cases. try { auto_ptr<int> p(new int(1)). // may throw something int* ptr() { // returns a newly allocated value on success. the class template ``auto_ptr<T>''.

release()'' is called. If something goes wrong between calling ``creation'' and ``termination''.) { return 0. The properties of the ``auto_ptr<T>'' are more useful when working with functions.f(). The value returned is the pointer to the memory area. // // // // since we're sending it off as an auto_ptr<T>. Rather. ``p. by relieving it of ownership while accepting ownership itself. the above program snippet is too simplistic to be useful. The reason I've quoted the names assignment and copy. however. // now p2 owns the memory area.. such that ``termination'' ought not be called. and since the function "termination" wants an auto_ptr<T> it wants the responsibility. auto_ptr<int> p2. // we must take care of deallocation somehow. An important issue here for those of you who have used early versions of the ``auto_ptr<T>'' is that older versions did not become 0 ``pointers'' when not owning the memory area. void f() { auto_ptr<int> pi=creation(). Below are some examples of this: // simple transfer auto_ptr<int> p1(new int(1)). What happens is that they both modify the right hand side. p1 doesn't. an exception thrown from ``f'' results in the destruction of the ``auto_ptr<int>'' object ``p'' before the call of the ``release'' member function. and any function that accepts an ``auto_ptr<T>'' requires ownership to work. and the value returned from there is passed to the caller. // p2 doesn't own anything. What do you think about this? auto_ptr<int> creation(). // p1 has become the 0 "pointer" auto_ptr<int> p3(p2). where it works as both documentation and implementation of ownership transfer. // Now it is p3 that owns the memory area. we must take care of the . p2 = p1. Any function that returns an ``auto_ptr<T>'' leaves it to the caller to take care of the deallocation. deletion is not our headache anymore. // p1 owns the memory area. .. The ``auto_ptr<T>'' makes that rather easy.release(). The member function ``release'' releases ownership of the memory area from the ``auto_ptr<int>'' object. return p. void termination(auto_ptr<int> rip). // It's now clear that we are responsible for // deletion of the memory area allocated. Of course. Even // if we chose to release ownership from "pi". } catch (. Safe memory area ownership transfer This safety is achieved by cheating in the ``assignment'' operator and ``copy'' constructor. not // p1 or p2. If. as can be seen above. } } In the code above. // use pi for something termination(pi). I think the above example speaks for itself. Since the object no longer owns the memory area.. is that the cheat is so bad that it's not really an assignment and definitely not a copy.. both are ownership transfer operations. ``f'' does not throw any exception. which means that the object pointed to will be deleted. } One of the headaches of using dynamically allocated memory is knowing who is responsible for deallocating the memory at any given moment in a program's lifetime. it will not be deallocated. but that is the behaviour set in the final draft of the C++ standard.

A pointer cannot be implicitly // converted to an auto_ptr<T>. auto_ptr<T> ap=p. but since we have it in an ``auto_ptr<T>'' that is automatically done for us if we return or throw an exception. // illegal. or generally funny behaviour (possibly followed by a crash later. termination(&i). This is done by explicitly prohibiting all implicit conversions between pointer types and ``auto_ptr<T>'' types. Ouch! The function would attempt to delete the local variable. Imagine the maintenance headaches you could get otherwise. The auto_ptr<T> required by the // termination function cannot be implicitly // created from the pointer. Calling ``ap. void f() { int* p = creator(). ap. depending on the desired effect. void termination(auto_ptr<int>). ``ap'' might be declared somewhere far far away. // illegal. The erroneous code below shows how: auto_ptr<int> creator(). This functionality is an advantage that ``auto_ptr<T>'' offer over ``ptr<T>''. we do not have to worry about it. it is extremely important that they cannot accidentally be confused with normal pointers. What would the first mean? Would the implicit conversion from ``auto_ptr<T>'' to a raw pointer transfer ownership or not? All implementations I have seen where such implicit conversions are allowed do not transfer the ownership. If we want a normal pointer from an ``auto_ptr<T>'' object.)'' Well. For both the result will be something that in standardese is called ``undefined behaviour''. // use the "reset" member function. since ``ptr<T>'' does allow implicit construction. The member function ``reset'' takes care of that.) The last is as bad. The second error ``auto_ptr<int> ap=p. use the constructor // syntax auto_ptr<T> ap(p).reset(p)'' will deallocate whatever ``ap'' owns (if anything) and make it own whatever ``p'' points to. but which in normal English best translates to ``a crash now. since the latter doesn't have any way of transferring ownership. In this respect ``auto_ptr<T>'' is better than ``ptr<T>''.'' is perhaps a bit unfortunate since the intended behaviour is clear. it is important that the memory area currently owned by the object (if any) is deallocated. } It is indeed fortunate that the first and last error above are illegal.reset(p). No confusion with normal pointers Since the auto pointers have the behaviour outlined above. so that in the code near the assignment it is not obvious if it is an ``auto_ptr<T>'' or a normal pointer. allowing the last error. which in the situation above means that the memory would be deallocated when the ``auto_ptr<T>'' object returned is destroyed (which it would be immediately after the conversion. The member function ``release'' gives us a normal pointer to the memory area owned by the . If you want to rebind an // auto_ptr<T> object to point to something else. // also illegal. Controlled and visible rebinding and release of ownership If we want to rebind an ``auto_ptr<T>'' object to another memory area. a crash later.deallocation. That it is illegal comes as a natural consequence of banning the third situation ``ap=p'' which is not clear. // also illegal. When creating a // new auto_ptr<T> object. we can get it in two ways. ap=p. termination(p). An auto_ptr<T> cannot be implicitly // converted to a pointer. What about this situation? int i. since it is illegal.

auto_ptr<B> pb(new B()). void reset(T* t = 0) throw (). This is the only functionality of a pointer that is implemented. This is not particularly strange: class A {}. func(p. T* operator->() const throw (). auto_ptr<A> pa2(pb). pa=pb. private: T* p. template <class Y> auto_ptr<T>& operator=(auto_ptr<Y>& t) throw (). class B : public A{}. . This function ``f'' then returns the raw pointer if ``func'' does its job. since the functionality is only required if ownership transfer is allowed. ~auto_ptr() throw (). the ``auto_ptr<int>'' object ``p'' will deallocate the memory in its destructor. so we use the ``get'' member to temporarily get the pointer and pass it to ``func''. Works with dynamic types Just as a normal pointer to a base class can point to an object of a publicly derived class. Implementation The definition of ``auto_ptr<T>'' looks as follows: template <class T> class auto_ptr { public: explicit auto_ptr(T* t = 0) throw (). but if it fails with an exception. auto_ptr<A> pa(new B()). but temporarily need a normal pointer to the memory area. T* release() throw ().``auto_ptr<T>'' object. }.get()). auto_ptr(auto_ptr<T>& t) throw(). // call func with a normal pointer return p. template <class Y> auto_ptr(auto_ptr<Y>& t) throw(). // return the pointer } Above we see that the function ``func'' requires a normal pointer. since the functionality is exactly the same and so is the syntax. so that we will be responsible for the deallocation. Since ``ptr<T>'' was specifically designed to disallow transfer of ownership. Here is an example showing the differences: void func(const int*). but it does not assume ownership. We get access to the element pointed to with ``operator*'' and ``operator->''.release(). Here it is a tie between ``auto_ptr<T>'' and ``ptr<T>''. If we do not want that responsibility. The reverse is (of course) not allowed. T* get(void) const throw (). Pointer-like syntax for pointer-like behaviour For the small subset of a pointer's functionality that is implemented in the ``auto_ptr<T>'' class template. we use the ``get'' member function. For ``ptr<T>'' this is not a problem. auto_ptr<T>& operator=(auto_ptr<T>& t) throw (). T& operator*() const throw (). this functionality is added-value for ``auto_ptr<T>''. // may throw int* f(void) { auto_ptr<int> p(new int(1)). the syntax is exactly the same. and also gives us the ownership. an ``auto_ptr<T>'' can too.

There is a fake around it.) Few compilers are smart enough to inline automatically. It is an essential addition to the C++ language. auto_ptr<B> pb. it owns it (by definition. otherwise we will get an error message from the compiler. The member templates. The only thing it needs to do is to initialize the ``auto_ptr<T>'' object such that it owns the memory area. not be worked around. private: T value. the generated code will compile just fine. although it has been part of C++ for a very long time. but only one user defined implicit conversion may take place. The keyword ``explicit'' in front of the constructor. a compiler is free to ignore it. pa=pb''. and most important (please take note of this. Marking a function ``inline'' is a way of hinting to the compiler that you think this function is so simple that it can insert the function code directly where needed. and if it points to anything at all. This is what makes it possible to say ``auto_ptr<A> pa. however. operator T() const { return value. }. strictly speaking. strictly speaking. void func() { . so there's a place for the ``inline'' keyword. when attempting to call a function requiring an ``auto_ptr<T>'' parameter with a normal pointer. This feature can.) is that the ``copy'' constructor and ``assignment'' operator do take a non-const reference to their right hand side. It is an error if two or more implicit conversions are required to get the desired effect. necessary. we can see that a member function auto_ptr<A>& auto_ptr<A>::operator=(auto_ptr<B>&) throw() will be generated. Both of these are relatively recent additions to the C++ language and far from all compilers support them. Unfortunately even fewer compilers support this than support the ``explicit'' keyword. and the code above. I mentioned above that the ``explicit'' keyword is not. and likewise a good compiler may inline even functions not marked as inline (provided you cannot see any difference in the behaviour of the program. template <class T> inline auto_ptr<T>::auto_ptr(explicit<T*> ptr) throw() : p(ptr) { } The way this works is as follows: By default. Look at this example usage: void termination(auto_ptr<int> pi). If class ``B'' is publicly derived from class ``A''. which you will see when we get to the implementation details. is a way of creating new member functions at need. as the ``template <class Y>'' used inside the class definition is called. The code Let us do the member functions one by one. This constructor is marked ``explicit'' in the class definition. With this mini-example.) This keyword is. and it will be used for all member functions of the ``auto_ptr<T>''. for example in function calls (see the error example above. beginning with the constructor. and ``template <class Y>'' inside the class definition. so that it can be modified. Third. ``explicit'' is what disallows implicit construction of objects. not needed.) template <class T> inline auto_ptr<T>::auto_ptr(T* ptr) throw() : p(ptr) { } The ``inline'' keyword is new for this course. }.Three new details can be seen above. Constructing an object is a user defined conversion. This is just a hint. and executing a conversion operator is too. instead of making a function call. implicit conversions are allowed. Here is the promised work-around: template <class T> class explicit { public: explicit(T t) : value(t) {}. to the best of my knowledge.

of course. in this case to get the value from it.release()) { } There is not much strange going on here. however. we will be obeyed. Thus this constructor makes ``p'' point to what ``t'' did point to. even though it may seem so.release()). return *this.release()) { } The code for this constructor is. with the two subsequent ``template <. It doesn't. and one from ``explicit<int*>'' to ``auto_ptr<int>''. it is a detail of the innards of the ``auto_ptr<T>'' constructor how it is used. } auto_ptr<int> pi(new int(1)). the member ``release'' relieves the object of ownership and returns the pointer.) template <class T> inline auto_ptr<T>::~auto_ptr() throw () { delete p. As mentioned far above. } If the object owns anything. and alters ``t'' so that it becomes the 0 ``pointer''. that I thought the latter would imply the former. Note that both are necessary. template <class T> inline auto_ptr<T>& auto_ptr<T>::operator=(auto_ptr<T>& t) throw () { reset(t. Since we've been so stern about this. Please see the source code for how to work around the compilation error (the work around is simply not to have this member function. one for creating the ``explicit<T>'' object. the same as for the previous one..>'' Of course. One from ``int*'' to ``explicit<int*>''. it will be deleted by the destructor. template <class T> template <class Y> inline auto_ptr<T>::auto_ptr(auto_ptr<Y>& t) throw() : p(t. which means that the resulting ``auto_ptr'' will be limited in functionality. Then. but that is not quite true. It may seem like there are two implicit conversions taking place here. Note. and one for getting the value out of it. template <class T> inline auto_ptr<T>::auto_ptr(auto_ptr<T>& t) throw() : p(t.. because we say that we want an object of type ``auto_ptr<int*>''. Please see the provided source code for how to allow both versions to coexist for different compilers in the same source file. Note that deleting the 0 pointer is legal. The code at //**2 however.} The code at //**1 is not in error. users of compilers that do not implement member templates will get compilation errors on this member function.release()). ``p'' will be the 0 pointer. is in error. by the way. except that the parameter is a non-const reference. Our ``auto_ptr<int>'' accepts as its parameter an ``explicit<int*>'' which is implicitly created from the pointer value. If the object does not own anything. I made a mistake with the ``auto_ptr<T>'' implementation available in the adapted SGI STL. //**1 legal termination(new int(2)). //** 2 error . the syntax for a member template. return *this. because the call to ``termination'' requires two user defined conversions. } template <class T> template <class Y> inline auto_ptr<T>& auto_ptr<T>::operator=(auto_ptr<Y>& t) throw () { reset(t. and does nothing at all.

Most probably close to none at all. though. in which case it must be deallocated. for later deletion again! It seems like a better way is to just do nothing if the situation ever arises.) The member function ``reset'' does exactly what we want. template <class T> inline void auto_ptr<T>::reset(T* t) throw () { if (t != p) { delete p. and the value previously held by ``p'' is returned. just as mentioned in the introduction of the class. of course. I would say the difference is that with ``auto_ptr<T>'' you do many more deletions (i. It will depend a lot on how clever your compiler is with inlining. the price is nothing at all. If you have a measurable speed difference in a realworld application. } Not much to say about this one. only be used if ``T'' is a struct or class type. and use only the functionality that ``ptr<T>'' offers. } Not much to say.This is pretty much the same story as the ``copy'' constructor. ``operator->'' can. p=0. } These are not identical with the version of ``ptr<T>'' from the previous issue of the course. template <class T> inline T& auto_ptr<T>::operator*() const throw () { return *p. you have mended memory leaks you were not aware of having. performance and memory-wise to use the ``auto_ptr<T>'' instead of ``ptr<T>'' from last month? If you use ``auto_ptr<T>'' instead of ``ptr<T>''. Compared to raw pointers and doing your own deletion? I do not know. Efficiency The question of efficiency pops up now and then. after all. template <class T> inline T* auto_ptr<T>::get(void) const throw () { return p. return tp. The constructor.e.) . On some older compilers. Nothing strange. ``operator*'' and ``operator->'' holds exactly the same code for both templates. } template <class T> inline T* auto_ptr<T>::operator->() const throw () { return p. Since we are not creating a new object. template <class T> inline T* auto_ptr<T>::release() throw () { T* tp=p. p=t. we cannot just assign to ``p'' (it may point to something. except the safety guard against resetting to the value already held. If we didn't have this guard. however. destructor. One word on the way. In most cases this is a minor limitation. You pay for what you use only. and not built-in types. } } Deletes what it points to and sets ``p'' to the given value. How much does it cost. resetting to the current value would deallocate the memory and keep the ownership of it. it is normally structs and classes you handle this way. it's even illegal to instantiate ``auto_ptr<T>'' if ``T'' is not a struct or class type. and give it a new value. is there? The object is relieved of ownership by making ``p'' the 0 pointer. delete whatever ``p'' points to.

a reference counted one. If the counter reaches zero. documents and implements ownership transfer of dynamically allocated memory. or you want something clarified further or disagree with me. When the first smart pointer attaches to it. When allocated it is set to 0.) • The ``explicit'' keyword disallows implicit construction of objects. and also buy a few commercially available libraries for. • ``inline'' hints to a compiler that you think a function is so small that it is better to directly insert the function code where required instead of making a function call.Recap The news this month were: • The standard class template ``auto_ptr<T>'' handles memory deallocation and ownership transfer. but last one out locks the door. we do not want to be bothered with ownership. • exception safety • no confusion with normal pointers • controlled and visible rebinding and access • works with polymorphic types • pointer-like syntax for pointer-like behaviour • automatic deletion when no longer referring to the object . increments the reference count. but we also want to be sure that the memory is deallocated when no longer needed. since there's no need to worry about ownership.) The problems to solve Many of the problems with a reference counting pointer are the same as for the auto pointer. Next month we'll have a look at a smarter pointer. • Automatic memory deallocation and ownership transfer reduces the risk for memory leaks. so the resource must be deallocated. The list is actually a bit shorter.) The less general solution is reference counting. please drop me a line and I'll address your ideas in future articles. no one is referring to it anymore. I'm beginning to dry up on topics now. especially when exceptions occur. the count is incremented to 1. The general solution to this is called automatic garbage collection (something you can read several theses on. just like function templates can be used to create functions at compile time. • Member templates can be used to create member functions at compile time. The idea is that a counter is attached to every object allocated. • The ``explicit'' keyword can be faked. • Implicit conversions between raw pointers and smart pointers is bad (even if it may seem tempting at first. Every smart pointer attaching to the resource. Often. however. or assigned another value) the resource's counter is decremented. however. The weakness of this compared to automatic garbage collection is that it does not work with circular data structures (the count never goes below 1. no owner. Exercises • • • Why is it a bad idea to have arrays (or other collections) of ``auto_ptr<T>''? Can smart pointers be dangerous? When? ``auto_ptr<T>'' too? What is a better name for this function template? template <class T> void funcname(auto_ptr<T>) { } What happens if ~T throws an exception? • Coming up If I missed something. Part Part Part Part Part Part Part Part Part Part1 Part1 Part1 Part1 1 2 3 4 5 6 7 8 9 0 1 2 3 Introduction Last month's ``auto_ptr<T>'' class template. We want several places of the code to be able to access the memory. and every smart pointer detaching from a resource (the smart pointer destroyed. so please write and give me suggestions for future topics to cover.

the object pointed to is not duplicated. When three counting pointers refer to the same object. the value of the counter is three.manage(new int(other)). counting_ptr<int> P2(P1). so it is better not to have the functionality. P1. Here is how it is supposed to work when we are done: counting_ptr<int> P1(new int(value)). but it quickly leads to user code that is extremely hard to maintain. This is exactly what we want to avoid. . that of how to stop reference counting a resource. the reference count for the value pointed to is set to one.This might also be the place to mention a problem not to solve. After creating a counting pointer ``P1''. but the reference count is incremented. counting_ptr<int> P3(P2). Adding this functionality is not difficult. When a second counting pointer ``P2'' is created from ``P1''.

As one of the pointers referring to the first object created is reinitialized to another object. and the object is deallocated. the counter for the old one is yet again decremented. When yet one of the pointers move attention from the old object to the new one. and for the new one it is incremented. . P3=P2. Now instead the new object has a reference count of 3. P2=P1. the old objects reference count goes to zero. the old reference count is decremented (there are only two references to it now) and the new one is assigned a reference count of 1. Now that the last counting pointer referring to the old object moves its attention away from it. since there are three reference counting pointers referring to it.

counting_ptr(const counting_ptr<T>& t) throw(). giving the reference counting pointer an identical interface. It is obvious that the counter belongs to the object referred to. the pointer to the representation and the pointer to the object from the representation. T* operator->(void) const throw (). we have the object and counter together. template <class T> class counting_ptr { public: explicit counting_ptr(T* t = 0). use a ``T*'' instead. To me the word ``get'' associates with a transfer. ~counting_ptr() throw (). All we do is to peek inside and see what the internal raw pointer value is. T& operator*(void) const throw (). Whenever accessing the object referred to. It also becomes very difficult to write the constructor and ``manage'' member function. Unfortunately there are two severe drawbacks. representation* ptr. I think this name better describes what is happening. so it cannot reside in the smart pointer object. }. since the ``value'' component is indeed a value and dynamic binding only works through pointers and references.) There is a performance disadvantage with this. we must figure out where to store the reference count. and the type referred to. however. Here is a suggested interface.) The member function ``get'' is here named ``peek''. Where to store the reference count Before we can dive into implementation. template <class Y> counting_ptr(const counting_ptr<Y>& ty) throw(). share much with the auto pointer. }. A solution that easily springs to mind is to use a struct with a counter. this would be very unfortunate since their semantics differ dramatically. the only differences are that some member functions do not have an empty exception specification. and there is no transfer occurring. With this construct. T* peek(void) const throw(). for obvious reasons. T value. While these aspects could use the same interface as does the auto pointer. Compared to the auto pointer. and the reason is a big difference in semantics. The differences lie in accessing the raw pointer and giving the pointer a new value. void manage(T* t). }. template <class Y> counting_ptr<T>& operator=(const counting_ptr<Y>& ty) throw().Interface outline The interface of the reference counting smart pointer will. The best solution I have seen is to decouple the representation from the object and instead allocate an ``unsigned'' and in every counting pointer object keep both a pointer to the counter and to the object referred to. This gives the following data section of our counting pointer class template: template <class T> class counting_pointer { public: . All we need to work this way is to make sure to allocate this representation struct on heap in the constructor and ``manage'' member functions (and of course to deallocate the struct when we're done with it. The work around is simple. we must follow two pointers. and there is no member function corresponding to ``auto_ptr<T>::release()'' (which stops the managing of the pointer. The member function ``reset'' is here named ``manage''. like this: template <class T> class counting_pointer { public: private: struct representation { unsigned counter. counting_ptr<T>& operator=(const counting_ptr<T>& t) throw (). We cannot use it for dynamic binding.

even when member templates are not available. please read Scott Meyer's paper on the topic. if we want the ability to assign a ``counting_ptr<T>'' object from a value of type ``counting_ptr<Y>'' if a ``T*'' can be assigned from a ``Y*''. counting_base(const counting_base& cb) throw(). the function body throwing the exception will not be necessary. Accessibility The solution outlined above is so good it almost works. Second. }. The default constructor allows us to choose whether we want a reference counter or not. so we can implement the counter managing code in a separate class. a rather new addition to the language. The idea here is that the this class handles every aspect of the reference counter. counting_base& operator=(const counting_base& cb) throw (). ~counting_base(void) throw(). unsigned* pcount.private: T* ptr. We just need to tell it how to behave. Note that the copy constructor never involves any allocation or deallocation of dynamic memory. If you have a modern compiler. inline counting_base::counting_base( const counting_base& cb ) throw () : pcount(cb. we must think of something. Base implementation It is fairly easy to implement. For compilers that do not support member templates. The problem is that ``counting_ptr<T>'' and ``counting_ptr<Y>'' are two distinct types. If our reference counting pointer is initialized with the 0 pointer. . One step on the way towards a solution is to see that the management of the counter is independent of the type T. int release() throw(). } Initialize a counter with the value of the parameter. A reference counting class may look like: class counting_base { public: counting_base(unsigned count = 0). there is nothing in the copy constructor that can throw exceptions.pcount) { if (pcount) ++(*pcount). such that the assignment and construction from a counting pointer of another type are impossible. however. there is no need to waste time and memory by allocating a reference counter for it. we should delete the object we refer to) or 0 if it was just decremented. To begin with. However. The value of the raw pointer member can be accessed through the public ``peek'' member function. extremely few compilers support template friends. operator new will throw ``bad_alloc'' in out-of-memory conditions. private: unsigned* pcount. In fact. For the curious. but there is a two-fold problem with that. and it reports the results to us. but we need a solution for accessing the counter without making it publicly available. } Copying a reference counting object means adding one to the reference counter (since there is now one more object referring to the counter. and makes life easier later on. member templates open up holes in the type system you can only dream of. int reinit(unsigned count=1). For older compilers. The member functions ``release'' and ``reinit'' return 1 if the old counter is discarded (and hence. there is nothing to update. A count of 0 is represented by a 0 pointer. so both the member variables are private and thus inaccessible. you need to define the ``bad_alloc'' class.) When we have 0 pointers. This kind of problem is exactly what ``friend'' declarations are for. it is all we need. inline counting_base::counting_base(unsigned count) : pcount(count ? new unsigned(count) : 0) { if (count && !pcount) throw bad_alloc(). }.

It is an optimization of memory handling. if (count && !pcount) throw bad_alloc(). return 0. The easy way out is to use public inheritance. It may be that ``release'' is called just prior to destruction. } Strictly speaking. and never dereferences the 0 pointer. inline counting_base& counting_base::operator=( const counting_base& cb ) throw() { if (pcount != cb. and the destructor calls ``release'' to deallocate memory.pcount. if (pcount) ++(*pcount). The problem remains. ``reinit'' is not needed. in that if this object was the last reference the counter is reinitialized instead of deallocated and then allocated again. pcount=cb. } If the reference count goes to zero the counter is deallocated. though.) Accessibility again As nice and convenient the above helper class is. it really does not solve the accessibility problem. or decrementing the reference count twice. } Assignment of reference counting objects means decrementing the reference count for the left hand side object. It does make the implementation a bit easier. class ``counting_ptr<T>'' and ``counting_ptr<Y>'' are different classes and because of this are not allowed to see each others private sections.pcount) { release(). we implement that work in the ``release'' member function. return 1. never accesses uninitialized or just deallocated memory. A return value of 1 means deallocation took place (hinting to the user of this class that it should deallocate whatever object it refers to. since there will be one less object referring to the counter from the left hand side object. and incrementing it for the right hand side object. however. return 1. If the pointer to the counter is not set to zero it means either referring to just deleted memory. pcount = 0.e. inline int counting_base::reinit(unsigned count) { if (pcount && --(*pcount) == 0) { *pcount = count. return 0.inline counting_base::~counting_base() throw() { release(). Its purpose is to release from the current counter and initialize a new one with a defined value. prove to yourself that this reference counting base class does not have any memory handling errors (i. and one more referring to the counter from the right hand side object. } return *this.) Since this code is needed in the assignment operator. a new counter must of course be allocated. since it is the last one referring to it. } pcount = count ? new unsigned(count) : 0. If it was not the last object referring to the counter. } Destroying a reference counting object means decrementing the reference counter and deallocating it if the count goes to zero (last one out locks the door. } pcount=0. it always deallocates what it allocates. never deallocates the same area twice. inline int counting_base::release() throw () { if (pcount && --(*pcount) == 0) { delete pcount. As an exercise. and say that . as well as in the public interface for use by the reference counted smart pointer class template.) In both cases the pointer to the counter is set to 0 as a precaution.

template <class T> inline counting_ptr<T>::counting_ptr( const counting_ptr<T>& t ) throw() : counting_base(t). inline counting_ptr<T>::~counting_ptr() throw() { if (release()) delete pt.almost. If you review the functionality of the ``release'' member function of ``counting_base''. There is no is-a relationship here. the implementation is not too convoluted. pt( { if (release()) delete pt. template <class T> inline counting_ptr<T>::counting_ptr(T* t) : counting_base(t ? 1 : 0). As such they do not become part of the public interface. but it works fine. the count reaches 0 and then returns 1. That is a solution that is simple. Instead of having the member functions of the ``counting_base'' class public. and only if. } return *this. pt(t) { } Initialize the counter to 1 if we have a pointer value. you see that it deallocates the counter if.peek()) { } Note how the latter makes use of the knowledge that a ``counting_ptr<Y>'' is-a ``counting_base''. but an is-implemented-in-terms-of relationship. } The destructor is important to understand. and 0 otherwise. pt( { } template <class T> template <class Y> inline counting_ptr<T>::counting_ptr( const counting_ptr<Y>& ty ) throw() : counting_base(ty). sweet and dead wrong. and only if the reference count for it reaches zero and is deallocated. template <class T> inline counting_ptr<T>& counting_ptr<T>::operator=( const counting_ptr<T>& t ) throw() { if (pt != t. otherwise it returns 0. This means that in the destructor we will deallocate whatever ``pt'' points to if. Such relationships are implemented through private member data or private inheritance . This is bordering on abuse. Implementation of a reference counting pointer Finally we can get to work and write the reference counting pointer. counting_base::operator=(t). and since the helper class does most of the dirty work. they can be declared protected.every counting pointer is-a counting base. and the is-a relationship is mostly imaginary. } template <class T> template <class Y> inline counting_ptr<T>& counting_ptr<T>::operator=( const counting_ptr<Y>& ty ) throw() { .

please drop me a line and I'll address your ideas in future articles. constructed. Exercises • • • What happens if ``~T'' throws an exception? What happens if allocation of a new counter fails? The ``counting_base'' implementation and use is suboptimal. there is a cost involved in using reference counting pointers. and the first time the counter pointer is set to zero to prevent decrementing the counter twice. Otherwise we want the counter value to be 1. I do not have access to any compiler under OS/2 that supports RTTI. we want the reference count to also be the zero pointer. or RTTI for short.} Please spend some time convincing yourself that the above works. and tested with. and only if. pt=ty. If the parameter is the zero pointer.) under Linux. If they refer to different objects. and that costs a few CPU cycles.) one of the new functionalities in C++. Since ``release'' also resets the counter pointer to zero. counter manipulation is done. despite that all member functions are protected? • Coming up If I missed something. a pointer is also allocated. the assignment operator for ``counting_base'' does not decrement the counter again. and vice-versa for deallocation. Last topic is Run Time Type Identification. if (pt != ty. } This one is only slightly tricky. or it may be severe. Part Part Part Part Part Part Part Part Part Part1 Part1 Part1 Part1 1 2 3 4 5 6 7 8 9 0 1 2 3 Introduction Lucky number thirteen will conclude the long C++ series. What happens is that if the left hand side and right hand side counting pointers already refer to the same object. Next month is devoted to Run Time Type Identification (RTTI. Instead it is simply assigned. Efficiency There is no question about it. pt = t. or you want something clarified further or disagree with me. ``release'' returns 1. counting_base::operator=(ty). which it does if. The access operators are all trivial and need no further explanation: T* operator->() const T& operator*() const T* peek() const Now only the ``manage'' member function remains: template <class T> inline void counting_ptr<T>::manage(T* t) { if (!t && release() || t && t != pt && reinit()) delete pt. the object referred to is deallocated. Every time a counting pointer is assigned. For example at destruction. the object referred to by the left hand side reference counting pointer is deleted if. Improve the implementation to never (implicitly) set the pointer to zero and yet always be safe. If either of those tells that the old counter is discarded. on the other hand.peek()) { if (release()) delete pt. . After this the raw pointer value can safely be copied. nothing happens. so the example programs are written for.peek(). Every time an object is allocated. the egcs compiler (a fast-moving bleeding edge gcc. Depending on how efficient your compiler's memory manager is for small objects this cost may be negligible. destroyed and copied. and thus ``release'' is called. } return *this. the reference count goes to zero and the counter is deallocated. the ``release'' member function is called twice. so ``reinit'' is called instead. It's a new addition to C++. Does the public derivation from ``counting_base'' open up any holes in the type safety.

otherwise a zero pointer is returned (of if ``T'' is a reference. Towards the end of the article there is also a discussion about various aspects of efficiency in C++. Suppose a hierarchy like this: class Event {}. pb->setText("***"). class TextPushButton : public PushButton { public: void setText(const char* txt). The result is likely an uncontrolled crash. If you think about it.) That way we can check and take some action if the function is called with a pointer or reference to the wrong type. void createButton(const char* txt) { TextPushButton* pb=new TextPushButton(). // ^^^^^^^^^^^^^^ pb->setText("***"). For example. One is finding a type identification for an object. void register(void (*pushed)(const PushedEvent&)). so we know it is the right kind. pb->setText(txt). It works only if there is at least one virtual member function in the type casted from. does it? The callback is registered when the button is created. There are two distinct ways this can be done. void pushed(const PushButton::PushedEvent& ev) { TextPushButton* pb= dynamic_cast<TextPushButton*>(ev. }. Here is a new version of the ``pushed'' function using the RTTI ``dynamic_cast''. most notably GUI libraries like IBM's Open Class Library. the exception ``bad_cast'' is thrown. and the down cast marked with ``^^^''is safe. class PushButton : public Button { public: class PushedEvent : public Event { public: PushButton* button(void) const. especially in comparison with C. } void pushed(const PushButton::PushedEvent& ev) { TextPushButton* pb=(TextPushButton*)(ev.button()). suffer from the problem that callback functions will be called with pointers to objects of a base class. The idea is that whoever uses a push button can register a function that will be called whenever a pushbutton is called. however. Or is it? Right now it is. from which you can get a pointer to the button itself. giving a unique identifier for each unique type. it might not be as easy to see what happens. that is not much of a penalty. . Let us put it to use: void pushed(const PushButton::PushedEvent& ev). but as the program grows. but you know the objects really are of some class inheriting from it. }. or just generally weird behaviour. There is one catch with ``dynamic_cast''. class Window {}. Type Safe (down)Casting Many class libraries. At least the destructor should be virtual in such hierarchies anyway. assert(pb). The solution is to do the cast only if it is legal.button()). } This does not look too dangerous. class Control : public Window {}. otherwise you get compilation errors.RTTI is a way of finding out about the type of objects at run-time. Another is a clever cast. concider a button push event. which allows casting of pointers and references only if the type matches. and somewhere the wrong callback is registered for some button. } ``dynamic_cast<T>(p)'' works like this: If ``p'' is-a ``T'' the result is a ``T'' referring to the object. pb->register(pushed). class Button : public Control {}. }.

const char* name() const. to the standard ``ostringstream''. type_info& operator=(const type_info&). no ``. other than that it makes it possible to keep sorted collections of type information objects. however. In which way is this worse from the previous version that did not have the problem. Do not try to get a meaning from the sort order. and during transitions between libraries. It is defined in the header named <typeinfo> (note. and we have added an error check that should have been there earlier but was not because the language did not support it (what if the function is called with an ``fstream'' object?) Use RTTI ``dynamic_cast'' when being forced to adapt to poorly written libraries. and its end must be marked by appending a zero termination with the ``ends'' modifier. . cout << pos->str(). Identifying types Much more interesting is using explicit information about the type of an object. With RTTI we can live with both worlds at the same time. }. It does not need to have any meaning. The ``name'' member function gives you access to a printable form of the name of the type identified by the ``type_info'' object. Here is an example mirroring a problem from my previous job: void print(ostream& os) { if (ostrstream* pos=dynamic_cast<ostrstream*>(&os)) { *pos << ends. For new designs. bool before(const type_info&) const. which is based on standard strings. by turning the problem ``inside-out''. private: type_info(const type_info&). In fact.It must be stressed that this use of ``dynamic_cast'' can always be avoided (I cannot prove it. and then this can make our life a lot easier. as an error check. never design new code like this! Use this construct only during a transition phase. where everything was always ``ostrstream''? Well. An ``ostrstream'' holds a raw buffer of memory. even the built-in ones.) However. Most notably it is not standardised what the printable form looks like for any given type. } else if(ostringstream*pos= dynamic_cast<ostringstream*>(&os)) { cout << pos->str(). This requires a little bit of care. There is a standard class called ``type_info''. It can also be used during a transition between different libraries. Say you have this situation: class parent {}. You get a ``type_info'' object for a value through the built-in operator ``typeid(x)''. Note that it is the runtime type of ``x'' that is accessed. in no way. which purpose is to carry information about types. not the static type. and many have tried. This solves a problem that cannot reliably be worked around with clever designs. pos->freeze(0).h'') and looks as follows: class type_info { public: virtual ~ type_info().) The need arises from a poor design (the solution is to use dynamic binding instead. use dynamic binding instead. while standard ``ostringstream'' returns a string object which itself knows how long it is and a zero termination must not be added to it (otherwise an extra zero character will be printed. The ``before'' member function gives you a sort order for types. This unfortunately means that you cannot write portable (across compilers) applications that rely on the names in any way. the standard requires very little of the ``name'' member function.) Sometimes we have to live with poor designs. but I have never seen a counter proof either. } This transition is from the ``old'' ``ostrstreams''. bool operator==(const type_info&) const. However. storing strings as plain ``char*''. The difference lies in handling the end. } else throw bad_cast(). there may not be one. bool operator!=(const type_info&) const. it is not even required that the string is unique for each type.

The use of template functions that do not have any parameters of the template type is a recent addition to C++ that few compilers support. Let us call this class ``persistent_store''. }. It may look like this: class persistent { protected: virtual void store(ostream& os) const = 0. virtual void retrieve(istream& is) = 0. however. but this is error prone. I have chosen the only additional constraints to be that they have a default constructor and may be created on the heap with ''operator new''. parent* p=new child(). or decide to use third party class libraries. Chances are you will be extending existing << endl. The interface should be reasonably obvious. but not as limiting as it may seem. we just need to create persistent versions of the classes. persistent_store storage(file). but it makes sense.template register_class<persistent_X>(). you need to know what type of object to create. It is what it points to that may differ. or a socket. such as a file. The run-time type of ``p'' is ``parent*''. virtual void retrieve(istream& is) { is >> *this. Is this useful? Suppose you need to store and retrieve objects from a typeless media. }.class child : public parent {}. If you are designing a complete system from scratch. We can still make use of third party libraries. It may be defined as follows: class persistent_store { public: persistent_store(iostream& stream). public persistent { protected: virtual void store(ostream& os) const { os << *this. and slightly restricted way. char* argv[]) { fstream file(argv[1]). }.name() << endl. but what those will be will depend on our implementation. you can add some type identifier to all classes.//* persistent* pp=storage. but when reading. and it does not work at all with existing classes. There will. Here is an outline for how to do this (in a non-portable. Here is how to use the persistent store: class persistent_X : public X. Only classes registered with the store may be used. let us call it ``persistent''. int main(int argc.retrieve_object(). since it is easy to forget changing it when creating a new class inheriting from another one. be other requirements put on the type ``T''. }. Storing them is easy. Obviously only classes inherited from ``persistent'' may be stored and retrieved. storage. template <class T> void register_class<T>().) First we design a class for I/O objects. cout << typeid(*p). This came as a surprise to me. If we have a compiler whose name of a type as given from ``type_info'' is indeed the name of the type as we write it in the program. what do you think the above snippet prints? The answer is ``parent*'' followed by ``child''. Next we need something that does the work of calling the member functions and creating the objects when read. This is limiting. }. but it is not too bad. cout << typeid(p). void store_object(persistent*). The syntax is a bit ugly. The intention is that all classes we want to store must inherit from ``persistent'' and implement the ``store'' and ``retrieve'' member functions. and that is exactly what is mirrored by the output. persistent* retrieve_object(). Generality and portability requires more work. . persistent_X* px=dynamic_cast<persistent_X*>(pp).

. if (iter == creator_map. The second problem. the creator function is looked up and called. ??? can be replaced with ``persistent_object_creator<T>'' and the check line can be removed. p->store(medium). char* name=new char[len+1]. So . That way.find(name). a character buffer allocated for the correct size and exactly that many characters read. When reading. maptype::iterator template <class T> persistent* persistent_object_creator(void) { return new T. Note the ``template'' keyword. but allowing any kind of indexing type. The first is extremely easy to check for. In this case the indexing type will be a string. Not too bad. but requires a bit of trickery to implement. My suggestion is simple. // type check.get().insert(make_pair(typeid(T). Here is how it is all implemented: void store_object(persistent* p) { const char* name=typeid(*p). Now there are a number of problems that must be solved: • How to prevent classes not inheriting from ``persistent'' from being registered • How to allocate an object of the correct type on heap. persistent* (*creator_func)(void) = ???. }. persistent* retrieve_object(void) { size_t len. Here is one way of doing it: template <class T> void register_class() { persistent* p = (T*)0. this time a function template. The solution lies in using a map from strings to a creation function.len).read(name. Perhaps it comes as no surprise that the solution is yet another template. medium >> len. when reading.store_object(px).name(). func)). }. It is ugly. Now for how to store and retrieve the type name. Here the map type from the C++ standard library is used. That is the easy part. • How to store the type information on the stream such that it can be interpreted unambigously. medium. A map is a data structure acting like an array. creator_map. In other words. the implementation of ``register_class'' may look like this: template <class T> void register_class() { persistent* p=(T*)0. store the length of the name followed by the name itself. Any attempt to call ``register_class'' with a type not publicly inheriting from ``persistent'' will result in a compilation error. deciding the type from a name is easy to understand conceptually. but I guess one gets used to it. }. the length can be read. medium << strlen(name) << ' ' << name.end()) throw "unregistered type". When storing the string is checked for in the map to make sure no unregistered types are stored. The tricky part is to find out what ``???'' is.. given a string representation of its type.. . } The line marked ``//*'' shows the syntax for calling template member functions without parameters of the template type. // read past blank.. Every string in the map corresponds to a function accepting no parameters and returning a ``persistent*''. } This function template implicitly carries out the type test for us (since ``T*'' can only be returned as ``persistent*'' if ``T'' publicly inherits from ``persistent''.

something that the virtual function call interestingly is not. After all. // no longer needed if (iter == creator_map. and contributes most to the bloat. strictly speaking. because that is the one that is hardest to solve. and are likely to reduce the number of cache hits since active code areas will increase in size. If true the string was not represented in the map. however. but use new and delete instead of malloc/free. use inline functions instead of macros and add constructors to your structs. If you need the functionality that a virtual function call gives you. you are still better off than in plain C. the programmer is a top-notch algorithm and data-structure guru. ill chosen ones will lead to bloated executables. delete[] name. there are mainly two reasons for C++ bloat. More interesting is what happens if the compiler for one reason or another fails to inline it. Does it need to be that way? The answer is no. Throughout this course. and quite possibly larger. }. the instructions for the function will be inserted at the place of the call.'' That is a major performance killer. which will be called like normal functions. one is technical. because of the added type safety. There is one difference however (getting into the area of technical problems here. not all problems need general and extensible solutions. If inlined. Unless. C++ and Efficiency I have once in a while heard things like ``C++ programs are slow and large compared to a C program with the same functionality. If you do. Have a look at the SGI STL. and read the documentation and code commens. In a large program. possibly at the cost of program size and run-time efficiency. As can be guessed.find(name). If the function is large (more than a very few instructions) the size of your program will grow considerably if the function is called often. many programmers avoid them. p->retrieve(medium). Fear of template code bloat Since it is ``a well known fact'' that templates leads to code bloat. That is a mini persistency library for you. return p.end()) throw "unregistered type". as a static function. Here are some examples: Virtual functions I have heard people say things like ``As a performance optimization. the result is just about guaranteed to be slower. and you will notice that many of the data structures and algorithms used were not even known a year ago. the same design can be used with C++. It will then have to make an out-of-line function for it. Remember that an inline function is actually not a function. but the program will not be slower and larger. Another cultural problem leading to inefficient C++ programs is the hunt for efficiency! Yes.end())'' checks if the lookup was successful. This means rolling their own algorithms and collections instead of using standard ones. Culture related inefficiency Let us first look at the cultural problem. I have been touting generality and extensibility over and over and over.maptype::iterator iter=creator_map. However.'' In a way that statement is often true. As I see it. developmenttime efficiency has been gained.) Where will the out-of-line function reside? With one exception. and thus the type was not registered. all compilers I know of will add one copy. Generality and extensibility have a price. this might mean *many* copies. Enjoy. One is a cultural problem. you will not get the boost from high performance hits due to good locality either. The switch switch/call construct is guaranteed to require at least as many instructions (probably more) and is an instruction pipeline killer for the CPU. it is true. If you have a lean C design. I've removed all virtual functions and replaced the virtual function calls with switch statements and ordinary member function calls. the contents of the standard library are state of the art.cpp file) where the function inlining failed. it does not. Inline functions While well chosen inline functions may speed up your program. for every translation unit (read . and since every copy is in its own memory area. than those in the standard library. persistent* p=(iter->second)(). Here is one very good argument for why. I will return to this one shortly. right? If I can make use of the generality and extensibility at some other time. How many programmers have a working knowledge of the latest and greatest of algorithms and data structures? . the line ``if (iter == creator_map. Are you prepared to pay the price? It depends. there is just no way to do it faster than through a virtual function call.

I know of one C++ compiler that has a code optimizer made especially for the requirements of C++. A must read for any C++ programmer. say that exception handling can be implemented such that there is no performance loss at all. with examples and discussions.) Unfortunately it is not available for OS/2. No matter how good the implementation is. differs a lot. and that makes life harder for the code optimiser. this book is not for beginners. destruction. It may seem like it. but it contains many useful tecniques. A worse technical problem is exceptions. 2nd ed''. but unfortunately many do. and as a result they have extremely extensible. Beware. ``Effective C++. If you are curious. Some compiler vendors. Scott Meyers.) I mentioned earlier the problem of outlined inline functions. present an easy to understand one page solution just to say it was unnecessarily clumsy and reduce it to half a page without sacrificing extensibility and generality. Koenig is the only author I know who can explain a problem to solve. Why does C++ have multiple inheritance? Why does it provide assignment. A good compiler will not leave you several copies as static functions. Nackman. Also. however. and it might happen. When feeding it a program written like a C program. Andrew Koening and Barbara Moo. John J. Now. exceptions add a number of program flows that would otherwise not be there. look up KAI C++ (Note. copy-constructor and default constructor for you? Why is RTTI there? Why no garbage collection? Why not dynamic types like in Smalltalk? ``Ruminations on C++''. and many proponents of exceptions. Recommended reading I want to finish by recommending some books on C++. Meyers' writing style is easy to follow and entertaining too. for how to improve your programs. that is all. Bjarne Stroustrup. though. and that is when using good encapsulation and templates. this is a pleasant book to read. do follow comp. ``The Design and Evolution of C++''. performance will be poorer. It is a mind opener in many ways. Programming Styles and Idioms''. Scott Meyers. Barton and Lee R. slick. unless an exception is actually thrown. this time introducing cleverer optimisation techniques like copy-on-write. ``More effective C++''. ``Advanced C++. It has been shown to equal and even beat Fortran for performance. despite the fact that the way a C++ program works and accesses memory etc. I have used their compiler. This book answers all the why's. . but it is not true. ``Scientific and Engineering C++''. What they have done is to break almost every rule of thumb. I am not associated with them in any way.lang. Often simply referred to as B&N. and I doubt it ever will be (bug them about it. Another 35 tips. This is a modern ``Advanced C++''. Entertaining and enlightening. Coplien.moderated. Most C++ compilers use exactly the same optimization techniques as C and Fortran compilers do. Contains 50 tips. type-safe and fast designs.c++. James O. which one immediately realizes will require a large program. Without doubt this book from 1992 is getting old.Technical problems The largest technical reason for slow C++ programs is compilers with very poor code optimizers.

Sign up to vote on this title
UsefulNot useful