P. 1


|Views: 19|Likes:
Publicado porJohn Jairo Cabal

More info:

Published by: John Jairo Cabal on Jan 02, 2011
Direitos Autorais:Attribution Non-commercial


Read on Scribd mobile: iPhone, iPad and Android.
download as DOC, PDF, TXT or read online from Scribd
See more
See less






  • Int
  • Classes
  • A class
  • Destructor
  • An improved stack
  • Exercise
  • Recap
  • Next
  • References
  • What's a class?
  • The Orthodox Canonical Form
  • Const Correctness
  • Exercises
  • Coming up
  • Why templates
  • Function templates
  • Templates and exceptions
  • Class templates
  • Advanced Templates
  • Introduction
  • Exploring I/O of fundamental types
  • I/O with our own types
  • Formatting
  • An easier way
  • Standards update
  • Other uses
  • Conclusion
  • Short recap of inheritance
  • A deficiency in the model
  • Pure virtual (abstract base classes)
  • Addressing pure virtuals
  • Unselfish protection
  • A toy program
  • Files
  • File Streams
  • Binary streaming
  • Array on file
  • Seeking
  • A stream array, for really huge amounts of data
  • In memory data formatting
  • The data representation problem
  • Several arrays in a file
  • Temporary file array
  • Code reuse
  • What can go wrong?
  • Iterators
  • The problem to solve
  • Implementation
  • Efficiency
  • Type Safe (down)Casting
  • Identifying types
  • C++ and Efficiency
  • Recommended reading

Part1 Part2 Part3 Part4 Part5 Part6 Part7 Part8 Part9 Part10 Part11 Part12 Part13 Part Part

Part Part Part Part Part Part Part Part1 Part1 Part1 Part1 1 2 3 4 5 6 7 8 9 0 1 2 3


Last month we saw, among others, how we can give a struct well defined values by using constructors, and how C+ + exceptions aid in error handling. This month we'll look at classes, a more careful study of object lifetime, especially in the light of exceptions. The stack example from last month will be improved a fair bit too.

A class
The class is the C++ construct for encapsulation. Encapsulation means publishing an interface through which you make things happen, and hiding the implementation and data necessary to do the job. A class is used to hide data, and publish operations on the data, at the same time. Let's look at the "Range" example from last month, but this time make it a class. The only operation that we allowed on the range last month was that of construction, and we left the data visible for anyone to use or abuse. What operations do we want to allow for a Range class? I decide that 4 operations are desirable: • Construction (same as last month.) • find lower bound. • find upper bound. • ask if a value is within the range. The second thing to ask when wishing for a function is (the first thing being what it's supposed to do) is in what ways things can go wrong when calling them, and what to do when that happens. For the questions, I don't see how anything can go wrong, so it's easy. We promise that the functions will not throw C++ exceptions by writing an empty exception specifier. I'll explain this class by simply writing the public interface of it: struct BoundsError {}; class Range { public: Range(int upper_bound = 0, int lower_bound = 0) throw (BoundsError); // Precondition: upper_bound >= lower_bound // Postconditions: // lower == upper_bound // upper == upper_bound int lowerBound() throw (); int upperBound() throw (); int includes(int aValue) throw (); private: // implementation details. }; This means that a class named "Range" is declared to have a constructor, behaving exactly like the constructor for the "Range" struct from last month, and three member functions (also often called methods,) called "lowerBound", "upperBound" and "includes". The keyword "public," on the fourth line from the top, tells that the constructor and the three member functions are reachable by anyone using instances of the Range class. The keyword "private" on the 3rd line from the bottom, says that whatever comes after is a secret to anyone but the "Range" class itself. We'll soon see more of that, but first an example (ignoring error handling) of how to use the "Range" class: int main(void) { Range r(5); cout << "r is a range from " << r.lowerBound() << " to " << r.upperBound() << endl; int i; for (;;) {

cout << "Enter a value (0 to stop) :"; cin >> i; if (i == 0) break; cout << endl << i << " is " << "with" << (r.includes(i) ? "in" : "out") << " the range" << endl; } return 0; } A test drive might look like this: [d:\cppintro\lesson2]rexample.exe r is a range from 0 to 5 Enter a value (0 to stop) :5 5 is within the range Enter a value (0 to stop) :7 7 is without the range Enter a value (0 to stop) :3 3 is within the range Enter a value (0 to stop) :2 2 is within the range Enter a value (0 to stop) :1 1 is within the range Enter a value (0 to stop) :0 Does this seem understandable? The member functions "lowerBound", "upperBound" and "includes" are, and behave just like, functions, that in some way are tied to instances of the class Range. You refer to them, just like you do member variables in a struct, but since they're functions, you call them (by using the, in C++ lingo named, function call operator "()".) Now to look at the magic making this happen by filling in the private part, and writing the implementation: struct BoundsError {}; class Range { public: Range(int upper_bound = 0, int lower_bound = 0) throw (BoundsError); // Precondition: upper_bound >= lower_bound // Postconditions: // lower == upper_bound // upper == upper_bound int lowerBound() throw (); int upperBound() throw (); int includes(int aValue) throw (); private: int lower; int upper; }; Range::Range(int upper_bound, int lower_bound) throw (BoundsError) : lower(lower_bound), /***/ upper(upper_bound) /***/ { // Preconditions. if (upper_bound < lower_bound) throw BoundsError();

// Postconditions. if (lower != lower_bound) throw BoundsError(); if (upper != upper_bound) throw BoundsError(); } int Range::lowerBound() throw () { return lower; /***/ } int Range::upperBound() throw () { return upper; /***/ } int Range::includes(int aValue) throw () { return aValue >= lower && aValue <= upper; /***/ } First, you see that the constructor is identical to that of the struct from last month. This is no coincidence. It does the same thing and constructors are constructors. You also see that "lowerBound", "upperBound" and "includes", look just like normal functions, except for the "Range::" thing. It's the "Range::" that ties the function to the class called Range, just like it is for the constructor. The lines marked /***/ are a bit special. They make use of the member variables "lower_bound" and "upper_bound." How does this work? To begin with, the member functions are tied to instances of the class, you cannot call any of these member functions without having an instance to call them on, and the member functions uses the member variables of that instance. Say for example we use two Range instances, like this: Range r1(5,2); Range r2(20,10); Then r1.lowerBound() is 2, r1.upperBound() is 5, r2.lowerBound() is 10 and r2.upperBound() is 20. So how come the member functions are allowed to use the member data, when it's declared private? Private, in C++, means secret for anyone except whatever belongs to the class itself. In this case, it means it's secret to anyone using the class, but the member functions belong to the class, so they can use it. So, where is the advantage of doing this, compared to the struct from last month? Hiding data is always a good thing. For example, if we, for whatever reason, find out that it's cleverer to represent ranges as the lower bound, plus the number of valid values between the lower bound and upper bound, we can do this, without anyone knowing or suffering from it. All we do is to change the private section of the class to: private: int lower_bound; int nrOfValues; And the implementation of the constructor to: Range::Range(int upper_bound, int lower_bound) throw (BoundsError) : lower(lower_bound), /***/ nrOfValues(upper_bound-lower_bound) /***/ ... And finally the implementations of "upperBound" and "includes" to: int Range::upperBound() throw () { return lower+nrOfValues; } int Range::includes(int aValue) throw () { return aValue >= lower && aValue <= lower+nrOfValues;

Already with the struct. r. tp = new Tracer("on heap"). a promise of integrity. I get this behaviour (and so should you. } What this simple class does is to write its own parameter string. The only one allowed to make changes to the member variables are functions belonging to the class. Tracer t2("t2"). when constructed. Tracer::Tracer(const char* tracestring) : string(tracestring) { cout << "+ " << string << endl. // destructor private: const char* string. How much was that promise worth with the struct? This much: Range r(5. u < 3. prepended by a "-" character. and usually more important. that helps us find out the life time of objects. } Tracer* tp = 0. there was a promise that the member variable "upper" would have a value greater than or equal to that of the member variable "lower". benefit. for (unsigned u = 0. when destroyed. ++u) { Tracer inLoop("inLoop").upper!!! Try this with the class. or when removed from the heap with the delete operator. unless you have a buggy compiler): . 2). // Oops! Now r. } delete tp. } Tracer::~Tracer() { cout << ". Tracer t3. but prepended with the ~ character. A destructor has the same name as the class. ~Tracer(). eh?"). Let's toy with it! int main(void) { Tracer t1("t1"). Destructor Just as you can control construction of an object by writing constructors. prepended with a "+" character.h> class Tracer { public: Tracer(const char* tracestring = "too lazy.lower = 25. you can control destruction by writing a destructor. It won't work. } When run. { Tracer t1("Local t1"). return 0. and it never accepts any parameters. }. A destructor is executed when an instance of an object dies. We can use this to write a simple trace class. #include <iostream.lower > r." << string << endl. Tracer* t2 = new Tracer("leaky").} We also have another. and the same string. either by going out of scope. and those we can control.

inLoop + Local t1 + leaky + on heap .too lazy. SuperTracer::SuperTracer(const char* tracestring) : t(tracestring) { cout << "SuperTracer(" << tracestring << ")" << endl. ~SuperTracer(). eh? . SuperTracer t2("t2"). looking at how the constructor is written. } SuperTracer::~SuperTracer() { cout << "~SuperTracer" << endl. } int main(void) { SuperTracer t1("t1"). } What's your guess? [d:\cppintro\lesson2]stracer. return 0. eh? + inLoop .t1 This means that the contained object ("Tracer") within "SuperTracer" is constructed before the "SuperTracer" object itself is.exe + t1 SuperTracer(t1) + t2 SuperTracer(t2) ~SuperTracer .inLoop + inLoop . objects are destroyed in the reversed order of creation (have a careful look.) We also see that the object. What happens with classes containing classes then? Must be tried.t2 . private: Tracer t. }.inLoop + inLoop . right? class SuperTracer { public: SuperTracer(const char* tracestring). the object on heap. instantiated with the string "leaky" is never destroyed. with a call to the "Tracer" .Local t1 .t1 What conclusions can be drawn from this? With one exception.[d:\cppintro\lesson2]tracer.on heap . This is perhaps not very surprising. it's true. and it's always true.exe + t1 + t2 + too lazy.t2 ~SuperTracer .

destructorThrow(i) { cout << "SuperTracer(" << tracestring << ")" << endl. but there is a good reason for this. }. } catch (const char* p) { . Superficially. SuperTracer t2(0. "throw in destructor"). we'd have serious problems properly destroying our no longer needed objects. private: Tracer t. "throw in destructor"). if (!destructorThrow) throw (const char*)"SuperTracer::SuperTracer". } catch (const char* p) { cout << "Caught " << p << endl. zero for throwing in the constructor. const char* tracestring) throw (const char*). one where the constructor of "SuperTracer" throws. } try { cout << "Let the fun begin" << endl.class constructor in the initialiser list. } try { SuperTracer t1(1. SuperTracer t1(1. } catch (const char* p) { cout << "Caught " << p << endl. if (destructorThrow) throw (const char*)"SuperTracer::~SuperTracer". "throw in constructor"). Here's the new "SuperTracer" along with an interesting "main" function. int destructorThrow. but it's a bit deeper than that. and one where the destructor throws. } SuperTracer::~SuperTracer() throw (const char*) { cout << "~SuperTracer" << endl. the reason might appear to be that of symmetry. } int main(void) { try { SuperTracer t1(0. ~SuperTracer() throw (const char*). and what if the member data is destroyed when the destructor starts running? At best a destructor would then be totally worthless. the curious wonders. It's not unlikely that the member data is useful in some way to the destructor. We'll control this by a second parameter. and non-zero for throwing in the destructor. const char* tracestring) throw (const char*) : t(tracestring). destruction always in the reversed order of construction. So. Perhaps a bit surprising is the fact that the "SuperTracer" objects destructor is called before that of the contained "Tracer". but more likely. "throw in constructor"). SuperTracer::SuperTracer(int i. class SuperTracer { public: SuperTracer(int i. what about C++ exceptions? Now here we get into an interesting subject indeed! Let's look at two alternatives.

} Here we can study different bugs in different compilers. Minimum for a stack is functionality to push new elements onto it. As can be seen. we can see a class that. not surprisingly. The one that removes it either returns or throws an exception (remember. though. More important. either a function fails. the destructor for all so far constructed member variables are destructed. What bugs does your compiler have? Here's the result when running with GCC. because it both changes the state of the stack (removes the top element from it) and returns whatever was the top element. If it fails.throw in constructor ~SuperTracer Abnormal program termination core dumped The first 4 lines tell that when an exception is thrown in a constructor. your program will terminate very quickly. pop.) OK. there's no middle way. The correct result can be seen in the execution above. If you have a bleeding edge compiler. where functions push. or does what it's supposed to do. The C++ way is. how do you destroy something that was never constructed? The next four lines reveal the GCC bug. Comments about the bug found are below the result: [d:\cppintro\lesson2]s2tracer. through a call to their destructor. The pop function is a classical headache. but it was far from adequate. An improved stack The stack from last month was in many ways better than a corresponding C implementation. if you throw an exception because an exception is in the air. otherwise it returns. think *very* carefully. but think carefully about the consequences. and whatever's needed is available to the users. before allowing a destructor to throw exceptions. because you can easily lose data. the member Tracer variable is not destroyed as it should be (VisualAge C++ handles this one correctly. at once. Why? Well. does that indicate that the it has been removed? It's better to make two functions of it.cout << "Caught " << p << endl. some thinking is needed regarding what the stack should do. Both GCC and VisualAge C++ have theirs. one that returns the top element. but the destructor for the object itself is never run. An easy. it exits through an exception. the exception is thrown in the destructor. looks something like this: class intstack { public: . After all. This means that the first object will be destroyed because an exception is in the air. on the surface. and one that removes it. Before going into that.) Next we see the interesting case. is to implement it as an abstract data type. What's the lesson learned from this? To begin with that it's difficult to find a compiler that correctly handles exceptions thrown in destructors. you can control this by calling the function "uncaught_exception()" (which tells if an exception is in the air. and this is done by a call to the function "terminate".) and from there decide what to do. and to pop the top element from it. and when destroyed it will throw another one. } return 0.throw in constructor Caught SuperTracer::SuperTracer + throw in destructor SuperTracer(throw in destructor) ~SuperTracer Caught SuperTracer::~SuperTracer Let the fun begin + throw in destructor SuperTracer(throw in destructor) + throw in constructor SuperTracer(throw in constructor) . though. however. so. C-ish way of improving it. and then an object is created that throws at once. to write a stack class. What happens here is that an object is created that throws on destruction. This behaviour is dangerous in terms of errors. Program execution must stop. What if something fails while removing the top element? Should you return the top element value? If you do. The bug in VisualAge C++ is that it destroys the contained Tracer object before calling terminate.exe + throw in constructor SuperTracer(throw in constructor) .

Construction (from nothing): nrOfElements() == 0. void pop().intstack(). • Out of memory on push. Again. pop(): Currently no way to say. // Preconditions: ~intstack() throw (). This leaves us with two different errors: Stack underflow (pop or top on empty stack). • destruction. throw exception. • construction. but we'll wait with that until next month. what if the stack is empty? • push. Since top and pop requires that the stack isn't empty. • top and pop on empty stack. We also found. let's think about what to do when they occur. Now let's look at what can go wrong in the different operations. otherwise we don't leave them a chance. // Preconditions: // Postconditions: // nrOfElements() == 0 || top() == old top() // *2* void push(int anInt) throw (bad_alloc). // Preconditions: // Postconditions: . What if the stack is empty? It mustn't be. What's the post conditions for the different operations? push(anInt): The stack can't be empty after that (post conditions always reflect successful completion. or this article will grow far too long. Nothing really. the preconditions for operations pop and top (!isEmpty(). I don't see how anything can go wrong in here. If the stack is in a bad state. and hope it doesn't happen.) Now to think of post conditions. // remove top element int top(). • invalid stack state in destruction? Can we find out of we have them? I don't think we can. // retrieve value of top element private: // implementation details }. // *1* struct stack_underflow {}. we must allow the user to check if the stack is empty. So. then nrOfElements will be one less after pop. • isEmpty. Thus. but we can't check it (try to think of a method to do that. so another function is needed. Out of memory. • pop. it might be indestructible. not failure. • top. This looks fair. Normally copying and assignment (a = b) would be implemented too. stack remains empty. rather easily. and tell me if you find one. but let's change things a bit. I *think* the best solution for this problem is to just be careful with the coding. class intstack { public: intstack() throw ().) So. There's no object left to check the post condition on! We can state a post condition that all memory allocated by the stack object is deallocated. Instead of having the method isEmpty() we add the method nrOfElements(). // free memory by popping all elements void push(int aValue). Destruction? Nothing. // Preconditions: // Postconditions: // The memory allocated by the object is deallocated unsigned nrOfElements() throw (). now we can write the public interface of the stack: struct bad_alloc {}. that probably increases the likelihood of exactly the kind of errors we want to avoid. without adding significant control data. // initialise empty stack ~intstack(). with the problems identified. Tough one. Throw exception and leave stack unchanged. top(): nrOfElements() same after as before. and out of memory.) Also top() == anInt.

or the top elements are equal. the copy constructor (constructing a new stack by copying the contents of an old one. and use the struct itself as the information. // implementation details }. // Preconditions: - . By declaring them private. You'll get used to this reversed looking logic. If you have such a compiler. and ironically that is why they are declared private. we just throw them. // Preconditions: // nrOfElements() > 0 // Postconditions: // nrOfElements() == 1 + old nrOfElements() // Behaviour on exception: // Stack unchanged. Here comes the complete class declaration. the C++ compiler will do it for you. with the old "stack_element" as a nested struct within the class. I'll talk more about this next month. if included. but it can be done. // Preconditions: // nrOfElements() > 0 // Postconditions: // nrOfElements() == old nrOfElements() // Behaviour on exception: // Stack unchanged. coping and assignment is explicitly illegal. // Preconditions: ~intstack() throw (). int top(void) throw (stack_underflow). class intstack { public: intstack() throw (). but what this means is that if there are elements on the stack. the top elements must be the same.) I said we wouldn't implement these this month.// nrOfElements() > 0 // top() == anInt // Behaviour on exception: // Stack unchanged. but with an additional element counter? I think that's a perfectly reasonable approach. The reason is that if you don't declare a copy constructor and assignment operator. // Preconditions: // Postconditions: // The memory allocated by the object is deallocated unsigned nrOfElements() throw (pc_error). This requirement is also implied by our destructor guaranteeing not to throw anything. This is tricky. struct bad_alloc {}. nothing more is needed. // used for post condition violations. *2*: This looks odd. remove the declaration of it above. perhaps. For really new compilers. either the stack is empty. and unfortunately. So. however. the compiler generated ones are usually not the ones you'd want. the new operator throws a pre-defined class called bad_alloc. // *1* struct stack_underflow {}. so it's not a problem. *1*: the structs stack_underflow and bad_alloc are empty. struct pc_error {}. The promise to always leave the stack unchanged in when exceptions occur means that we must guarantee that whatever internal data structures we're dealing with must always be destructible. void pop(void) throw (stack_underflow). private: intstack& operator=(const intstack& is). Or literally as it says in the code comment. and below it. *3*: This is how the assignment operator looks like. how do we implement this then? Why not like the one from last month. // *3* intstack(const intstack& is).

) As a rule of thumb. // guaranteed not to throw. int top(void) throw (stack_underflow. delete pTop. pTop = p. pc_error). Preconditions: Postconditions: nrOfElements() = 1 + old nrOfElements() top() == anInt Behaviour on exception: Stack unchanged. // hidden!! intstack& operator=(const intstack& is). stack_element* pNext. }. // hidden !! struct stack_element { stack_element(int aValue.e. elements(0) { // Preconditions: } intstack::~intstack() throw () { // Preconditions: // Postconditions: // The memory allocated by the object is deallocated while (pTop != 0) { stack_element* p = pTop->pNext. unsigned elements. pc_error). but it's OK for trivial member functions. stack_element* pTop. intstack::intstack() throw () : pTop(0). So let's look at the implementation. stack_element* p) throw () : value(aValue). The only peculiarity here is that the constructor for the nested struct "stack_element" is defined in line (i. int value. like this constructor. pc_error). which only copies values. void pop(void) throw (stack_underflow. at the point of declaration. . }. bit by bit. // Preconditions: // nrOfElements() > 0 // Postconditions: // nrOfElements() == old nrOfElements() // Behaviour on exception: // Stack unchanged. this should be avoided. pNext(p) {}. // Preconditions: // nrOfElements() > 0 // Postconditions: // nrOfElements() + 1 == old nrOfElements() // Behaviour on exception: // Stack unchanged. private: intstack(const intstack& is).// Postconditions: // nrOfElements() == 0 || top() == old top() void // // // // // // push(int anInt) throw (bad_alloc.

The guarantee that "delete pTop" doesn't throw comes from the fact that the destructor for "stack_element" can't throw (which is because we haven't written anything that can throw. ++elements. the memory is deallocated and we're leaving the function with the exception (before assigning to pTop.) { // Behaviour on exception: // Stack unchanged. pTop). since it's valuable to others reading the sources. } Here I admit to being a bit lazy. and an explanation for my laziness. I leave the post condition. elements = old_nrOfElements. for some reason. but since all that is done is to return a value. If that happens.) unsigned intstack::nrOfElements() throw (pc_error) { // Preconditions: return elements. Strictly speaking. It's also valuable if. delete pTop. void intstack::push(int anInt) throw (bad_alloc. so the stack remains unchanged. // Postconditions: // nrOfElements() == 1 + old nrOfElements() // top() == anInt if (nrOfElements() != 1 + old_nrOfElements || top() != anInt) { throw pc_error().} } These are rather straight forward. as a comment. . stack_element* pTmp = new stack_element(anInt. } } catch (. it is obvious that the stack cannot change from this. the post condition should be checked. If it throws.. // Postconditions: // nrOfElements() == 0 || top() == old top() // no need to check anything with this // implementation as it's trivially // obvious that nothing will change. the implementation is changed so that it is not obvious. pc_error) { // Preconditions: unsigned old_nrOfElements = nrOfElements().. though. if (pTmp == 0) throw bad_alloc().) try { pTop = pTmp. // get rid of the new top element pTop = pOld. stack_element* pOld = pTop. // // // // the above either throws or succeeds. and the contents of "stack_element" itself can't throw since it's fundamental data types only. the check should be implemented.

just remove them. Then..throw. an out of memory situation will mean that the return value stored in "pTmp" is 0. pc_error) { // Preconditions: // nrOfElements() > 0 if (nrOfElements() == 0 || pTop == 0) { throw stack_underflow().) int intstack::top(void) throw (stack_underflow. If so. either case.)" will catch anything thrown from within the try block above. and since we promise the stack won't be changed in the case of exceptions. it'll most probably complain about the next two lines. but also when restoring the stack should an exception be thrown. All these three situations are handled in the catch block. otherwise return the top value. though. on the other hand. we throw. I'm lazy with the post condition check. and the post condition check itself might fail. pc_error) { // Preconditions: // nrOfElements() > 0 if (nrOfElements() == 0 || pTop == 0) { throw stack_underflow(). but careful to document the behaviour should the implementation for some reason change into something less obvious. Here we have three situations in which an exception results. is to free the just allocated memory (which won't throw for the same reason as for the destructor. in which case we throw ourselves. A throw like this is only legal within a catch block (use it elsewhere and see your program terminate rather quickly. in which case everything is fine. The post condition check is interesting. without having leaked memory. If you have a brand new compiler. } unsigned old_elements = nrOfElements(). stack_element* pOld = pTop. and that is what the empty "throw. . and also contains some news. // Postconditions: // nrOfElements() == old nrOfElements() } // No need to check with this implementation! } This is not so difficult." means to re throw whatever it was that was caught. This is used solely for restoring the stack in the case of exceptions. Next we start doing things that changes the stack. if we're out of memory here. since they're unnecessary in that case. If we have no elements on the stack.) For most of you. "bad_alloc" will be thrown and the stack will be unchanged.) We also restore the old stack top and the element counter. "catch (. The call to "nrOfElements" could throw "pc_error". is to pass the error on to the caller of "push". the call to "top" may throw. OK. If it does. On the next line a new stack element is created on the heap. The call to "nrOfElements" may throw. Let's start from the beginning. "old_nrOfElements" is used both for the post condition check that the number of elements after the push is increased by one. operator new itself throws "bad_alloc" when we're out of memory. This is harmless since we haven't done anything to the stack yet. This assignment cannot throw since "pOld" and "pTop" are fundamental types (pointers). An empty "throw. On the next line we store the top of stack as it was before the push. or we're out of memory (the only possible error cause here since the constructor for "stack_element" cannot throw. } return pTop->value. Either the creation succeeds as expected. Thus the stack is restored to the state it had before entering "push". what we must do. things that do change the stack goes into a "try" block." does. } This is not trivial. Setting the new stack top and incrementing the element counter is not hard to understand. What we do when catching something. the exception passes "push" and to the caller since we're not catching it. That case is taken care of on the next two lines. As with "nrOfElements". void intstack::pop(void) throw (stack_underflow.. Here there are three possibilities. If you have such a compiler.

top() << endl. intstack is1. Suppose the deletion did. pTop = pOld. --elements. throw. cout << "is1.nrOfElements() = " << is1. // Postconditions: // nrOfElements() + 1 == old nrOfElements() if (nrOfElements() + 1 != old_elements) { // Behaviour on exception: // Stack unchanged. we too break our promise not to alter the stack when leaving on exception. is1.try { pTop = pTop->pNext. cout << "is1.top() = " << is1. but we at least make sure the stack is in a usable (and most notably.pop(). cout << "is1. it's time to have a little fun and play with it.) { elements = old_elements. throw something. } delete pOld. cout << "is1. cout << "is1. cout << "is1. If it did.. cout << "is1.nrOfElements() << endl.nrOfElements() = " << is1. } catch (. is1. } The exception protection of "pop" works almost exactly the same way as with "push". cout << "is1.push(5). . cout << "is1. cout << "is1. As it is now. if it breaks its promise. // guaranteed not to throw. and the top of stack would be left to point to something undetermined. despite its promise.nrOfElements() = " << is1.pop()" << endl.top() = " << is1. The thing worth mentioning here. it would be caught. cout << "is1. is why "delete pOld" is located after the "catch" block and not within the "try" block.push(5)" << endl.nrOfElements() << endl. is1.pop().nrOfElements() << endl.top() << endl. } throw pc_error(). though.h> int main(void) { try { cout << "Constructing empty stack is1" << endl..nrOfElements() = " << is1.top() << endl.top() = " << is1.push(32). destructible) state. don't you think? #include <iostream.push(32)" << endl. After having spent this much time on writing this class.nrOfElements() << endl.pop()" << endl. is1.

It should either work.] . or say why it fails. is to see what kind of "internal state" tests that can be done. That'll be dealt with later. Exercise 1. in this case. 2. } catch (pc_error&) { cout << "Post condition violation" << endl. to ensure that the program won't die all of a sudden. Please discuss your ideas with me over e-mail (this month. } return 0. because this is where I end this month's lesson. • We have seen that throwing exceptions in destructors can be lethal. What should happen when something goes wrong.)" and "throw. When you have satisfactory answers to all four questions for all functionality of your class. Write me! I want opinions and questions. For example.} I'm staying within the limits of the allowed here. except when the objects are allocated on the heap. I won't go into more details with references. but that a lot of thought is required before doing so. Now I will break a promise from last month. it must not go undetected. going to fast.. is a good check for the integrity of the stack object itself. The knowledge that something of that type has been caught is. teaching the wrong things.) • You can now iterate your way to a good design by thinking of 1. • You have seen how classes can be used to encapsulate internals and publish interfaces with the aid of the access specifiers "public:" and "private:" • Member functions are always called together with instances (objects) of the class. • A member function can access private parts of a class. Recap Again a month filled with news. It's generally considered to be a bad idea to have public data in a class. (This is not to say that it should never ever be done. The reason is simply that I don't use it. and if it does. Something that is badly missing in the stack implementation above. what if we somehow manage to get "elements" to non-zero. Part Part Part Part Part Part Part Part Part Part1 Part1 Part1 Part1 1 2 3 4 5 6 7 8 9 0 1 2 3 [NOTE: Here is a link to a zip of the source code for this article. ask me. you have a safe design. too slow. make your member functions "exception safe" without being bloated with too many special cases ( "catch(. After that. yes. we'll have a look at copy construction and assignment (together with a C++ idiom often referred to as the "orthodox canonical form. enough. } catch (stack_underflow&) { cout << "Stack underflow" << endl. in which case they're destroyed when the delete operator is called for a pointer to them. If you think I'm wrong about things. } catch (bad_alloc&) { cout << "Out of memory" << endl. • You have seen how it is possible to. by the way. if I take a long time in responding. and to implement them. on the catch clauses I don't bind the exception instance caught to any named parameter. Ed. Can you think of why? Mail me your reasons. What I'd like you to do. 3. tell me. What the function should do. and thus always have access to the member data of the class. whatever. I need your constant feedback.) Next Next month there will most probably be a break. please don't feel ignored. to break the rules and see what happens. since I'll be on a well needed vacation.) 2. but please make changes to the test program..") I promise to explain the references in more detail too. What can go wrong. 4. however. by carefully crafting your code. How can a user of the class prevent things from going wrong. Please. I'll be net-less most of August. while "pTop" is 0? That's a terrible error that must not occur. and the stack implementation." helps considerably here. • Destruction of objects is done in the reversed order of construction. Um.

char d. if (&r != &i) throw "Broken compiler". would be an unbound reference. return 0. • Once bound to a variable there is no way to make the reference refer to something else. the function uses a local object of its own. When you do. Let's have a look at an example: int main(void) { int i = 0. for one. and sometimes beneficial in terms of performance. }. what exactly is encapsulation. but first let's finally have a look at the promised references. A reference is in itself not a type. }.So far we've covered error handling through exceptions. The same goes for parameters to functions. it always is a reference of some type. the reference. but don't confuse the two. The reason for the performance benefit is that the when passing a parameter by value. If we look at the exceptions. or something else than the intended. they offer a certain security over pointers. // and somewhere else. int q. See a reference more as an alias for another variable. A& ra = pc->p[2]. what on earth are references for? Why would anyone want them? Well. // r now refers to i. You can try to. Some details about references: • References must be initialized to refer to something. • There is no such thing as "reference arithmetic. which in some cases it is. Here's an example: struct A { int b[5].b[4]=2. int& r = i. instead of manipulating our own copy. If copying the object is an expensive operation. just like arrays are arrays of something and pointers are pointers to something. A reference is a means of indirection. In parts 1 and 2. we get a reference to the thing thrown. but what you get is the address of the variable referred to. ra. What's the meaning of a class? I will get to this. However. // error. the function has ." • You cannot get the address of a reference. pc is given a value and is // here used. just the same ways as pointers are denoted with an unary "*". that is identical to the one you passed. what should we make classes of. int x. The reference in this case just makes life easier. Passing an object by reference instead of by value. struct C { A* p[10].b[3]=5. passing parameters by reference can be dangerous. C* pc. it's so easy to get a pointer referring to something that doesn't exist. References are also often used for parameters to functions. then passing by reference is cheaper. and we can manipulate it if we want to. This may sound a lot like a pointer. References are denoted with an unary "&". it means that instead of getting a local copy of the thing thrown. It's a way of reaching another variable. References C++ introduces an indirect kind of type that C does not have. // now i == 1. The feedback from part 2 tells me I forgot a rather fundamental thing. r still refers to i. ++r. I used references when catching exceptions. } From this. one may wonder. They're also handy as a short-cut into long nested structs" and arrays. the object is copied. // pc->p[2]->b[3] = 5. There is no such thing as a 0 reference. can some times be necessary. // pc->p[2]->b[4] = 2. they're very different. ra. int& x. and encapsulation with classes.

which programs could use. and so on. When should you write a class. is a classic example of a class. } What will this program print? It's hard to tell.h> int& ir(void) // function returning reference to int { int i = 5. that the "main" function prints. Java or whatever. we had the operations "pop". the stack of integers. which value happens to be the sum of the values of the other two. i.. though. It could be anything. and because of this all attempts to change its value will cause a compile time error. The idea "Bicycle" that is. you write classes. do it now. The member functions of the class. Eiffel. you add a new type to the language. return i. A commonly used way around this is to declare the parameter as a "const" reference. if it prints at all. One such trap. If you remember the "tracer" examples from the previous lesson. remember?) it is impossible to pass instances of it to functions in other ways than by reference (or by pointer. as before. we have operations like adding two instances of the type. not my particular bicycle. and if the function modifies it. } int main(void) { cout << ir() << endl. but what does the reference returned refer to? It refers to the local variable "i" in "ir". that I think all C++ programmers fall into at least once. SmallTalk. // Error. So. return 0. Why? The function "ir" returns a reference to an "int". as a rule of thumb. but more importantly. "push" and "nrOfElements". what should the class allow you to do. attempting to alter // const reference. void workWithStack(const intstack& is) { // work with is is.access to the very object you pass. yielding a third instance. at once. in addition to well defined construction and destruction of instances. So far so good. When you define a class. "Bicycle" for example. return 0. This means that you get a reference. but the reference is treated as was it a constant. We can increment the value of instances of the type with operations like ++. the reference returned refers to a variable that no longer exists! Don't do this! Or rather. With the "intstack". just to have your one time mistake over with :-) What's a class? Now for the theoretical biggie.) There are situations when a reference is dangerous. Have a look at this: #include <iostream. What. you remember that the variable ceases to exist when exiting the function. descriptions of ideas. we introduced a new type to the language. be it C++. when we wrote the class "intstack". a class is a type. exactly. In the previous lesson. A class is.push(5). as I mentioned in part 2. the caller better be prepared for that. and what's a good name for a class? What's the relation between classes and objects? When you write programs in Object Oriented Programming Languages. In other words. Objective-C. With the built in integral types. workWithStack(i). My bicycle is a . } Since the "intstack" class does not have a copy constructor (it was declared private. It uses the "intstack" from the previous lesson: // put the declaration of intstack here. } int main(void) { intstack i. "unsigned" and "double". describe the semantics of the type. Modula-3.pop(). a method of encapsulation. "top". how can you know what classes to make? Classes are. is the meaning of a class.. Here's an example of passing a parameter by "const" reference. is returning a reference to a local variable. It might just crash. C++ comes with a set of built in types like "int".

"pop" and "top" member functions. you better have amplifiers (multiplication. if you can say "The X .) The "intstack" for example must make its own copy of the stack representation in the copy constructor. and the functions that represent the semantics. What's important. However.) adders." The "intstack" guaranteed that no matter what happened. The objects are the instances of types (yes.. the copy constructor looks like this: class C { public: C(const C& c). when solving your problem. given a class C. Construction from scratch we've seen. Construction by copying is done by the copy constructor.". subtractors and so on. "An X. then. Given a class named C. So. Given this. they probably should be parameters to the member functions." So.e. you might have a good candidate for a class X. There are.". Usually instances of the class has a state (for example. The objects are. be able to do with objects of any class is construction from scratch. the answer would. must in one way or the other be expressible through the classes. the value returned by "top()" or "nrOrElements()" depends on the history of "push()" and "pop()" calls. For example. on the other hand. Again. to instances of types. Note that objects don't exist when you write your program. what exists are types. compared to the work with the "intstack. what member functions should a class have? This is even harder to say.) A class represent the idea. This does not mean that every member variable of the newly constructed object must have values identical to the ones in the original. This places a slightly heavier burden on us. When you write your program. or "A kind of X. an instance of type "float" is also an object.) Having state means that the same member function can give different results depending on what has been done to the object before calling the member function (again. Normally this extra burden is light. just the run time instances of the classes. construction by copying another instance. you probably need a class "Road". When your program executes. // other necessary member functions.. but as far as you can see through the "push". with a stack. in general.) The class has member data to represent state. the copy assignment operator looks like this: class C { public: C& operator=(const C&). however. The idea of bicycle is a very good candidate for a class. On the contrary. This Object Oriented Programming thing is a hoax! What it's all about. is that they're semantically identical (i. Objects are run-time entities. }. the things that you need to do. given the same input to member functions. should have the member function "beRiddenBy" accepting an instance of class "Human".. because there are so many ways to solve every problem. though. since a mathematical function is state less.. Next in line is copy assignment. or they won't use your program. when thinking of the problem you want to solve. which you want to pass an instance of to the member function of either "Bicycle::beRiddenBy" or "Human::ride". exceptions to this rule of thumb. is a mathematical function a class that you'd like to have instances of to toy with in your program? According to the rule. If I need to ride my bicycle. This means that the base pointer will differ. The job of the copy constructor is to create an object that is identical to another object.. }. The Orthodox Canonical Form poses the additional requirement that an instance must always be copyable. be no. // copy constructor // other necessary member functions. like "Bicycle" or "intstack". you might start to feel like someone's been fooling you. it can be that the class "Bicycle". the state of a stack is the elements in it. It is your job to make sure it does this.physical entity that is currently getting wet in the rain. assignment and destruction. and their order. . but there are tricky cases. there is no difference between the copy and the original. after all. they often differ. descriptions of how instances of types can be used. is class oriented programming. an instance was always destructible.". What my bicycle is. as expected. it is not. If the road itself is important. they give the same response. they need not be instances of classes. the identifiers. is a good candidate for an instance of the class "Bicycle. (like "pc" in the reference example above) are replaced by bit-patterns representing objects. but it might also be that class "Human" should have the member function "ride" accepting an instance of class "Bicycle" as its parameter. If the starting point or destination are important. and descriptions of semantics and state representation. The Orthodox Canonical Form The basic operations you should. but if you design a program for use by electronics engineers when designing their gadgets.. In most situations.

} // Here b2 is destroyed. not by necessity) a reference to the object just assigned to. This means that the memory area // pointed to by b2. When inside a member function (the assignment operator as defined above is a member function) the object can be reached through a pointer named "this. bad(const bad&). above. bad& operator=(const bad&). } int main(void) { bad b1. // b2. Not only does the copy assignment operator need to make the object equal to its parameter. the type of "this" is "C* const" This means that is's a pointer to type C. For the class C. ~bad(void). } bad& bad::operator=(const bad& b) { pi = b.e. and the pointer itself is a constant (i. b5 = b4. and b2's destructor is // called. // Make b3.pi) is // no longer valid bad b3(b1). as you will soon see } bad::~bad(void) { delete pi.pi is now the same as b1. }. // The memory allocated by b5 was never // deallocated. so the last statement of an assignment operator is almost always "return *this." The difficulty of writing a good copy constructor and copy assignment operator is best shown through a classical error: class bad { public: bad(void).pi) // initialize pi with the value of b's pi { // This is very bad. We have a memory leak! . // assignment is also disasterous.) The return value of an assignment operator is (by tradition. bad b5.pi. // This seamingly logical and simple return *this." which is a pointer to the class type. { } bad::bad(const bad& b) : pi(b. it also needs to cleanly get rid of whatever resources it might have held when being called (The copy constructor does not have this problem since it creates a new object that cannot have held any data since before. { bad b2(b1). // // // // default constructor copy constructor destructor copy assignment bad::bad(void) : pi(new int(5)) // allocate new int on heap and // give it the value 5.pi.pi (and hence also b1. you cannot make "this" point to anything else than the current instance.pi point to the same invalid // memory area! bad b4. private: int* pi.Writing the copy assignment operator is more difficult than writing the copy constructor.) The reference to the object is obtained by dereferencing the "this" pointer.

right? So. } OK. By doing so. we guarantee that the pointers owned by the objects are truly theirs. { bad b2(b1). This goes for the copy assignment operator too. // No more memory leak pi = new int(*b. and their destructor can safely deallocate them. so from the example it is pretty clear that it's more work than this. // the copy constructor. so deallocation is not a // problem.pi)) // initialize pi as a new int // with the value of b's pi { // This guarantees that both the new object and the } // original are destructible.return 0. bad::bad(const bad& b) : pi(new int(*b. and initialise that memory with the same value as that pointed to by the original. The copy constructor should allocate its own memory. // // // return 0. have yet a problem to deal with. pi = new int(*b. how can we make the copy assignment operator safe from self assignment? Here are two alternatives: bad& bad::operator=(const bad& b) { if (pi != b. and b2's destructor is // called. // Get a new pointer and // initialise just as in return *this. } // Can you spot the problem with this one? int main(void) { bad b1. A version of the program fixing the above issues can show you what is meant by that: // class declaration.pi). that of self assignment. Whoa!! b3.pi is allocated again and initialised with the value no longer available!!! // all OK. then b3.pi is still valid. OK.pi now points to its own area. which deallocates // already deallocated memory. and then b4. default constructor and // destructor are identical and because of that not // shown here. // Allocate new memory b3=b1. but it also needs to discard the pointer it already had. but that doesn't mean it's allowed to crash. so assigning an object to itself is perhaps not the most frequently done operation in a program. all objects refer to their own // memory.pi is no longer valid // b1. . The destructors of b4 and b5 both attempt // to deallocate the same memory (b5 first. though. // b2. however. We do. // correctly so.pi). and allocate new again b3=b3.pi) { delete pi.pi is first deallocated. This means that the memory area // pointed to by b2. } // The destrctor of b1 and b3 attempt to deallocate // the memory already dealloceted by the destructor // of b2. bad b3. bad& bad::operator=(const bad& b) { delete pi. // Deallocate. } // Here b2 is destroyed.

right? The problem is that the compiler doesn't know which member functions modify the objects. by default. we can alter the value of the top element by writing like this: intstack is. the tests above are not necessary. Fortunately we can tell the compiler differently. will just copy/assign the member variables. now when we know about references. with the non-const version returning a non-const reference to the element instead. const references or pointers. it should be allowed to call "top" for a constant stack.push(5). one by one. In the previous lesson. does fine with this auto-generated copy constructor and copy assignment operator. // misc other member functions }. it does. so that readers of the source code know what you're thinking.) The non- . Member functions can be overloaded on "constness. for example. not allowed to do anything at all to a constant object.top() = 3. we can get the value of the top element. } Common to both is that they check if the right hand side (parameter b) is the same object. leave a comment in the class declaration saying so..pi). // change value of top element! There is no magic involved in this. One last thing before wrapping up. The second by comparing the pointer to the objects themselves. is. but it's actually the one most frequently seen. because normally classes have more than one member variable to check for. Const Correctness When talking about passing parameters to functions by reference. This has a tremendous advantage: For constant stack objects. we can do even better by writing two member functions "top".} } return *this.. It is no longer bad. the copy constructor and copy assignment operator was declared private. So. I mentioned the const reference as a way to ensure that the parameter won't get modified. It's the word "const" after the parameter list that tells the compiler that this member function will not modify the object and can safely be called for constant objects. since they both treat the object referred to as if it was a constant. bad& bad::operator=(const bad& b) { if (this != &b) { delete pi. Does "top" modify the stack? No. functions can be overloaded if their parameter list differs. since then both the original and the copy would share the same representation (and have exactly the same problem as described in the above "bad" example!) If you decide that for your class. Otherwise they might easily think you've simply forgotten to write them. The reason is that a C++ compiler automatically generates a copy constructor and copy assignment operator for you if you don't declare them. you are.pc_error). however. The "intstack" on the other hand does not. The latter perhaps feels a bit harder to understand. the auto generated copy constructor and/or copy assignment operator is OK. the assignment is simply not done. If it is. for non-constant stack objects. In some cases this is perfectly OK. This is of course hard. Since. is. by default. We can change "top" to be declared as follows: class intstack { public: // misc member functions int top(void) const throw(stack_underflow. a member function is assumed to alter the object. how does the compiler know if something you do to an object will modify it? Does "pop" modify the "intstack?" Yes." The "const" member function is called for constant objects (or. since the const reference treats whatever it refers to as a constant and thus won't allow you to do things that would modify it. The first alternative does this by comparing the "pi" pointer. } return *this. and which don't (and assumes they do. It removes the top element. With these changes done. The "Range" class from the previous lesson. to prevent copying and assignment. just to be on the safe side) unless you tell it differently. the class deserves a name change. The auto-generated copy constructor and assignment operator. The question is. pi = new int(*b. As a matter of fact. Just as I mentioned in part one. one "const" and one not. Note that if your class only has member variables of types for which copying the values does not lead to problems. // top is now 5.

// Preconditions: // Postconditions: // nrOfElements() == 0 || top() == old top() int& // // // // // // top(void) throw (stack_underflow. copy // assignment and destructor. You'll find a zip file with the complete sources at the top. This is getting too much without concrete examples. It is needed both in copy assignment and destructor." Our overloaded "top" member functions can be declared like this: class intstack { public: // misc member functions int top(void) const throw(stack_underflow. // Preconditions: // nrOfElements() > 0 // Postconditions: // nrOfElements() == old nrOfElements() // Behaviour on exception: // Stack unchanged. and that is usually desirable. copy constructor and destructor. but it means I won't have identical code in two places. you can bet you'll forget to update one of them otherwise.const member function is called for non-constant objects. The same goes for deallocation of the stack. int& top(void) throw (stach_underflow. that mistake is hard to make. void destroyAll(void) throw(). Since these helper functions "copy" and "destroyAll" are purely intended as an aid when implementing copy assignment. // Preconditions: intstack& operator=(const intstack& is) throw (bad_alloc). copy assignment operator. Since copying elements of a stack is the same when doing copy assignment and copy construction. and you have a subtle bug that may be hard to find. I have a private helper function that does the job. // misc other member functions }. and a non-const version of "top" (just as above. Here's a version of "intstack" with copy constructor.pc_error). const version of "top" and "nrOfElements". // Preconditions: unsigned nrOfElements() const throw (pc_error). }. class intstack { public: // the previous memberfunctions intstack(const intstack& is) throw (bad_alloc). This is not necessary by any means. they're declared private. You cannot declare non-member functions "const. After all.) Only the new and changed functions are included here.pc_error).pc_error). Note that it is only member functions you can do this "const" overloading on. Preconditions: nrOfElements() > 0 Postconditions: nrOfElements() == old nrOfElements() Behaviour on exception: Stack unchanged. if ever you need to change the code. Just as a private member variable can . stack_element* copy(void) const throw (bad_alloc). pc_error). int top(void) const throw(stack_underflow. With only one place to update. private: // helper functions for copy constructor.

) Next in turn is "top". if it wasn't for the fact that . // Postconditions: // nrOfElements() == old nrOfElements() // No need to check with this implementation! } As can be seen. or rather the two versions of "top": int intstack::top(void) const throw (stack_underflow. // Postconditions: // nrOfElements() == 0 || top() == old top() // no need to check anything with this // implementation as it's trivially // obvious that nothing will change. Note that declaring a member function "const" does not mean it's only for constant objects. } return pTop->value. in addition to making those member functions callable for constant objects (or constant references or pointers. This is always bad. the other one is not declared const and returns a reference. other than that it's declared to be "const." The implementation is in fact identical for both. } return pTop->value." Can you see what's different from the previous lesson? unsigned intstack::nrOfElements() const throw (pc_error) { // Preconditions: return elements. Whenever you have a member function that does not modify any member variable. } There isn't anything at all that differs from the previous version of "nrOfElements". They have nothing what so ever to do with how the stack works. // Postconditions: // nrOfElements() == old nrOfElements() // No need to check with this implementation! } int& intstack::top(void) throw (stack_underflow." Had we. It saves you debug time. declare it "const" so that errors can be caught by the compiler. and quite likely to cause unpredictable run-time behaviour. saying that we're attempting to break our promise not to modify the object. in this member function (or any other member function declared as "const" attempted to modify any member variable. So why do we have two identical implementations here. just how it's implemented. member functions declared private can only be accessed from member functions of the same class. not much differs between the two variants of "top. "const" methods are thus good also as a way of preventing you from making mistakes. does not have the desired effect. pc_error) { // Preconditions: // nrOfElements() > 0 if (nrOfElements() == 0 || pTop == 0) { throw stack_underflow(). pc_error) { // Preconditions: // nrOfElements() > 0 if (nrOfElements() == 0 || pTop == 0) { throw stack_underflow(). The "const" version could be implemented in terms of the non-const version. since we'd then return a reference to a local value.only be accessed from the member functions of a class. neither can be expressed in terms of the other. when I earlier mentioned that this is always undesirable? The reason is simply that although the implementation is identical. the compiler would give an error. it just means that it's callable on constant objects too. but the first one returns a value and is declared const. Here comes the new implementation of "nrOfElements. The non-const version cannot be implemented with the aid of the const version. and not by anyone else.

If "i" is an empty stack.copy()".elements.pop(). your program will crash right away. } Seemingly simple. but not that bad. is. Here a temporary pointer "pTmp" is first set to refer to the copy of "i's" representation. just to destroy the original. The "copy" helper function. In this case. or getting a value from it. not allowed to call non-const member functions for the same object. the self assignment guard ("if (this != &i)") is not necessary. since we first get a local copy of the object assigning from. since we've promised that our destructors won't throw) the rest is guaranteed to work.) Again. since "bad_alloc" is not caught in the function. int& i=is. Since we've promised that "destroyAll" won't throw anything (a promise we could make. it'll start behaving randomly erratically! Now for the copy constructor. If you're lucky." Since we're not catching "bad_alloc". as a consequence of this. first getting the copy is essential. whatever was allocated will be deallocated.top(). it's really simple! intstack::intstack(const intstack& i) throw (bad_alloc) : pTop(i. With the aid of the "destroyAll" helper function. // guaranteed not to throw! pTop=pTmp. creates a new copy of i's representation ("pTop" and whatever it points to) on the heap and returns the pointer to its base. The difficulty lies in being careful with the order in which to do things.copy()). int val=is. . This is very important from an exception handling point of view. Note that there is a danger in this too: What about this example? intstack is. Also. Since it is the element itself. we can safely destroy whatever we have and then change the "pTop" member variable. it flows out of the function as intended. elements(i. With the help of the "copy" member function. but the copying threw "bad_alloc. and thus the new object will never be constructed. // can throw bad_alloc destroyAll(). // i now refers to the top element i=32. not a local copy of it. If we run out of memory when "copy" is working. anything can happen. and yet both efficient and exception safe. it can be modified. try to leave objects in an unaltered state in the presence of exceptions. // what happens here? The answer to the last two questions is that "i" refers to a variable that no longer exists and that when assigning to it. elements=i. Instead. and after that destroy our own representation. Suppose we first destroyed the contents and then tried to get a copy. and is. and thus our promise to always stay destructible." The implementation of a const member function is not allowed to alter the object. The copy assignment operator is a little bit trickier. but our own "pTop" would point to something illegal.copy(). // what does i refer to now? i=45. If copying is successful. the member variables are not altered. "copy" returns 0. is. If the copying fails. and "bad_alloc" thrown. It's a pure performance boost by making sure we do nothing at all instead of duplicating the representation. Remember that a reference really isn't an object on its own? You cannot distinguish it in any way from the object it refers to. // val is 32.it is not declared "const. and always leave them destructible and copyable.elements) { // Preconditions: // Postconditions: } The "pTop" member of the instance being created is initialized with the value from "i.push(45). if you're out of luck. and copyable whenever resources allow. it means that "bad_alloc" will be thrown before "pTop" is initialized. In this case it means that what's returned from the non-const version of "top" is the top element itself. and the object remains unchanged (whenever possible. intstack& intstack::operator=(const intstack& i) throw (bad_alloc) { if (this != &i) { stack_element* pTmp = i. it'll flow off to the caller if thrown. would be broken.top(). // modify top element. the destructor becomes trivial: } return *this.

} } return pFirst. // cannot throw except bad_alloc pPrevious = pPrevious->pNext. if (p != 0) { // take care of first element here. // guaranteed not to throw. } throw. how is this magic "destroyAll" helper function implemented? It's actually identical with the old version of the destructor.0).) // If anything went wrong. if (pPrevious == 0) //**1 throw bad_alloc(). void intstack::destroyAll(void) throw () { while (pTop != 0) { stack_element* p = pTop->pNext. while ((p = p->pNext) != 0) //**2 { pPrevious->pNext = new stack_element(p->value. deallocate all { // and rethrow! while (pFirst != 0) { stack_element* pTmp = pFirst->pNext. pTop = p. // used in catch block. intstack::stack_element* intstack::copy(void) const throw (bad_alloc) { stack_element* pFirst = 0. pFirst = new stack_element(p->value.. pPrevious = pFirst. pFirst = pTmp. try { stack_element* p = pTop. delete pFirst. // Here we take care of the remaining elements. delete pTop.intstack::~intstack(void) { destroyAll(). } } Now the only thing yet untold is how the helper function "copy" is implemented.. // Cannot throw anything except bad_alloc if (pFirst == 0) //**1 throw bad_alloc(). // guaranteed not to throw. It's the by far trickiest function of them all. stack_element* pPrevious = 0. } } .0). } So. } catch (.

Remember that assignment is an expression. for comparisons. • You have learned about the "Orthodox Canonical Form".To begin with. here you have one. questions and (of course) answers to this month's exercises! Capitulo 4 . Coming up Next month I hope to introduce you to components of the C++ standard library. assignment and destruction. you are the ones who can make this course the ultimate C++ course for you. return 0. • You have seen how you can make member functions callable for "const" objects by declaring them as "const". • You have learned about "const". • The "while" statement marked //**2 might look odd. These member functions are then only callable from within member functions of that class. The type "stack_element" is only known within "intstack. • You have seen how you can implement common behaviour in private member functions. Old compilers. The local variable "pFirst". which would not be what we intended. but fortunately it is available and downloadable for free from a number of sources. partly because it is standard. Send me e-mail at once. • You have seen how C++ references work. New compilers automatically throw "bad_alloc" when they're out of memory. the whole structure that "pTop" refers to is copied. Mail me your reasons for why this can be a bad idea (it can. will be very beneficial for you. The assignment "p=p->pNext" must be in a parenthesis for this to work. desires. there would be no way it could find the memory to deallocate. since it is always put at the end of the stack. used to point to the first element of the copy. At the places where a "stack_element" is allocated. and our program would behave erratically. and that expression can be used. which always gives you construction from nothing. the return type is "intstack::stack_element*". and usually even is!) Can it be bad in this case? When can returning references be dangerous? When is it not? Mail me an exhaustive list of reasons when assignment or construction can be allowed to fail under the Orthodox Canonical Form. What happens is that the variable "p" is given the value of "p->pNext". it is important that the "pNext" member variable is given the value 0. construction by copying. The whole copying is in a "try" block. It's not until we have successfully created another element to append to the stack." As long as we're "in the header" of a member function. since it is then known what class the type belongs to. If "pTop" is non-zero. and how to use it. nested types must be explicitly stated. The precedence rules are such that assignment has lower precedence than comparison. As usual. • The "if" statements marked //**1 are only needed for older compilers. is defined outside of the "try" block. so it can be used inside the "catch" block. and seen that member functions declared "const" are callable for non-const objects as well. Most compilers available today do not have this library. so if we left out the parenthesis. yet more news has been introduced to you. If we didn't leave this for the "catch" block." so whenever used outside of "intstack" it must be explicitly stated that it is the "stack_element" type that is defined in "intstack. • You have found out how you can overload member functions on "constness" to get different behaviour for const objects and non-const objects. as coming C++ programmers. it would not be possible to know that it was the last element. it is no longer needed. stating your opinions. When is it OK to use the auto-generated copy constructor and copy assignment operator? Recap This month. and that value is compared against zero. so we can deallocate things if something goes wrong. There are two details worth mentioning here. for example. If it was not set to 0. and partly because it's remarkably powerful. that the "pNext" member variable is given a value other than 0. Exercises • • • • • • When is guarding against self assignment necessary? When is it desirable? How can you disallow assignment for instances of a class? The non-const version of "top" returns a reference to data internal to the class. no matter what happens. Knowing this library. however. it's up to you to toy with the "intstack". the effect would be to assign "p" the value of "p->pNext" compared to 0. and how it works for objects. Well within the function. Whenever you have a need for a stack of integers. • You have learned that your objects should always be in a destructible and copyable state. Now.

. it does pretty much nothing at all. so getting to know them before hand might not be so bad anyway. There's of course the C kind of solution. Type safety is essential for writing solid programs (Smalltalk programmers disagree). Here are some examples using it: int main(void) { print(5).. 4.NOTE: Here is a link to a zip of the source code for this article. T does not have to be a class. enclosed in a '<'. Just think of this little nightmare. I want a stack of doubles. The same happens for the other types. Note that this is done by the compiler at compile time. other than to remember that there's a function template with one template parameter and the name "print".. Later. OK. that's really all there is. the latter alternative isn't an alternative in my book. Why templates The last two articles made some effort in perfecting a stack of integers. so I guess I've explained what templates are for. time for some demystifying. More or less any kind of cookie can be made with that cutter. at least. just one data member in the internal struct with different type for all of them. always make it a stack of void*. it tries to create it by expanding the function template. Templates are the foundation of the standard C++ library too. and I don't own a Watcom compiler to work around it with. we'll look at another very important aspect of C++. But how? Function templates In the first C++ article. however. The name "T" is of course arbitrarily chosen. } The keyword "template" says we're dealing with a template. rather than creating yet another copy of it. // print<const char*> print(2). This is very much like a cookie cutter. This means that the template deals with a type. something which the compiler uses to create functions. Today. The template parameter for this template is "class T". When the compiler reads the function template. When the compiler reaches "main". and if there isn't. It could be any name. which printed parameters of different types. so it expands the template function "print" with the type "int". it will accept the keyword "typename" instead of "class". The code for the template. I wanted to introduce you to the C++ standard library. and cast to whatever you want (and just hope you won't cast to the wrong type). all with identical code. where T is used just as if it was a legal type.] So I've done it again. structs. The code in each of those functions was exactly the same (exactly the kind of redundancy that's always to be avoided). "print(2)" uses the same function as "print(5)" does. Once you have a cookie cutter. more or less identical pieces of code. This order of things is necessary to avoid unnecessary duplication. is not a function. Here's what a template function for printing something can look like: template <class T> void print(const T& t) { cout << "t=" << t << endl. For writing a template function. // print<int> print(3. not if). return 0. When declaring/defining a template. What do I do? Rewrite it all and call it doublestack? It's one alternative. enumerations. When you find a bug (when. '>' pair. without sacrificing type safety (this will get clear later). Let's compile and run: . No. that of templates. } Weird? OK. After all. it looks for a function "print" taking an "int" as a parameter. and so on (if you have a modern compiler. we end up with 4 versions of stack. Yuck. It's a function template. The former alternative isn't an alternative either. // print<double> print("cool"). Then I want a stack of char* (another rewrite) and a stack of bicycles (yet a rewrite. This month. Sigh. They're the solution to the above problem. and actually makes a new function. Ed. called T. and sees the call to "print(5)". but there are problems with the available implementations and Watcom compilers. There is none. although "class" will still work). I wrote a set of overloaded functions called "print". This is an ideal place for a template.. and a bizarre view). you can make cookies with the shape of the cutter. After this comes the function definition. And then. promised something I cannot keep. Despite the keyword "class".141592).. A template is a way of loosening your dependency on a type. there's always a template parameter list. it can be any of the built in types. of course. // print<int> again. some type. The compiler always first checks if there is a function available. you have to correct it in as many places..

This problem is something I strongly dislike about C++. in one or a few articles on C++ I/O). and I cannot know if operator<< on that type can throw an exception or not. "unexpected" would be called. class templates exist as builtins in C and C++. In a sense. called a template function. What if it does? If so. a compilation error occurs. For the function template "print". We'll deal with that later this fall/early winter. One example of a template function above. by adding the exception specifier "throw ()".cpp: In function `void print(const class intstack &)': temp_test2. Class templates Just as you can write functions that are independent of type (and yet type safe!) we can write classes that are independent of type. The compiler generated functions are the template functions. and the function template had an empty exception specifier list. it must be possible to print it with the "<<" operator to "cout". The drawback with templates is that they make writing exception specifiers a bit difficult. Note that a function is not generated from a function template until a call is seen (the compiler cannot know what types to generate the function for before that). some type. It's a template from which the compiler generates functions. and noticed the error. You have arrays and pointers (and references) that all act on a type. while the "template function" is the cookie. but this is how it works. Templates and exceptions As you may have noticed. but that would not be wise. The type they act on does not change their behaviour. In case you don't remember. After having generated the function. class Range { public: Range(int upper_bound = 0. the "int" version of print). Here the compiler generated a new function. since the type "intstack" cannot be printed with "<<" on "cout" (the error message says "ostream". I wish there was a way to say "The exceptions that might be thrown are the ones from operator<< on T" but there is no way to say that. where every occurrence of "T" (only one. This was not a mistake. It's just seen in a somewhat different way. Let's explore writing a simple class template.cpp [c:\desktop]temp_test. it compiled it. Please note the different meanings of the terms "function template". and there's not much to do about it. // Precondition: upper_bound >= lower_bound // Postconditions: // lower == upper_bound // upper == upper_bound .14159 t=cool t=2 [c:\desktop] Although it does not seem like it. which is correct. here's what GCC says when trying to print the "intstack" from last month: [c:\desktop]gcc temp_test2.cpp -fhandle-exceptions -lstdcpp temp_test2. pointers and references. by improving the old "Range" class from lesson 2. Not nice. If the type cannot be printed.cpp:285: no match for `operator <<(class ostream. other than as a comment. I could try to make the promise that the function "print" does not throw exceptions. there's only one requirement on the type T. nor was it sloppiness. The "function template" is what you write. and the program terminate. Note that not writing an exception specifier means that any exception may be thrown. int lower_bound = 0) throw (BoundsError). the original "Range" looks like this: struct BoundsError {}. is "print<int>()" (i. I didn't write an exception specifier for the "print" function template. class intstack)' GCC delivered a compilation error. The problem is that I cannot know what kind of type T will be. but of different types.[c:\desktop]icc /Q temp_test. To test it. type safety is by no means compromised here.e. and "template function".exe t=5 t=3. they're still arrays. The "function template" is the cookie cutter. in the function parameter list) is replaced with "intstack".

const T& lowerBound() throw (). T upper. does need the unfortunate comment. } template <class T> const T& Range<T>::lowerBound() throw () { return lower. private: T lower. "includes" on the other hand. // copy constructor upper(upper_bound) // copy constructor { if (upper < lower) throw BoundsError(). const T& upperBound() throw (). why it shouldn't be a range of any type.int lowerBound() throw (). template <class T> class Range { public: Range(const T& upper_bound = 0. which will include some news: template <class T> Range<T>::Range(const T& upper_bound. }. since after all. I've changed the constructor so that it accepts the parameters as const reference instead of by value. however. since those member functions do not do anything with the T's. The reason is performance if T is a large type (if passed by value. const T& lower_bound) : lower(lower_bound). "lowerBound". }. const T& lower_bound = 0). // Throws: Whatever operator>= and operator <= on T // throws. on line 3. They just return a reference to one of them. This class is a range of int. // Whatever operator < on T throws. As can be seen. int includes(const T& aValue). int upper. } . Writing a class template is in many ways similar to writing a function template: struct BoundsError {}. after "template <class T>". there is no way to know if T throws anything. for the same reason as the constructor does. Time for the implementation. There's no reason. I've also removed the exception specifier. int includes(int aValue) throw (). "lowerBound" and "upperBound" can safely have empty exception specifiers. private: int lower. "upperBound" and "includes" uses const T& instead of value. and instead used a comment. T is used just as if it was a type existing in the language. int upperBound() throw (). the parameters must be copied and the copying may be an expensive operation). // Precondition: upper_bound >= lower_bound // Postconditions: // lower == upper_bound // upper == upper_bound // Throws: // Bounds error on precondition violation // Whatever T's copy constructor throws.

you must explicitly state what type it is for.10)" and have the compiler automatically understand that you mean "Range<int>(5. and when executed. When done. Range<double> rd(3. by adding "<T>" after the class name. There isn't much more to say about this.-3.10)". it would not have been expanded.lowerBound() << ". by expanding whatever is needed.includes(55)) { cout << "[" << ri. and doesn't contain any data.141592). we should have a look at some power usage (this section is abstract. It just tells things about other classes. the code will not be expanded until it is called from somewhere. but had "includes" not been called. and we must specify that it's the template version of the class. but for some reason some people didn't like the name.h> int main(void) { Range<int> ri(100. which prints ranges looking like the constructor call for the range.upperBound() << "] includes 55" << endl. so when the compiler first sees "Range<int>". if (ri. A traits class is never instantiated. Advanced Templates Now that the basics are covered.4)). I will be able to write: print(make_range(10. it creates the class. The name "traits class" is odd. } Take a careful look at the syntax here. until "includes" was called. As with function templates. One unusually clever place for templates. see: . i. that is its sole purpose. is as something called "traits classes". One unfortunate side of this is that "includes" could actually contain errors. and belong to something else. which is used to create ranges. and finally. so it was changed. a class template is expanded when it's referred to.e. and we must be explicit about that. Let's have a look at how it's used: #include <iostream. a function template. We must also precede every member function with "template <class T>". } if (!rd. Originally they were called "baggage classes". There is unfortunately no way to say "Range(5.template <class T> const T& Range<T>::upperBound() throw () { return upper.141592.lowerBound() << ". since they're useless on their own. } The syntax for member functions is very much the same as that for function templates. and this would be unnoticed by the compiler. which tells the name of the type it is specialized for (explanation of that comes below). } return 0. } template <class T> int Range<T>::includes(const T& aValue) { return aValue >= lower && aValue <= upper. but with a class template. To use a class template. The reason is that we're not dealing with a complete class. The only difference is that we must refer to the class (of course). without needing to specify the type.upperBound() << "] does not include 62" << endl. " << rd. " << ri. The above code calls all members of "Range".includes(62)) { cout << "[" << rd.10). My intention is to write a traits class. and to write a special template print. The compiler will also treat every member function just as any template function. so it may require a number of readings).

public: void f(void). which means it's the "h" belonging to the class named "A". since it is not static. and thus not bound to any object. A traits class.h(). Now back to traits classes. f is bound to an object.h()" and "A::h()" are synonymous. you can do what's called a specialization. } void A::g(void) { cout << data << endl. // Error. in that it does not belong to an instance. }. is different from normal member functions. but if you want the class to take some special care for a certain type. } "A::g()" is in error. Since "h" is not tied to an object. like this: . static void g(void). The whole idea for traits classes is one of "specialization". a. The calls "a. but belongs to the class itself. The traits class needed here. return 0. and as such cannot access any member data. a.. // error! Cannot access data. Here's an example: class A { private: int data.. } int main(void) { A a. A member function declared static. is one that tells the name of a type.4) Magic? No. The class template just looks like this: template <class T> class type_name { public: static const char* as_string(). No data. static void h(void). The class template is the general way of doing things. and only static member functions. void A::f(void) { cout << data << endl. // prints something. just templates! Here we go. and the member function is declared as "static". That is. This is the way traits classes usually look. just defined." operator). because it's declared static.f(). }. is a simple class template. and must be called on an object (through the ".Range<int>(10. } void A::h(void) { cout << "A::h" << endl. it holds no data. A member function specialization is usually not declared. it can be called through the class scope operator "A::". Calling "A::f()" is an error. This means it belongs to an object. since member data belongs to objects. and must be // called on an object. // prints "A::h" A::h(). // also prints "A::h" A::f().

And it will work (if we specialize "type_name<int>::as_string()". The other new thing is how elegantly the "type_name" traits class blends with the function template. though (for all except the absolutely newest compilers. Now. all template parameters must be used in the parameter list for the function). } A minor. } Here we see two new things. It needs to be something that makes use of the template parameter. It's supposed to accept an instance of a "Range". the function template that creates "Range" instances. since we're specializing for known types. Now over to the print template. using the types of the parameters. Now for the last detail.5)). Their purpose is only to tell something about other classes. just as the constructor call for the "Range" was done.lowerBound() << ")" << endl. such as "double". .const char* type_name<char>::as_string() { return "char". we'll get an error when compiling. the parameter for the function template need not be the template parameter itself. We can now write: print(Range<int>(10.5)). Please add all the fundamental types. } Doesn't seem too tricky. Those member functions are intended to tell something about some other class. eh? If you want to learn more about traits classes. We can now write: print(make_range(10. the class. Neat. understandable difference. If we try for a type we haven't specialized. but the template parameter list is empty. have a look at Nathan Meyers traits article from the June '95 issue of C++ Report. just as we planned to. This is how traits classes usually look. the template member functions are not defined. but in a sense. Normally. For being such an incredibly simple construct. const T& t2) { return Range<T>(t1.upperBound() << ". If the types differ in a call. we can use the "type_name" traits class for "char" as follows: cout << type_name<char>::as_string() << endl. You can of course make any specializations you like. and print it. you'll get a compilation error. now does it? There actually is no catch in this. When the type is known. which with the above seems fairly simple. template <class T> Range<T> create_range(const T& t1. so compilers very much up to date with the standardization requires you to write like this: template <> const char* type_name<char>::as_string() { return "char". Note also that this means we cannot print ranges of types for which the "type_name" traits class is not specialized. it will know what kind of "Range" to create and return. instead specializations are. The "template <>" part clarifies that it's a template we're dealing with. They have a template interface. The syntax has changed.t2). nothing else. that is). the traits classes are unbelievably useful. Now we're almost there. if you have a top modern compiler. Piece of cake: template <class T> void print(const Range<T>& r) { cout << "Range<" << type_name<T>::as_string() << ">(" << r. which declares a number of static member functions. The function template is by the compiler translated to a template function. " << r. } Of course. the compiler will give you an error message.

It's surprisingly easy to do. that's why it's built into the language itself. send me e-mail at once. the syntax is legal only because you can overload operators in C++. (If you're familiar with Pascal. This month. X& operator=(int i). Let's see what actually happens when we use operator=. that'll be next month's topic. You've already seen that with operator=. class X { public: . • how to specialize class templates for known types. operator== or destructor of T throws exceptions? When can you. otherwise it'll be C++ I/O streams. • how to write and use traits classes. is not part of the language proper in C++ (or in C for that matter. and when can you not use exception specifiers for templates? What are the requirements on the type parameter of the templatized "Range"? Can you use a range of "intstack"? What are the requirements on the type parameter of the templatized "stack<T>"? Recap Quite a lot of news this month. without sacrificing type safety. . Coming up If the standard library's up and running on Watcom. "stack<T>" What happens if the copy constructor. • how the compiler generates the template functions from your function template. the language doesn't allow it.) It's handled by an I/O library..Exercises • • • • Biggie: Rewrite last months "intstack" as a class template. Exploring I/O of fundamental types Formatted I/O. }. that's implemented in the language. • that templates restricts the usefulness of exception specifiers. • about template classes. You've learned: • how to write type independent functions with templates. You can't. questions and (of course) answers to this month's exercises! Part Part Part Part Part Part Part Part Part Part1 Part1 Part1 Part1 1 2 3 4 5 6 7 8 9 0 1 2 3 Introduction We've seen how the fundamental types of C++ can be written to the screen with "cout << value" and read from standard input with "cin >> variable".) We've seen a number of times how we can print something with "cout << value". As always. How can this be expressed in the language? To begin with. X x.. which can contain data of a type not known at the time of writing. try to implement something like Write and WriteLn in Pascal... stating your opinions. you will learn how you can do the same for your own classes and structs. desires. .

and that's not what we want.operator<<(i)).operator<<(b)". So. an operator overridden in a class. ostream& operator<<(unsigned int). The solution does yet again lie in operator overloading. "cout" is an object of some class. such that the operator becomes a member function for that class (only. As we can see then. and if we add operators << and >> to our class. ostream& operator<<(short). Let's go back to printing again. ostream& operator<<(signed char)... and the stream to print on/read from.x=5. ostream& operator<<(char). (cout. //** At the last line of the example. this syntax is legal. ostream& operator<<(long double). ostream& operator<<(unsigned long). work. we can see that writing int i. double d. is just like any other member function. We just saw how we can overload an operator for a class. where T is any of the fundamental types of C++. public: . As I wrote above. double d. The value returned by each of these is the stream object itself (i. Another possible way of doing this is to edit the ostream and istream class to contain operator<< and operator>> for our own classes. }. how do we make sure we can do I/O on ranges and stacks (from the earlier lessons?) What about extending our own class with the members operator<< and operator>>? This would. but the syntax would change. what actually happens is that operator= is called for the object named "x". it's just called in a peculiar form. which has operator<<(T) overloaded. .e.. cout << i << d. the return value will be a reference to "cout" itself. The only difference for reading is that the class is called "istream" instead. ostream& operator<<(unsigned short). but this time in a somewhat different way. and it generates identical code. ostream& operator<<(const char*). The relevant section of the class definition looks as follows: class ostream { .operator<<(d). This is important. is synonymous with int i. because this is how the compiler will treat the more human-readable form "x=5". ostream& operator<<(int). In fact. ostream& operator<<(unsigned char). if you call "operator<<(char)" on "cout".) With the above in mind. I/O would be very difficult indeed. The C++ I/O library only supports I/O of the fundamental types.. on the right hand side.. in its . Another way of expressing this is: x. ostream& operator<<(double). ostream& operator<<(float).. "a << b" is identical with "a. ostream& operator<<(long). and that the operator used is operator>>(). we'll require our object on the left hand side. I/O with our own types The most important thing to recognise is that our own types (classes and structs) always consists of fundamental types. so if our own data types consisted of something completely different. Does that seem like a good idea to you? It doesn't to me.operator=(5). sort of.

cout << r << i. #6 and #7 are usually skipped. int upper. int includes(int aValue) const throw (). The signature becomes: ostream& operator<<(ostream&. int upperBound() const throw (). The compiler will treat it as operator<<(cout. for those who do not have old issues handy (I've added "const" on the member functions. say. it is not at all obvious that this will occur. provided that at least one of the parameters to the operator is not a built-in type. }. Full commit or roll back.) It's also possible to overload operators. // Precondition: upper_bound >= lower_bound // Postconditions: // lower == upper_bound // upper == upper_bound int lowerBound() const throw (). This even works for more complex expressions. This declares a function. int i. such that the operator becomes a function. What's the appearance we want of a range when printed. We want printing and reading synchronized (i.. Such is the case for our new friends operator<< and operator>>. and between any of the tokens (the tokens here are '['. Most operators that can be defined like a nonmember function. so now we have a pretty good picture on what to do. now that you know what it's for. how? Overloading operator<< as a global function. the class "Range.. How should this thing be printed and read? Here's a wishlist. like: Range r. All of these are possible. const Range&). int lower_bound = 0) throw (BoundsError). I'll skip #2 for now. or we print nothing at all. class Range { public: Range(int upper_bound = 0. two integers separated by a comma. 7. The syntax and semantics for printing must be the same as for the fundamental types of C++. 4.e. private: int lower. OK. No unnecessary computations.) Since both reading and writing is normally buffered. but #2. cout << i << r << j. Which the compiler interprets as: . but very liberal in what you accept as input. 1. and what format should we accept when reading? A golden rule in I/O (and not just in C++) is to be very strict in your output.. white space is allowed before the first bracket. The print must be in a form distinguishable from. the syntax differs. For format I chose is "[upper. if we read something. Encapsulation not violated. then print a range.. . 2. then reads something.' and ']').. now.lower]". 3. to be more realistic later. . Let's revisit our old friend. '. On input however.. and we want it all printed before reading again. See part 3 for details if you've forgotten): struct BoundsError {}.use. which has the syntax of a left shift operator. Normally. no spaces anywhere. Full type safety 5. we want the reading to complete before printing. either we print all there is to be printed. 6. If we have code like: Range r. int i. We'll reduce it a little bit.operator<<(i). int j. the C++ I/O library handles just exactly this for you.. number. r). that is. accept two parameters." This is the definition of "Range".

how about reading? The signature and general appearance of the function is pretty much clear from the above discussions. if (c != '[') // signal error somehow and roll back stream.) I dare you to find this in a C++ book (I know of one book. is >> c. Detached processes do not have standard output and standard input (unless redirected) and as such printing will always fail. int lower. return os. is >> c. It is not possible to pass it by value. } The "prefix" ("opfx" means "output prefix") function checks for a valid stream. The check and synchronization is simple to make. given the facts known this far (more is needed. It will not be the same after printing as it was before printing. char c.ipfx()) return is. we have a detached process. Now.osfx(). . to make sure you understand what's going on. . if (c != '. we're printing known types. The format is distinct enough. int upper. as you will see further down. How well does this suit the 7 points above? The syntax is correct.' << r. const Range& r) { os << '[' << r.) Inside the function. } Here "r" is passed as const reference.) I don't know why it's just about always skipped.) The stream. so the operator<< provided by the I/O class library suits just fine.operator<<(j). Say. it's fairly easy to get down to work with implementing the operator<< function. "os".lowerBound() << ']'.opfx()) return os. is >> upper. we have type safety and encapsulation is not violated. we do make some unnecessary computations if the stream is bad in one way or the other.operator<<(i).lowerBound() << ']'. is >> lower. os << '[' << r. Range& r) { if (!is. Study these examples carefully. so that synchronized input streams can begin accepting input again. since when passing by value. since the function does not alter "r" in any way (and promises it won't. This is essential. return os. means copying. However. and copying a stream doesn't make much sense (think about it.operator<<(cout.r). That was printing. ostream&A& operator<<(ostream& os. is passed by non-const reference. even try? We also do not synchronize our output with input. and the semantics are too. . since it isn't more difficult than this to avoid unnecessary computations and synchronize input with output.upperBound() << '. char and int. ostream& operator<<(ostream& os. This is as far as most books on C++ cover when it comes to printing your own types. but I mentioned already in the beginning that we'll skip that for now. Printing does alter a stream. for example. however. Why then.) We do not have full commit or rollback. if (c != ']') // signal error and roll back stream.') // signal error somehow and roll back stream. Let's make a try: istream& operator>>(istream& is.' << r. and also synchronizes output with input.upperBound() << '. after these examples. but oddly enough not mentioned in most C+ books. const Range& r) { if (!os. The "suffix" ("osfx" means "output suffix") signals end of output. os. is >> c.

and thus not to be handled with exceptions.clear(ios::failbit|is. demand that the users of your program enter the exact right data in the exact correct format every time.clear(ios::failbit|is. The problem is fixed by removing the faulty line (don't you just love bugs that you fix solely by removing code!) Rolling back the stream is interesting indeed.) The obvious solution to signalling an error. so reading wasn't as easy. Let's begin from the easy end. since it means the stream is really out of touch with reality and we cannot trust anything from it (I've only seen this one once. and non-zero otherwise. since we want to affect only that bit. "clear" sets the status bits of the stream to the pattern of the integer parameter (which defaults to 0." This is done with the odd named member function "clear(int)". Now with the above in mind. and leave the other bits as they were before the call.} Hmm. but in practice it means you cannot know if it just backs a position. is >> c. "bad". "is. let's make another try: istream& operator>>(istream& is.isfx(). if (c != '.fail()". return is. } int lower. erroneous user input is expected.good()" returns non-zero if no error state bits are set. is >> lower >> c. but the stream itself is OK. Remember you're dealing with input generated by human beings here. a not too unusual situation (as a matter of fact. Range& r) { if (!is. } int upper. r=Range(upper. since it's very difficult to do. // ERROR! Does not exist! return is. return is. "eof" is used to signal end of file. but you won't be very popular among them. How to signal error. what we should do if we read something unexpected. We can put back a character. and it was due to a bug in a library!) I guess we can expect "bad" if reading from a file.) A fourth call "is.putback(c). Sure you can.bad()" and "is. The solution is that there isn't one. Use exceptions to signal exceptional situations. is to set the stream state to "fail. to throw an exception. is wrong. and hit a bad sector.) The bits we can set are "ios::badbit". No. In other words.eof()" (which return 0 if the bit they represent is not set. it's usually a failure. The reason is conceptual. One character. and soon will have none. is >> upper >> c. return is. that is all that is guaranteed to work. and usually we want to do that when setting or resetting a status bit. and 0 otherwise.isfx()" was wrong. the program may do *anything*.rdstate()). and other means to handle the expected. OK. How then? A stream object has an error state consisting of three orthogonal failure flags. and thus not exceptional. it's almost impossible. and how to deal with the suffix function. but if it occurs in the middle of reading something.rdstate()". how to roll back the stream.') { is. is. .ipfx()) return is.clear(ios::failbit|is.. There are three issues above that needs to be resolved. Putting back a character is done with "istream::putback(char)".. it's used to signal that we received something that was not what we expected. if (c != ']') { is. So. The wrong input *is* expected. In fact. if (c != '[') { is..rdstate()).lower). our only chance is if the first character read is not right. since the guess "is. "eof" and "fail". It's also absolutely necessary that the character put back is the same as the last one read. in theory. . otherwise the behaviour is undefined (which literally means all bets are off. so we needn't even try. "bad" is something we hope to never see. is.. We can get the current status bits by calling "is. the name makes sense. the suffix function. or actually changes the character. so if nothing is passed. a situation most programs rely on.) "fail" is the one we're interested in here. char c. The status bits can also be checked with the calls "is.rdstate()). "ios::failbit" and "ios::eofbit".

setf(ios::dec. and other reads will not do anything at all (not even alter the stream error state. so we needn't work on that at all. and the separator.ipfx()" not only synchronizes the input stream with output streams. and a few ways in which the requirements on the input format can be altered. mark the stream as failed.) thus the check near the end for "is. If the separator is not ". all we need to do is to check that the upper limit indeed is at or above the lower limit (precondition for the range) and if so set "r" (since we haven't declared an assignment operator. Then read the lower limit and the terminator. int main(void) { int i=19.setf(v. How well do we match the 7 item wish list? You check and judge. cout << i << endl. This can be improved with the formatting flag ios::showbase. we set the stream state to failed and return.lower).setf(ios::showbase).setf(ios::hex. If the first character read is not a '['.unsetf()".rdstate()). "putback" is not guaranteed to work if the stream is in error. return 0. and alignment within that field. else is. ios::basefield). and a little data.clear(ios::failbit|is. a field width can be set. are controlled with a few formatting flags. All flags are set or cleared with the member functions "os. the base can be set (decimal. The call to "is. the compiler did it for us. Formatting There are a number of ways in which the output format of the fundamental types of C++ can be altered. consider: #include <iostream. } if (is. cout << i << endl. cout. For example. but there is no way to see what base it is. All of these. but also checks for error conditions and reads past leading white space. octal.) After this we read the upper limit of the range. ios::basefield)". If the terminator is not ']'.setf()" and "os. cout << i << endl. I think we're doing fine. where v is one of "ios::hex". ios::basefield).good()) { if (upper >= lower) r=Range(upper.h> int main(void) { int i=19. As a small example. ios::basefield). The base for integral output is altered with a call to "os. cout << i << endl. and we'll visit those later. cout. "ios::dec" or "ios::oct". cout. and return. } The result of running this program is: 19 13 23 19 The base is converted as expected. cout. For integral types. } return is. If they were. hexadecimal). . so the call is valid. I think they're difficult to use. so let's set that one too. cout << i << endl. Note that operator>> for built in types skips leading whitespace. If reading of either upper limit or lower limit failed. the stream is set to fail state.setf(ios::oct.". For floating point types the format can be fixed point or scientific. but fortunately there are easier ways of achieving the same effect. and yet some. we put the character back and set the fail bit (the order is important.} This actually solves the problem as far as is possible. and in fact better than what can be found in most books on the subject.) otherwise set the fail error state.good()" is enough to know if all parts were read as we expected.

it bitwise "or"es the current bit-pattern with the one provided as the parameter. cout. Let's alter the width setting program to show the behaviour. and leaves the others unchanged (in other words.cout.setf()". though. return 0.width(void).setf(ios::hex.width(10). is potentially dangerous (what if "ios::oct" was already set? Then you'd end up with both being set. or the field width set is smaller than that necessary to represent the value to be printed. "ios::right". cout. The three of these are mutually exclusive. for sure. or "ios::internal". now for something that's common to all types.setf(ios::hex)". the width set does not affect the printing separate characters.) That was setting the base for integral types. let's try it out: #include <iostream. cout. cout << i << endl.) Now you begin to see why this is messy. The second form. As with the base for integral types. cout << '[' << -55 << ']' << endl. and the width is reset after printing the first thing that uses it. ios::basefield). ios::basefield). #include <iostream. The field width is set with "os. the other one a full set of flags only. sets the flags sent as parameter. } Executing this programs shows something interesting.h> int main() { cout << '[' << -55 << ']' << endl. right? The call to "setf()" for setting the "ios::showbase" flag is different. If the field width is not set. let's play with alignment within a field. or all three of these flags at the same time. the three alignment forms are mutually exclusive.width(int)". then "ios::oct" and "ios::dec" will be cleared. field width and alignment.h> int main() . cout << -55 << ']' << endl. and the version with the mask clears the bits represented by the mask.setf(ios::oct. Now.) The second parameter "ios::basefield" guarantees that if you set "ios::hex". cout << '['.setf(ios::dec. "setf()" is overloaded in two forms. it's not a very good idea (yields undefined behaviour. the one accepting only one parameter. Formatting bits not represented by the mask will remain unchanged. Alignment is set with the two parameter version of "os. If the masked version is called. where the first parameter is one os "ios::left". and the curious can get the current field width by calling "os. cout << i << endl. and the second parameter is "ios::adjustfield". ios::basefield). } The output of this program is 19 0x13 023 19 That's more like it. cout << i << endl. All the formatting flags of the iostreams are represented as bits in an integer. The result of running the program is shown below: [-55] [ -55] [ -55] [-55] Had you expected this? I didn't. cout. the only formatting flags of the stream that will be affected are "ios::hex" or "ios::dec" or "ios::oct". alignment does make a difference. but if there's extra room." Simple enough. alignment doesn't matter. so a call to "os. except those explicitly set by the first parameters. and the mask is "ios::basefield". so don't set two of them at the same time. This is not very intuitive I think. cout << '[' << -55 << ']' << endl. While it's possible to set two.width(10). One accepts a set of flags and a mask. return 0.

fill(void)". which comes in two flavours. cout.. is just the default.fill('.width(10). The pad character. remains the same until explicitly changed... but the way. OK.. } The result of running this is.precision". Let's exercise that one too: #include <iostream. The only thing remaining for formatting is "os.fill(char)". cout. One without parameters which reports the current precision. cout << '[' << -55 << ']' << endl.. and one with an int parameter. cout << '[' << -55 << ']' << endl. • ios::uppercase controls whether hexadecimal digits should be displayed with upper case letters or lower case letters. return 0. If the field width is larger than that required for a value. cout. and where in the field space will be.').-5 . cout << '[' << -55 << ']' << endl. however. cout.. ios::adjustfield). Some think the precision is the number of digits after the decimal point. by .. is that many compilers interpret it differently. by calling "os. not very surprising: [-55] [-55] [-55] [ -55] [-55 ] [55] Well.setf(ios::left. after the above explanations.. Space. cout << -5 << endl.setf(ios::internal. return 0. the current alignment defines where in the field the value will be.width(10).) ios::showpos controls whether a "+" should be prepended to positive numbers or not (just like a "-" is prepended to negative numbers. cout.{ cout.width(10). and get the current value with a call to "os.setf(ios::right. Now that you have the general idea. but it kind of makes sense.width(10).. } Running it yields the surprising result . ios::adjustfield).-5 Why was this surprising? Earlier we saw that the field width is "forgotten" once used. The November 1997 draft C++ standards document (which. cout. while most think it's the number of digits to display.. cout. The unpleasant thing about this parameter.. cout.setf(ios::right. I found the formatting of "ios::internal" to be a bit odd. ios::adjustfield). ios::adjustfield). ios::adjustfield).. cout << '[' << -55 << ']' << endl.. why not try the other formatting flags there are: • ios::fixed and ios::scientific control the format of floating point numbers (the mask used is ios::floatfield. cout. cout..setf(ios::left.setf(ios::internal.h> int main() { cout. cout << -5 << endl. cout << '[' << -55 << ']' << endl. ios::adjustfield). • ios::showpoint controls whether the decimals should be shown for floating point numbers if they are all zero. we can change the "padding character". cout << '[' << -55 << ']' << endl.width(10).

and flushes the stream buffer. inconsistencies aside. return os. and it in its turn calls the function for the stream. actually ends up as "left(cout)". . so they defined something called "manipulators. print something on the stream. "endl". Let's write one that prints a defined number of spaces: class spaces { public: spaces(int s) : nr(s) {}. }. or may not. return 0. isn't it? An easier way The authors of the I/O package realized that this is a mess. Their use is simple: #include <iostream.< left". } This function matches the required signature." The ones available are: "dec". ++i) cout << ' '. and that there's no way you can accidentally set illegal base flag combinations.) How do these manipulators work? There's a rather odd looking operator<< for output streams. so doing it in a portable way is very difficult. Their use is fairly straightforward and doesn't require any example. Let's exercise this by rolling our own "left" alignment manipulator: ostream& left(ostream& os) { os. There are two kinds of manipulators. To access them. because it really is simple. Or actually.h> int main(void) { cout << hex << 127 << " " << oct << 127 << " " << oct << 127 << endl. this is a mess. and "setfill". it isn't if you skip the mechanism offered by your compiler vendor and do the job yourself. ios::adjustfield). "ends" is rarely used. just like "endl. and returning an ostream&." A manipulator does may. you need to #include <iomanip. } Now. Let's first focus on those that don't.the way. and "flush". Every compiler I've seen provides its own mechanism for writing such manipulators. Cool. i < nr. those accepting a parameter.e.) "flush" flushes the stream buffer (i.) "setprecision". what on earth does this mean? It means that if you have a function accepting an ostream& parameter. so if we "print" it with "cout lt. return os. but it will alter the stream in some way. the function will be called with the stream as its parameter. it's there to print a terminating '\0' (the terminating '\0' of strings is never printed normally." You've already used one manipulator a lot. "hex". } The advantage of this is both that the code becomes clearer. so "cout << left". ostream& printOn(ostream& os) const { for (int i=0. the above mentioned operator<< is called. It looks like: ostream& operator<<(ostream& (*f)(ostream&)) { return f(*this). and those that does not.h>.) Then there are some manipulators accepting a parameter.setf(ios::left. "endl. "ends". forces printing right away.) says the number of digits after the decimal point is what's controlled. most probably is the final C++ standards document. but I'm not sure if that's what the current standards document says. For example "endl" prints a new line character. "oct"." and if you do. eh? Roll your own "right" and "internal" manipulators as a simple exercise (they're handy too. } private: int nr. At any rate. that function can be "printed. The ones usually accessed from there are "setw" (for setting the field width.

like this: istream& work(istream& is) { istream::sentry cerberos(is). class traits> class basic_istream<charT. clear and recognise the error state of a stream. with a parameter of 40. } • The destructor of the sentry object does the work corresponding to that of the postfix function. "istream" and "ostream" are in fact not classes in the standard. which have effect. and template<class charT. That parameter is in the constructor stored in the member variable "nr". that are streams of wide characters. if (kerberos) { . • • . const spaces& s) { return s. no exceptions are thrown. and sets the ios::fail status bit if they differ. "istream" is typedefed as "basic_istream<char. Recap This month you've learned a number of things regarding the fundamentals of C++ I/O. Any operation that sets an error status bit may throw an exception.printOn(os). • The very messy.. though.) With the above in mind.) By default. Which error status bits cause exceptions to be thrown is controlled with an exception mask (a bit mask. char_traits<char> >". There's also the pair "wistream" and "wostream". and check it. For example • How to set. • Why exceptions are not to be used when input is wrong. and which don't? Of those that do have an effect. class traits> class basic_ostream<charT. traits>. • How to write your own stream manipulators. write a class which will accept an ostream as its constructor parameter. The mechanism for writing manipulators is standardised (and heavily based on templates.) Exercises • • • • Find out which formatting parameters "stick" (like the choice of padding character) and which ones are dropped immediately after first use (like the field width. but typedef's for class templates. The class templates are template <class charT. and that function in its turn calls the printOn member function for the spaces object. do they have the effect you expect? Write an input manipulator accepting a character. which goes through the loop printing space characters. which when called compares it with a character read from the stream. traits>. Experiment with the formatting flags on input. Then the global operator<< for an ostream& and a const space& is called. } return is. and remembering that destructors can be put to good work. Now something for you to think about until next month. char_traits<char> >". and which on destruction will restore the ostreams formatting state to what it was on construction. your probably want it printed differently from what will be the case if you don't take care of it. what about our I/O of our own classes with respect to the formatting state of the stream? How's the "Range" class printed if the field width and alignment is set to something? How should it be printed (hint. and the somewhat less messy way of altering the formatting state of a stream. and "ostream" as "basic_ostream<char. I think writing manipulators requiring parameters this way is lots easier than trying to understand the non-portable way provided by your compiler vendor. Instead you create an object of type istream::sentry or ostream::sentry. • How to make sure your own classes can be written and read. Standards update • The prefix and postfix functions are history.) I still think it's easier to write a class the way I showed you.ostream& operator<<(ostream& os.. } Can you see what happens if we call "cout << spaces(40)"? First the object of class "spaces" is created.

but let's analyze it. whose return value must be assignable. both for the parameter and the return value. OUT dest) { while (begin != end) { *dest = *begin.) The header name is <iostream> (no . send me e-mail at once.34. Must have operator*. Operator ++ (prefix) must be allowed. desires. As always. not by reference. then this is legal for all types "T1" that can be assigned (or implicitly converted to. One match for "IN" and "OUT" is obvious: a pointer in an array. Must have an operator*.h) and the names actually std::istream and std::ostream (everything in the C++ standard library is named std::whatever. For example: int iarr[]={ 12. from) a type "T2". class OUT> OUT copy(IN begin.iarr+isize. ++begin. } return dest. } What does it mean? Let us first have a look at the requirements on the types for IN and OUT. Example: Study this function template: template <class IN. and every standard header is named without trailing . instead of leaving you in the dark for a month. At the call-site. the function template is expanded to a template function. even though it's short and not the planned details on inheritance. Maybe it's the winter darkness.• • Formatting of numeric types (and time) is localised. Let's say that the types are "T1*" and "T2*". size_t isize=sizeof(iarr)/sizeof(iarr[0]). or the to be switch of jobs. ++dest. what does the above do? The name no doubt gives a hint. Inheritance is what Object Orientation is all about. Operator++ (prefix) must be allowed. IN end. stating your opinions.) Must be comparable with operator !=. IN: Must be copy-able (since the parameters are passed by value.h) Coming up Next month we'll have a look at inheritance. People who enjoy and understand the philosophy of Platon will feel at home. whose return value must be something which can be assigned to the result of operator* for IN. as follows: . I've had very little inspiration for writing this month. copy(iarr. here's some food for thought.23. By default most implementations will probably use the same formatting as they do today.darr).45}. double darr[isize]. So. However. but with the support for "imbuing" streams with other locales (formatting rules. questions and (of course) answers to this month's exercises! Part1 Part2 Part3 Part4 Part5 Part6 Part7 Part8 Part9 Part10 Part11 Part12 Part13 Part Part Part Part Part Part Part Part Part Part1 Part1 Part1 Part1 1 2 3 4 5 6 7 8 9 0 1 2 3 Int Introduction I admit it. OUT: Must be copy-able. Inheritance is a way of expressing commonality.

int* end.double copy<int*. and conforms to the requirements stated for the template parameter "OUT?" The secrets seems to lie in the lines "*dest = *begin" and "++dest". How can we do this? Of course we can. Other uses Let's assume we want to print the contents of an array. i. there's a difference between "dest++" and "++dest. "copy(a. However. and "end" the end of one. // postfix. ++begin.) Fortunate as it is. You can do the latter as an exercise. Either "*dest = *begin" prints the value of "*begin" on the screen. while operator++ requires some thought since there are two operator++. how do we make a type that does the necessary conversion. the fun has only begun. as long as "begin" does not equal "end".double*>(int* begin. It illegal to dereference the "one-past-the-end" pointer. use a loop over all elements and print them. where n>1. through some conversion (the output formatting. This means that to copy an entire array. the values in an array) means copying the values from the array to the screen. which yields "undefined behaviour". "end" must point one past the last element of the array. and then increment "begin" and "dest.a... we can use the copy function template to do it. there are two alternatives. // other misc. if the source type can be implicitly converted to the destination type. but the value is legal. ++dest. prints on the screen. Of these. i. operator ++ is (usually) modeled as follows (and always with these member function signatures:) class C { public: .e. like in the example above. Very useful. and "++dest" does nothing at all. as usual. c++ . by using the copy function template.b)" does nothing at all (except return "b").. but I can assure you. The real joy begins when we realize we can write our own types that behaves the same way. }. To show you how this can be done. Of course. that when "begin" and "end" are equal. we can now copy arrays (or parts of arrays) of any type to arrays (or parts there of) of any type.e. it must be possible to reach a state where "begin" does not compare unequal to "end. double* dest) { while (begin != end) { *dest = *begin. or "*dest = *begin" makes our variable "dest" remember the value to print and on "++dest" it does the printing.. operator* is very straight. and decrementing it will make it point to the last element (as opposed to making it point n.) The problem thus becomes. i. // misc C& operator++(void). operator* and operator++. no copying is done. all bets are off. } What this function does is to assign the value pointed to by "dest" the value of the dereferenced pointer "begin".") Assuming a class C. . "begin" might be the beginning of an array. Note.) This is useful." This puts a requirement on "begin" and "end" by using operator++ (prefix only) on "begin". elements past the end." For example. it is now necessary to see how some more operators can be overloaded. } return dest. one postfix and one prefix (i. I will do the former.e.e. That is. C operator++(int). As I see it. like this: +----+-----+-----+-----+----+ primes_lt_10 = | 2 | 3 | 5 | 7 |XXXX| +----+-----+-----+-----+----+ ^ ^ | | begin end (points to non-existing element one past the last one. // prefix. however. How? Printing an array (actually.forward. this is legal in C and C++. ++c.

yes. int_writer& operator*(). Perhaps the latter is purer. decrementing is analogous.operator=(*begin).iarr+isize. dereferencing it yields a T. and I want the assignment to write something on standard output. return *this.. // let the prefix version do the job return old_val.) class int_writer { public: // trust the compiler to generate necessary constructors and destructor int_writer& operator++(). and which very assignment actually means writing. and one class whose only job in life is to be assignable by int. }. What do I want to use the result of operator* for? Only for assigning to. Operator++ we implement to do nothing at all. return *this. normally the screen. we need to create two types. // do nothing } int_writer& int_writer::operator=(int i) { cout << i << endl.operator=(*begin)". we can use operator= for that class to do the writing. It's weird. the name "int_writer" is a dead giveaway for a class template. // does the real writing.int_writer()). isn't it? Why limit it to integers only? template <class T> class writer { public: .operator*(). the following "operator=(*begin)" means "dest. Weird? Well.e. which writes integers on standard output (i. other than to { // distinguish between pre. // remember the old value ::operator++(). If we made operator* return some other type. Cool eh? Here's all it takes to write the contents of the prime number array: copy(iarr. the line "*dest = *begin". // do nothing } int_writer& int_writer::operator*() { return *this. Since the return value of "dest. } This means that if "dest" is of type int_writer. // and return the old value } Needless to say. } C C::operator++(int) // throw away int. Of course. int_writer& operator=(int i).operator*()" is a reference to "dest" itself. however. Let's make our simple "int_writer" class. we're writing something. but the former is so much less work.. one int_writer. can be expanded to: dest. right? If I make operator* return the very object for which operator* was called on. Here's what the implementation looks like: int_writer& int_writer::operator++() { return *this. but operator* and operator= are interesting. In this case.C& C::operator++(void) { . since it's not used for anything. I say that dereferencing an int_writer yields an int_writer. it's not used. and if the type of "*begin" can be implicitly converted to "int".and post-fix ++ C old_val(*this). but it makes perfect sense anyway. // whatever's needed to "increment" it. If you look at a pointer to T.

reader<T>& operator++(). This requires some thought. 0 for { // the parameter-less constructor. private: unsigned remaining. Can we create a type matching the requirements for "IN". such that a copy would read values from standard input (normally the keyboard?) The requirements for "IN" are a little bit more complicated than those for "OUT. since T might be a type for which copying is expensive. The prime number copying now becomes: copy(iarr. For every operator++. and operator* must return a value. such that operator!= yields true. how else would you write it? As a last example. especially on the reachability issue. } template <class T> reader<T>& reader<T>::operator++() . for writer<T>.iarr+isize. I propose that we can create a "reader<T>" with a number. through operator++. it must be possible to reach one value from another. } I've changed the signature for "operator=" to accept a const reference instead of a value. and we can use the parameter-less constructor for that. T must be writable through operator<<. } template <class T> writer<T>& writer<T>::operator=(const T& t) { cout << t << endl. int operator!=(const reader<T>& r) const. const T& operator*() const.// trust the compiler to generate necessary constructors // and destructor writer<T>& operator++(). and the number of reads remaining is decremented. template <class T> writer<T>& writer<T>::operator++() { return *this. let's have a look at the source side. To make this example simple. and the number is the amount of T's to read from standard input. }. writer<T>& operator*(). It must also be possible to create an "end" reader<T>. Of course. With this template. writer<T>& operator=(const T&). that's no surprise. the types for "IN". Here's how reader<T> might look like: template <class T> class reader { public: reader(unsigned count=0). // does the real writing.writer<int>()). }. T t. a new T is read. } template <class T> writer<T>& writer<T>::operator*() { return *this. yet another requirement surfaced. template <class T> reader<T>::reader(unsigned count) : remaining(count) // the number of remaining reads." It must be not-equal comparable.

" or "input iterators" and "output iterators" to be more specific. can be used as one.e. Every algorithm you write can be used with any iterators of the type your algorithm requires. The employee/engineer/manager inheritance tree was an example of that. to enter values at the end of a linked list. forward iterator (sort of the combination.remaining != 0 || remaining != 0. to send audio data to our sound card.array+size. the series of prime numbers or whatever you want to get values from. The template parameters "IN" and "OUT" (from "copy") are called "iterators. To make matters even better.reader<int>(). by knowing about employees in general. you can use it with any kind of data source/sink which have iterators that follows your convention. your iterators can be used with any of the algorithms that requires such iterators. and iterators called "input_iterator" and "output_iterator" which behaves very similarly to the "reader" and "writer" class templates. return *this. If you write your iterators to comply with the requirements of one of these categories. The draft contains a function template "copy". The standard documents 5 iterator categories. generic programming.) bidirectional iterator (like forward. input iterator. allows both read and write. and whatever you need. // print the read values as unsigned long's copy(array. We can write input iterators for data base access.e. Anything that behaves like an input iterator. I've decided that operator != is really only useful for comparing with the end.) Pointers in arrays are typical bidirectional iterators. // read 5 integers from standard input and // store in our float array. and likewise for output iterators. Part Part Part Part Part Part Part Part Part Part1 Part1 Part1 Part1 1 2 3 4 5 6 7 8 9 0 1 2 3 Short recap of inheritance Inheritance can be used to make runtime decisions about things we know conceptually. If you write an algorithm in terms of generic iterators. but not in detail. This is *VERY* useful. this is all part of the now final draft C++ standard. as long as the types are convertable from *IN to *OUT. We can write output iterators to store values in a data base. That is a major time/code/debug saver. no remaining reads) state. which behaves identically to what I used in this article.) and lastly random access iterators (iterators which can be incremented/decremented by more than one. Conclusion What you've seen here is. However. but allows moving backwards too. } template <class T> const T& reader<T>::operator*() const { return t. it will only return false if both sides have reached the end (i. this is mighty neat: const unsigned size=5. // read a new value only if // there are values to read. output iterator. most probably your first ever encounter with. i. // return the last read value.{ if (remaining > 0 ) cin >> t. } The last one's perhaps debatable. we .writer<unsigned long>()). } template <class T> int reader<T>::operator!=(const reader<T>& r) const { return r. The function template "copy" will be useful for any combination of the above. copy(reader<int>(size). enumerating files in a directory.array). float array[size].

since that'd make this article way too long. so how can we take care of that scenario? It's unnecessary to write code that does nothing. A shape can be a square. What would you do with a generic shape object? It's better to make it impossible to create one by mistake. and ``Circle'').'' Saying so also implies that objects of the class itself can never be instantiated.'' There are 5 phases in which an error can be found.'' Addressing pure virtuals I won't write a drawing program. }. and even kinds of employees we haven't yet thought of. virtual void translate(Coordinate c) = 0. is it not? Let's have a look at the alternatives. virtual void scale(double) = 0. a rectangle. ``The sooner you catch an error. a circle. The root of this lies in the illusion that doing nothing at all is the default behaviour. and make sure to override these in our concrete shape classes. The ``= 0'' ending of a member function declaration makes it pure virtual. since it's meaningless anyway. Such a program usually holds a collection of shapes. how do you do any of these for a shape in general? How is a generic shape drawn or rotated? It's impossible. Herein lies the problem. while it is an optimization for circles. you can scale them. rotated. Having one or more pure virtual member functions in a class makes the class an abstract base class. • Let's just ignore it. though. or some times an interface class. ``scale(double)'' and so on. ``rotate(double degrees)''. and code its implementation to do nothing. This doesn't seem like a good idea because then the programmer implementing the square might forget to implement ``rotate'' without getting compiler errors. salesmen and janitors. If you try you'll get compiler errors. a collection of grouped images. addresses. managers. etc. Pure virtual (abstract base classes) C++ offers a way of saying ``This member function must be overridden by all descendants. Here's how a pure abstract base class might be defined: class Shape { public: virtual void draw(Canvas&) = 0. you can draw them on a canvas. secretaries. In other words. the better. translated. including engineers. ``Rectangle''. and an empty implementation for ``Circle::rotate. The graphically experienced reader has of course noticed that rotation of a circle can be implemented extremely efficiently by doing nothing at all. and the point would be drowned in all other intricacies of graphical programming. If you send something internationally . ``translate(Coordinate c)''. for example marketers. Please note the obvious that errors that cannot be detected until runtime might go undetected! How to discover errors at design or edit time is not for this article (or even this article series). text. A deficiency in the model While this is good. design. You know a number of things for shapes in general. The problem is. This in itself is not a problem.'') • We can change the interface of ''Shape`` such that ``rotate'' is not a pure virtual. The bad thing with that is that it violates a very simple rule-of-thumb. project leaders. but there's a simple way of moving this particular discovery from runtime to compile time. The latter is more descriptive. it's not quite enough. link and runtime. It won't work. virtual void rotate(double angle) = 0. Mailing addresses have different formatting depending on sender and receiver country. Instead I'll attack another often forgotten issue. edit. As such the ``do nothing at all'' code belongs in ``Circle`` only. the best solution is with the original pure abstract ``Shape'' class. we can create our base class ``Shape'' with virtual member functions ``drawOn(Canvas&)''. How do we force the descendants to override them? One way (a bad way) is to implement them in the base class in such a way that they bomb with an error message when called. and any piece of code that can understand the interface can operate on objects implementing the interface (the concrete classes like ``Triangle''. lines and so on. A class which has only pure virtual member functions and no data is often called a pure abstract base class. This makes sense. The classic counter example is a vector drawing program. The problem in the model lies in the common base. Pure virtual means that it must be overridden by descendants.can handle any kind of employee. It's only the concrete shapes that can be drawn. compile. the class defines an interface that descendants must conform to. only objects of classes inheriting from it. scaled. you can rotate them and translate them. since then our ``Circle'' class will be an abstract class (at least one pure virtual is not ``terminated. shape. Abstract because you cannot instantiate objects of the class.

addresses are synonymous (i. addresses as a unit. Here comes the ``MailingAddress'' base class: . As a simplification for this example I'll treat State and Zip in U. The Country-Code as can be seen in the Swedish address example will also be ignored (this too makes for an excellent exercise to include). E-mail. but only the descendants and no one else. This class. will not implement any of the formatting pure virtuals from ``Address.S. The formatting itself also differs from country to country. virtual void print(int international=0) const = 0. The member function ``acquire'' is used for asking an operator to enter address data.e. The address class hierarchy will be done such that other kinds of addresses like e-mail addresses and phone numbers can be added. since all kinds of mailing addresses are mailing addresses. virtual void acquire(void) = 0. while for domestic letters that's not necessary. Ham Radio call-signs. State Zip {Country-Name} Canada and U. }. but much stricter than public. depending on country). Note that the destructor is virtual. phone number. (i. virtual ~Address().S. inheriting from ``Address''. Name Number Street City {Country} Postal-Code Then. and this is a problem. Make sure ``State'' is only dealt with in address kinds where it makes sense. to access the address fields. The idea here is that ``type'' can be used to ask an address object what kind of address it is. however. Access to the address fields is for the concrete classes only. We've seen how we can make things generally available by declaring them public. Here are a few (simplified) examples: Sweden Name Street Number {Country-Code}Postal-Code City {Country-Name} USA Name Number Street City. to always return the string ``Mailing address''. Here's the base class: class Address { public: virtual const char* type() const = 0. etc.'' That must be done by the concrete address classes with knowledge about the country's formatting and naming. the concrete address classes.'' Protected means that access is limited to the class itself (of course) and all descendants of it.you add the destination country to the address. Here we want something in between. It is thus looser than private. ``protected.e. This can be achieved through the third protection level. fax number. or by hiding them from the general public by making them private. country name will be added to mailing addresses and international prefixes added to phone numbers). Addresses. even if they're Swedish addresses or U. e-mail address and so on. As an exercise you can improve this. of course. If the parameter for ``print'' is non-zero. I'll only have one field that's used either as postal code or as state/zip combination. and I will assume that PostalCode and State/Zip in U. there are totally different types of addresses. We want descendants. however.S. a mailing address. but not pure virtual (what would happen if it was?) Unselfish protection All kinds of mailing addresses will share a base. the address will be printed in international form. that contains the address fields.K. The member function ``type'' will be defined here. and ways to access them.

MailingAddress& operator=(const MailingAddress&). // get void street(const char*). if I left it to the compiler to generate them). // get void city(const char*). // set const char* street() const. // set const char* postalCode() const. distributing this to the concrete descendants is asking for trouble. virtual void print(int international=0) const. class USAddress : public MailingAddress { public: . char* number_data. }. // set const char* city() const. Now we get to the concrete address classes: class SwedishAddress : public MailingAddress { public: SwedishAddress(). // // declared private to disallow them // MailingAddress(const MailingAddress&). As a rule of thumb. char* postalCode_data.class MailingAddress : public Address { public: virtual ~MailingAddress(). char* street_data. no doubt. }. // set const char* country() const. // get void postalCode(const char*). The reason for the constructor to be protected is more or less just aestethical. It's the responsibility of this class to manage memory for the data strings. // set const char* name() const. This is not because they conceptually don't make sense. Having all data private. // set const char* number() const. No one but descendants can construct objects of this class anyway. since some of the pure virtuals from ``Address'' aren't yet terminated. and always manage the resources for the data in a controlled way. // get private: char* name_data. and giving controlled access through protected access member functions will drastically cut down your aspirin consumption. protected data is a bad mistake. // get void number(const char*). const char* type() const. char* city_data. // get void country(const char*). void name(const char*). protected: MailingAddress(). but because I'm too lazy to implement them (and yet want protection from stupid mistakes that would come. Here the copy constructor and assignment operator is declared private to disallow copying and assignment. char* country_data. virtual void acquire(void).

country_data(0) { } The only thing the constructor does is to make sure all pointers are 0. number_data(0). the definitions of ``USAddress'' and ``SwedishAddress'' are identical. that it is responsible for handling the resources for the member data.) Since it will never. }. Don't be afraid of copy construction and assignment. Since there's no data to take care of in these classes (it's all in the parent class) we don't need to do anything special here. one of the fields are not set to anything. As you can see. and yet implement it! Pure virtual does not illegalize implementation. since the destructor will be called when a descendant is destroyed. Yes. you can declare a member function pure virtual. The ``delete[]'' syntax is for deleting arrays as opposed to just ``delete'' which deletes single objects. For the ``Address'' base class only one thing needs implementing and that is the destructor. By termination. The ``type'' and read-access methods are trivial: const char* MailingAddress::type(void) const . ever. Since we don't know the length of the fields. for some reason. Deleting the 0 pointer does nothing at all. city_data(0). If. street_data(0). There's no escape for the compiler. by the way. I've left the destructors to be implemented at the compilers discretion. we can save a little typing by declaring it pure virtual and there won't be a need to implement it. They were declared private in ``MailingAddress''. Note that it's legal to delete the 0 pointer. The observant reader might have noticed a nasty pattern of the authors refusal to get to the point with pure virtuals and implementation. That's wrong. you'll probably get a nasty run-time error when the first concrete descendant is destroyed. I mean declaring it in a non pure virtual way. postalCode_data(0). virtual void acquire(void). Then how can one be called? Through explicit qualification. The only way to call the implementation of ``acquire'' in ``Address'' is to explicitly write ``Address::acquire. it will be 0. MailingAddress::~MailingAddress() { delete[] name_data. delete[] country_data.e. delete[] street_data. writing it like this can only mean one thing. Now let's look at the middle class. that we through some magic found a way to implement the some reasonable generic behaviour of ``acquire'' in ``Address. From this to the constructor: MailingAddress::MailingAddress() : name_data(0). OK. There's no way around that. but rather dynamically allocate whatever is needed.USAddress(). by just calling the function on an object. delete[] city_data. This is used here. If you declare it pure virtual and don't implement it. Let's assume. and ``acquire''. which means the compiler cannot create them the ``USAddress'' and ''SwedishAddress. We know the parent takes care of it. just for the sake of argument. be called through virtual dispatch. a reference or a pointer to an object. Since the class holds no data. the destructor will be empty: Address::~Address() { } A trap many beginners fall into is to think that since the destructor is empty. virtual void print(int international=0) const. in order to guarantee destructability. the ``MailingAddress'' base class.'' Let's look at the implementation. delete[] number_data. delete[] postalCode_data. so a pure virtual won't ever be called through virtual dispatch. hence the rule that you cannot instantiate objects where pure virtuals are not terminated.'' but we want to be certain that descendants do implement it. even if ``Address::acquire'' is declared pure virtual. It only means that the pure virtual version will NEVER be called through virtual dispatch (i. we oughtn't restrict them. } I said when explaining the interface for this class.'' This is what explicit qualification means. though. The only difference lies in the implementation of ``print''. it must be implemented by the descendants.

The meaning of this is. } The write access methods are a bit trickier. } } . that we'll use a convenience function. } } This is done so many times over and over.n). ``strlen'' and ``strcpy'' are the C library functions from <string> that calculates the length of. and do nothing in those situations. but I can't think of any way). however. static void replace(char*& data.'' We must make sure that doing this works (or find a way to illegalize the construct. } const char* MailingAddress::street(void) const { return street_data.n). of course. the old destination must be deleted. // OK even if 0 name_data = new char[strlen(n)+1]. and copies strings. strcpy(name_data. } const char* MailingAddress::number(void) const { return number_data. data = new char[strlen(n)+1]. though. If the source and destination are different. exactly the same way for all kinds of data members. const char* n) { if (data != n) { delete[] data. } const char* MailingAddress::postalCode(void) const { return postalCode_data. ``set the name to what it currently is. a new one allocated on heap and the contents copied. strcpy(data. While it may seem like a very stupid thing to do. First we must check if the source and destination are the same. } const char* MailingAddress::country(void) const { return country_data.{ } return "Mailing address". This is to achieve robustness. it's perfectly possible to see something like: name(name()).'' to do the job. } const char* MailingAddress::city(void) const { return city_data. const char* MailingAddress::name(void) const { return name_data. ``replace. Like this: void MailingAddress::name(const char* n) { if (n != name_data) { delete[] name_data.

if (international) cout << country() << endl. } void MailingAddress::city(const char* n) { ::replace(city_data. cout << "Number: " << flush. } void MailingAddress::country(const char* n) { ::replace(country_data.getline(buffer. cin.Using this convenience function. cin. the write-access member functions will be fairly straight forward: void MailingAddress::name(const char* n) { ::replace(name_data.getline(buffer.n). . cout << "Street: " << flush.n). Now it's time for the concrete classes.sizeof(buffer)). } void MailingAddress::postalCode(const char* n) { ::replace(postalCode_data. name(buffer).sizeof(buffer)). // A mighty long field cout << "Name: " << flush. cin. cout << postalCode() << ' ' << city() << endl. All they do is to ask questions with the right terminology and output the fields in the right places: SwedishAddress::SwedishAddress() : MailingAddress() { country("Sweden"). } void SwedishAddress::acquire(void) { char buffer[100]. cout << street() << ' ' << number() << endl.sizeof(buffer)).n).getline(buffer. } void MailingAddress::number(const char* n) { ::replace(number_data.n).n).n). street(buffer). // what else? } void SwedishAddress::print(int international) const { cout << name() << endl. } void MailingAddress::street(const char* n) { ::replace(street_data. } That was all the ``MailingAddress'' base class does.

cout << endl << "--------" << endl. cout << city() << ' ' << postalCode() << endl. cin. cin.getline(buffer. city(buffer). street(buffer). // Seems like a mighty long field cout << "Name: " << flush. } A toy program Having done all this work with the classes. cout << "Street: " << flush.addrs+size).sizeof(buffer)).getline( buffer.getline(buffer. we must of course play a bit with them. if (international) cout << country() << endl. cin.sizeof(buffer)). cout << "Number: " << flush. cout << "City: " << flush.sizeof(buffer)). // needed for VACPP (bug?) Address** last = get_addrs(addrs.number(buffer). city(buffer). sizeof(buffer)).getline(buffer. . cin. // what else? } void USAddress::print(int international) const { cout << name() << endl. Address** first = addrs. postalCode(buffer). cout << "State and ZIP: " << flush. postalCode(buffer). cout << "Postal code: " << flush. Address* addrs[size].sizeof(buffer)). } void USAddress::acquire(void) { char buffer[100].S. cin. Here's an short and simple example program that (of course) also makes use of the generic programming paradigm introduced last month. cout << "City: " << flush. cout << number() << ' ' << street() << endl.sizeof(buffer)).A.getline(buffer. cin."). cin.sizeof(buffer)).getline(buffer. } USAddress::USAddress() : MailingAddress() { country("U. name(buffer).getline(buffer. int main(void) { const unsigned size=10. number(buffer).

that was mean. char answer[5]. that was reading. . (S)wedish or (N)one " << flush. default: return current.OI last. Could this one be replaced with virtual dispatch as well? It would be unfair of me to say ``no''. Imagine never again having to explicitly loop through a complete collection again. // Should be enough. or it terminates for some other reason. but it would be equally unfair of me to propose using virtual dispatch here.Address** last) { Address** current = first.acquire(). } return current. for_each(first. } } In fact. switch (answer[0]) { case 'U': case 'u': *current = new USAddress. } (**current). Instead we'd need a set of address creating objects.last. ++current. if (!cin) break.sizeof(answer)). return 0. It's something which behaves like a function. and call a virtual creation member function for. } In part 6 I mentioned that virtual dispatch could replace switch statements. since they're not created yet. cin. ++first. but which might store a state of some kind (in this case whether the country should be added to written addresses or not).'' or ``function object'' as they're often called. It could be implemented like this: template <class OI. Here's how it may be implemented: Address** get_addrs(Address** first. const F& functor) { while (first != last) { functor(*first). in the (final draft) C++ standard. now for the rest. ``for_each'' does something for every iterator in a range. which reads addresses into a range of iterators (in this case pointers in an array) until the array is full. Obviously there's a function ``get_addrs''.deallocate<Address>()). Defining one is easy. break. break.getline(answer. there is a beast called ``for_each'' and behaving almost like this one (it returns the functor). Why? We obviously cannot do virtual dispatch on the ``Address'' objects we're about to create. although it looks odd at first.print(1)). Doesn't seem to save a lot of work does it? Probably the selection mechanism for which address creating object to call would be a switch statement anyway! So. It's pretty handy.last. which we can access through some subscript or whatever. for_each(first.class F> void for_each(OI first. What is ``print'' then? Print is a ``functor.} OK. while (current != last) { cout << endl << "Kind (U)S. The reason is that we'd need to work a lot without gaining anything. and yet here is one. and which can be passed around like any object. case 'S': case 's': *current = new SwedishAddress.

}. print::print(int i) : international(i) { } void print::operator()(const Address* p) const { p->print(international). // pobject. • a new protection level. This is usually called the ``function call'' operator. • that the above means that there's a distinction between terminating a pure virtual. // define print object. disadvantages of the methods?) Rewrite ``get_addrs'' to accept templatized iterators instead of pointers. • why it's a bad idea to make destructors pure virtual. • that despite what most C++ programmers believe. and how you declare pure virtual functions. you've learned: • what pure virtual means.'' • why protected data is bad. Most of the language issues that remain are more or less obscure and little known. and yet define it. cout << endl. Think of two ways to handle the State/Zip problem. Recap This month. . by the way.class print { public: print(int i) . ``protected. but you probably already guessed it looks like this: template <class T> class deallocate { public: void operator()(T* p) const. Exercises • • • Find out what happens if you declare the ``MailingAddress'' destructor pure virtual. template <class T> void deallocate<T>::operator()(T* p) const { delete p. pobject(1).operator()(1). and have some experience with the C++ standard class library. Like this: print pobject. and how you can work around it in a clever way. The only remaining thing now is ``dealllocate<T>''. void operator()(const Address*) const. } This is well enough for one month. private: int international. isn't it? You know what? You know by now most of the C++ language. • that switch statements cannot always be replaced by virtual dispatch. and simply call it. and implementing one. pure virtual functions can be implemented. }. We'll look mostly at library stuff and clever ideas for how to use the language from now on. • that there is a ``function call'' operator and how to define and use it. and implement both (what are the advantages. } What on earth is ``operator()''? It's the member function that's called if we boldly treat the name of an object just as if it was the name of some function.

while other programming languages. next month we'll look at file I/O (finally). this means . there's very much in common. since the data will be the same. there is very little difference. but in some important aspects different. templates are used when we want the same kind of behaviour. since the ideas expressed here and in parts 5 and 6 can be used for other things than I/O. Here is a situation where it's used in the right way. there's a good case for using templates too. In a sense. The only thing that truly differs is the media where the formatted message ends up. For example a stack of some data type. commonality is expressed either through inheritance or templates. how it does end up there) differs.) Files In what way is writing ``Hello world'' on standard output different from writing it to a file? The question is worth some thought. with formatted reading and writing from standard input and output. In this case it's inheritance that's the correct solution. Inheritance is used when you want similar. the basics of I/O were introduced. Few compilers today support this. The C++ standard does indeed have templatized streams. (Incidentally. for example in-memory formatting of data (we'll see that at the very end of this article. but it can cause severe problems.Coming up As I mentioned just a few lines above. In the former case. In other words. go to the other extreme and allow you to inherit the same base several times Personally I think multiple inheritance is very useful if used right. Anyway. Then there's the odd ones. Anyway. which inherits from both ``istream'' and ``ostream''.) The inheritance tree for stream types look like this: The way to read this is that there's a base class named ``ios''. pointer and delete. and instead use streams and streaming. but for file I/O it's in a file somewhere on your hard disk. and ``fstream'' which inherits from both ``ifstream'' and ``ofstream. behaviour at runtime for the same kind of data. See the ``Standards Update'' towards the end of the article for more information. depending on what's common and what's not. To refresh your memory. /Björn Part Part Part Part Part Part Part Part Part Part1 Part1 Part1 Part1 1 2 3 4 5 6 7 8 9 0 1 2 3 In parts 5 and 6. however. Quite a bit of the library remains.'' Inheriting from two bases is called multiple inheritance. The classes ``ifstream'' and ``ofstream'' in their turn inherit from ``istream'' and ``ostream'' respectively. it's on your screen. from which the classes ``istream'' and ``ostream'' inherit. Many programming languages have banned it: Objective-C. ``iostream''. since in many programming languages there is a distinct difference. most of the C++ language is covered. like Eiffel. independent of data. it's better to stop using the term I/O here. and lots of useful and cool techniques are waiting to be exploited. As we've seen so far. We'll now have a look at I/O for files. Java. The ``f'' in the names imply that they're file streams. or at least. and is by many seen as evil. just for differing between character types. regarding the type of characters used. Smalltalk to mention a few. Here's something for you to think about destructor. We saw this for the staff hierarchy and mailing addresses in parts 7 and 8. but where it will end up (and most notably. Is the message different? Is the format (as seen from the program) different? I cannot see any difference in those aspects.

ifstream(const char* name.'' These variations of course makes it difficult to write portable C++ today. Some implementations do not have ``ios::binary. Now. }.. int mode=ios::out). wasn't that neat? In other words. the six ones listed first are required by the standard (although. You get access to the classes by #including <fstream. }. class fstream : public ofstream.. Since you normally use either ``ifstream'' or ``ofstream'' and rarely ``fstream''. .'' Some implementations also provide ``ios::nocreate'' and ``ios::noreplace. a call to ``open'' must be made. in which you use bitwise or (``operator|'') for any of the values ``ios::in''. will work just as they do with file streams. means that all the stream insertion and extraction functions (the ``operator>>'' and ``operator<<'') you've written. ``ios::out''. ``ios::app''. you probably don't want to use the ``iostream'' or ``fstream'' classes. int mode). you need to use the ``mode'' parameter.'') The meaning of these are: ios::in ios::out ios::ate ios::app ios::trunc open for reading open for writing open with the get and set pointer at the end (see Seeking for info) of the file. void open(const char* name. . while ``iostream'' is an abstract stream for both reading and writing. ``ios::ate''... To tie such an object to a file. however.. however. ``name'' is of course the name of the file. It's a bit field.'' but those are extensions. This inheritance.'' while others call it ``ios::bin. . this is normally the only parameter you need to supply. scrap all data in the file if it already exists. int mode). ``ios::trunc''. void open(const char* name. ``open'' and the constructors with parameters behaves identically. open for append. class ofstream : public ostream { ofstream(). they belong to class ``ios_base. ofstream(const char* name. More often than you think. int mode=ios::in). The empty constructors always create a file stream object that is not tied to any file. Sometimes. fstream(const char* name.'' rather than ``ios. any write you make to the file will be appended to the file. ..that ``fstream'' is a file stream for both reading and writing. int mode=ios::in). and finally ``ios::binary. The parts of interest look like this: class ifstream : public istream { ifstream().h>. void open(const char* name. int mode=ios::out). }. File Streams The first thing you need to know before you can use file streams is how to create them. Fortunately. the only things you need to learn for file based I/O are the details that are specific to files. public ifstream { fstream(). that is.

for example to save space in a file. that is indeed often the case. Write to it! of << ``Hello file!'' << endl.ios::binary open in binary mode. so there's no need for it. ios::noreplace cause the open to fail if the file already exists. so often insist on. // create the ofstream object // and open the file. On many implementations today there's also a third parameter for the constructors and ``open. } // Now the file stream object is created. that is. since the destructors do close the file. ostream& flush(). Binary streaming is what you use your stream for. DOS. . do not do the brain damaged LF<->CR/LF conversions that OS/2. The file stream classes also have a member function ``close''.) They're two different concepts. return 2. if (!of) { // something went wrong cout << ``Error. CP/M (RIP).the failure is guaranteed. Note that binary streaming does not necessarily mean using the ``ios::binary'' mode when opening a file (although. Actually this is all there is that's specific to files. Of course combinations like ``ios::noreplace | ios::nocreate'' doesn't make sense -. and probably other operating systems. The reason some implementations do not have ios::binary is that many operating systems do not have this conversion. Windows. { public: ostream& write(const char* s. If you look at a file produced by. How this parameter behaves is very operating system dependent. its usage is analogous to that of ``cout'' that you're already familiar with. it's most likely not in a human readable form. char* argv[]) { if (argc != 2) { cout << ``Usage: `` << argv[0] << ``filename'' << endl.. Of course reading with ``ifstream'' is done the same way. Binary streaming So far we've dealt with formatted streaming only. ostream& put(char c). Few are the situations when you need to call this member function. raw data that is. means turning the brain damaged LF<->CR/LF translation off. ios::nocreate cause the open to fail if the file doesn't exist. streamsize n). Now for some simple usage: #include <fstream.. // error code } ofstream of(argv[1]). the process of translating raw data into a human readable form. } As you can see. Binary streaming is done through the stream member functions : class ostream . or translating human readable data into the computer's internal representation. return 1.'' a protection parameter. that is. once the stream object is created.h> int main(int argc. and opening a file with the ``ios::binary'' mode. return 0. that by force closes the file and unties the stream object from it. cannot open `` << argv[1] << endl. for example a word processor. Some times you want to stream raw data as raw data. just use the object as you've used ``cin'' earlier.

streamsize n. Here a ``char'' is used instead of an ``int.'' that's physically impossible. Inserts the character into the stream. Note.) istream& istream::read(char* s.'') istream& istream::get(char& c). It stops if the delimiter character is found.'' Here you better make sure that the array is large enough.'' ``streamsize'' is a signed integral data type. The writing interface is extremely simple and straight forward.'' since you can check the value directly by calling ``. char delim='\n'). { public: istream& read(char* s. istream& getline(char* s. Read one character from the stream. istream& get(char* s. Write ``n'' characters to the stream. that the delimiter is not stored in the array. streamsize n. while the reading interface includes a number of small but important differences.. no more. If the delimiter character is read. This one's similar to ``read'' above. class istream . Note that these member functions are implemented in classes ``istream'' and ``ostream. but read the character into ``c'' instead. istream& istream::getline(char* s. int get(). The only difference between this one and ``get'' above. ostream& ostream::put(char c). Despite ``streamsize'' being signed. and return it. It will not be zero terminated. Reads at most ``n'' characters from the stream. int delim=EOF). istream& ignore(streamsize n=1. ostream& ostream::flush(). char delim='\n'). int istream::get(). although files are where you're most likely to use them. }. istream& get(char& c). you're of course not allowed to pass a negative size here (what would that mean?) Exactly the characters found in ``s'' will be written to the stream. or unpleasant things will happen. one by one: ostream& ostream::write(const char* s. The value is an ``int'' instead of ``char'' since the return value might be ``EOF'' (which is not uniquely representable as a ``char. istream& istream::get(char* s. from the array pointed to by ``s. it stops there.... int delim=EOF). no less. Same as above.. streamsize n). Read ``n'' characters into the array pointed to by ``s. streamsize n). unless the last character read from the stream indeed is '\0'. }. char delim='\n'). if the delimiter is ``EOF'' (as is the default) it does not read past ``EOF. Note that when the delimiter is found. streamsize n. Let's have a look at them. streamsize n. Of course. but with the difference that it reads at most ``n'' characters. streamsize n). but doesn't store them anywhere. char delim='\n'). Note that only the characters read from the stream are inserted into the array. is that this one does read the delimiter from the stream. . istream& istream::ignore(streamsize n=1. Force the data in the stream to be written (file streams are usually buffered.'' so they're not specific to files.eof()'' on the reference returned. however. it is not read from the stream.

There's the put pointer. size_t elems) { os. continuous streams of data. There's a total of 6 new member functions that deal with random access in a stream: streampos istream::tellg(). both backward and forward. and we want to do this in raw binary format.sizeof(elems)). but that's normal for binary streaming. then allocate an array of that size. which you get from ``tellg'' and ``tellp'' is an absolute position in a stream. You cannot use the values for anything other than ``seekg'' and ``seekp''. ``streampos''.h> void storeArray(ostream& os. in contrast. What's done here is to use brute force to see the address of ``elems'' as a ``const char*'' (since that's what ``write'' expects) and then say that only the ``sizeof(elems)'' bytes from that pointer are to be read. elems*sizeof(*p)). are true random access data stores. and get a ``streamoff'' . Both the size and the data will be in raw format. and the get pointer. or other releases of the same compiler. Note that ``sizeof(*p)'' reports the size of the type that ``p'' points to.'' Repeating ``int'' again just means I'll forget to update one of them when I change the type to something else. ostream& ostream::seekp(streampos). #include <fstream. ios::seek_dir).read((char*)&elems. and read the data into it. Naturally we want to be able to read the array as well.e. ostream& ostream::seekp(streamoff. p = new int[elems]. and an istream only the get pointer.'') Well. which refers to the next position to read data from. A reasonable way is to first store a size (in elements) followed by the data. int*& p) { size_t elems.Array on file An example: Say we want to store an array of integers in a file. } It's not particularly hard to follow. but what you find out might hold only for the current release of your specific compiler. which refers to the next position to write data to. Seeking Up until now we have seen streams as. You especially cannot examine a value and hope to find something useful there (i. Streams like standard input and standard output are truly continuous streams. ios::seek_dir). They're not to be confused with pointers in the normal C++ sense. const int* p. To read such an array into memory requires a little more work: #include <fstream. is. return elems.write((const char*)&elems. An ostream of course only has the put pointer. What this actually does is to write out the raw memory that ``elems'' resides in to the stream. there's a need to move around. Random access streams have something called position pointers. streampos ostream::tellp(). might show different characteristics for ``streampos. it does the same kind of thing for the array.h> size_t readArray(istream& is. is.'' but that is a dangerous duplication of facts.read((char*)elems. It's enough that I've said that ``p'' is a pointer to ``int. you can. within which you cannot move around. there are two other things you can do with ``streampos'' values. I could as well have written ``sizeof(int). what it sounds like. istream& istream::seekg(streampos). You can subtract two values. } The above code does a lot of ugly type casting. Sometimes however. sizeof(elems)). elems*sizeof(*p)). os. if you attempt to write anything. istream& istream::seekg(streamoff.write((const char*)p. After this. first read the number of elements. other compilers. but it's something referring to where in the file you currently are. Files.

) We also want to say that an array is just a part of a file and not necessarily an entire file. We'll also skip error handling for now (you can add it as an exercise. but nothing else will suffer. This makes for slow access. and you can add a ``streamoff'' value to a ``streampos'' value. is some signed integral type. the only thing that happens is that some member variable in the stream object changes value..'' In any reasonable implementation.) and add that too next month. // Create a new array and set the size. and also for errors that arrays cannot have (disk full. such as addressing beyond the range of the array. any of the seek member functions use lazy evaluation.value. the selection of which. However. disk corruption. private: // don't want these to be used. so it can be used to store arbitrary types. ??? operator[](size_t index). when you call any of the seek member functions. First of all. we cannot have the entire array duplicated in memory (then all the benefits will be lost. get the // size from the file. for really huge amounts of data Suppose we have a need to access enormous amounts of simple data. probably a ``long. To prevent this article from growing way too long. The ``seekg'' and ``seekp'' methods accept a ``streamoff'' value and a direction. // Create an array from an existing file. or do relative searches by adding/subtracting ``streamoff'' values.) A stream array. which has these three values ``ios::beg''. We do not want the size of the array to be part of its type (if you've programmed in Pascal. let's use a file to access the data.. is done through the ``ios::seek_dir'' enum. that can be used for traversing it. is OK. ``ios::end'' and ``ios::cur.seekp(0. }. The things to cover this month are: An array of built-in fundamental types only. such as asking for the number of elements in it. It's not a very good idea to just allocate that much memory. but a parameter for the constructor. As can be expected. Here's the idea. including user defined classes.ios::beg). ``operator[]'' can be overloaded. Since we do not want the size to be part of the type signature. the array must be a template. which lacks pointers and is limited to one file per array. cannot create file. size_t elements). but probably the whole system due to excessive paging.) In addition to arrays. and work in a slightly different way. There must be a type. You search your way to a position relative to the beginning of the stream. but extra functionality that arrays do not have.'' by the way. FileArray(const char* name). for sure. It'll not just make this application crawl. Of course. at least not on my machine with a measly 64Mb RAM. quite a few of the above listed features will be left for next month. FileArray(const FileArray&). already here we see a problem. ``streamoff.'' To make the next write occur on the very first byte of the stream. call ``os. the end of the stream. size_t size() const. Its usage must resemble that of real arrays as much as possible. What's the non-const ``operator[]'' to return? To see why this is a problem. Instead.'' you have a way of finding your way back. . T operator[](size_t index) const. template <class T> class FileArray { public: FileArray(const char* name. or the current position. That is. The array must be possible to use with any data type. something truly happens on disk (or wherever the stream data resides. resembling pointers to arrays. you know why.) instead we will search for the data on file every time it's needed.'' where ``os'' is some random access ``ostream. I'll raise some interesting questions along the way. FileArray& operator=(const FileArray&). etc. the size is not a template parameter. This would allow the user to create several arrays within the same file. It's not until you actually read or write. say 10 million floating point numbers. Here's the outline for the class. // use compiler defined destructor. we want some measures of safety from stupid mistakes. ask .'' By using the value returned from ``tellg'' or ``tellp. which is handy for providing a familiar syntax.

'' thus ``FileArray<T>'' is declared a friend of ``FileArrayProxy<T>. this class is a helper for the array only. with the constructors being private. This is done by not taking care of the problem in ``operator[]. how can ``FileArray<T>::operator[]()'' create and return one? Enter another C++ feature: friends. // read a value // compiler generated destructor FileArrayProxy<T>& operator=(const FileArrayProxy<T>& p). When ``operator[]'' is on the left hand side of an assignment. those functions are not the ``operator[]. it's wrong and it won't work. int y = x[3]..'' since then we'd have an infinite recursion. We create a class template. const size_t index. to add another level of indirection. as so often in computer science. This means that ``FileArray<T>'' can access everything in ``FileArrayProxy<T>. The trick is.'' including things that are declared private. Friends are a way of breaking encapsulation. and that's what we wanted to prevent. Warning: I've often seen it suggested that the solution is to have the const version read and return a value. FileArray<T>& array. I want to write data to the file. which does the job.. The const version is called for const array objects. looking like this: template <class T> class FileArrayProxy { public: FileArrayProxy<T>& operator=(const T&). . Paradoxically. depending on where it's used. that there are member functions in ``FileArray<T>'' that can read and write (and of course. except for the copy constructors.) All constructors. FileArrayProxy(const FileArrayProxy<T>&). // read a value // compiler generated destructor FileArrayProxy<T>& // read from p and then write operator=(const FileArrayProxy<T>& p). in ``FileArrayProxy<T>'' declare ``FileArray<T>'' to be a friend. I want ``operator[]'' to do two things. private: . all other constructors. and the non-const version write a value. the non-const version for non-const array objects. What?!?! Yes.'' The declaration then becomes: template <class T> class FileArrayProxy { public: FileArrayProxy& operator=(const T&). and if its on the right hand side of an assignment. like this: FileArray<int> x. x[5] = 4. but it's important to use it only in situations where two (or more classes) are so tightly bound to one another that they're meaningless on their own. violating encapsulation with friendship strengthens encapsulation when done right. . and is not intended to ever even be seen. what you read is right.yourself what you want ``operator[]'' to do.. This is the case with ``FileArrayProxy<T>. are made private to prevent users from creating objects of the class whenever they want to. Friends break encapsulation. Ouch.'' but rather let it return a type. I want to read data from the file. This. poses a problem. // write a value operator T() const. After all. of course. As slick as it would be. }. // write value operator T() const.'' It's meaningless without ``FileArray<T>. Instead what we have to do is to pull a little trick. We can. however. We have to make sure.. Friends are useful for strong encapsulation. The only alternative here to using friendship. is to make the constructors public. and (this is the real shock) that's a good thing! Friends break encapsulation in a controlled way. but then anyone can create objects of this class.

and neither ``seekg'' nor ``read'' are allowed on constant . we face an unexpected problem.h> // size_t template <class T> class FileArrayProxy. // for use by FileArrayProxy<T> T readElement(size_t index) const. fstream stream. all member variables are ``const''. The above code won't compile. We can now start implementing the array. stream. // for use by FileArray<T> only. FileArrayProxy<T> operator[](size_t size). Some problems still lie ahead. } All of a sudden.hpp #ifndef FARRAY_HPP #define FARRAY_HPP #include <fstream. The functions for reading and writing are made private members of the array. but I'll mention them as we go. size_t n). // use existing array T operator[](size_t size) const. we need to make use of friendship to grant ``FileArrayProxy<T>'' the right to access them. template <class T> class FileArray { public: FileArray(const char* name. // Forward declaration necessary. The member function is declared ``const''. size_t max_size. FileArray<T>& array. // farray. // illegal FileArray<T>& operator=(const FileArray<T>&). // create FileArray(const char* name). const T&). const size_t index.seekg(sizeof(max_size)+index*sizeof(T)). since they're not for anyone to use. size_t size). private: FileArray(const FileArray<T>&). since FileArray<T> // returns the type. sizeof(t)). }. friend class FileArrayProxy<T>. // what if read fails? return t.h> #include <stdlib. }. Again. friend class FileArray<T>. Let's define them right away template <class T> T FileArray<T>::readElement(size_t index) const { T t.// compiler generated copy contructor private: FileArrayProxy(FileArray<T>& fa. size_t size() const. and as such. // what if seek fails? stream.read((char*)&t. void storeElement(size_t index.

you declare ``stream'' to be ``mutable fstream stream.seekg(sizeof(max_size)+index*sizeof(T)). instead of using an ``fstream'' member variable called ``stream. This solution is. I'll delete it in the destructor.) The only reasonable way to achieve this is to store the stream object on the heap. T& operator*() const. ~ptr().'' not a ``const T&. whatever it points to is deleted. }. however. is to make sure that whatever we feed it is allocated on heap (and is not an array) so it can be deleted with operator delete. the solution is very simple. ``readElement'' must be slightly rewritten: template <class T> T FileArray<T>::readElement(size_t index) const { (*pstream). However.'' in the class definition. When this thing is a constant. destructor.'' I'll probably devote a whole article exclusively for these some time. yet again.'' let's use a ``ptr<stream>'' member named ``pstream. .'' With this change. } template <class T> T& ptr<T>::operator*() const { return *p. it is not bitwise const. template <class T> ptr<T>::ptr(T* pt) : p(pt) { } template <class T> ptr<T>::~ptr() { delete p. I can have a pointer to an ``fstream.'') So. only bitwise constness. have a very old compiler. the thing pointed to still isn't a constant (look at the return type for ``operator*. then the destructor will never execute (since no object has been created that must be destroyed. what if I forget to delete the pointer? Sure. the pointer is also ``const''. // we don't want copying ptr<T>& operator=(const ptr<T>&). The problem is one of differing between logical constness and bitwise constness. // nor assignment T* p. The only thing we have to keep in mind when using it. } This is probably the simplest possible of the family known as ``smart pointers. but what if an exception is thrown already in the constructor. I. the stream member changes. pointer and delete. // what if seek fails? T t.'' it's a ``T&. private: ptr(const ptr<T>&). and in doing this I introduce a possible danger. one of adding another level of indirection.streams. This member function is logically ``const''. but not what it points to (there's a difference between a constant pointer.) Do you remember the ``thing to think of until this month?'' The clues were. If you have a modern compiler. so I have to find a different solution. Thought of anything? What about this extremely simple class template? template <class T> class ptr { public: ptr(T* pt). and a pointer to a constant. Whenever an object of this type is destroyed. This solves our problem nicely. C++ cannot understand logical constness. as it does not alter the array in any way.'' When in a ``const'' member function.

// What if read failed because of a disk error? } template <class T> .read((char*)&t. sizeof(max_size)). // what if write failed? } Now for the constructors: template <class T> FileArray<T>::FileArray(const char* name. ios::in|ios::out|ios::binary)).t). sizeof(t)). sizeof(elem)). } I bet the change wasn't too horrifying.seekp(sizeof(max_size)+index*sizeof(T). (*pstream). (*pstream). ios::beg). ios::in|ios::out|ios::binary)). // what if write failed? // We want to write a value (any value) at the end // to make sure there is enough space on disk. // what if read fails? return t. max_size(size) { // what if the file could not be opened? // store the size on file. template <class T> void FileArray<T>::storeElement(size_t index. storeElement(max_size-1.read((char*)&max_size. sizeof(max_size)). // What if this fails? } template <class T> FileArray<T>::FileArray(const char* name) : pstream(new fstream(name. const T& elem) { (*pstream). size_t size) : pstream(new fstream(name.write((char*)&elem. // what if seek fails? (*pstream).write((const char*)&max_size.(*pstream). T t. // what if read fails or max_size == 0? // How do we know the file is even an array? } The access members: template <class T> T FileArray<T>::operator[](size_t size) const { // what if size >= max_size? return readElement(size). max_size(0) { // get the size from file.

) since we have explicitly defined a contructor. The compiler doesn't generate a default constructor (one which accepts no parameters. return *this. there's absolutely no error handling here. } Well.'' template <class T> class FileArrayProxy { public: // copy constructor generated by compiler operator T() const. Sure. as can be seen by the comments.FileArrayProxy<T> FileArray<T>::operator[](size_t size) { // what if size >= max_size? return FileArrayProxy<T>(*this . Now for the implementation: template <class T> FileArrayProxy<T>::FileArrayProxy(FileArray<T>& f. private: FileArrayProxy(FileArray<T>& f. but then. however. it would succeed. The assignment operator is necessary. size_t i). since the return value must be copied (return from ``FileArray<T>::operator[]. which just copies all member variables. but what we want to do is to read data from one array and write it to another. however. // read from one array and write to the other. What it would do is to copy the member variables. friend class FileArray<T>. }.readElement(index). Next in line is ``FileArrayProxy<T>. Note. fa(f) { } template <class T> FileArrayProxy<T>::operator T() const { return fa. but the result would *NOT* be what we want. The copy constructor is needed. since its implementation is trivial. this wasn't too much work. since references (``fa'') can't be rebound.t). FileArray<T>& fa. I've left out the ``size'' member function. } template <class T> FileArrayProxy<T>& FileArrayProxy<T>::operator=( const FileArrayProxy<T>& p . size_t index. FileArrayProxy<T>& operator=(const FileArrayProxy<T>& p). but it will fail. size).storeElement(index. size_t i) : index(i). that if we instead of a reference had used a pointer. the compiler will try to generate one for us if we don't. The one that the compiler generates for us.'') and it must be public for this to succeed. will do just fine. FileArrayProxy<T>& operator=(const T& t). } template <class T> FileArrayProxy<T>& FileArrayProxy<T>::operator=(const T& t) { fa.

'' One where you have an array you want to store data in. a similar proxy is created through the call to ``operator[](2)'' This time.p)''. but unfortunately the compiler does not prevent it (a decent compiler will warn that we're binding a constant or pointer to a temporary. There are two alternative uses for ``ostrstream. os << "x=" << x << ends. ``arr.storeElement(0. Thus. ``istrstream'' isn't much more exciting than that. With ordinary arrays. After executing this snippet. assigning arr[2] the value 2. *p=2. is >> x. ``arr[2]=0'' ends up as ``arr. which in turn calls ``fa. 2 arr[2]=0. has as its member ``fa'' a reference to ``arr''. arr. albeit very useful. the proxies don't add any new functionality. The variable ``buffer'' will contain the string ``x=23. as needed (usually because you have no idea what size the buffer must have. In memory data formatting One often faced problem is that of converting strings representing some data to that data.'' In other words ``arr[0] = arr[2]'' generates the code ``arr.storeElement(index.0)''. ``ostrstream'' and ``strstream''. ``x'' will have the value 23542. ``operator=(int)'' is executed. where you don't know how large a buffer you will need. return *this. The object. and one to index 2. t). With them we can treat our file arrays very much like any kind of array. #endif // FARRAY_HPP That was it. The stream manipulator ``ends'' zero terminates the buffer.'' As you can see. double x=23.readElement(2).'' On line 4.) { } fa. the thing to do is to create an ``istrstream'' object from the string.) The former usage is like this: char buffer[24].storeElement(index. The assignment operator is called. and besides you might not always want it. With our file array we cannot do this. one referring to index 0. For example. ``ostrstream'' on the other hand is more exciting. and arr[3] the value 5. x=5. the above would be legal and have well defined semantics. where p is the temporary proxy referring to element 2.p). Since ``storeElement'' wants an ``int. which is a temporary and does not have a name. . which calls ``arr. is generally more useful (I think. An example will explain: char* s = "23542". ostrstream os(buffer. finally. and one where you want the ``ostrstream'' to create it for you.storeElement(0. and want those digits as an integer. thus ``int x=arr[2]'' translates to ``int x=arr. and as its member ``index'' the value 2. however. Can you see what happens with the proxy? Let's analyze a small code snippet: 1 FileArray<int> arr("file". The other variant. int& x = arr[3]. On line two. ``arr[0]=arr[2]'' creates two temporary proxies. 4 arr[0]=arr[2]. the ``operator int() const'' is called. 3 int x=arr[2]. There's one thing we cannot do: int* p = &arr[2]. which creates a ``FileArrayProxy<int>'' from ``arr'' with the index 2. this is easy.readElement(2).'' ``p. since the stream cannot know where to put it. int x. sizeof(buffer)).) ostrstream os.operator[](2)'' is called.10).readElement(2)). This operator in turn calls ``fa. istrstream is(s). say we have a string containing digits. they're just syntactic sugar.) We'll mend that hole next month (think about how) and also add iterators. Zero termination is not done by default. which will allow us to use the file arrays almost exactly like real ones.34'' after this snippet. or vice versa.storeElement(2. On this temporary object.'' where ``index'' is still 2 and the value of ``t'' is 0. On line 3. With the aid of ``istrstream''. This member function in turn calls ``fa.34.operator int() const'' is called.readElement(2)'' and returns its value.

a lot of things have changed regarding streams.) • streams can be used also for in-memory formatting of data. • there's a difference between logical const and bitwise const. std::ostream. where the underlying type is ``wchar_t''. I know I'm violating type safety. strengthens encapsulation.) ``pcount'' returns the number of characters stored in the buffer. reinterpret_cast<T> would be used. For ``ostream'' this is ``char'' (ostream is actually a typedef. Attempts to alter the stream while frozen.) dynamic_cast<T>. the headers are actually <iostream> and <fstream>. will fail. and the names std::istream. const size_t length=os. There are four new cast operators. the stream guarantees that it will not deallocate the buffer.e. is just like ``fstream'' the combined read/write stream. but the C++ compiler doesn't know and always assumes bitwise const. the generally useful strstreams has been replaced by ``std::istringstream''. Why the standard has removed the file stream open modes ios::create and ios::nocreate is beyond me. static_cast<T>. Last ``freeze'' can either freeze the buffer. OK?'' The good thing about it is that it's so visible that anyone doubting it can easily spot the dangerous lines and have a careful look.'' which saves both learning and coding (the already written ``operator<<'' and ``operator>>'' can be used for all kinds of streams already.'' • proxy classes can be used to differentiate read and write operations for ``operator[]'' (the construction can of course be used elsewhere too. but it's most useful in this case. or again.freeze(0). which both makes life easier and not. or in-memory formatting.write(reinterpret_cast<const char*>(&variable). The latter is done by giving it a parameter with the value 0. • truly simple smart pointers can save some memory management house keeping. but hey.h> (or for some compilers <strstrea. . ``Yeah. nor overwrite it. I find this interface to be unfortunate. I know what I'm doing. I think the example pretty much shows what this kind of usage does.) There's another typedef. Recap The news this month were: • streams dealing with files. In the binary streaming seen in this article. that are highly visible.) but it's often useful when dealing with files. ``strstream'' finally.) • friends break encapsulation in a way that. y=34. They're (in approximate order of increasing danger. the way of declaring a variable as non-const for const members.str(). unformatted I/O too. The string streams can be found in the header <strstream. but on strings (there is a string class.) • streams can be used for binary. The syntax is: os.) defined in the header <sstream>. how to differentiate between logical and bitwise const.double x=23. in other words. etc. os. at least file streams and in-memory formatting streams. const_cast<T> and reinterpret_cast<T>.h>. It's so easy to forget to release the buffer (by simply forgetting to call ``os. The underlying type for std::ostream is: std::basic_ostream<class charT. The member function ``str'' returns a pointer to the internal buffer (which is then frozen. sizeof(variable)).) Standards update With the C++ standard. and it's hard to see in large code blocks. • It is possible to move around in streams. rather a string class template. in the standard. This normally doesn't make sense for ``cout'' and ``cin'' or in-memory formatting (as the name implies. std::wistringstream. It's generally not possible to move around in ``cin'' and ``cout. are used just the same way as the familiar ``cout'' and ``cin. which on most systems probably will be 16-bit Unicode. const char* p = os. // work with p and length. the value of EOF. or ``unfreeze'' it. and some other house keeping things. ``std::wostream''. where the most important template parameter is the underlying character. as a way of saying.pcount(). that is.45. Finally. as they're extremely useful. // release the memory. As I mentioned already last month. Casting is ugly. os << x << '*' << y << '=' << x*y << ends. The streams are templatized too.) ``std::ostringstream'' does not suffer from the freeze problem that ``ostrstream'' does. The class template ``char_traits'' is a traits class which holds the type used for EOF. They do not operate on ``char*''.freeze(0)'') and that leads to a memory leak.34. and also be used as a work around for compilers lacking ``mutable'' (i. when done right. class traits=std::char_traits<charT> > ``charT'' is the basic type for the stream. ``std::ostringstream'' and ``std::stringstream'' (plus wide variants. etc.

• • • • • Improve the file array such that it accepts a ``stream&'' instead of a file name, and allows for several arrays in the same file. Improve the proxy such that ``int& x=arr[2]'' and ``int* p=&arr[1]'' becomes illegal. Add a constructor to the array that accepts only a ``size_t'' describing the size of the array, which creates a temporary file and removes it in its destructor. What happens if we instantiate ``FileArray'' with a user defined type? Is it always desireable? If not, what is desireable? If you cannot define what's desireable, how can instantiation with user defined types be banned? How can you, using the stream interface, calculate the size of a file?

Coming up
Next month will be devoted to improving the ``FileArray.'' We'll have iterators, allow arbitrary types, add error handling and more. I assume I won't need to tell you that it'll be possible to use the ``FileArray,'' just as ordinary arrays with generic programming, i.e. we can have the exact same source code for dealing with both! Part Part Part Part Part Part Part Part Part Part1 Part1 Part1 Part1 1 2 3 4 5 6 7 8 9 0 1 2 3 [Note: the source code for this month is here. Ed.] Last month a file based array template for truly huge amounts of data was introduced. While good, it was nowhere near our goals. Error handling was missing completely, making it dangerous to use in real life. There was no way to say how a user defined data type should be represented on disk, yet they weren't disallowed, which is a dangerous combination. It was also lacking iterators, something that is handy, and is an absolute requirement for generic programming with algorithms that are independent of the source of the data. On top of that, we'd really like the ability to store several different arrays in the same file, and also have an anonymous array which creates a temporary file and removes it when the array is destroyed. All of these will be dealt with this month, yet very little will be new. Instead it's time to make use of all the things learned so far in the course.

The data representation problem
In the file array as implemented last month, data was always stored in a raw binary format, exactly mirroring the bits as they lay in memory. This works fine for integers and such, but can be disastrous in other situations. Imagine a file array of strings (where string is a ``char*''). With the implementation from last month, the pointer value would be stored, not the data pointed to. When reading, a pointer value is read, and when dereferenced, whatever happens to be at the memory location pointed to (if anything) will be used (which is more than likely to result in a rather quick crash.) Anything with pointers is dangerous when stored in a raw binary format, yet we must somehow allow pointers in the array, and preferably so without causing problems for those using the array with built-in arithmetic types. How can this be done? In part 4, when templates were introduced, a clever little construct called ``traits classes'' was shown. I then gave this rather terse description: ``A traits class is never instantiated, and doesn't contain any data. It just tells things about other classes, that is its sole purpose.'' Doesn't that smell like something we can use here? A traits class that tells how the data types should be represented on disk? What do we need from such a traits class? Obviously, we need to know how much disk space each element will take, so a ``size'' member will definitely be necessary, otherwise we cannot know much disk space will be required. We also need to know how to store the data, and how to read it. The easiest way is probably to have member functions ``writeTo'' and ``readFrom'' in the traits class. Thus we can have something looking like this: template <class T> class FileArrayElementAccess { public: static const size_t size; static void writeTo(T value, ostream& os); static T readFrom(istream& is); }; The array is then rewritten to use this when dealing with the data. The change is extremely minor. ``storeElement'' needs to be rewritten as: template <class T> void FileArray<T>::storeElement(size_t index, const T& element)

{ // what if index >= array_size? typedef FileArrayElementAccess<T> traits; (*pstream).seekp(traits::size*index +sizeof(array_size), ios::beg); // what if seek fails? traits::writeTo(element,*pstream); // what if write failed? // what if too much data was written? } The change for ``readElement'' is of course analogous. However, as indicated by the last comment, a new error possibility has shown up. What if the ``writeTo'' and ``readFrom'' members of the traits class are buggy and write or read more data to disk than they're allowed to? Since it's the user of the array that must write the traits class (at least for their own data types) we cannot solve the problem, but we can give the user a chance to discover that something went wrong. Unfortunately for writing, the error is extremely severe; it means that the next entry in the array will have its data destroyed... In the traits class, by the way, the constant ``size'', used for telling how many bytes in the stream each ``T'' will occupy, poses a problem with most C++ compilers today (modern ones mostly makes life so much easier.) The problem is that a static variable, and also a static constant, in a class, needs to reside somewhere in memory, and the class declaration is not enough for that. This problem is two-fold. To begin with, where should it be stored? It's very much up to whoever writes the class, but somewhere in the code, there must be something like: const size_t ArrayFileElementAccess<X>::size = ...; where ``X'' is the name of the class dealt with by the particular traits specialisation. The second problem is that this is totally unnecessary. What we want is a value that can be used by the compiler at compile time, not a memory location to read a value from. As I mentioned, a modern compiler does make this much easier. In standard C++ it is allowed to write: template<> class ArrayFileElementAccess<X> { public: const size_t size = ...; ... }; Note that for some reason that I do not know, this construct is only legal if the type is a constant of an integral or enumeration type. ``size_t'' is such a type, it's some unsigned integral type, probably ``unsigned int'', but possibly ``unsigned long''. The expression denoted ``...'' must be possible to evaluate at compile time. Unless code is written that explicitly takes the address of ``size'', we need not give the constant any space to reside in. The odd construct ``template <>'' is also new C++ syntax, and means that what follows is a specialisation of a previously declared template. For old compilers, however, there's a work-around for integral values, no larger than the largest ``int'' value. We cheat and use an enum instead of a ``size_t''. This makes the declaration: class ArrayFileElementAccess<X> { public: enum { size= ... }; ... }; This is a bit ugly, but it is perfectly harmless. The advantage gained by adding the traits class is flexibility and safety. If someone wants to use a file array for their own class, they're free to do so. However, they must first write a ``FileArrayElementAccess'' specialisation. Failure to do so will result in a compilation error. This early error detection is beneficial. The sloppy solution from last month would not yield any error until run-time, which means a (usually long) debugging session.

Several arrays in a file
What is needed in order to host several arrays in the same file? One way or the other, there must be a mechanism for finding out where one array begins and another ends. I think the simplest solution, is to let go of the file names, and instead make the constructors accept an ``fstream&''. We can then require that the put and get pointer of the stream must be where the array can begin, and we can in turn promise that the put and get pointer will be positioned at the byte after the array end. Of course, in addition to having a reference to the ``fstream'' in our class, we also need the

``home'' position, to seek relative to, when indexing the array. This becomes easy to write for us, it becomes easy to use as well. For someone requiring only one array in a file, there'll be slightly more code, an ``fstream'' object must be explicitly initialised somewhere, and passed to the constructor of the array, instead of just giving it a name. I think the functionality increase/code expansion exchange is favorable. In order to improve the likelihood of finding errors, we can waste a few bytes of disk space by writing a well known header and trailer pattern at the beginning and end of the array (before the first element, and after the last one.) If someone wants to allocate an array using an existing file, we can find out if the get pointer is in place for an array start. The constructor creating a file should, however, first try to read from the file to see if it exists. If it does, it should be created from the file, just like the constructor accepting a stream only does. If the read fails, however, we can safely assume that the file doesn't exist and should instead be created. The change in the class definition, and constructor implementation is relatively straight forward, if long: template <class T> class FileArray { public: FileArray(fstream& fs, size_t elements); // create a new file. FileArray(fstream& fs); // use an existing file and get size from there ... private: void initFromFile(const char*); fstream& stream; size_t array_size; // in elements streampos home; }; template <class T> FileArray<T>::FileArray(fstream& fs, size_t elements) : stream(fs), array_size(elements) { // what if the file could not be opened? // first try to read and see if there's a begin // pattern. Either there is one, or we should // get an eof. char pattern[6]; stream.read(pattern,6); if (stream.eof()) { stream.clear(); // clear error state // and initialise. // begin of array pattern. stream.write("ABegin",6); // must store size of elements, as last month const size_t elem_size =FileArrayElementAccess<T>::size; stream.write((const char*)&elem_size, sizeof(elem_size)); // and of course the number of elements stream.write((const char*)&array_size, sizeof(array_size)); // Now that we've written the maintenance // stuff, we know what the home position is. home = stream.tellp();

initFromFile(pattern). stream. stream. } // OK. stream. what to do? Let's set // the fail flag for now.sizeof(elem_size)). // set put and get pointer to past the end pos. char pattern[6].6).tellg()). // set put and get pointer to past the end pos. stream. we have a valid array. } template <class T> void FileArray<T>::initFromFile(const char* p) { // Check if the read pattern is correct if (strncmp(p.read(pattern.clear(ios::failbit). stream. // shared with other // stream constructor if (array_size != elements) { // Uh oh.clear(ios::failbit). stream.seekp(stream. // stupid name for the // member function. the element sizes // mismatch. } . } template <class T> FileArray<T>::FileArray(fstream& fs) : stream(fs) { // First read the head pattern to see if // it's right. Again. stream. return. The data read from the stream.4). // set the fail flag. if (elem_size != FileArrayElementAccess<T>::size) { // wrong kind of array."ABegin". // for lack of better. return.6)) { // What to do? It was all wrong! stream.// Then we must go the the end and write // the end pattern.tellp()). stream. now let's see if // it's of the right kind.tellg()).write("AEnd".seekp(home+elem_size*array_size).clear(ios::failbit).seekg(stream. } initFromFile(pattern). right? return.read((char*)&elem_size.seekp(stream. } // set put and get pointer to past the end pos. size_t elem_size. // and the size given in the constructor // mismatches! What now? stream.

// Get the size of the array. Can't do much with // the size here, though. stream.read((char*)&array_size,sizeof(array_size)); // Now we're past the header, so we know where the // data begins and can set the home position. home = stream.tellg(); stream.seekg(home+elem_size*array_size); // Now positioned immediately after the last // element. char epattern[4]; stream.read(epattern,4); if (strncmp(epattern,"AEnd",4)) { // Whoops, corrupt file! stream.clear(ios::failbit); return; } // Seems like we have a valid array! } Other than the above, the only change needed for the array is that seeking will be done relative to ``home'' rather than the beginning of the file (plus the size of the header entries.) The new versions of ``storeElement'' and ``readElement'' become: template <class T> T FileArray<T>::readElement(size_t index) const { // what if index >= max_elements? typedef FileArrayElementAccess<T> traits; stream.seekg(home+index*traits::size); // what if seek fails? return traits::readFrom(stream); // what if read fails? // What if too much data is read?


template <class T> void FileArray<T>::storeElement(size_t index, const T& element) { // what if index >= array_size? typedef FileArrayElementAccess<T> traits; stream.seekp(home+traits::size*index); // what if seek fails? traits::writeTo(element,stream); // what if write failed? // what if too much data was written? }

Temporary file array
Making use of a temporary file to store a file array that's not to be persistent between runs of the application isn't that tricky. The implementation so far makes use of a stream and known data about the beginning of the stream, number of elements and size of the elements. This can be used for the temporary file as well. The only thing we need to do is to create the temporary file first, open it with an fstream object, and tie the stream reference to that object, and remember to delete the file in the destructor. What's the best way of creating something and making sure we remember to undo it later? Well, of course, creating a new helper class which creates the file in its constructor and removes it in its destructor. Piece of cake. The only problem is that we shouldn't always create a temporary file, and when we do, we can handle it a bit different from what we do with a ``global'' file that can be shared. For example, we know that we have exclusive rights to the file, and that it won't be reused, so there's no need for the extra information in the beginning and end. So, how's a

temporary file created? The C++ standard doesn't say, and neither is there any support for it in the old de-facto standard. I don't think C does either. There are, however, two functions ``tmpnam'' and ``tempnam'' defined as commonly supported extensions to C. They can be found in <stdio.h>. I have in this implementation chosen to use ``tempnam'' as it's more flexible. ``tempnam'' works like this: it accepts two string parameters named ``dir'' and ``prefix''. It first attempts to create a temporary file in the directory pointed to by the environment variable ``TMPDIR''. If that fails, it attempts to create it in the directory indicated by the ``dir'' parameter, unless it's 0, in which case a hard-coded default is attempted. It returns a ``char*'' indicating a name to use. The memory area pointed to is allocated with the C function ``malloc'', and thus must be deallocated with ``free'' and not delete[]. Over to the implementation details: We add a class called temporaryfile, which does the above mentioned work. We also add a member variable ``pfile'' which is of type ``ptr<temporaryfile>''. Remember the ``ptr'' template from last month? It's a smart pointer that deallocates whatever it points to in its destructor. It's important that the member variable ``pfile'' is listed before the ``stream'' member, since initialisation is done in the order listed, and the ``stream'' member must be initialised from the file object owned by ``pfile''. We also add a constructor with the number of elements as its sole parameter, which makes use of the temporary file. class temporaryfile { public: temporaryfile(); ~temporaryfile(); iostream& stream(); private: char* name; fstream fs; }; temporaryfile::temporaryfile() : name(::tempnam(".","array")), fs(name, ios::in|ios::out|ios::binary) { // what if tmpnam fails and name is 0 // what if fs is bad? } temporaryfile::~temporaryfile() { fs.close(); ::remove(name); // what if remove fails? ::free(name); } In the above code, ``tempnam'', ``remove'' and ``free'' are prefixed with ``::``, to make sure that it's the names in global scope that are meant, just in case someone enhances the class with a few more member functions whose name might clash. For the sake of syntactical convenience, I have added yet another operator to the ``ptr'' class template: template <class T> class ptr { public: ptr(T* tp=0) : p(tp) {}; ~ptr() { delete p; }; T* operator->(void) const { return p; }; T& operator*(void) const { return *p;}; private: ptr(const ptr<T>&); ptr<T>& operator=(const ptr<T>&); T* p; }; It's the ``operator->'' that's new, which allows us to write things like ``p->x,'' where p is a ``ptr<X>'', and the type ``X'' contains some member named ``x''. The return type for ``operator->'' must be something that ``operator->'' can be applied to. The explanation sounds recursive, but it makes sense if you look at the above code.

``ptr<X>::operator->()'' returns an ``X*''. ``X*'' is something you can apply the built in ``operator->'' to (which gives you access to the elements.) template <class T> FileArray<T>::FileArray(size_t elements) : pfile(new temporaryfile), stream(pfile->stream()), array_size(elements), home(stream.tellg()) { const size_t elem_size= FileArrayElementAccess<T>::size; // put a char just after the end to make // sure there's enough free disk space. stream.seekp(home+array_size*elem_size); char c; stream.write(&c,1); // what to do if write fails? // set put and get pointer to past the end pos stream.seekg(stream.tellp()); } That's it! The rest of the array works exactly as before. No need to rewrite anything else.

Code reuse
If you're an experienced C programmer, especially experienced with programming embedded systems where memory constraints are tough and you also have a good memory, you might get a feeling that something's wrong here. What I'm talking about is something I mentioned the first time templates were introduced: ``Templates aren't source code. The source code is generated by the compiler when needed.'' This means that if we in a program uses FileArray<int>, FileArray<double>, FileArray<X> and FileArray<Y> (where ``X'' and ``Y'' are some classes,) there will be code for all four types. Now, have a close look at the member functions and see in what way ``FileArray<int>::FileArray(iostream& fs, size_t elements)'' differs from ``FileArray<char>::FileArray(iostream& fs, size_t elements)''. Please do compare them. What did you find? The only difference at all is in the handling of the member ``elem_size'', yet the same code is generated several times with that as the only difference. This is what is often referred to as the template code bloat of C++. We don't want code bloat. We want fast, tight, and slick applications. Since the only thing that differs is the size of the elements, we can move the rest to something that isn't templatised, and use that common base everywhere. I've already shown how code reuse can be done by creating a separate class and have a member variable of that type. In this article I want to show an alternative way of reusing code, and that is through inheritance. Note very carefully that I did not say public inheritance. Public inheritance models ``is-A'' relationships only. We don't want an ``is-A'' relationship here. All we want is to reuse code to reduce code bloat. This is done through private inheritance. Private inheritance is used far less than it should be. Here's all there is to it. Create a class with the desired implementation to reuse and inherit privately from it. Nothing more, nothing less. To a user of your class, it matters not at all if you chose not to reuse code at all, reuse through encapsulation of a member variable, or reuse through private inheritance. It's not possible to refer to the descendant class through a pointer to the private base class, private inheritance is an implementation detail only, and not an interface issue. To the point. What can, and what can not be isolated and put in a private base class? Let's first look at the data. The ``stream'' reference member can definitely be moved to the base, and so can the ``pfile'' member for temporary files. The ``array_size'' member can safely be there too and also the ``home'' member for marking the beginning of the array on the stream. By doing that alone we have saved just about nothing at all, but if we add as a data member in the base class the size (on disk) for the elements, and we can initialise that member through the ``FileArrayElementAccess::size'' traits member, all seeking in the file, including the initial seeking when creating the file array, can be moved to the base class. Now a lot has been gained. Left will be very little. Let's look at the new improved implementation: Now for the declaration of the base class. class FileArrayBase { public: protected: FileArrayBase(iostream& io,

} The implementation of ``FileArrayBase'' is very similar to the ``FileArray'' earlier. size_t elem_size). The only surprise here should be the nesting of the class ``temporaryfile.". private: char* name. // What if remove fails? ::free(name).ios::in|ios::out|ios::binary) { // what if tmpnam fails and name is 0 // what if fs is bad? } FileArrayBase::temporaryfile::~temporaryfile() { fs.size_t elements. it's inaccessible from anywhere other than the ``FileArrayBase'' implementation. } iostream& FileArrayBase::temporaryfile::stream() { return fs. FileArrayBase(size_t elements. iostream& stream(). size_t elem_size). size_t array_size. iostream& stream. FileArrayBase::temporaryfile::temporaryfile() : name(::tempnam(". FileArrayBase::FileArrayBase(iostream& io. array_size(elements).close(). size_t size() const. size_t elements.'' Yes. ~temporaryfile(). iostream& seekp(size_t index) const. streampos home. It's actually possible to nest classes in class templates as well."array")). iostream& seekg(size_t index) const. size_t elem_size) : stream(io). }. private: class temporaryfile { public: temporaryfile(). When implementing the member functions of the nested class. since the surrounding scope must be used. ::remove(name). but few compilers today support that. void initFromFile(const char* p). instead of the traits class. The only difference is that we use a parameter for the element size. }. it's possible to define a class within a class. fs(name. Since the ``temporaryfile'' class is defined in the private section of ``FileArrayBase''. // number of elements size_t element_size() const. ptr<temporaryfile> pfile. it looks a bit ugly. fstream fs. FileArrayBase(iostream& io). size_t e_size. e_size(elem_size) { .

write(ArrayBegin. stream.sizeof(ArrayBegin)).clear(ios::failbit). // must store size of elements stream. size_t elem_size) . sizeof(elem_size)).tellg()).char pattern[sizeof(ArrayBegin)]. // and the size given in the constructor // mismatches! What now? stream.write(ArrayEnd.tellg()). // begin of array pattern. } initFromFile(pattern). } To make life a little bit easier.tellp()).tellp(). initFromFile(pattern). which hold the patterns to be used for marking the beginning and end of an array on disk. stream. } if (e_size != elem_size) { stream. sizeof(array_size)).write((const char*)&array_size. // Then we must go the the end and write // the end pattern. // set put and get pointer to past the end pos. // clear error state // and initialize. stream.read(pattern. stream.sizeof(ArrayEnd)).seekp(stream. stream. // Now that we've written the maintenance // stuff.seekp(home+elem_size*array_size). return.write((const char*)&elem_size. home = stream.clear(ios::failbit). if (stream. } FileArrayBase::FileArrayBase(size_t elements. I've assumed two arrays of char named ``ArrayBegin'' and ``ArrayEnd''. stream. FileArrayBase::FileArrayBase(iostream& io) : stream(io) { char pattern[sizeof(ArrayBegin)]. stream. stream.read(pattern. we know what the home position is. // and of course the number of elements stream.seekp(stream. The data read from the stream.clear(). // shared with other // stream constructor if (array_size != elements) { // Uh oh.sizeof(pattern)). // set put and get pointer to past the end pos.eof()) { stream.sizeof(pattern)).seekg(stream. } // set put and get pointer to past the end pos.

// Now positioned immediately after the last // element. // for lack of better.sizeof(ArrayBegin))) { // What to do? It was all wrong! stream. stream. // what if seek failed? return stream. now let's see if // it's of the right kind. we have a valid array.1). home(stream.clear(ios::failbit). char epattern[sizeof(ArrayEnd)]. return.read(epattern. if (strncmp(epattern.seekp(home+array_size*e_size). stream.read((char*)&e_size.sizeof(e_size)). // What if seek failed? return stream. stream(pfile->stream()). Can't do much with // the size here.: pfile(new temporaryfile). though. stream.clear(ios::failbit). home = stream.seekg(stream. // Now we're past the header. array_size(elements).seekg(home+index*e_size). } // OK. stream.sizeof(ArrayEnd))) { // Whoops.ArrayEnd.seekp(home+index*e_size). stream.sizeof(epattern)).tellg()) { stream. // set the fail flag. corrupt file! stream.tellp()). } . } // Seems like we have a valid array! } iostream& FileArrayBase::seekg(size_t index) const { // what if index is out of bounds? stream. so we know where the // data begins and can set the home position. char c.tellg().sizeof(array_size)).write(&c.ArrayBegin. stream. } void FileArrayBase::initFromFile(const char* p) { // Check if the read pattern is correct if (strncmp(p. // Get the size of the array. } iostream& FileArrayBase::seekp(size_t index) const { // What if index is out of bounds? stream.seekg(home+e_size*array_size). // set put and get pointer to past the end pos.read((char*)&array_size. e_size(elem_size). return.

// illegal T readElement(size_t index) const. }.// create one. FileArrayElementAccess<T>::size) { } template <class T> FileArray<T>::FileArray(iostream& io) : FileArrayBase(io) { // what if element_size is wrong? } template <class T> FileArray<T>::FileArray(size_t elements) : FileArrayBase(elements. } . it's all pretty straight forward. FileArrayProxy<T> operator[](size_t index). The really good news. FileArray(iostream& io). template <class T> class FileArray : private FileArrayBase { public: FileArray(iostream& io.size_t FileArrayBase::size() const { return array_size. size_t size) : FileArrayBase(io. }. is how easy this makes the implementation of the class template ``FileArray''. } Apart from the tricky questions. // use existing array FileArray(size_t elements). // create temporary T operator[](size_t index) const. FileArrayElementAccess<T>::size) { } template <class T> T FileArray<T>::operator[](size_t index) const { // what if index>= size()? return readElement(index). const T& elem). // illegal FileArray<T>& operator=(const FileArray<T>&). private: FileArray(const FileArray<T>&). elements. size_t size). } size_t FileArrayBase::element_size() const { return e_size. void storeElement(size_t index. friend class FileArrayProxy<T>. size_t size() { return FileArrayBase::size(). however. Now watch this! template <class T> FileArray<T>::FileArray(iostream& io.

part 1. const T& element) { // what if index>= size()? iostream& s = seekp(index).s).template <class T> FileArrayProxy<T> FileArray<T>::operator[](size_t index) { // what if index>= size()? return FileArrayProxy<T>(*this. // what if write failed? // What if too much data was written? } How much easier can it get? This reduced code bloat. we can create exception class hierarchies with public inheritance. What can go wrong? Already in the very beginning of this article series. or to use wording slightly more English-like. // what if read failed? // What if too much data was read? return t. When I introduced exceptions. } template <class T> T FileArray<T>::readElement(size_t index) const { // what if index>= size()? iostream& s = seekg(index). There was one thing I didn't tell. // parent seekp // what if seek fails? FileArrayElementAccess<T>::writeTo(element. and we can choose what level to catch. void f() (throw A). That one thing is that when exceptions are caught. B : public A {}. dynamic binding works. because at that time it wouldn't have made much sense. // parent seekg return FileArrayElementAccess<T>::readFrom(s). index). } catch (B& b) { // **1 } catch (C& c) { // **2 } catch (A& a) { // **3 } . B1 : public B{}. extend and maintain. C : public A {}. // may throw any of the above void x() { try { f(). the C++ error handling mechanism. I introduced exceptions. and also makes the source code easier to understand. I didn't tell the whole truth about them. } template <class T> void FileArray<T>::storeElement(size_t index. Of course exceptions should be used to handle the error situations that can occur in our array class. Here's a mini example showing the idea: class class class class A {}.

but it's extremely useful. In a perfectly debugged program. and we note that the header or trailer doesn't match the expected. regardless of why (it's not very easy to find out if it's a faulty disk or lack of disk space. class FileArrayLogicError : public FileArrayException {}.) We want to access that element by dereferencing the iterator (unary operator *. For whenever the creation of the array fails. Iterators An iterator into a file array is something whose behavior is analogous to that of pointers into arrays. objects of class ``B'' and class ``B1'' are caught if thrown from ``f''. we can have a root class ``FileArrayException''. class FileArrayBoundsError : public FileArrayLogicError {}. for example if seeking or reading/writing fails. for example. The iterator arithmetics becomes simple too. If after creation. Beware. that space may be occupied when the next statement in the program is executed. We want to be able to create an iterator from the array (in which case the iterator refers to the first element of the array. and what they do throw.) and we want iterator arithmetic with integers. We can use abstraction levels for errors. In ``**2'' objects of class ``C'' (and descendants of C. For example. however.} At ``**1'' above. Here ``FileArrayLogicError'' are for clear violations of the not too clearly stated preconditions. At ``**3'' all others from the ``A'' hierarchy are caught. That way we even have error handling for iterator arithmetic that lead us outside the valid range for the array given for free from the array itself. Addressing outside the legal bounds. abuse and environmental issues outside the control of the programmer.) It's of course possible to take this even further. We can see that there are clearly two kinds of errors that can occur in the file array. I think this is quite enough. This may seem like a curious detail of purely academic worth. I invite you to add the throws to the code.) A reasonable start for the exception hierarchy then becomes: class FileArrayException {}. If the read/write members of the element access traits class are faulty and either write too much (thus overwriting the data for the next element) or reads too much (in which case the last few bytes read will be garbage picked from the next element. Whenever the iterator is dereferenced. For abuse I mean things like indexing outside the valid bounds. and with environmental issues I mean faulty or full disks (Since there are several programs running. it's not a good idea to add exception specifications to the member functions making use of the T's (since you cannot know which operations on T's that may throw. and an index. If an array is created from an old existing file. We can divide those further into: class FileArrayCreateError : public FileArrayRuntimeError {}. though. we return (*array)[index]. and ``FileArrayRuntimeError'' for things that the programmer may not have a chance to do something about. the only exceptions ever thrown from file arrays will be of the ``FileArrayRuntimeError'' kind.) class FileArrayStreamError : public FileArrayRuntimeError {}. class FileArrayElementSizeError : public FileArrayLogicError {}. if any are declared elsewhere) are caught. a check if there's enough disk space is still taking a chance. Even if there was enough free space when the check was made. An easy way of getting there is to let an iterator contain a pointer to a file array. class FileArrayRuntimeError : public FileArray Exception {}. from which all other exceptions regarding the file array inherits. class FileArrayDataCorruptionError : public FileArrayRuntimeError {}. Now we have a reasonably fine level of error reporting. . yet an application that wishes a coarse level of error handling can choose to catch the higher levels of the hierarchy only. As an exercise. something goes wrong with a stream.) You can increase the code size and eligibility gain from the private inheritance of the implementation in the base by putting quite a lot of the error handling there.

• iterator+n yields a new iterator referring to the iterator. • iterator1>=iterator2 returns !(iterator1<iterator2). As an example. private: FileArray<T>* array. and thus a good chance of making errors. it's an error and we throw an exception. }.index. Likewise for operator<=. const FileArrayIterator<T>& i). • iterator1==iterator2 returns non-zero if the arrays and indices of iterator1 and iterator2 are equal.e a • leArrayProxy. If the iterators refer to different arrays. and the actions we want. If iterator1 and iterator2 refer to different arrays. • iterator1<iterator2 returns true if the iterators refer to the same array and iterator1. • addition of array and ``long int'' value ``n'' yields iterator referring to n:th element of array. Here's my idea: • creation from array yields iterator referring to first element • copy construction and assignment are of course well behaved. FileArrayIterator<T>& operator+=(long n). Likewise for operator>. and analogous for operator-. The implementation thus seems easy. all that's needed is to define the operations needed for the iterators. unless you want to give the class users some rather unhealthy surprises) is to define ``operator+='' as a member of the class. template <class T> FileArrayIterator<T>::FileArrayIterator( const FileArray<T>& a ) : array(&a). template <class T> FileArrayIterator<T> operator+(const FileArrayIterator<T>& i.. • iterator+=n (where n is of type long int) adds n to the value of the index in the iterator. * iterator[n] returns (*array)[index+n].index < iterator2. template <class T> FileArrayIterator<T> operator+(long n. long n). unsigned long index. and two versions of operator+ that are implemented with ``operator+=''. • iterator1!=iterator2 returns !(iterator1==iterator2) • *iterator returns whatever (*array)[index] returns. • iterator1-iterator2 yields a long int which is the difference between the indices of the iterators. Neither of the above is difficult.. . ``o+v'' and ``v+o'' are well defined and behaves like they do for the built in types (which they really ought to. Operator -= is analogous. Here's how it's done in the iterator example: template <class T> class FileArrayIterator { public: FileArrayIterator(FileArray<T>& f). a rule of thumb when writing a class for which an object ``o'' and some other value ``v'' the operations ``o+=v''. it's dereferencing the iterator that's an error if the index is out of range. it's an error and we throw an exception. FileArrayProxy<T> operator*(). index(0) { } template <class T> FileArrayIterator<T>::FileArrayIterator( const FileArrayIterator<T>& i ) . i. I think the above is an exhaustive list. It's just a lot of code to write. thus reducing the amount to write and also the risk for errors. • moving forwards and backwards with operator++ and operator--. This addition is never an error. however. With a little thought.index+n:th element of the array.since it's just ordinary arithmetics on the index type. FileArrayProxy<T> operator[](long n). quite a lot of code can be reused over and over.

private inheritance should've been used. return it+=n. though. • Modern compilers do not need the above hack. } template <class T> FileArrayProxy<T> FileArrayIterator<T>::operator[](long n) { return (*array)[index+n]. index(i. In many situations where public inheritance is used.) . • A user of a class that has privately inherited from something else cannot take advantage of this fact. } template <class T> FileArrayIterator<T> operator+(const FileArrayIterator<T>& i. • Exception catching is polymorphic (i. it's fairly simple. while private inheritance models ``is-implemented-in-terms-of'' relationships.e. To a user the private inheritance doesn't make any difference. return *this. Fortunately there are commonly supported extensions to the languages that do. } Surely. but since its behaviour is defined in terms of ``operator+='' it means that if we have an error. • Private inheritance can be used for code reuse. There's no need to display all the code here in the article.array). you can study it in the sources. • Private inheritance is very different from public inheritance. and as you can see. Recap This month the news in short was: • You can increase flexibility for your templates without sacrificing ease of use or safety by using traits classes. does not have any support for the notion of temporary files. • Private inheritance is in real-life used far less than it should be. there's only one place to correct it. Defining a class-scope static constant of an integral type in the class declaration is cleaner and more type safe. • Standard C++ and even C. } template <class T> FileArrayProxy<T> FileArrayIterator<T>::operator*() { return (*array)[index]. The above shows how it all works. dynamic binding works when catching. long n) { FileArrayIterator<T> it(i). return it+=n.{ } : array(i. } template <class T> FileArrayIterator<T> operator+(long n. the code for the two versions of ``operator+'' must be written.index) template <class T> FileArrayIterator<T>& FileArrayIterator<T>::operator+=(long n) { index+=n. Public inheritance models ``is-A'' relationships. const FileArrayIterator<T>& i) { FileArrayIterator<T> it(i). • Enumerations in classes can be used to have class-scope constants of integral type.

They both delete whatever they point to in their destructor. They're templates. Exercises • • • Alter the file array such that it's possible to instantiate two (or more) kinds of FileArray<X> in the same program. operator-=. and using a member variable of that same class. // or the 0 pointer on failure. (hint. and they're dangerous if you forget that you're dealing with smart pointers. however. For example there was no way to rebind an object to another pointer. That is the core purpose of all smart pointers. so please write and give me suggestions for future topics to cover. we've seen how a simple smart pointer. . their syntax resembles that of pointers. or to tell it not to delete the memory (that too can be useful at times. operator*= and operator/= members of the classes. While ``ptr<T>'' served its purpose. The problem to solve I do not know what the core issues where when the ``auto_ptr<T>'' was designed. operator* and operator/ as functions outside the classes. Exception safety In this respect ``auto_ptr<T>'' and ``ptr<T>'' are equal. called simply ``ptr<T>'' was used to make memory handling a little bit easier. the alternatives will all need different traits class specialisations. The only thing that ``auto_ptr<T>'' has to offer over ``ptr<T>''. where the alternatives store the data in different formats. • exception safety • safe memory area ownership transfer • no confusion with normal pointers • controlled and visible rebinding and release of ownership • works with dynamic types • pointer-like syntax for pointer-like behaviour Let us have a look at each of these in some detail and compare with the previous ``ptr<T>''. but deallocate in exceptional situations. I'm beginning to dry up on topics now. operator-. they aren't pointers. with respect to exception safety.) Always implement binary operator+. as we will see later in this article. Part Part Part Part Part Part Part Part Part Part1 Part1 Part1 Part1 1 2 3 4 5 6 7 8 9 0 1 2 3 [Note: the source code for this month is here. // may throw something int* ptr() { // returns a newly allocated value on success. You can find a few things in common with them all.• • The polymorphism of exception catching allows us to create an arbitrarily fine-grained error reporting mechanism while still allowing users who want a coarse error reporting mechanism to use one (they'll just catch classes near the root of the exception class inheritance tree.] In the past two articles. but I know what problems the implementation provided does solve. Ed. try { auto_ptr<int> p(new int(1)). This can be used for holding onto something we want to return in normal cases.) This article is devoted to the only smart pointer provided by the standard C++ library. they relieve you of the burden of remembering to deallocate the memory. Here is a code fragment showing such a situation: void f(). is that we can tell an ``auto_ptr<T>'' object that it no longer owns a memory area. it's a bit too simplistic to be generally useful. the class template ``auto_ptr<T>''. for reusing code? In which situations is it crucial which alternative you choose? Coming up Next month we'll have a look at smart pointers.) What's the difference between using private inheritance of a base class. and always implement them in terms of the operator+=.

an exception thrown from ``f'' results in the destruction of the ``auto_ptr<int>'' object ``p'' before the call of the ``release'' member function. return p. auto_ptr<int> p2. // p1 owns the memory area. If something goes wrong between calling ``creation'' and ``termination''. as can be seen above. // we must take care of deallocation somehow. ``f'' does not throw any exception. // use pi for something termination(pi). } One of the headaches of using dynamically allocated memory is knowing who is responsible for deallocating the memory at any given moment in a program's lifetime. // p1 has become the 0 "pointer" auto_ptr<int> p3(p2).release(). } } In the code above. void f() { auto_ptr<int> pi=creation(). by relieving it of ownership while accepting ownership itself. // It's now clear that we are responsible for // deletion of the memory area allocated. What happens is that they both modify the right hand side.. Below are some examples of this: // simple transfer auto_ptr<int> p1(new int(1)). is that the cheat is so bad that it's not really an assignment and definitely not a copy. p2 = p1. // Now it is p3 that owns the memory area. // // // // since we're sending it off as an auto_ptr<T>. Rather. Even // if we chose to release ownership from "pi". } catch (. Of course. p1 doesn't. What do you think about this? auto_ptr<int> creation(). and any function that accepts an ``auto_ptr<T>'' requires ownership to work. we must take care of the . void termination(auto_ptr<int> rip).release()'' is called.. Since the object no longer owns the memory area..f(). such that ``termination'' ought not be called. where it works as both documentation and implementation of ownership transfer. and the value returned from there is passed to the caller. I think the above example speaks for itself. The properties of the ``auto_ptr<T>'' are more useful when working with functions. and since the function "termination" wants an auto_ptr<T> it wants the responsibility. the above program snippet is too simplistic to be useful. The member function ``release'' releases ownership of the memory area from the ``auto_ptr<int>'' object. The ``auto_ptr<T>'' makes that rather easy. The reason I've quoted the names assignment and copy.) { return 0. // p2 doesn't own anything. Safe memory area ownership transfer This safety is achieved by cheating in the ``assignment'' operator and ``copy'' constructor. deletion is not our headache anymore. not // p1 or p2. which means that the object pointed to will be deleted. .. If. Any function that returns an ``auto_ptr<T>'' leaves it to the caller to take care of the deallocation. ``p. The value returned is the pointer to the memory area. // now p2 owns the memory area. it will not be deallocated. however. both are ownership transfer operations. but that is the behaviour set in the final draft of the C++ standard. An important issue here for those of you who have used early versions of the ``auto_ptr<T>'' is that older versions did not become 0 ``pointers'' when not owning the memory area.

so that in the code near the assignment it is not obvious if it is an ``auto_ptr<T>'' or a normal pointer. since the latter doesn't have any way of transferring ownership. void termination(auto_ptr<int>). // illegal. An auto_ptr<T> cannot be implicitly // converted to a pointer. Calling ``ap. The erroneous code below shows how: auto_ptr<int> creator().reset(p). When creating a // new auto_ptr<T> object. The member function ``release'' gives us a normal pointer to the memory area owned by the . since it is illegal. A pointer cannot be implicitly // converted to an auto_ptr<T>. we do not have to worry about it.)'' Well. depending on the desired effect. // illegal. but which in normal English best translates to ``a crash now. For both the result will be something that in standardese is called ``undefined behaviour''. // also illegal.deallocation. but since we have it in an ``auto_ptr<T>'' that is automatically done for us if we return or throw an exception. No confusion with normal pointers Since the auto pointers have the behaviour outlined above. // also illegal. use the constructor // syntax auto_ptr<T> ap(p). since ``ptr<T>'' does allow implicit construction. If you want to rebind an // auto_ptr<T> object to point to something else. Controlled and visible rebinding and release of ownership If we want to rebind an ``auto_ptr<T>'' object to another memory area. termination(&i). it is extremely important that they cannot accidentally be confused with normal pointers.) The last is as bad.reset(p)'' will deallocate whatever ``ap'' owns (if anything) and make it own whatever ``p'' points to. or generally funny behaviour (possibly followed by a crash later. it is important that the memory area currently owned by the object (if any) is deallocated. ``ap'' might be declared somewhere far far away.'' is perhaps a bit unfortunate since the intended behaviour is clear. This functionality is an advantage that ``auto_ptr<T>'' offer over ``ptr<T>''. we can get it in two ways. If we want a normal pointer from an ``auto_ptr<T>'' object. auto_ptr<T> ap=p. allowing the last error. Imagine the maintenance headaches you could get otherwise. This is done by explicitly prohibiting all implicit conversions between pointer types and ``auto_ptr<T>'' types. In this respect ``auto_ptr<T>'' is better than ``ptr<T>''. What would the first mean? Would the implicit conversion from ``auto_ptr<T>'' to a raw pointer transfer ownership or not? All implementations I have seen where such implicit conversions are allowed do not transfer the ownership. What about this situation? int i. The auto_ptr<T> required by the // termination function cannot be implicitly // created from the pointer. void f() { int* p = creator(). The second error ``auto_ptr<int> ap=p. } It is indeed fortunate that the first and last error above are illegal. The member function ``reset'' takes care of that. a crash later. // use the "reset" member function. That it is illegal comes as a natural consequence of banning the third situation ``ap=p'' which is not clear. ap=p. ap. Ouch! The function would attempt to delete the local variable. termination(p). which in the situation above means that the memory would be deallocated when the ``auto_ptr<T>'' object returned is destroyed (which it would be immediately after the conversion.

Here it is a tie between ``auto_ptr<T>'' and ``ptr<T>''. Since ``ptr<T>'' was specifically designed to disallow transfer of ownership. The reverse is (of course) not allowed. // may throw int* f(void) { auto_ptr<int> p(new int(1)). auto_ptr<A> pa(new B()). this functionality is added-value for ``auto_ptr<T>''. auto_ptr<T>& operator=(auto_ptr<T>& t) throw (). so we use the ``get'' member to temporarily get the pointer and pass it to ``func''. void reset(T* t = 0) throw (). // call func with a normal pointer return p. an ``auto_ptr<T>'' can too. but if it fails with an exception. Works with dynamic types Just as a normal pointer to a base class can point to an object of a publicly derived class. auto_ptr(auto_ptr<T>& t) throw(). Pointer-like syntax for pointer-like behaviour For the small subset of a pointer's functionality that is implemented in the ``auto_ptr<T>'' class template. Here is an example showing the differences: void func(const int*).release(). T* release() throw (). since the functionality is only required if ownership transfer is allowed. This is not particularly strange: class A {}. T* get(void) const throw (). class B : public A{}. Implementation The definition of ``auto_ptr<T>'' looks as follows: template <class T> class auto_ptr { public: explicit auto_ptr(T* t = 0) throw (). we use the ``get'' member function. This function ``f'' then returns the raw pointer if ``func'' does its job. func(p.get()). // return the pointer } Above we see that the function ``func'' requires a normal pointer. since the functionality is exactly the same and so is the syntax. so that we will be responsible for the deallocation. the ``auto_ptr<int>'' object ``p'' will deallocate the memory in its destructor. and also gives us the ownership. For ``ptr<T>'' this is not a problem. but temporarily need a normal pointer to the memory area. auto_ptr<A> pa2(pb). the syntax is exactly the same.``auto_ptr<T>'' object. but it does not assume ownership. We get access to the element pointed to with ``operator*'' and ``operator->''. This is the only functionality of a pointer that is implemented. pa=pb. auto_ptr<B> pb(new B()). T& operator*() const throw (). T* operator->() const throw (). private: T* p. }. If we do not want that responsibility. template <class Y> auto_ptr<T>& operator=(auto_ptr<Y>& t) throw (). . ~auto_ptr() throw (). template <class Y> auto_ptr(auto_ptr<Y>& t) throw().

}. Constructing an object is a user defined conversion. Unfortunately even fewer compilers support this than support the ``explicit'' keyword. The member templates. and it will be used for all member functions of the ``auto_ptr<T>''.) is that the ``copy'' constructor and ``assignment'' operator do take a non-const reference to their right hand side. private: T value. for example in function calls (see the error example above. and likewise a good compiler may inline even functions not marked as inline (provided you cannot see any difference in the behaviour of the program. although it has been part of C++ for a very long time. necessary. Here is the promised work-around: template <class T> class explicit { public: explicit(T t) : value(t) {}. so that it can be modified. however. as the ``template <class Y>'' used inside the class definition is called. so there's a place for the ``inline'' keyword. when attempting to call a function requiring an ``auto_ptr<T>'' parameter with a normal pointer. not be worked around. This constructor is marked ``explicit'' in the class definition. to the best of my knowledge. beginning with the constructor. template <class T> inline auto_ptr<T>::auto_ptr(explicit<T*> ptr) throw() : p(ptr) { } The way this works is as follows: By default. With this mini-example. pa=pb''. auto_ptr<B> pb. Marking a function ``inline'' is a way of hinting to the compiler that you think this function is so simple that it can insert the function code directly where needed. void func() { . It is an error if two or more implicit conversions are required to get the desired effect. strictly speaking. There is a fake around it. operator T() const { return value. and ``template <class Y>'' inside the class definition. strictly speaking. and most important (please take note of this. a compiler is free to ignore it. It is an essential addition to the C++ language. is a way of creating new member functions at need. If class ``B'' is publicly derived from class ``A''. I mentioned above that the ``explicit'' keyword is not. The only thing it needs to do is to initialize the ``auto_ptr<T>'' object such that it owns the memory area. Both of these are relatively recent additions to the C++ language and far from all compilers support them. we can see that a member function auto_ptr<A>& auto_ptr<A>::operator=(auto_ptr<B>&) throw() will be generated. it owns it (by definition. }. implicit conversions are allowed.Three new details can be seen above. The code Let us do the member functions one by one. Look at this example usage: void termination(auto_ptr<int> pi). Third. and the code above. This feature can. which you will see when we get to the implementation details.) Few compilers are smart enough to inline automatically. the generated code will compile just fine. and executing a conversion operator is too. and if it points to anything at all. This is just a hint.) template <class T> inline auto_ptr<T>::auto_ptr(T* ptr) throw() : p(ptr) { } The ``inline'' keyword is new for this course. ``explicit'' is what disallows implicit construction of objects. The keyword ``explicit'' in front of the constructor.) This keyword is. This is what makes it possible to say ``auto_ptr<A> pa. instead of making a function call. not needed. otherwise we will get an error message from the compiler. but only one user defined implicit conversion may take place.

I made a mistake with the ``auto_ptr<T>'' implementation available in the adapted SGI STL. It doesn't. Note. that I thought the latter would imply the former. template <class T> inline auto_ptr<T>::auto_ptr(auto_ptr<T>& t) throw() : p(t. template <class T> inline auto_ptr<T>& auto_ptr<T>::operator=(auto_ptr<T>& t) throw () { reset(t. and does nothing at all. because the call to ``termination'' requires two user defined conversions. The code at //**2 however. It may seem like there are two implicit conversions taking place here. and one from ``explicit<int*>'' to ``auto_ptr<int>''.) template <class T> inline auto_ptr<T>::~auto_ptr() throw () { delete p. because we say that we want an object of type ``auto_ptr<int*>''. As mentioned far above. } If the object owns anything. in this case to get the value from it. //** 2 error . Since we've been so stern about this. } auto_ptr<int> pi(new int(1)). One from ``int*'' to ``explicit<int*>''. Please see the provided source code for how to allow both versions to coexist for different compilers in the same source file. is in error.release()).release()) { } The code for this constructor is.} The code at //**1 is not in error. and one for getting the value out of it. users of compilers that do not implement member templates will get compilation errors on this member function. but that is not quite true. which means that the resulting ``auto_ptr'' will be limited in functionality. ``p'' will be the 0 pointer. except that the parameter is a non-const reference. it will be deleted by the destructor. //**1 legal termination(new int(2)). by the way.release()).. Then. Note that both are necessary. the same as for the previous one. we will be obeyed. template <class T> template <class Y> inline auto_ptr<T>::auto_ptr(auto_ptr<Y>& t) throw() : p(t. the member ``release'' relieves the object of ownership and returns the pointer.release()) { } There is not much strange going on here. however. return *this. with the two subsequent ``template <. of course. If the object does not own anything. it is a detail of the innards of the ``auto_ptr<T>'' constructor how it is used. } template <class T> template <class Y> inline auto_ptr<T>& auto_ptr<T>::operator=(auto_ptr<Y>& t) throw () { reset(t.. Thus this constructor makes ``p'' point to what ``t'' did point to. and alters ``t'' so that it becomes the 0 ``pointer''. Please see the source code for how to work around the compilation error (the work around is simply not to have this member function. even though it may seem so. Our ``auto_ptr<int>'' accepts as its parameter an ``explicit<int*>'' which is implicitly created from the pointer value. one for creating the ``explicit<T>'' object. Note that deleting the 0 pointer is legal.>'' Of course. the syntax for a member template. return *this.

the price is nothing at all. } These are not identical with the version of ``ptr<T>'' from the previous issue of the course. How much does it cost. is there? The object is relieved of ownership by making ``p'' the 0 pointer. return tp. } Not much to say about this one. Nothing strange. template <class T> inline T& auto_ptr<T>::operator*() const throw () { return *p. If we didn't have this guard. we cannot just assign to ``p'' (it may point to something. only be used if ``T'' is a struct or class type. of course. for later deletion again! It seems like a better way is to just do nothing if the situation ever arises. resetting to the current value would deallocate the memory and keep the ownership of it. p=0. performance and memory-wise to use the ``auto_ptr<T>'' instead of ``ptr<T>'' from last month? If you use ``auto_ptr<T>'' instead of ``ptr<T>''. though. I would say the difference is that with ``auto_ptr<T>'' you do many more deletions (i. You pay for what you use only.) . you have mended memory leaks you were not aware of having. template <class T> inline T* auto_ptr<T>::get(void) const throw () { return p. and the value previously held by ``p'' is returned. in which case it must be deallocated. it's even illegal to instantiate ``auto_ptr<T>'' if ``T'' is not a struct or class type. The constructor. If you have a measurable speed difference in a realworld application. } } Deletes what it points to and sets ``p'' to the given value. template <class T> inline T* auto_ptr<T>::release() throw () { T* tp=p. after all. template <class T> inline void auto_ptr<T>::reset(T* t) throw () { if (t != p) { delete p. In most cases this is a minor limitation. One word on the way.) The member function ``reset'' does exactly what we want. and not built-in types. and give it a new value. Efficiency The question of efficiency pops up now and then. It will depend a lot on how clever your compiler is with inlining.e. delete whatever ``p'' points to. On some older compilers. ``operator*'' and ``operator->'' holds exactly the same code for both templates. ``operator->'' can. Since we are not creating a new object. p=t. and use only the functionality that ``ptr<T>'' offers. Compared to raw pointers and doing your own deletion? I do not know.This is pretty much the same story as the ``copy'' constructor. except the safety guard against resetting to the value already held. } Not much to say. Most probably close to none at all. destructor. just as mentioned in the introduction of the class. } template <class T> inline T* auto_ptr<T>::operator->() const throw () { return p. it is normally structs and classes you handle this way. however.

the count is incremented to 1. but we also want to be sure that the memory is deallocated when no longer needed. Part Part Part Part Part Part Part Part Part Part1 Part1 Part1 Part1 1 2 3 4 5 6 7 8 9 0 1 2 3 Introduction Last month's ``auto_ptr<T>'' class template. so the resource must be deallocated. and also buy a few commercially available libraries for. • Automatic memory deallocation and ownership transfer reduces the risk for memory leaks. so please write and give me suggestions for future topics to cover. or you want something clarified further or disagree with me.Recap The news this month were: • The standard class template ``auto_ptr<T>'' handles memory deallocation and ownership transfer. The weakness of this compared to automatic garbage collection is that it does not work with circular data structures (the count never goes below 1. Next month we'll have a look at a smarter pointer. When allocated it is set to 0. no one is referring to it anymore. just like function templates can be used to create functions at compile time.) The problems to solve Many of the problems with a reference counting pointer are the same as for the auto pointer.) The less general solution is reference counting. We want several places of the code to be able to access the memory. The idea is that a counter is attached to every object allocated. When the first smart pointer attaches to it. no owner. The list is actually a bit shorter. documents and implements ownership transfer of dynamically allocated memory. Often. increments the reference count. • Member templates can be used to create member functions at compile time. Every smart pointer attaching to the resource. a reference counted one. If the counter reaches zero. however. and every smart pointer detaching from a resource (the smart pointer destroyed. The general solution to this is called automatic garbage collection (something you can read several theses on. since there's no need to worry about ownership. • Implicit conversions between raw pointers and smart pointers is bad (even if it may seem tempting at first. however. • The ``explicit'' keyword can be faked. we do not want to be bothered with ownership. or assigned another value) the resource's counter is decremented. but last one out locks the door. • exception safety • no confusion with normal pointers • controlled and visible rebinding and access • works with polymorphic types • pointer-like syntax for pointer-like behaviour • automatic deletion when no longer referring to the object . please drop me a line and I'll address your ideas in future articles. Exercises • • • Why is it a bad idea to have arrays (or other collections) of ``auto_ptr<T>''? Can smart pointers be dangerous? When? ``auto_ptr<T>'' too? What is a better name for this function template? template <class T> void funcname(auto_ptr<T>) { } What happens if ~T throws an exception? • Coming up If I missed something. • ``inline'' hints to a compiler that you think a function is so small that it is better to directly insert the function code where required instead of making a function call.) • The ``explicit'' keyword disallows implicit construction of objects. I'm beginning to dry up on topics now. especially when exceptions occur.

When a second counting pointer ``P2'' is created from ``P1''. the reference count for the value pointed to is set to one. the object pointed to is not duplicated. Here is how it is supposed to work when we are done: counting_ptr<int> P1(new int(value)). so it is better not to have the functionality. counting_ptr<int> P2(P1). After creating a counting pointer ``P1''. but the reference count is incremented.This might also be the place to mention a problem not to solve. When three counting pointers refer to the same object. that of how to stop reference counting a resource. . counting_ptr<int> P3(P2).manage(new int(other)). This is exactly what we want to avoid. Adding this functionality is not difficult. but it quickly leads to user code that is extremely hard to maintain. P1. the value of the counter is three.

and the object is deallocated. P3=P2. Now instead the new object has a reference count of 3. the old reference count is decremented (there are only two references to it now) and the new one is assigned a reference count of 1. and for the new one it is incremented. Now that the last counting pointer referring to the old object moves its attention away from it. When yet one of the pointers move attention from the old object to the new one. since there are three reference counting pointers referring to it. the counter for the old one is yet again decremented. P2=P1. . the old objects reference count goes to zero.As one of the pointers referring to the first object created is reinitialized to another object.

use a ``T*'' instead. All we do is to peek inside and see what the internal raw pointer value is. void manage(T* t). since the ``value'' component is indeed a value and dynamic binding only works through pointers and references. }. representation* ptr. We cannot use it for dynamic binding. A solution that easily springs to mind is to use a struct with a counter. Where to store the reference count Before we can dive into implementation. counting_ptr<T>& operator=(const counting_ptr<T>& t) throw (). template <class Y> counting_ptr(const counting_ptr<Y>& ty) throw(). this would be very unfortunate since their semantics differ dramatically. ~counting_ptr() throw (). giving the reference counting pointer an identical interface. Whenever accessing the object referred to. template <class Y> counting_ptr<T>& operator=(const counting_ptr<Y>& ty) throw().) There is a performance disadvantage with this. T* peek(void) const throw(). The differences lie in accessing the raw pointer and giving the pointer a new value. To me the word ``get'' associates with a transfer. T& operator*(void) const throw (). however. we have the object and counter together.Interface outline The interface of the reference counting smart pointer will. It also becomes very difficult to write the constructor and ``manage'' member function. While these aspects could use the same interface as does the auto pointer. Compared to the auto pointer. }. and the type referred to. It is obvious that the counter belongs to the object referred to. Unfortunately there are two severe drawbacks. The best solution I have seen is to decouple the representation from the object and instead allocate an ``unsigned'' and in every counting pointer object keep both a pointer to the counter and to the object referred to. The member function ``reset'' is here named ``manage''.) The member function ``get'' is here named ``peek''. the only differences are that some member functions do not have an empty exception specification. we must figure out where to store the reference count. All we need to work this way is to make sure to allocate this representation struct on heap in the constructor and ``manage'' member functions (and of course to deallocate the struct when we're done with it. like this: template <class T> class counting_pointer { public: private: struct representation { unsigned counter. and there is no transfer occurring. This gives the following data section of our counting pointer class template: template <class T> class counting_pointer { public: . counting_ptr(const counting_ptr<T>& t) throw(). }. T value. template <class T> class counting_ptr { public: explicit counting_ptr(T* t = 0). so it cannot reside in the smart pointer object. Here is a suggested interface. With this construct. the pointer to the representation and the pointer to the object from the representation. and the reason is a big difference in semantics. T* operator->(void) const throw (). for obvious reasons. The work around is simple. and there is no member function corresponding to ``auto_ptr<T>::release()'' (which stops the managing of the pointer. we must follow two pointers. I think this name better describes what is happening. share much with the auto pointer.

and it reports the results to us. One step on the way towards a solution is to see that the management of the counter is independent of the type T. The problem is that ``counting_ptr<T>'' and ``counting_ptr<Y>'' are two distinct types. you need to define the ``bad_alloc'' class. operator new will throw ``bad_alloc'' in out-of-memory conditions. however. The member functions ``release'' and ``reinit'' return 1 if the old counter is discarded (and hence. so we can implement the counter managing code in a separate class. there is nothing to update. }. even when member templates are not available. counting_base(const counting_base& cb) throw(). Base implementation It is fairly easy to implement. In fact. the function body throwing the exception will not be necessary.private: T* ptr. and makes life easier later on. Accessibility The solution outlined above is so good it almost works. inline counting_base::counting_base(unsigned count) : pcount(count ? new unsigned(count) : 0) { if (count && !pcount) throw bad_alloc(). we should delete the object we refer to) or 0 if it was just decremented. counting_base& operator=(const counting_base& cb) throw (). int release() throw(). For older compilers. To begin with. but there is a two-fold problem with that. member templates open up holes in the type system you can only dream of. please read Scott Meyer's paper on the topic. If you have a modern compiler. there is nothing in the copy constructor that can throw exceptions. if we want the ability to assign a ``counting_ptr<T>'' object from a value of type ``counting_ptr<Y>'' if a ``T*'' can be assigned from a ``Y*''. A count of 0 is represented by a 0 pointer. we must think of something. such that the assignment and construction from a counting pointer of another type are impossible. However. For compilers that do not support member templates. Second. Note that the copy constructor never involves any allocation or deallocation of dynamic memory. This kind of problem is exactly what ``friend'' declarations are for. a rather new addition to the language. A reference counting class may look like: class counting_base { public: counting_base(unsigned count = 0). private: unsigned* pcount. We just need to tell it how to behave. .pcount) { if (pcount) ++(*pcount). }. extremely few compilers support template friends.) When we have 0 pointers. but we need a solution for accessing the counter without making it publicly available. For the curious. ~counting_base(void) throw(). The value of the raw pointer member can be accessed through the public ``peek'' member function. } Copying a reference counting object means adding one to the reference counter (since there is now one more object referring to the counter. unsigned* pcount. so both the member variables are private and thus inaccessible. The idea here is that the this class handles every aspect of the reference counter. there is no need to waste time and memory by allocating a reference counter for it. If our reference counting pointer is initialized with the 0 pointer. inline counting_base::counting_base( const counting_base& cb ) throw () : pcount(cb. it is all we need. The default constructor allows us to choose whether we want a reference counter or not. } Initialize a counter with the value of the parameter. int reinit(unsigned count=1).

class ``counting_ptr<T>'' and ``counting_ptr<Y>'' are different classes and because of this are not allowed to see each others private sections. If the pointer to the counter is not set to zero it means either referring to just deleted memory. inline int counting_base::release() throw () { if (pcount && --(*pcount) == 0) { delete pcount. } return *this. The problem remains. in that if this object was the last reference the counter is reinitialized instead of deallocated and then allocated again. The easy way out is to use public inheritance. it really does not solve the accessibility problem. return 0. if (pcount) ++(*pcount). and one more referring to the counter from the right hand side object. It may be that ``release'' is called just prior to destruction. pcount=cb. never deallocates the same area twice. a new counter must of course be allocated.) Accessibility again As nice and convenient the above helper class is. } Strictly speaking. } If the reference count goes to zero the counter is deallocated. if (count && !pcount) throw bad_alloc(). ``reinit'' is not needed. and say that . we implement that work in the ``release'' member function. } Assignment of reference counting objects means decrementing the reference count for the left hand side object. It is an optimization of memory handling.) In both cases the pointer to the counter is set to 0 as a precaution. } Destroying a reference counting object means decrementing the reference counter and deallocating it if the count goes to zero (last one out locks the door. return 0. inline counting_base& counting_base::operator=( const counting_base& cb ) throw() { if (pcount != cb.e. and never dereferences the 0 pointer. inline int counting_base::reinit(unsigned count) { if (pcount && --(*pcount) == 0) { *pcount = count. Its purpose is to release from the current counter and initialize a new one with a defined value. since there will be one less object referring to the counter from the left hand side object. As an exercise. as well as in the public interface for use by the reference counted smart pointer class template. and incrementing it for the right hand side object. } pcount=0.pcount. since it is the last one referring to it. however. never accesses uninitialized or just deallocated memory. A return value of 1 means deallocation took place (hinting to the user of this class that it should deallocate whatever object it refers to. prove to yourself that this reference counting base class does not have any memory handling errors (i. and the destructor calls ``release'' to deallocate memory. or decrementing the reference count twice. though. return 1.) Since this code is needed in the assignment operator. If it was not the last object referring to the counter. it always deallocates what it allocates.pcount) { release().inline counting_base::~counting_base() throw() { release(). pcount = 0. } pcount = count ? new unsigned(count) : 0. return 1. It does make the implementation a bit easier.

} The destructor is important to understand. the implementation is not too convoluted. pt=t. That is a solution that is simple. template <class T> inline counting_ptr<T>& counting_ptr<T>::operator=( const counting_ptr<T>& t ) throw() { if (pt != t. As such they do not become part of the public interface. and 0 otherwise. counting_base::operator=(t). you see that it deallocates the counter if.peek()) { } Note how the latter makes use of the knowledge that a ``counting_ptr<Y>'' is-a ``counting_base''. otherwise it returns 0. template <class T> inline counting_ptr<T>::counting_ptr( const counting_ptr<T>& t ) throw() : counting_base(t). pt(t) { } Initialize the counter to 1 if we have a pointer value. they can be declared protected.pt. and the is-a relationship is mostly imaginary. Implementation of a reference counting pointer Finally we can get to work and write the reference counting pointer. pt(t. the count reaches 0 and then returns 1. inline counting_ptr<T>::~counting_ptr() throw() { if (release()) delete pt. and only if the reference count for it reaches zero and is deallocated. This is bordering on abuse. and only if. If you review the functionality of the ``release'' member function of ``counting_base''. but it works fine. There is no is-a relationship here. Such relationships are implemented through private member data or private inheritance . } return *this.almost. template <class T> inline counting_ptr<T>::counting_ptr(T* t) : counting_base(t ? 1 : 0).pt) { } template <class T> template <class Y> inline counting_ptr<T>::counting_ptr( const counting_ptr<Y>& ty ) throw() : counting_base(ty). This means that in the destructor we will deallocate whatever ``pt'' points to if.pt) { if (release()) delete pt. } template <class T> template <class Y> inline counting_ptr<T>& counting_ptr<T>::operator=( const counting_ptr<Y>& ty ) throw() { . and since the helper class does most of the dirty work. sweet and dead wrong. pt(ty. but an is-implemented-in-terms-of relationship. Instead of having the member functions of the ``counting_base'' class public.every counting pointer is-a counting base.

a pointer is also allocated. } This one is only slightly tricky. counter manipulation is done. Every time an object is allocated. the object referred to by the left hand side reference counting pointer is deleted if. ``release'' returns 1. if (pt != ty. so ``reinit'' is called instead. so the example programs are written for. pt = t. Does the public derivation from ``counting_base'' open up any holes in the type safety. Part Part Part Part Part Part Part Part Part Part1 Part1 Part1 Part1 1 2 3 4 5 6 7 8 9 0 1 2 3 Introduction Lucky number thirteen will conclude the long C++ series. The access operators are all trivial and need no further explanation: T* operator->() const T& operator*() const T* peek() const Now only the ``manage'' member function remains: template <class T> inline void counting_ptr<T>::manage(T* t) { if (!t && release() || t && t != pt && reinit()) delete pt. and that costs a few CPU cycles. the reference count goes to zero and the counter is deallocated. Efficiency There is no question about it. If the parameter is the zero pointer. Instead it is simply assigned. Last topic is Run Time Type Identification. } return *this. and the first time the counter pointer is set to zero to prevent decrementing the counter twice. the object referred to is deallocated. or RTTI for short. which it does if. and tested with.) one of the new functionalities in C++.) under Linux. .peek(). Improve the implementation to never (implicitly) set the pointer to zero and yet always be safe. If either of those tells that the old counter is discarded. please drop me a line and I'll address your ideas in future articles. If they refer to different objects. and vice-versa for deallocation. and only if. nothing happens. and thus ``release'' is called. the egcs compiler (a fast-moving bleeding edge gcc.} Please spend some time convincing yourself that the above works. Depending on how efficient your compiler's memory manager is for small objects this cost may be negligible. For example at destruction. or it may be severe. After this the raw pointer value can safely be copied. despite that all member functions are protected? • Coming up If I missed something. Next month is devoted to Run Time Type Identification (RTTI. the assignment operator for ``counting_base'' does not decrement the counter again.peek()) { if (release()) delete pt. Every time a counting pointer is assigned. destroyed and copied. there is a cost involved in using reference counting pointers. Otherwise we want the counter value to be 1. the ``release'' member function is called twice. constructed. we want the reference count to also be the zero pointer. pt=ty. Since ``release'' also resets the counter pointer to zero. What happens is that if the left hand side and right hand side counting pointers already refer to the same object. I do not have access to any compiler under OS/2 that supports RTTI. or you want something clarified further or disagree with me. counting_base::operator=(ty). on the other hand. Exercises • • • What happens if ``~T'' throws an exception? What happens if allocation of a new counter fails? The ``counting_base'' implementation and use is suboptimal. It's a new addition to C++.

The solution is to do the cast only if it is legal. There is one catch with ``dynamic_cast''. however. and the down cast marked with ``^^^''is safe. Towards the end of the article there is also a discussion about various aspects of efficiency in C++. most notably GUI libraries like IBM's Open Class Library. Or is it? Right now it is. }.button()). For example. Let us put it to use: void pushed(const PushButton::PushedEvent& ev). or just generally weird behaviour. concider a button push event. that is not much of a penalty. void register(void (*pushed)(const PushedEvent&)). }. } This does not look too dangerous. } void pushed(const PushButton::PushedEvent& ev) { TextPushButton* pb=(TextPushButton*)(ev. The idea is that whoever uses a push button can register a function that will be called whenever a pushbutton is called.RTTI is a way of finding out about the type of objects at run-time. so we know it is the right kind. the exception ``bad_cast'' is thrown. At least the destructor should be virtual in such hierarchies anyway. class Control : public Window {}. Another is a clever cast. and somewhere the wrong callback is registered for some button. pb->setText(txt).) That way we can check and take some action if the function is called with a pointer or reference to the wrong type. class PushButton : public Button { public: class PushedEvent : public Event { public: PushButton* button(void) const. . One is finding a type identification for an object. which allows casting of pointers and references only if the type matches. class TextPushButton : public PushButton { public: void setText(const char* txt). otherwise you get compilation errors. suffer from the problem that callback functions will be called with pointers to objects of a base class. }. } ``dynamic_cast<T>(p)'' works like this: If ``p'' is-a ``T'' the result is a ``T'' referring to the object. Suppose a hierarchy like this: class Event {}. Here is a new version of the ``pushed'' function using the RTTI ``dynamic_cast''. // ^^^^^^^^^^^^^^ pb->setText("***"). does it? The callback is registered when the button is created. pb->setText("***"). giving a unique identifier for each unique type. void pushed(const PushButton::PushedEvent& ev) { TextPushButton* pb= dynamic_cast<TextPushButton*>(ev. it might not be as easy to see what happens. but you know the objects really are of some class inheriting from it. especially in comparison with C. assert(pb). from which you can get a pointer to the button itself. pb->register(pushed). The result is likely an uncontrolled crash. Type Safe (down)Casting Many class libraries.button()). There are two distinct ways this can be done. It works only if there is at least one virtual member function in the type casted from. but as the program grows. otherwise a zero pointer is returned (of if ``T'' is a reference. If you think about it. void createButton(const char* txt) { TextPushButton* pb=new TextPushButton(). class Window {}. class Button : public Control {}.

even the built-in ones. }.) Sometimes we have to live with poor designs. bool operator==(const type_info&) const. bool operator!=(const type_info&) const. but I have never seen a counter proof either. as an error check. which is based on standard strings. This solves a problem that cannot reliably be worked around with clever designs. and its end must be marked by appending a zero termination with the ``ends'' modifier. Most notably it is not standardised what the printable form looks like for any given type. the standard requires very little of the ``name'' member function. it is not even required that the string is unique for each type. However. and during transitions between libraries. pos->freeze(0). This requires a little bit of care. Note that it is the runtime type of ``x'' that is accessed. } This transition is from the ``old'' ``ostrstreams''. It is defined in the header named <typeinfo> (note. Identifying types Much more interesting is using explicit information about the type of an object. } else throw bad_cast(). } else if(ostringstream*pos= dynamic_cast<ostringstream*>(&os)) { cout << pos->str(). In which way is this worse from the previous version that did not have the problem. where everything was always ``ostrstream''? Well. You get a ``type_info'' object for a value through the built-in operator ``typeid(x)''. It can also be used during a transition between different libraries. never design new code like this! Use this construct only during a transition phase. there may not be one. In fact. The ``before'' member function gives you a sort order for types. to the standard ``ostringstream''.) However. by turning the problem ``inside-out''. storing strings as plain ``char*''. This unfortunately means that you cannot write portable (across compilers) applications that rely on the names in any way. . which purpose is to carry information about types. cout << pos->str(). other than that it makes it possible to keep sorted collections of type information objects. Say you have this situation: class parent {}. however. The ``name'' member function gives you access to a printable form of the name of the type identified by the ``type_info'' object. no ``. private: type_info(const type_info&).) The need arises from a poor design (the solution is to use dynamic binding instead. and we have added an error check that should have been there earlier but was not because the language did not support it (what if the function is called with an ``fstream'' object?) Use RTTI ``dynamic_cast'' when being forced to adapt to poorly written libraries. type_info& operator=(const type_info&). and many have tried. and then this can make our life a lot easier. in no way. use dynamic binding instead. not the static type. bool before(const type_info&) const.h'') and looks as follows: class type_info { public: virtual ~ type_info(). while standard ``ostringstream'' returns a string object which itself knows how long it is and a zero termination must not be added to it (otherwise an extra zero character will be printed. The difference lies in handling the end. It does not need to have any meaning. An ``ostrstream'' holds a raw buffer of memory. For new designs. Do not try to get a meaning from the sort order. const char* name() const. With RTTI we can live with both worlds at the same time. Here is an example mirroring a problem from my previous job: void print(ostream& os) { if (ostrstream* pos=dynamic_cast<ostrstream*>(&os)) { *pos << ends. There is a standard class called ``type_info''.It must be stressed that this use of ``dynamic_cast'' can always be avoided (I cannot prove it.

and slightly restricted way.name() << endl. This came as a surprise to me. but when reading. }. but not as limiting as it may seem. you need to know what type of object to create. however. char* argv[]) { fstream file(argv[1]). Let us call this class ``persistent_store''. It is what it points to that may differ. }. template <class T> void register_class<T>(). cout << typeid(*p). Next we need something that does the work of calling the member functions and creating the objects when read. virtual void retrieve(istream& is) { is >> *this. Is this useful? Suppose you need to store and retrieve objects from a typeless media. Only classes registered with the store may be used. and that is exactly what is mirrored by the output. Generality and portability requires more work. but what those will be will depend on our implementation. persistent_X* px=dynamic_cast<persistent_X*>(pp). void store_object(persistent*). we just need to create persistent versions of the classes. .template register_class<persistent_X>(). Obviously only classes inherited from ``persistent'' may be stored and retrieved.retrieve_object(). virtual void retrieve(istream& is) = 0. The syntax is a bit ugly. storage. but it makes sense. you can add some type identifier to all classes. }. parent* p=new child(). but this is error prone. The intention is that all classes we want to store must inherit from ``persistent'' and implement the ``store'' and ``retrieve'' member functions. persistent* retrieve_object(). It may look like this: class persistent { protected: virtual void store(ostream& os) const = 0.name() << endl. but it is not too bad. Storing them is easy. what do you think the above snippet prints? The answer is ``parent*'' followed by ``child''. The run-time type of ``p'' is ``parent*''. I have chosen the only additional constraints to be that they have a default constructor and may be created on the heap with ''operator new''. or a socket. Here is an outline for how to do this (in a non-portable. The use of template functions that do not have any parameters of the template type is a recent addition to C++ that few compilers support. It may be defined as follows: class persistent_store { public: persistent_store(iostream& stream). let us call it ``persistent''. such as a file. since it is easy to forget changing it when creating a new class inheriting from another one.//* persistent* pp=storage. be other requirements put on the type ``T''. Chances are you will be extending existing code. or decide to use third party class libraries. int main(int argc.) First we design a class for I/O objects. }. cout << typeid(p).class child : public parent {}. This is limiting. }. Here is how to use the persistent store: class persistent_X : public X. There will. If we have a compiler whose name of a type as given from ``type_info'' is indeed the name of the type as we write it in the program. The interface should be reasonably obvious. If you are designing a complete system from scratch. and it does not work at all with existing classes. public persistent { protected: virtual void store(ostream& os) const { os << *this. persistent_store storage(file). We can still make use of third party libraries.

given a string representation of its type. a character buffer allocated for the correct size and exactly that many characters read. when reading. creator_map.storage. . if (iter == creator_map. Here is how it is all implemented: void store_object(persistent* p) { const char* name=typeid(*p).. medium << strlen(name) << ' ' << name. Note the ``template'' keyword. the length can be read. When storing the string is checked for in the map to make sure no unregistered types are stored. medium. That way. . My suggestion is simple. In this case the indexing type will be a string. template <class T> persistent* persistent_object_creator(void) { return new T. The first is extremely easy to check for. persistent* retrieve_object(void) { size_t len. persistent* (*creator_func)(void) = ???. store the length of the name followed by the name itself. A map is a data structure acting like an array. Not too bad. Here is one way of doing it: template <class T> void register_class() { persistent* p = (T*)0.. So . ??? can be replaced with ``persistent_object_creator<T>'' and the check line can be removed.insert(make_pair(typeid(T). The solution lies in using a map from strings to a creation function. The second problem. Any attempt to call ``register_class'' with a type not publicly inheriting from ``persistent'' will result in a compilation error. medium >> len. The tricky part is to find out what ``???'' is. • How to store the type information on the stream such that it can be interpreted unambigously. When reading.get(). char* name=new char[len+1]. but requires a bit of trickery to implement. }. } The line marked ``//*'' shows the syntax for calling template member functions without parameters of the template type. Here the map type from the C++ standard library is used. }. medium.. Now there are a number of problems that must be solved: • How to prevent classes not inheriting from ``persistent'' from being registered • How to allocate an object of the correct type on heap. Every string in the map corresponds to a function accepting no parameters and returning a ``persistent*''. p->store(medium). deciding the type from a name is easy to understand conceptually. this time a function template. // type check. but I guess one gets used to it. func)).find(name).len). the implementation of ``register_class'' may look like this: template <class T> void register_class() { persistent* p=(T*)0. Perhaps it comes as no surprise that the solution is yet another template.name(). In other words.name(). That is the easy part. // read past blank. Now for how to store and retrieve the type name. } This function template implicitly carries out the type test for us (since ``T*'' can only be returned as ``persistent*'' if ``T'' publicly inherits from ``persistent''.read(name.end()) throw "unregistered type". maptype::iterator iter=creator_map. the creator function is looked up and called. but allowing any kind of indexing type.. It is ugly. }.store_object(px).

but use new and delete instead of malloc/free.'' In a way that statement is often true. I will return to this one shortly. right? If I can make use of the generality and extensibility at some other time. There is one difference however (getting into the area of technical problems here. there is just no way to do it faster than through a virtual function call. As I see it. Enjoy. I have been touting generality and extensibility over and over and over. Generality and extensibility have a price.find(name). If you have a lean C design. there are mainly two reasons for C++ bloat. one is technical. Fear of template code bloat Since it is ``a well known fact'' that templates leads to code bloat. the programmer is a top-notch algorithm and data-structure guru. One is a cultural problem. If true the string was not represented in the map. the line ``if (iter == creator_map. but the program will not be slower and larger. the result is just about guaranteed to be slower. If inlined. C++ and Efficiency I have once in a while heard things like ``C++ programs are slow and large compared to a C program with the same functionality. As can be guessed. and are likely to reduce the number of cache hits since active code areas will increase in size. p->retrieve(medium). and read the documentation and code commens. use inline functions instead of macros and add constructors to your structs. something that the virtual function call interestingly is not. Another cultural problem leading to inefficient C++ programs is the hunt for efficiency! Yes. because of the added type safety. How many programmers have a working knowledge of the latest and greatest of algorithms and data structures? . Are you prepared to pay the price? It depends. ill chosen ones will lead to bloated executables. the contents of the standard library are state of the art. and contributes most to the bloat. developmenttime efficiency has been gained. and thus the type was not registered. the instructions for the function will be inserted at the place of the call.'' That is a major performance killer.end()) throw "unregistered type". many programmers avoid them.cpp file) where the function inlining failed. it does not. If you do. the same design can be used with C++. However. // no longer needed if (iter == creator_map. strictly speaking. After all. for every translation unit (read . Throughout this course. If you need the functionality that a virtual function call gives you. you are still better off than in plain C. however. Culture related inefficiency Let us first look at the cultural problem. }. this might mean *many* copies. In a large program. as a static function. which will be called like normal functions. return p. It will then have to make an out-of-line function for it. Here are some examples: Virtual functions I have heard people say things like ``As a performance optimization. possibly at the cost of program size and run-time efficiency.) Where will the out-of-line function reside? With one exception. than those in the standard library. If the function is large (more than a very few instructions) the size of your program will grow considerably if the function is called often. it is true. not all problems need general and extensible solutions. and quite possibly larger. and you will notice that many of the data structures and algorithms used were not even known a year ago. More interesting is what happens if the compiler for one reason or another fails to inline it. Inline functions While well chosen inline functions may speed up your program. Unless. persistent* p=(iter->second)(). This means rolling their own algorithms and collections instead of using standard ones.end())'' checks if the lookup was successful. Remember that an inline function is actually not a function. Here is one very good argument for why.maptype::iterator iter=creator_map. That is a mini persistency library for you. all compilers I know of will add one copy. you will not get the boost from high performance hits due to good locality either. Have a look at the SGI STL. Does it need to be that way? The answer is no. because that is the one that is hardest to solve. and since every copy is in its own memory area. The switch switch/call construct is guaranteed to require at least as many instructions (probably more) and is an instruction pipeline killer for the CPU. delete[] name. I've removed all virtual functions and replaced the virtual function calls with switch statements and ordinary member function calls.

unless an exception is actually thrown. Beware. Meyers' writing style is easy to follow and entertaining too. It is a mind opener in many ways. do follow comp. and that is when using good encapsulation and templates.lang.c++. and as a result they have extremely extensible. A good compiler will not leave you several copies as static functions.) Unfortunately it is not available for OS/2. this time introducing cleverer optimisation techniques like copy-on-write. which one immediately realizes will require a large program. Andrew Koening and Barbara Moo. When feeding it a program written like a C program. differs a lot. Coplien. performance will be poorer. Now. that is all. A worse technical problem is exceptions. I have used their compiler. No matter how good the implementation is. say that exception handling can be implemented such that there is no performance loss at all. James O. look up KAI C++ (Note. this is a pleasant book to read. If you are curious.moderated. John J. Recommended reading I want to finish by recommending some books on C++. This book answers all the why's. Barton and Lee R. Contains 50 tips. but it is not true. however. exceptions add a number of program flows that would otherwise not be there.) I mentioned earlier the problem of outlined inline functions. Often simply referred to as B&N. ``Effective C++. destruction. This is a modern ``Advanced C++''. and many proponents of exceptions. Scott Meyers. despite the fact that the way a C++ program works and accesses memory etc. Koenig is the only author I know who can explain a problem to solve. type-safe and fast designs.Technical problems The largest technical reason for slow C++ programs is compilers with very poor code optimizers. slick. Nackman. but it contains many useful tecniques. I know of one C++ compiler that has a code optimizer made especially for the requirements of C++. Another 35 tips. with examples and discussions. ``The Design and Evolution of C++''. Programming Styles and Idioms''. It has been shown to equal and even beat Fortran for performance. Also. ``Advanced C++. . Entertaining and enlightening. present an easy to understand one page solution just to say it was unnecessarily clumsy and reduce it to half a page without sacrificing extensibility and generality. and it might happen. I am not associated with them in any way. copy-constructor and default constructor for you? Why is RTTI there? Why no garbage collection? Why not dynamic types like in Smalltalk? ``Ruminations on C++''. 2nd ed''. ``More effective C++''. What they have done is to break almost every rule of thumb. Why does C++ have multiple inheritance? Why does it provide assignment. Scott Meyers. this book is not for beginners. ``Scientific and Engineering C++''. Some compiler vendors. for how to improve your programs. Without doubt this book from 1992 is getting old. and that makes life harder for the code optimiser. and I doubt it ever will be (bug them about it. but unfortunately many do. Most C++ compilers use exactly the same optimization techniques as C and Fortran compilers do. A must read for any C++ programmer. It may seem like it. Bjarne Stroustrup. though.

You're Reading a Free Preview

/*********** DO NOT ALTER ANYTHING BELOW THIS LINE ! ************/ var s_code=s.t();if(s_code)document.write(s_code)//-->