P. 1


|Views: 19|Likes:
Publicado porJohn Jairo Cabal

More info:

Published by: John Jairo Cabal on Jan 02, 2011
Direitos Autorais:Attribution Non-commercial


Read on Scribd mobile: iPhone, iPad and Android.
download as DOC, PDF, TXT or read online from Scribd
See more
See less






  • Int
  • Classes
  • A class
  • Destructor
  • An improved stack
  • Exercise
  • Recap
  • Next
  • References
  • What's a class?
  • The Orthodox Canonical Form
  • Const Correctness
  • Exercises
  • Coming up
  • Why templates
  • Function templates
  • Templates and exceptions
  • Class templates
  • Advanced Templates
  • Introduction
  • Exploring I/O of fundamental types
  • I/O with our own types
  • Formatting
  • An easier way
  • Standards update
  • Other uses
  • Conclusion
  • Short recap of inheritance
  • A deficiency in the model
  • Pure virtual (abstract base classes)
  • Addressing pure virtuals
  • Unselfish protection
  • A toy program
  • Files
  • File Streams
  • Binary streaming
  • Array on file
  • Seeking
  • A stream array, for really huge amounts of data
  • In memory data formatting
  • The data representation problem
  • Several arrays in a file
  • Temporary file array
  • Code reuse
  • What can go wrong?
  • Iterators
  • The problem to solve
  • Implementation
  • Efficiency
  • Type Safe (down)Casting
  • Identifying types
  • C++ and Efficiency
  • Recommended reading

Part1 Part2 Part3 Part4 Part5 Part6 Part7 Part8 Part9 Part10 Part11 Part12 Part13 Part Part

Part Part Part Part Part Part Part Part1 Part1 Part1 Part1 1 2 3 4 5 6 7 8 9 0 1 2 3


Last month we saw, among others, how we can give a struct well defined values by using constructors, and how C+ + exceptions aid in error handling. This month we'll look at classes, a more careful study of object lifetime, especially in the light of exceptions. The stack example from last month will be improved a fair bit too.

A class
The class is the C++ construct for encapsulation. Encapsulation means publishing an interface through which you make things happen, and hiding the implementation and data necessary to do the job. A class is used to hide data, and publish operations on the data, at the same time. Let's look at the "Range" example from last month, but this time make it a class. The only operation that we allowed on the range last month was that of construction, and we left the data visible for anyone to use or abuse. What operations do we want to allow for a Range class? I decide that 4 operations are desirable: • Construction (same as last month.) • find lower bound. • find upper bound. • ask if a value is within the range. The second thing to ask when wishing for a function is (the first thing being what it's supposed to do) is in what ways things can go wrong when calling them, and what to do when that happens. For the questions, I don't see how anything can go wrong, so it's easy. We promise that the functions will not throw C++ exceptions by writing an empty exception specifier. I'll explain this class by simply writing the public interface of it: struct BoundsError {}; class Range { public: Range(int upper_bound = 0, int lower_bound = 0) throw (BoundsError); // Precondition: upper_bound >= lower_bound // Postconditions: // lower == upper_bound // upper == upper_bound int lowerBound() throw (); int upperBound() throw (); int includes(int aValue) throw (); private: // implementation details. }; This means that a class named "Range" is declared to have a constructor, behaving exactly like the constructor for the "Range" struct from last month, and three member functions (also often called methods,) called "lowerBound", "upperBound" and "includes". The keyword "public," on the fourth line from the top, tells that the constructor and the three member functions are reachable by anyone using instances of the Range class. The keyword "private" on the 3rd line from the bottom, says that whatever comes after is a secret to anyone but the "Range" class itself. We'll soon see more of that, but first an example (ignoring error handling) of how to use the "Range" class: int main(void) { Range r(5); cout << "r is a range from " << r.lowerBound() << " to " << r.upperBound() << endl; int i; for (;;) {

cout << "Enter a value (0 to stop) :"; cin >> i; if (i == 0) break; cout << endl << i << " is " << "with" << (r.includes(i) ? "in" : "out") << " the range" << endl; } return 0; } A test drive might look like this: [d:\cppintro\lesson2]rexample.exe r is a range from 0 to 5 Enter a value (0 to stop) :5 5 is within the range Enter a value (0 to stop) :7 7 is without the range Enter a value (0 to stop) :3 3 is within the range Enter a value (0 to stop) :2 2 is within the range Enter a value (0 to stop) :1 1 is within the range Enter a value (0 to stop) :0 Does this seem understandable? The member functions "lowerBound", "upperBound" and "includes" are, and behave just like, functions, that in some way are tied to instances of the class Range. You refer to them, just like you do member variables in a struct, but since they're functions, you call them (by using the, in C++ lingo named, function call operator "()".) Now to look at the magic making this happen by filling in the private part, and writing the implementation: struct BoundsError {}; class Range { public: Range(int upper_bound = 0, int lower_bound = 0) throw (BoundsError); // Precondition: upper_bound >= lower_bound // Postconditions: // lower == upper_bound // upper == upper_bound int lowerBound() throw (); int upperBound() throw (); int includes(int aValue) throw (); private: int lower; int upper; }; Range::Range(int upper_bound, int lower_bound) throw (BoundsError) : lower(lower_bound), /***/ upper(upper_bound) /***/ { // Preconditions. if (upper_bound < lower_bound) throw BoundsError();

// Postconditions. if (lower != lower_bound) throw BoundsError(); if (upper != upper_bound) throw BoundsError(); } int Range::lowerBound() throw () { return lower; /***/ } int Range::upperBound() throw () { return upper; /***/ } int Range::includes(int aValue) throw () { return aValue >= lower && aValue <= upper; /***/ } First, you see that the constructor is identical to that of the struct from last month. This is no coincidence. It does the same thing and constructors are constructors. You also see that "lowerBound", "upperBound" and "includes", look just like normal functions, except for the "Range::" thing. It's the "Range::" that ties the function to the class called Range, just like it is for the constructor. The lines marked /***/ are a bit special. They make use of the member variables "lower_bound" and "upper_bound." How does this work? To begin with, the member functions are tied to instances of the class, you cannot call any of these member functions without having an instance to call them on, and the member functions uses the member variables of that instance. Say for example we use two Range instances, like this: Range r1(5,2); Range r2(20,10); Then r1.lowerBound() is 2, r1.upperBound() is 5, r2.lowerBound() is 10 and r2.upperBound() is 20. So how come the member functions are allowed to use the member data, when it's declared private? Private, in C++, means secret for anyone except whatever belongs to the class itself. In this case, it means it's secret to anyone using the class, but the member functions belong to the class, so they can use it. So, where is the advantage of doing this, compared to the struct from last month? Hiding data is always a good thing. For example, if we, for whatever reason, find out that it's cleverer to represent ranges as the lower bound, plus the number of valid values between the lower bound and upper bound, we can do this, without anyone knowing or suffering from it. All we do is to change the private section of the class to: private: int lower_bound; int nrOfValues; And the implementation of the constructor to: Range::Range(int upper_bound, int lower_bound) throw (BoundsError) : lower(lower_bound), /***/ nrOfValues(upper_bound-lower_bound) /***/ ... And finally the implementations of "upperBound" and "includes" to: int Range::upperBound() throw () { return lower+nrOfValues; } int Range::includes(int aValue) throw () { return aValue >= lower && aValue <= lower+nrOfValues;

} When run. } Tracer::~Tracer() { cout << ". The only one allowed to make changes to the member variables are functions belonging to the class. Tracer t3. but prepended with the ~ character. ++u) { Tracer inLoop("inLoop"). ~Tracer(). #include <iostream. Tracer* t2 = new Tracer("leaky"). } delete tp.h> class Tracer { public: Tracer(const char* tracestring = "too lazy. and the same string. either by going out of scope. 2). there was a promise that the member variable "upper" would have a value greater than or equal to that of the member variable "lower". }. tp = new Tracer("on heap"). prepended with a "+" character. r." << string << endl. and it never accepts any parameters. I get this behaviour (and so should you.lower > r.} We also have another. How much was that promise worth with the struct? This much: Range r(5. return 0. Destructor Just as you can control construction of an object by writing constructors. when constructed. you can control destruction by writing a destructor. benefit. for (unsigned u = 0. that helps us find out the life time of objects. A destructor has the same name as the class. It won't work. and usually more important. and those we can control. // destructor private: const char* string. // Oops! Now r. Already with the struct. Tracer::Tracer(const char* tracestring) : string(tracestring) { cout << "+ " << string << endl. a promise of integrity.lower = 25. } What this simple class does is to write its own parameter string. when destroyed. A destructor is executed when an instance of an object dies. Let's toy with it! int main(void) { Tracer t1("t1"). } Tracer* tp = 0. We can use this to write a simple trace class.upper!!! Try this with the class. { Tracer t1("Local t1"). or when removed from the heap with the delete operator. u < 3. unless you have a buggy compiler): . eh?"). prepended by a "-" character. Tracer t2("t2").

instantiated with the string "leaky" is never destroyed.exe + t1 + t2 + too lazy. } SuperTracer::~SuperTracer() { cout << "~SuperTracer" << endl. }.inLoop + inLoop . This is perhaps not very surprising. objects are destroyed in the reversed order of creation (have a careful look. looking at how the constructor is written.exe + t1 SuperTracer(t1) + t2 SuperTracer(t2) ~SuperTracer .t2 . it's true. private: Tracer t.t1 This means that the contained object ("Tracer") within "SuperTracer" is constructed before the "SuperTracer" object itself is.inLoop + inLoop . right? class SuperTracer { public: SuperTracer(const char* tracestring).too lazy. } int main(void) { SuperTracer t1("t1"). and it's always true.[d:\cppintro\lesson2]tracer.Local t1 .on heap . return 0. What happens with classes containing classes then? Must be tried. ~SuperTracer(). SuperTracer t2("t2"). with a call to the "Tracer" . SuperTracer::SuperTracer(const char* tracestring) : t(tracestring) { cout << "SuperTracer(" << tracestring << ")" << endl. eh? . the object on heap.t2 ~SuperTracer . } What's your guess? [d:\cppintro\lesson2]stracer.) We also see that the object. eh? + inLoop .t1 What conclusions can be drawn from this? With one exception.inLoop + Local t1 + leaky + on heap .

const char* tracestring) throw (const char*). if (destructorThrow) throw (const char*)"SuperTracer::~SuperTracer". } catch (const char* p) { . SuperTracer t1(1. Superficially. }. SuperTracer::SuperTracer(int i. It's not unlikely that the member data is useful in some way to the destructor. } catch (const char* p) { cout << "Caught " << p << endl. } try { SuperTracer t1(1. one where the constructor of "SuperTracer" throws. but there is a good reason for this. and non-zero for throwing in the destructor. we'd have serious problems properly destroying our no longer needed objects. ~SuperTracer() throw (const char*). and what if the member data is destroyed when the destructor starts running? At best a destructor would then be totally worthless. zero for throwing in the constructor. const char* tracestring) throw (const char*) : t(tracestring). but it's a bit deeper than that. Perhaps a bit surprising is the fact that the "SuperTracer" objects destructor is called before that of the contained "Tracer". So. int destructorThrow. if (!destructorThrow) throw (const char*)"SuperTracer::SuperTracer".class constructor in the initialiser list. "throw in constructor"). destruction always in the reversed order of construction. the curious wonders. } try { cout << "Let the fun begin" << endl. what about C++ exceptions? Now here we get into an interesting subject indeed! Let's look at two alternatives. "throw in destructor"). class SuperTracer { public: SuperTracer(int i. Here's the new "SuperTracer" along with an interesting "main" function. private: Tracer t. and one where the destructor throws. We'll control this by a second parameter. } catch (const char* p) { cout << "Caught " << p << endl. the reason might appear to be that of symmetry. "throw in destructor"). destructorThrow(i) { cout << "SuperTracer(" << tracestring << ")" << endl. but more likely. } SuperTracer::~SuperTracer() throw (const char*) { cout << "~SuperTracer" << endl. } int main(void) { try { SuperTracer t1(0. "throw in constructor"). SuperTracer t2(0.

the member Tracer variable is not destroyed as it should be (VisualAge C++ handles this one correctly.) OK. What happens here is that an object is created that throws on destruction. so. C-ish way of improving it. we can see a class that. This behaviour is dangerous in terms of errors. think *very* carefully. at once. and when destroyed it will throw another one. to write a stack class. is to implement it as an abstract data type. otherwise it returns. some thinking is needed regarding what the stack should do. What's the lesson learned from this? To begin with that it's difficult to find a compiler that correctly handles exceptions thrown in destructors. where functions push. What bugs does your compiler have? Here's the result when running with GCC.) and from there decide what to do. An easy. however. Comments about the bug found are below the result: [d:\cppintro\lesson2]s2tracer. one that returns the top element. if you throw an exception because an exception is in the air. but it was far from adequate. but think carefully about the consequences. and whatever's needed is available to the users. does that indicate that the it has been removed? It's better to make two functions of it.throw in constructor Caught SuperTracer::SuperTracer + throw in destructor SuperTracer(throw in destructor) ~SuperTracer Caught SuperTracer::~SuperTracer Let the fun begin + throw in destructor SuperTracer(throw in destructor) + throw in constructor SuperTracer(throw in constructor) . What if something fails while removing the top element? Should you return the top element value? If you do. Why? Well. though. This means that the first object will be destroyed because an exception is in the air. Program execution must stop. the destructor for all so far constructed member variables are destructed. The C++ way is.cout << "Caught " << p << endl. and this is done by a call to the function "terminate". and then an object is created that throws at once. your program will terminate very quickly. and to pop the top element from it. The bug in VisualAge C++ is that it destroys the contained Tracer object before calling terminate. More important. If it fails. because you can easily lose data. Both GCC and VisualAge C++ have theirs. but the destructor for the object itself is never run. The pop function is a classical headache. As can be seen. } return 0. though. After all. before allowing a destructor to throw exceptions. and one that removes it. the exception is thrown in the destructor. The correct result can be seen in the execution above. Before going into that. } Here we can study different bugs in different compilers. or does what it's supposed to do. on the surface. An improved stack The stack from last month was in many ways better than a corresponding C implementation. because it both changes the state of the stack (removes the top element from it) and returns whatever was the top element. it exits through an exception. either a function fails. how do you destroy something that was never constructed? The next four lines reveal the GCC bug. Minimum for a stack is functionality to push new elements onto it. pop. The one that removes it either returns or throws an exception (remember.) Next we see the interesting case. If you have a bleeding edge compiler. not surprisingly. looks something like this: class intstack { public: . there's no middle way. you can control this by calling the function "uncaught_exception()" (which tells if an exception is in the air.exe + throw in constructor SuperTracer(throw in constructor) . through a call to their destructor.throw in constructor ~SuperTracer Abnormal program termination core dumped The first 4 lines tell that when an exception is thrown in a constructor.

or this article will grow far too long. What's the post conditions for the different operations? push(anInt): The stack can't be empty after that (post conditions always reflect successful completion. Throw exception and leave stack unchanged. Now let's look at what can go wrong in the different operations. the preconditions for operations pop and top (!isEmpty(). I don't see how anything can go wrong in here. // *1* struct stack_underflow {}. // retrieve value of top element private: // implementation details }. // free memory by popping all elements void push(int aValue).intstack(). what if the stack is empty? • push. What if the stack is empty? It mustn't be. but we can't check it (try to think of a method to do that. void pop(). // Preconditions: // Postconditions: // nrOfElements() == 0 || top() == old top() // *2* void push(int anInt) throw (bad_alloc). it might be indestructible. Nothing really.) Also top() == anInt. We also found. otherwise we don't leave them a chance. Thus. let's think about what to do when they occur. without adding significant control data. and hope it doesn't happen. not failure.) So. // initialise empty stack ~intstack(). and tell me if you find one. • Out of memory on push. There's no object left to check the post condition on! We can state a post condition that all memory allocated by the stack object is deallocated. Construction (from nothing): nrOfElements() == 0. • construction. • top and pop on empty stack. now we can write the public interface of the stack: struct bad_alloc {}. • invalid stack state in destruction? Can we find out of we have them? I don't think we can. • pop. with the problems identified. stack remains empty. so another function is needed. but we'll wait with that until next month. Tough one. we must allow the user to check if the stack is empty. pop(): Currently no way to say. • isEmpty. but let's change things a bit. • destruction. Instead of having the method isEmpty() we add the method nrOfElements(). class intstack { public: intstack() throw (). throw exception. Again. So. then nrOfElements will be one less after pop. • top. This looks fair. Since top and pop requires that the stack isn't empty.) Now to think of post conditions. Out of memory. rather easily. // Preconditions: // Postconditions: // The memory allocated by the object is deallocated unsigned nrOfElements() throw (). // remove top element int top(). I *think* the best solution for this problem is to just be careful with the coding. Destruction? Nothing. and out of memory. If the stack is in a bad state. that probably increases the likelihood of exactly the kind of errors we want to avoid. Normally copying and assignment (a = b) would be implemented too. // Preconditions: // Postconditions: . // Preconditions: ~intstack() throw (). top(): nrOfElements() same after as before. This leaves us with two different errors: Stack underflow (pop or top on empty stack).

For really new compilers. or the top elements are equal. if included. Here comes the complete class declaration. but with an additional element counter? I think that's a perfectly reasonable approach.) I said we wouldn't implement these this month. The reason is that if you don't declare a copy constructor and assignment operator. void pop(void) throw (stack_underflow). and below it. // *1* struct stack_underflow {}. This requirement is also implied by our destructor guaranteeing not to throw anything. class intstack { public: intstack() throw (). but what this means is that if there are elements on the stack. the top elements must be the same. Or literally as it says in the code comment. and ironically that is why they are declared private. we just throw them. coping and assignment is explicitly illegal. either the stack is empty. *2*: This looks odd. // used for post condition violations. You'll get used to this reversed looking logic. By declaring them private. So. // Preconditions: // nrOfElements() > 0 // Postconditions: // nrOfElements() == old nrOfElements() // Behaviour on exception: // Stack unchanged. // Preconditions: ~intstack() throw (). *3*: This is how the assignment operator looks like. and use the struct itself as the information. // Preconditions: // nrOfElements() > 0 // Postconditions: // nrOfElements() == 1 + old nrOfElements() // Behaviour on exception: // Stack unchanged. nothing more is needed. This is tricky. so it's not a problem. private: intstack& operator=(const intstack& is). with the old "stack_element" as a nested struct within the class. the compiler generated ones are usually not the ones you'd want. // Preconditions: // Postconditions: // The memory allocated by the object is deallocated unsigned nrOfElements() throw (pc_error). If you have such a compiler. // implementation details }. and unfortunately. however. int top(void) throw (stack_underflow). The promise to always leave the stack unchanged in when exceptions occur means that we must guarantee that whatever internal data structures we're dealing with must always be destructible. I'll talk more about this next month. // Preconditions: - . but it can be done. how do we implement this then? Why not like the one from last month. the C++ compiler will do it for you. remove the declaration of it above. struct pc_error {}. the new operator throws a pre-defined class called bad_alloc. perhaps.// nrOfElements() > 0 // top() == anInt // Behaviour on exception: // Stack unchanged. struct bad_alloc {}. // *3* intstack(const intstack& is). *1*: the structs stack_underflow and bad_alloc are empty. the copy constructor (constructing a new stack by copying the contents of an old one.

which only copies values. int value. stack_element* pNext. int top(void) throw (stack_underflow. pc_error). }.e. pc_error). private: intstack(const intstack& is). but it's OK for trivial member functions. Preconditions: Postconditions: nrOfElements() = 1 + old nrOfElements() top() == anInt Behaviour on exception: Stack unchanged. The only peculiarity here is that the constructor for the nested struct "stack_element" is defined in line (i. bit by bit. // hidden!! intstack& operator=(const intstack& is). pNext(p) {}. So let's look at the implementation. at the point of declaration. pc_error). unsigned elements. // Preconditions: // nrOfElements() > 0 // Postconditions: // nrOfElements() == old nrOfElements() // Behaviour on exception: // Stack unchanged. elements(0) { // Preconditions: } intstack::~intstack() throw () { // Preconditions: // Postconditions: // The memory allocated by the object is deallocated while (pTop != 0) { stack_element* p = pTop->pNext. void pop(void) throw (stack_underflow. // hidden !! struct stack_element { stack_element(int aValue. this should be avoided. stack_element* p) throw () : value(aValue).// Postconditions: // nrOfElements() == 0 || top() == old top() void // // // // // // push(int anInt) throw (bad_alloc. // guaranteed not to throw. like this constructor. stack_element* pTop. pTop = p. // Preconditions: // nrOfElements() > 0 // Postconditions: // nrOfElements() + 1 == old nrOfElements() // Behaviour on exception: // Stack unchanged. delete pTop.) As a rule of thumb. intstack::intstack() throw () : pTop(0). }. .

// // // // the above either throws or succeeds. ++elements. I leave the post condition. but since all that is done is to return a value. the check should be implemented. // Postconditions: // nrOfElements() == 1 + old nrOfElements() // top() == anInt if (nrOfElements() != 1 + old_nrOfElements || top() != anInt) { throw pc_error(). stack_element* pOld = pTop. pc_error) { // Preconditions: unsigned old_nrOfElements = nrOfElements().. though. elements = old_nrOfElements. the post condition should be checked. If that happens. void intstack::push(int anInt) throw (bad_alloc. } } catch (.) try { pTop = pTmp. if (pTmp == 0) throw bad_alloc(). If it throws. for some reason. as a comment. Strictly speaking. stack_element* pTmp = new stack_element(anInt. .) unsigned intstack::nrOfElements() throw (pc_error) { // Preconditions: return elements.} } These are rather straight forward. } Here I admit to being a bit lazy.. The guarantee that "delete pTop" doesn't throw comes from the fact that the destructor for "stack_element" can't throw (which is because we haven't written anything that can throw. // get rid of the new top element pTop = pOld. It's also valuable if. the implementation is changed so that it is not obvious.) { // Behaviour on exception: // Stack unchanged. and the contents of "stack_element" itself can't throw since it's fundamental data types only. and an explanation for my laziness. the memory is deallocated and we're leaving the function with the exception (before assigning to pTop. // Postconditions: // nrOfElements() == 0 || top() == old top() // no need to check anything with this // implementation as it's trivially // obvious that nothing will change. so the stack remains unchanged. pTop). since it's valuable to others reading the sources. delete pTop. it is obvious that the stack cannot change from this.

"bad_alloc" will be thrown and the stack will be unchanged. the call to "top" may throw. That case is taken care of on the next two lines.. This assignment cannot throw since "pOld" and "pTop" are fundamental types (pointers). and that is what the empty "throw. stack_element* pOld = pTop. This is harmless since we haven't done anything to the stack yet. and also contains some news. and since we promise the stack won't be changed in the case of exceptions. An empty "throw. we throw. on the other hand. A throw like this is only legal within a catch block (use it elsewhere and see your program terminate rather quickly. Thus the stack is restored to the state it had before entering "push". } return pTop->value. just remove them. what we must do. an out of memory situation will mean that the return value stored in "pTmp" is 0. Setting the new stack top and incrementing the element counter is not hard to understand. Here there are three possibilities. either case. operator new itself throws "bad_alloc" when we're out of memory. void intstack::pop(void) throw (stack_underflow. If it does. Either the creation succeeds as expected. pc_error) { // Preconditions: // nrOfElements() > 0 if (nrOfElements() == 0 || pTop == 0) { throw stack_underflow(). Next we start doing things that changes the stack. This is used solely for restoring the stack in the case of exceptions. OK. On the next line we store the top of stack as it was before the push. Let's start from the beginning. since they're unnecessary in that case. I'm lazy with the post condition check.)" will catch anything thrown from within the try block above. What we do when catching something. but also when restoring the stack should an exception be thrown. in which case we throw ourselves.) We also restore the old stack top and the element counter. "catch (. it'll most probably complain about the next two lines. or we're out of memory (the only possible error cause here since the constructor for "stack_element" cannot throw. "old_nrOfElements" is used both for the post condition check that the number of elements after the push is increased by one. On the next line a new stack element is created on the heap. pc_error) { // Preconditions: // nrOfElements() > 0 if (nrOfElements() == 0 || pTop == 0) { throw stack_underflow(). As with "nrOfElements". The call to "nrOfElements" may throw.) For most of you. The call to "nrOfElements" could throw "pc_error". in which case everything is fine." means to re throw whatever it was that was caught. things that do change the stack goes into a "try" block. All these three situations are handled in the catch block. . though. otherwise return the top value. Then. is to pass the error on to the caller of "push".throw. If you have such a compiler. If so. If we have no elements on the stack. } unsigned old_elements = nrOfElements(). the exception passes "push" and to the caller since we're not catching it. but careful to document the behaviour should the implementation for some reason change into something less obvious. // Postconditions: // nrOfElements() == old nrOfElements() } // No need to check with this implementation! } This is not so difficult. and the post condition check itself might fail. The post condition check is interesting.) int intstack::top(void) throw (stack_underflow.. without having leaked memory. } This is not trivial. is to free the just allocated memory (which won't throw for the same reason as for the destructor." does. Here we have three situations in which an exception results. If you have a brand new compiler. if we're out of memory here.

nrOfElements() = " << is1. cout << "is1. though. --elements. it would be caught.. After having spent this much time on writing this class. despite its promise. we too break our promise not to alter the stack when leaving on exception. } catch (.pop()" << endl. As it is now.nrOfElements() = " << is1. // Postconditions: // nrOfElements() + 1 == old nrOfElements() if (nrOfElements() + 1 != old_elements) { // Behaviour on exception: // Stack unchanged. } throw pc_error().top() = " << is1. is1.try { pTop = pTop->pNext. cout << "is1. throw. cout << "is1.push(32)" << endl. destructible) state. and the top of stack would be left to point to something undetermined. cout << "is1. cout << "is1. . Suppose the deletion did.nrOfElements() << endl. cout << "is1. it's time to have a little fun and play with it.push(5)" << endl.top() << endl. cout << "is1.nrOfElements() = " << is1. is why "delete pOld" is located after the "catch" block and not within the "try" block. If it did. cout << "is1. don't you think? #include <iostream. cout << "is1. // guaranteed not to throw. The thing worth mentioning here.pop(). if it breaks its promise.top() = " << is1.top() << endl. throw something. } delete pOld.. } The exception protection of "pop" works almost exactly the same way as with "push". cout << "is1.pop()" << endl.push(5).h> int main(void) { try { cout << "Constructing empty stack is1" << endl.pop(). cout << "is1. pTop = pOld. is1. is1.) { elements = old_elements. is1.nrOfElements() << endl.nrOfElements() = " << is1.push(32).nrOfElements() << endl.top() = " << is1.nrOfElements() << endl. but we at least make sure the stack is in a usable (and most notably.top() << endl. intstack is1.

Can you think of why? Mail me your reasons. or say why it fails. What I'd like you to do. I won't go into more details with references. going to fast. yes. What can go wrong. ask me. in this case. and to implement them. however. is a good check for the integrity of the stack object itself. Recap Again a month filled with news. if I take a long time in responding. but please make changes to the test program. please don't feel ignored. Ed." helps considerably here. in which case they're destroyed when the delete operator is called for a pointer to them.. (This is not to say that it should never ever be done. you have a safe design. What the function should do. it must not go undetected. • A member function can access private parts of a class. Exercise 1. except when the objects are allocated on the heap. When you have satisfactory answers to all four questions for all functionality of your class. Um. is to see what kind of "internal state" tests that can be done.. • You have seen how it is possible to. } catch (pc_error&) { cout << "Post condition violation" << endl. } catch (stack_underflow&) { cout << "Stack underflow" << endl. to ensure that the program won't die all of a sudden. Write me! I want opinions and questions. tell me.} I'm staying within the limits of the allowed here. enough. The knowledge that something of that type has been caught is.) • You can now iterate your way to a good design by thinking of 1. teaching the wrong things. and thus always have access to the member data of the class. • You have seen how classes can be used to encapsulate internals and publish interfaces with the aid of the access specifiers "public:" and "private:" • Member functions are always called together with instances (objects) of the class. to break the rules and see what happens. Now I will break a promise from last month. 3. • Destruction of objects is done in the reversed order of construction.)" and "throw. since I'll be on a well needed vacation. } catch (bad_alloc&) { cout << "Out of memory" << endl. How can a user of the class prevent things from going wrong. 2. by the way.") I promise to explain the references in more detail too. 4. while "pTop" is 0? That's a terrible error that must not occur. because this is where I end this month's lesson. } return 0. Please. What should happen when something goes wrong. and if it does. by carefully crafting your code.] . but that a lot of thought is required before doing so. I'll be net-less most of August. what if we somehow manage to get "elements" to non-zero. we'll have a look at copy construction and assignment (together with a C++ idiom often referred to as the "orthodox canonical form. Please discuss your ideas with me over e-mail (this month. Something that is badly missing in the stack implementation above.) Next Next month there will most probably be a break. It's generally considered to be a bad idea to have public data in a class.) 2. For example. It should either work. If you think I'm wrong about things. That'll be dealt with later. Part Part Part Part Part Part Part Part Part Part1 Part1 Part1 Part1 1 2 3 4 5 6 7 8 9 0 1 2 3 [NOTE: Here is a link to a zip of the source code for this article. • We have seen that throwing exceptions in destructors can be lethal. The reason is simply that I don't use it. I need your constant feedback. and the stack implementation. After that. too slow. on the catch clauses I don't bind the exception instance caught to any named parameter. whatever. make your member functions "exception safe" without being bloated with too many special cases ( "catch(.

This may sound a lot like a pointer. char d.So far we've covered error handling through exceptions. but first let's finally have a look at the promised references. what exactly is encapsulation. the function has . What's the meaning of a class? I will get to this. ra. // pc->p[2]->b[4] = 2. Let's have a look at an example: int main(void) { int i = 0. just the same ways as pointers are denoted with an unary "*". one may wonder. The reference in this case just makes life easier. the reference. They're also handy as a short-cut into long nested structs" and arrays. and encapsulation with classes. and we can manipulate it if we want to. ++r. Here's an example: struct A { int b[5]. pc is given a value and is // here used. C* pc. References are also often used for parameters to functions. References C++ introduces an indirect kind of type that C does not have. }. passing parameters by reference can be dangerous. int& r = i. }. // now i == 1. or something else than the intended. The same goes for parameters to functions. the object is copied. the function uses a local object of its own. See a reference more as an alias for another variable. A reference is in itself not a type. they offer a certain security over pointers. References are denoted with an unary "&". and sometimes beneficial in terms of performance. When you do. if (&r != &i) throw "Broken compiler". what should we make classes of. struct C { A* p[10]. // error. Passing an object by reference instead of by value. it means that instead of getting a local copy of the thing thrown. instead of manipulating our own copy. A& ra = pc->p[2]. // pc->p[2]->b[3] = 5. • There is no such thing as "reference arithmetic. // r now refers to i. r still refers to i. There is no such thing as a 0 reference. Some details about references: • References must be initialized to refer to something. A reference is a means of indirection. it's so easy to get a pointer referring to something that doesn't exist. ra. that is identical to the one you passed. I used references when catching exceptions. then passing by reference is cheaper. int q.b[3]=5.b[4]=2. It's a way of reaching another variable. The feedback from part 2 tells me I forgot a rather fundamental thing. can some times be necessary. would be an unbound reference. The reason for the performance benefit is that the when passing a parameter by value. they're very different. int x. In parts 1 and 2. int& x. return 0. for one. but don't confuse the two. which in some cases it is. However. If we look at the exceptions. • Once bound to a variable there is no way to make the reference refer to something else. If copying the object is an expensive operation. what on earth are references for? Why would anyone want them? Well. we get a reference to the thing thrown." • You cannot get the address of a reference. it always is a reference of some type. but what you get is the address of the variable referred to. You can try to. // and somewhere else. just like arrays are arrays of something and pointers are pointers to something. } From this.

} int main(void) { intstack i. So. but what does the reference returned refer to? It refers to the local variable "i" in "ir". exactly. the reference returned refers to a variable that no longer exists! Don't do this! Or rather. and because of this all attempts to change its value will cause a compile time error. The member functions of the class. "unsigned" and "double". We can increment the value of instances of the type with operations like ++. In other words. When should you write a class. The idea "Bicycle" that is. Java or whatever. just to have your one time mistake over with :-) What's a class? Now for the theoretical biggie. workWithStack(i). With the built in integral types. My bicycle is a . and what's a good name for a class? What's the relation between classes and objects? When you write programs in Object Oriented Programming Languages. is returning a reference to a local variable. Have a look at this: #include <iostream. It uses the "intstack" from the previous lesson: // put the declaration of intstack here. we introduced a new type to the language.. "top". } What will this program print? It's hard to tell. If you remember the "tracer" examples from the previous lesson. One such trap. a class is a type. but the reference is treated as was it a constant. With the "intstack". in addition to well defined construction and destruction of instances. } Since the "intstack" class does not have a copy constructor (it was declared private. return i. Eiffel. you write classes. at once.. In the previous lesson. that the "main" function prints. but more importantly. void workWithStack(const intstack& is) { // work with is is. though.push(5). what should the class allow you to do. as before. if it prints at all. and if the function modifies it. is a classic example of a class. What.h> int& ir(void) // function returning reference to int { int i = 5. be it C++. the stack of integers. It might just crash. a method of encapsulation. when we wrote the class "intstack". "push" and "nrOfElements". } int main(void) { cout << ir() << endl. return 0. return 0. which programs could use. attempting to alter // const reference. // Error. the caller better be prepared for that. we had the operations "pop". as a rule of thumb. Why? The function "ir" returns a reference to an "int". that I think all C++ programmers fall into at least once. how can you know what classes to make? Classes are. It could be anything. i. and so on. as I mentioned in part 2. Here's an example of passing a parameter by "const" reference. describe the semantics of the type. is the meaning of a class. you remember that the variable ceases to exist when exiting the function. you add a new type to the language. C++ comes with a set of built in types like "int". So far so good. remember?) it is impossible to pass instances of it to functions in other ways than by reference (or by pointer.pop(). SmallTalk. A class is. yielding a third instance. Modula-3. descriptions of ideas. not my particular bicycle. This means that you get a reference. A commonly used way around this is to declare the parameter as a "const" reference. Objective-C. When you define a class.access to the very object you pass. which value happens to be the sum of the values of the other two. we have operations like adding two instances of the type. do it now.) There are situations when a reference is dangerous. "Bicycle" for example.

subtractors and so on. . Construction by copying is done by the copy constructor." So. For example.) adders. construction by copying another instance. Next in line is copy assignment. Again. if you can say "The X .e. However. you might start to feel like someone's been fooling you. If the starting point or destination are important.) Having state means that the same member function can give different results depending on what has been done to the object before calling the member function (again. compared to the work with the "intstack. So. descriptions of how instances of types can be used. there is no difference between the copy and the original. When your program executes. The job of the copy constructor is to create an object that is identical to another object.".physical entity that is currently getting wet in the rain. the copy constructor looks like this: class C { public: C(const C& c). on the other hand. exceptions to this rule of thumb. an instance of type "float" is also an object. The objects are the instances of types (yes. to instances of types. like "Bicycle" or "intstack". Note that objects don't exist when you write your program. The Orthodox Canonical Form The basic operations you should. but there are tricky cases. What my bicycle is. is that they're semantically identical (i. When you write your program. in general. after all. and the functions that represent the semantics. must in one way or the other be expressible through the classes.. Normally this extra burden is light. the identifiers." The "intstack" guaranteed that no matter what happened. }. just the run time instances of the classes. they probably should be parameters to the member functions. you probably need a class "Road". assignment and destruction.) A class represent the idea. as expected. you better have amplifiers (multiplication. (like "pc" in the reference example above) are replaced by bit-patterns representing objects. and their order. what member functions should a class have? This is even harder to say. though. it can be that the class "Bicycle"... Given this. given the same input to member functions. when solving your problem. or they won't use your program. What's important. On the contrary. }. Objects are run-time entities. Usually instances of the class has a state (for example. it is not. the state of a stack is the elements in it. // other necessary member functions. when thinking of the problem you want to solve.) The class has member data to represent state. The idea of bicycle is a very good candidate for a class. The Orthodox Canonical Form poses the additional requirement that an instance must always be copyable.. since a mathematical function is state less.. the things that you need to do.) The "intstack" for example must make its own copy of the stack representation in the copy constructor. they often differ. but as far as you can see through the "push". the copy assignment operator looks like this: class C { public: C& operator=(const C&). "An X. "pop" and "top" member functions. you might have a good candidate for a class X. and descriptions of semantics and state representation. is a good candidate for an instance of the class "Bicycle. This places a slightly heavier burden on us. is class oriented programming. If I need to ride my bicycle. but it might also be that class "Human" should have the member function "ride" accepting an instance of class "Bicycle" as its parameter. be able to do with objects of any class is construction from scratch. This means that the base pointer will differ. however. they give the same response.". // copy constructor // other necessary member functions. or "A kind of X. an instance was always destructible. given a class C. but if you design a program for use by electronics engineers when designing their gadgets. what exists are types. Construction from scratch we've seen. the value returned by "top()" or "nrOrElements()" depends on the history of "push()" and "pop()" calls. which you want to pass an instance of to the member function of either "Bicycle::beRiddenBy" or "Human::ride". If the road itself is important. The objects are. This does not mean that every member variable of the newly constructed object must have values identical to the ones in the original.". There are. be no. is a mathematical function a class that you'd like to have instances of to toy with in your program? According to the rule. This Object Oriented Programming thing is a hoax! What it's all about. they need not be instances of classes. Given a class named C. the answer would. It is your job to make sure it does this. In most situations. then. because there are so many ways to solve every problem. should have the member function "beRiddenBy" accepting an instance of class "Human".. with a stack.

pi) // initialize pi with the value of b's pi { // This is very bad. This means that the memory area // pointed to by b2. above. and b2's destructor is // called. b5 = b4. For the class C.pi. Not only does the copy assignment operator need to make the object equal to its parameter. bad& operator=(const bad&). so the last statement of an assignment operator is almost always "return *this. as you will soon see } bad::~bad(void) { delete pi. { bad b2(b1). private: int* pi. it also needs to cleanly get rid of whatever resources it might have held when being called (The copy constructor does not have this problem since it creates a new object that cannot have held any data since before. We have a memory leak! . the type of "this" is "C* const" This means that is's a pointer to type C. // // // // default constructor copy constructor destructor copy assignment bad::bad(void) : pi(new int(5)) // allocate new int on heap and // give it the value 5. } // Here b2 is destroyed. not by necessity) a reference to the object just assigned to. // This seamingly logical and simple return *this. ~bad(void).e. // Make b3. // b2. { } bad::bad(const bad& b) : pi(b. bad b5. When inside a member function (the assignment operator as defined above is a member function) the object can be reached through a pointer named "this. and the pointer itself is a constant (i. // assignment is also disasterous.pi." which is a pointer to the class type. } int main(void) { bad b1. }.pi point to the same invalid // memory area! bad b4.pi is now the same as b1.) The return value of an assignment operator is (by tradition.) The reference to the object is obtained by dereferencing the "this" pointer. you cannot make "this" point to anything else than the current instance.pi (and hence also b1.Writing the copy assignment operator is more difficult than writing the copy constructor. bad(const bad&)." The difficulty of writing a good copy constructor and copy assignment operator is best shown through a classical error: class bad { public: bad(void). // The memory allocated by b5 was never // deallocated. } bad& bad::operator=(const bad& b) { pi = b.pi) is // no longer valid bad b3(b1).

The copy constructor should allocate its own memory. // correctly so. } // Can you spot the problem with this one? int main(void) { bad b1.pi is still valid. but it also needs to discard the pointer it already had. OK. so assigning an object to itself is perhaps not the most frequently done operation in a program. so deallocation is not a // problem. The destructors of b4 and b5 both attempt // to deallocate the same memory (b5 first. but that doesn't mean it's allowed to crash.pi now points to its own area. This goes for the copy assignment operator too. we guarantee that the pointers owned by the objects are truly theirs. { bad b2(b1). By doing so. // Deallocate. // the copy constructor. how can we make the copy assignment operator safe from self assignment? Here are two alternatives: bad& bad::operator=(const bad& b) { if (pi != b.pi is first deallocated. have yet a problem to deal with. so from the example it is pretty clear that it's more work than this. } // The destrctor of b1 and b3 attempt to deallocate // the memory already dealloceted by the destructor // of b2. // // // return 0.pi)) // initialize pi as a new int // with the value of b's pi { // This guarantees that both the new object and the } // original are destructible. // No more memory leak pi = new int(*b. all objects refer to their own // memory. and their destructor can safely deallocate them.pi is no longer valid // b1. and allocate new again b3=b3. This means that the memory area // pointed to by b2. bad::bad(const bad& b) : pi(new int(*b.pi). default constructor and // destructor are identical and because of that not // shown here. however. pi = new int(*b. bad b3. and then b4. . Whoa!! b3. // Allocate new memory b3=b1.pi is allocated again and initialised with the value no longer available!!! // all OK. that of self assignment. and b2's destructor is // called. We do. // Get a new pointer and // initialise just as in return *this. which deallocates // already deallocated memory. A version of the program fixing the above issues can show you what is meant by that: // class declaration.pi).pi) { delete pi. though. then b3. bad& bad::operator=(const bad& b) { delete pi. right? So.return 0. and initialise that memory with the same value as that pointed to by the original. } // Here b2 is destroyed. // b2. } OK.

by default. how does the compiler know if something you do to an object will modify it? Does "pop" modify the "intstack?" Yes. will just copy/assign the member variables. since they both treat the object referred to as if it was a constant. It's the word "const" after the parameter list that tells the compiler that this member function will not modify the object and can safely be called for constant objects. The "Range" class from the previous lesson. and which don't (and assumes they do. It is no longer bad.} } return *this. Since. As a matter of fact. it does. If it is.top() = 3. right? The problem is that the compiler doesn't know which member functions modify the objects. one by one. Note that if your class only has member variables of types for which copying the values does not lead to problems. now when we know about references. We can change "top" to be declared as follows: class intstack { public: // misc member functions int top(void) const throw(stack_underflow. This has a tremendous advantage: For constant stack objects. // misc other member functions }. the class deserves a name change. The first alternative does this by comparing the "pi" pointer. One last thing before wrapping up. The "intstack" on the other hand does not. the auto generated copy constructor and/or copy assignment operator is OK." The "const" member function is called for constant objects (or. the tests above are not necessary. Const Correctness When talking about passing parameters to functions by reference. const references or pointers.. since the const reference treats whatever it refers to as a constant and thus won't allow you to do things that would modify it. I mentioned the const reference as a way to ensure that the parameter won't get modified. we can do even better by writing two member functions "top". pi = new int(*b. So. The question is. with the non-const version returning a non-const reference to the element instead. leave a comment in the class declaration saying so. Does "top" modify the stack? No. Fortunately we can tell the compiler differently. // change value of top element! There is no magic involved in this. In the previous lesson. does fine with this auto-generated copy constructor and copy assignment operator. The auto-generated copy constructor and assignment operator. just to be on the safe side) unless you tell it differently. the copy constructor and copy assignment operator was declared private. It removes the top element. The reason is that a C++ compiler automatically generates a copy constructor and copy assignment operator for you if you don't declare them. The second by comparing the pointer to the objects themselves. In some cases this is perfectly OK.pi). Otherwise they might easily think you've simply forgotten to write them. } return *this. This is of course hard. bad& bad::operator=(const bad& b) { if (this != &b) { delete pi. functions can be overloaded if their parameter list differs. Just as I mentioned in part one. it should be allowed to call "top" for a constant stack. by default. for example. so that readers of the source code know what you're thinking. The latter perhaps feels a bit harder to understand. } Common to both is that they check if the right hand side (parameter b) is the same object. because normally classes have more than one member variable to check for. one "const" and one not. is.pc_error). we can alter the value of the top element by writing like this: intstack is. you are. With these changes done. since then both the original and the copy would share the same representation (and have exactly the same problem as described in the above "bad" example!) If you decide that for your class. is. to prevent copying and assignment. for non-constant stack objects. the assignment is simply not done. however. not allowed to do anything at all to a constant object. Member functions can be overloaded on "constness. a member function is assumed to alter the object.) The non- .push(5). // top is now 5.. we can get the value of the top element. but it's actually the one most frequently seen.

// Preconditions: intstack& operator=(const intstack& is) throw (bad_alloc). and that is usually desirable.pc_error). With only one place to update. and a non-const version of "top" (just as above. Note that it is only member functions you can do this "const" overloading on.) Only the new and changed functions are included here. const version of "top" and "nrOfElements".const member function is called for non-constant objects. copy assignment operator. Here's a version of "intstack" with copy constructor. if ever you need to change the code. and you have a subtle bug that may be hard to find. // Preconditions: // nrOfElements() > 0 // Postconditions: // nrOfElements() == old nrOfElements() // Behaviour on exception: // Stack unchanged. The same goes for deallocation of the stack. You cannot declare non-member functions "const. // Preconditions: unsigned nrOfElements() const throw (pc_error). Since copying elements of a stack is the same when doing copy assignment and copy construction. This is getting too much without concrete examples.pc_error). but it means I won't have identical code in two places. that mistake is hard to make. class intstack { public: // the previous memberfunctions intstack(const intstack& is) throw (bad_alloc). Preconditions: nrOfElements() > 0 Postconditions: nrOfElements() == old nrOfElements() Behaviour on exception: Stack unchanged. you can bet you'll forget to update one of them otherwise. pc_error). copy // assignment and destructor. }. // Preconditions: // Postconditions: // nrOfElements() == 0 || top() == old top() int& // // // // // // top(void) throw (stack_underflow. stack_element* copy(void) const throw (bad_alloc). I have a private helper function that does the job. copy constructor and destructor. Since these helper functions "copy" and "destroyAll" are purely intended as an aid when implementing copy assignment. This is not necessary by any means. // misc other member functions }. private: // helper functions for copy constructor. they're declared private." Our overloaded "top" member functions can be declared like this: class intstack { public: // misc member functions int top(void) const throw(stack_underflow. You'll find a zip file with the complete sources at the top. It is needed both in copy assignment and destructor. After all. int& top(void) throw (stach_underflow. void destroyAll(void) throw(). int top(void) const throw(stack_underflow. Just as a private member variable can .pc_error).

and quite likely to cause unpredictable run-time behaviour. neither can be expressed in terms of the other.only be accessed from the member functions of a class. and not by anyone else. the compiler would give an error. declare it "const" so that errors can be caught by the compiler. member functions declared private can only be accessed from member functions of the same class. pc_error) { // Preconditions: // nrOfElements() > 0 if (nrOfElements() == 0 || pTop == 0) { throw stack_underflow(). saying that we're attempting to break our promise not to modify the object. it just means that it's callable on constant objects too." Had we. not much differs between the two variants of "top. but the first one returns a value and is declared const. This is always bad. the other one is not declared const and returns a reference. pc_error) { // Preconditions: // nrOfElements() > 0 if (nrOfElements() == 0 || pTop == 0) { throw stack_underflow(). // Postconditions: // nrOfElements() == old nrOfElements() // No need to check with this implementation! } int& intstack::top(void) throw (stack_underflow." Can you see what's different from the previous lesson? unsigned intstack::nrOfElements() const throw (pc_error) { // Preconditions: return elements.) Next in turn is "top". just how it's implemented." The implementation is in fact identical for both. in addition to making those member functions callable for constant objects (or constant references or pointers. does not have the desired effect. It saves you debug time. since we'd then return a reference to a local value. The "const" version could be implemented in terms of the non-const version. other than that it's declared to be "const. // Postconditions: // nrOfElements() == old nrOfElements() // No need to check with this implementation! } As can be seen. The non-const version cannot be implemented with the aid of the const version. Note that declaring a member function "const" does not mean it's only for constant objects. when I earlier mentioned that this is always undesirable? The reason is simply that although the implementation is identical. Whenever you have a member function that does not modify any member variable. } return pTop->value. // Postconditions: // nrOfElements() == 0 || top() == old top() // no need to check anything with this // implementation as it's trivially // obvious that nothing will change. "const" methods are thus good also as a way of preventing you from making mistakes. } return pTop->value. Here comes the new implementation of "nrOfElements. } There isn't anything at all that differs from the previous version of "nrOfElements". if it wasn't for the fact that . So why do we have two identical implementations here. or rather the two versions of "top": int intstack::top(void) const throw (stack_underflow. They have nothing what so ever to do with how the stack works. in this member function (or any other member function declared as "const" attempted to modify any member variable.

) Again. The "copy" helper function. It's a pure performance boost by making sure we do nothing at all instead of duplicating the representation. not a local copy of it.copy(). // what does i refer to now? i=45. Instead. } Seemingly simple. but the copying threw "bad_alloc. first getting the copy is essential. If "i" is an empty stack. creates a new copy of i's representation ("pTop" and whatever it points to) on the heap and returns the pointer to its base. the destructor becomes trivial: } return *this. Note that there is a danger in this too: What about this example? intstack is. . elements=i. Since we've promised that "destroyAll" won't throw anything (a promise we could make. // val is 32. The difficulty lies in being careful with the order in which to do things. // i now refers to the top element i=32. but not that bad. int& i=is. would be broken. just to destroy the original. and after that destroy our own representation. If we run out of memory when "copy" is working. it means that "bad_alloc" will be thrown before "pTop" is initialized. and copyable whenever resources allow. it'll start behaving randomly erratically! Now for the copy constructor. we can safely destroy whatever we have and then change the "pTop" member variable. as a consequence of this. intstack& intstack::operator=(const intstack& i) throw (bad_alloc) { if (this != &i) { stack_element* pTmp = i.it is not declared "const. and is. Also. This is very important from an exception handling point of view.elements. or getting a value from it. In this case it means that what's returned from the non-const version of "top" is the top element itself.copy()). "copy" returns 0." Since we're not catching "bad_alloc". and thus our promise to always stay destructible. Suppose we first destroyed the contents and then tried to get a copy.elements) { // Preconditions: // Postconditions: } The "pTop" member of the instance being created is initialized with the value from "i. it'll flow off to the caller if thrown. your program will crash right away. The copy assignment operator is a little bit trickier. // what happens here? The answer to the last two questions is that "i" refers to a variable that no longer exists and that when assigning to it. it can be modified. the member variables are not altered. With the aid of the "destroyAll" helper function. and thus the new object will never be constructed. // can throw bad_alloc destroyAll(). it's really simple! intstack::intstack(const intstack& i) throw (bad_alloc) : pTop(i. and "bad_alloc" thrown.copy()". With the help of the "copy" member function. // guaranteed not to throw! pTop=pTmp. If copying is successful. In this case. since "bad_alloc" is not caught in the function. is. it flows out of the function as intended. Here a temporary pointer "pTmp" is first set to refer to the copy of "i's" representation. elements(i. try to leave objects in an unaltered state in the presence of exceptions.top(). if you're out of luck. but our own "pTop" would point to something illegal. is. whatever was allocated will be deallocated. and yet both efficient and exception safe. int val=is. since we first get a local copy of the object assigning from. the self assignment guard ("if (this != &i)") is not necessary. Since it is the element itself. // modify top element. and always leave them destructible and copyable. since we've promised that our destructors won't throw) the rest is guaranteed to work. If you're lucky.pop(). Remember that a reference really isn't an object on its own? You cannot distinguish it in any way from the object it refers to. and the object remains unchanged (whenever possible. If the copying fails.push(45)." The implementation of a const member function is not allowed to alter the object. not allowed to call non-const member functions for the same object.top(). anything can happen.

) // If anything went wrong. if (p != 0) { // take care of first element here. try { stack_element* p = pTop. delete pTop. deallocate all { // and rethrow! while (pFirst != 0) { stack_element* pTmp = pFirst->pNext.0). stack_element* pPrevious = 0. pTop = p. pPrevious = pFirst.0). } } Now the only thing yet untold is how the helper function "copy" is implemented. pFirst = pTmp. // guaranteed not to throw. how is this magic "destroyAll" helper function implemented? It's actually identical with the old version of the destructor. while ((p = p->pNext) != 0) //**2 { pPrevious->pNext = new stack_element(p->value. // guaranteed not to throw. pFirst = new stack_element(p->value. } throw. // cannot throw except bad_alloc pPrevious = pPrevious->pNext. } } . delete pFirst.. // used in catch block. intstack::stack_element* intstack::copy(void) const throw (bad_alloc) { stack_element* pFirst = 0. // Here we take care of the remaining elements. // Cannot throw anything except bad_alloc if (pFirst == 0) //**1 throw bad_alloc(). } So. It's the by far trickiest function of them all.intstack::~intstack(void) { destroyAll(). void intstack::destroyAll(void) throw () { while (pTop != 0) { stack_element* p = pTop->pNext.. } catch (. } } return pFirst. if (pPrevious == 0) //**1 throw bad_alloc().

so if we left out the parenthesis. and how to use it. • You have seen how you can implement common behaviour in private member functions. there would be no way it could find the memory to deallocate. and seen that member functions declared "const" are callable for non-const objects as well. return 0. it is important that the "pNext" member variable is given the value 0. Send me e-mail at once. Coming up Next month I hope to introduce you to components of the C++ standard library. but fortunately it is available and downloadable for free from a number of sources. assignment and destruction. questions and (of course) answers to this month's exercises! Capitulo 4 . New compilers automatically throw "bad_alloc" when they're out of memory.To begin with. The whole copying is in a "try" block. the whole structure that "pTop" refers to is copied. and that expression can be used. since it is then known what class the type belongs to. • You have learned about the "Orthodox Canonical Form". will be very beneficial for you. Exercises • • • • • • When is guarding against self assignment necessary? When is it desirable? How can you disallow assignment for instances of a class? The non-const version of "top" returns a reference to data internal to the class. yet more news has been introduced to you. it's up to you to toy with the "intstack". used to point to the first element of the copy. which would not be what we intended. If we didn't leave this for the "catch" block. which always gives you construction from nothing. since it is always put at the end of the stack. you are the ones who can make this course the ultimate C++ course for you. When is it OK to use the auto-generated copy constructor and copy assignment operator? Recap This month. Most compilers available today do not have this library. • You have learned that your objects should always be in a destructible and copyable state. • You have seen how you can make member functions callable for "const" objects by declaring them as "const". so it can be used inside the "catch" block. partly because it is standard. Knowing this library. Whenever you have a need for a stack of integers. however. The assignment "p=p->pNext" must be in a parenthesis for this to work. for comparisons. The type "stack_element" is only known within "intstack. so we can deallocate things if something goes wrong. What happens is that the variable "p" is given the value of "p->pNext". as coming C++ programmers. nested types must be explicitly stated." so whenever used outside of "intstack" it must be explicitly stated that it is the "stack_element" type that is defined in "intstack. the effect would be to assign "p" the value of "p->pNext" compared to 0. Mail me your reasons for why this can be a bad idea (it can. that the "pNext" member variable is given a value other than 0. it would not be possible to know that it was the last element. The precedence rules are such that assignment has lower precedence than comparison. and our program would behave erratically. no matter what happens. construction by copying. for example. • You have found out how you can overload member functions on "constness" to get different behaviour for const objects and non-const objects. Remember that assignment is an expression." As long as we're "in the header" of a member function. • The "if" statements marked //**1 are only needed for older compilers. the return type is "intstack::stack_element*". These member functions are then only callable from within member functions of that class. it is no longer needed. desires. Now. Old compilers. • The "while" statement marked //**2 might look odd. As usual. The local variable "pFirst". If "pTop" is non-zero. here you have one. There are two details worth mentioning here. stating your opinions. and usually even is!) Can it be bad in this case? When can returning references be dangerous? When is it not? Mail me an exhaustive list of reasons when assignment or construction can be allowed to fail under the Orthodox Canonical Form. is defined outside of the "try" block. • You have seen how C++ references work. and how it works for objects. It's not until we have successfully created another element to append to the stack. • You have learned about "const". and that value is compared against zero. If it was not set to 0. At the places where a "stack_element" is allocated. Well within the function. and partly because it's remarkably powerful.

however.] So I've done it again. Later. This means that the template deals with a type. and a bizarre view). OK. I wanted to introduce you to the C++ standard library. Once you have a cookie cutter. This order of things is necessary to avoid unnecessary duplication. of course. Yuck. and I don't own a Watcom compiler to work around it with. always make it a stack of void*. and actually makes a new function. structs. it tries to create it by expanding the function template. promised something I cannot keep. other than to remember that there's a function template with one template parameter and the name "print". The code for the template. Type safety is essential for writing solid programs (Smalltalk programmers disagree). Despite the keyword "class". it will accept the keyword "typename" instead of "class". called T. // print<int> print(3. and if there isn't. } Weird? OK. without sacrificing type safety (this will get clear later). something which the compiler uses to create functions. is not a function. // print<const char*> print(2). '>' pair. and sees the call to "print(5)".. but there are problems with the available implementations and Watcom compilers.. After this comes the function definition. Then I want a stack of char* (another rewrite) and a stack of bicycles (yet a rewrite. They're the solution to the above problem.. just one data member in the internal struct with different type for all of them. although "class" will still work). } The keyword "template" says we're dealing with a template. Here's what a template function for printing something can look like: template <class T> void print(const T& t) { cout << "t=" << t << endl. When you find a bug (when... T does not have to be a class. The name "T" is of course arbitrarily chosen. When the compiler reaches "main".141592). 4. When the compiler reads the function template. all with identical code. you can make cookies with the shape of the cutter. The code in each of those functions was exactly the same (exactly the kind of redundancy that's always to be avoided). There is none. enumerations. The former alternative isn't an alternative either. No. rather than creating yet another copy of it. The same happens for the other types. More or less any kind of cookie can be made with that cutter. it does pretty much nothing at all. I wrote a set of overloaded functions called "print". It could be any name. Note that this is done by the compiler at compile time. more or less identical pieces of code. This is an ideal place for a template. so it expands the template function "print" with the type "int". // print<int> again. The template parameter for this template is "class T". the latter alternative isn't an alternative in my book. and cast to whatever you want (and just hope you won't cast to the wrong type). not if). where T is used just as if it was a legal type. it can be any of the built in types.. Templates are the foundation of the standard C++ library too. we end up with 4 versions of stack. there's always a template parameter list. you have to correct it in as many places. I want a stack of doubles. What do I do? Rewrite it all and call it doublestack? It's one alternative. time for some demystifying. that's really all there is. This is very much like a cookie cutter. that of templates. which printed parameters of different types. and so on (if you have a modern compiler. at least. This month. some type. Here are some examples using it: int main(void) { print(5). we'll look at another very important aspect of C++. There's of course the C kind of solution. Ed. But how? Function templates In the first C++ article. The compiler always first checks if there is a function available. For writing a template function. And then. After all. so getting to know them before hand might not be so bad anyway. // print<double> print("cool"). so I guess I've explained what templates are for. A template is a way of loosening your dependency on a type. Today.NOTE: Here is a link to a zip of the source code for this article. Sigh. It's a function template. it looks for a function "print" taking an "int" as a parameter. Why templates The last two articles made some effort in perfecting a stack of integers. return 0. When declaring/defining a template. Let's compile and run: . enclosed in a '<'. Just think of this little nightmare. "print(2)" uses the same function as "print(5)" does.

14159 t=cool t=2 [c:\desktop] Although it does not seem like it. "unexpected" would be called. It's just seen in a somewhat different way. int lower_bound = 0) throw (BoundsError). but that would not be wise. the original "Range" looks like this: struct BoundsError {}. nor was it sloppiness. and I cannot know if operator<< on that type can throw an exception or not. some type. which is correct. since the type "intstack" cannot be printed with "<<" on "cout" (the error message says "ostream". and "template function". class templates exist as builtins in C and C++. other than as a comment. After having generated the function. it compiled it. a compilation error occurs. The type they act on does not change their behaviour. Please note the different meanings of the terms "function template". where every occurrence of "T" (only one. One example of a template function above. there's only one requirement on the type T. Here the compiler generated a new function. type safety is by no means compromised here.[c:\desktop]icc /Q temp_test. It's a template from which the compiler generates functions. pointers and references. while the "template function" is the cookie. and the function template had an empty exception specifier list.cpp: In function `void print(const class intstack &)': temp_test2. The "function template" is what you write. by adding the exception specifier "throw ()". class Range { public: Range(int upper_bound = 0. Let's explore writing a simple class template. but this is how it works.cpp:285: no match for `operator <<(class ostream. This problem is something I strongly dislike about C++. You have arrays and pointers (and references) that all act on a type. This was not a mistake. and the program terminate. Not nice. We'll deal with that later this fall/early winter.cpp [c:\desktop]temp_test.exe t=5 t=3. is "print<int>()" (i. I could try to make the promise that the function "print" does not throw exceptions. What if it does? If so. If the type cannot be printed. The problem is that I cannot know what kind of type T will be. but of different types. Note that a function is not generated from a function template until a call is seen (the compiler cannot know what types to generate the function for before that). The "function template" is the cookie cutter. class intstack)' GCC delivered a compilation error. and there's not much to do about it. in the function parameter list) is replaced with "intstack". To test it. In a sense. In case you don't remember. it must be possible to print it with the "<<" operator to "cout".cpp -fhandle-exceptions -lstdcpp temp_test2. // Precondition: upper_bound >= lower_bound // Postconditions: // lower == upper_bound // upper == upper_bound . Templates and exceptions As you may have noticed. in one or a few articles on C++ I/O). and noticed the error. here's what GCC says when trying to print the "intstack" from last month: [c:\desktop]gcc temp_test2. the "int" version of print). called a template function. I wish there was a way to say "The exceptions that might be thrown are the ones from operator<< on T" but there is no way to say that. For the function template "print". Note that not writing an exception specifier means that any exception may be thrown. The drawback with templates is that they make writing exception specifiers a bit difficult. The compiler generated functions are the template functions. I didn't write an exception specifier for the "print" function template.e. by improving the old "Range" class from lesson 2. Class templates Just as you can write functions that are independent of type (and yet type safe!) we can write classes that are independent of type. they're still arrays.

"upperBound" and "includes" uses const T& instead of value. T is used just as if it was a type existing in the language. }. T upper. "lowerBound". Writing a class template is in many ways similar to writing a function template: struct BoundsError {}. const T& lower_bound = 0). there is no way to know if T throws anything. int upperBound() throw (). As can be seen. private: T lower. // copy constructor upper(upper_bound) // copy constructor { if (upper < lower) throw BoundsError(). // Precondition: upper_bound >= lower_bound // Postconditions: // lower == upper_bound // upper == upper_bound // Throws: // Bounds error on precondition violation // Whatever T's copy constructor throws. since those member functions do not do anything with the T's. // Throws: Whatever operator>= and operator <= on T // throws. however. The reason is performance if T is a large type (if passed by value.int lowerBound() throw (). "includes" on the other hand. why it shouldn't be a range of any type. int includes(const T& aValue). int upper. const T& lower_bound) : lower(lower_bound). for the same reason as the constructor does. There's no reason. int includes(int aValue) throw (). I've also removed the exception specifier. the parameters must be copied and the copying may be an expensive operation). const T& upperBound() throw (). } . // Whatever operator < on T throws. private: int lower. I've changed the constructor so that it accepts the parameters as const reference instead of by value. and instead used a comment. "lowerBound" and "upperBound" can safely have empty exception specifiers. does need the unfortunate comment. const T& lowerBound() throw (). This class is a range of int. }. which will include some news: template <class T> Range<T>::Range(const T& upper_bound. since after all. } template <class T> const T& Range<T>::lowerBound() throw () { return lower. They just return a reference to one of them. template <class T> class Range { public: Range(const T& upper_bound = 0. after "template <class T>". Time for the implementation. on line 3.

a class template is expanded when it's referred to. The only difference is that we must refer to the class (of course). and we must be explicit about that.-3. To use a class template. } The syntax for member functions is very much the same as that for function templates.10)" and have the compiler automatically understand that you mean "Range<int>(5. There isn't much more to say about this. a function template. and we must specify that it's the template version of the class. and doesn't contain any data. As with function templates.4)). i. and to write a special template print. which prints ranges looking like the constructor call for the range. There is unfortunately no way to say "Range(5. so it was changed. that is its sole purpose. The reason is that we're not dealing with a complete class. the code will not be expanded until it is called from somewhere. My intention is to write a traits class. so when the compiler first sees "Range<int>".10)". which is used to create ranges. " << rd. } Take a careful look at the syntax here.141592. it would not have been expanded. is as something called "traits classes". but with a class template. and finally.10). and when executed. One unusually clever place for templates. without needing to specify the type. When done. One unfortunate side of this is that "includes" could actually contain errors. Range<double> rd(3. The above code calls all members of "Range".includes(62)) { cout << "[" << rd. by expanding whatever is needed. The compiler will also treat every member function just as any template function. we should have a look at some power usage (this section is abstract.lowerBound() << ". Originally they were called "baggage classes". but for some reason some people didn't like the name. you must explicitly state what type it is for. which tells the name of the type it is specialized for (explanation of that comes below). } template <class T> int Range<T>::includes(const T& aValue) { return aValue >= lower && aValue <= upper. it creates the class.141592). until "includes" was called.lowerBound() << ". I will be able to write: print(make_range(10. Advanced Templates Now that the basics are covered. The name "traits class" is odd.includes(55)) { cout << "[" << ri. if (ri. } if (!rd.e.h> int main(void) { Range<int> ri(100. It just tells things about other classes. A traits class is never instantiated. We must also precede every member function with "template <class T>". and this would be unnoticed by the compiler.template <class T> const T& Range<T>::upperBound() throw () { return upper. and belong to something else. but had "includes" not been called.upperBound() << "] does not include 62" << endl. so it may require a number of readings). } return 0. since they're useless on their own. see: . " << ri. Let's have a look at how it's used: #include <iostream. by adding "<T>" after the class name.upperBound() << "] includes 55" << endl.

} "A::g()" is in error. The traits class needed here. Calling "A::f()" is an error. // prints something. a. and as such cannot access any member data. Now back to traits classes. The class template is the general way of doing things. The class template just looks like this: template <class T> class type_name { public: static const char* as_string(). static void g(void). which means it's the "h" belonging to the class named "A". }. it holds no data. // also prints "A::h" A::f()." operator). just defined. return 0. Since "h" is not tied to an object. is one that tells the name of a type. just templates! Here we go. // prints "A::h" A::h(). }.h()" and "A::h()" are synonymous. The calls "a. // error! Cannot access data. and only static member functions. a. } int main(void) { A a. void A::f(void) { cout << data << endl. public: void f(void). A traits class. and the member function is declared as "static". because it's declared static. since member data belongs to objects. A member function specialization is usually not declared. and thus not bound to any object. This means it belongs to an object. like this: ... since it is not static. is a simple class template.h(). } void A::h(void) { cout << "A::h" << endl. } void A::g(void) { cout << data << endl. // Error. The whole idea for traits classes is one of "specialization". That is. you can do what's called a specialization. f is bound to an object. in that it does not belong to an instance. and must be // called on an object. Here's an example: class A { private: int data. and must be called on an object (through the ". A member function declared static. static void h(void). but belongs to the class itself. it can be called through the class scope operator "A::".4) Magic? No. This is the way traits classes usually look. is different from normal member functions.f().Range<int>(10. No data. but if you want the class to take some special care for a certain type.

so compilers very much up to date with the standardization requires you to write like this: template <> const char* type_name<char>::as_string() { return "char". If we try for a type we haven't specialized. nothing else. It's supposed to accept an instance of a "Range". using the types of the parameters. Piece of cake: template <class T> void print(const Range<T>& r) { cout << "Range<" << type_name<T>::as_string() << ">(" << r. understandable difference. that is). but the template parameter list is empty. just as the constructor call for the "Range" was done. Note also that this means we cannot print ranges of types for which the "type_name" traits class is not specialized. have a look at Nathan Meyers traits article from the June '95 issue of C++ Report. template <class T> Range<T> create_range(const T& t1. Please add all the fundamental types. The other new thing is how elegantly the "type_name" traits class blends with the function template. instead specializations are. the traits classes are unbelievably useful. This is how traits classes usually look. Their purpose is only to tell something about other classes.t2). which with the above seems fairly simple. if you have a top modern compiler. all template parameters must be used in the parameter list for the function).5)). If the types differ in a call. And it will work (if we specialize "type_name<int>::as_string()". Now. Now over to the print template. the template member functions are not defined. now does it? There actually is no catch in this. The "template <>" part clarifies that it's a template we're dealing with. const T& t2) { return Range<T>(t1. and print it. the parameter for the function template need not be the template parameter itself. just as we planned to. Neat. The syntax has changed. eh? If you want to learn more about traits classes. } Doesn't seem too tricky. Now we're almost there.5)). you'll get a compilation error. it will know what kind of "Range" to create and return. we can use the "type_name" traits class for "char" as follows: cout << type_name<char>::as_string() << endl.upperBound() << ". the class. though (for all except the absolutely newest compilers. We can now write: print(Range<int>(10. } A minor. The function template is by the compiler translated to a template function. } Here we see two new things. It needs to be something that makes use of the template parameter. . For being such an incredibly simple construct. " << r.lowerBound() << ")" << endl. Now for the last detail. } Of course. You can of course make any specializations you like. the compiler will give you an error message. We can now write: print(make_range(10. They have a template interface. which declares a number of static member functions. Normally. the function template that creates "Range" instances. we'll get an error when compiling. such as "double".const char* type_name<char>::as_string() { return "char". but in a sense. since we're specializing for known types. Those member functions are intended to tell something about some other class. When the type is known.

that's implemented in the language. otherwise it'll be C++ I/O streams. that'll be next month's topic. You've already seen that with operator=. Let's see what actually happens when we use operator=. . . (If you're familiar with Pascal. send me e-mail at once. As always. Coming up If the standard library's up and running on Watcom. and when can you not use exception specifiers for templates? What are the requirements on the type parameter of the templatized "Range"? Can you use a range of "intstack"? What are the requirements on the type parameter of the templatized "stack<T>"? Recap Quite a lot of news this month. which can contain data of a type not known at the time of writing.. stating your opinions. try to implement something like Write and WriteLn in Pascal.Exercises • • • • Biggie: Rewrite last months "intstack" as a class template. X& operator=(int i).. • that templates restricts the usefulness of exception specifiers. the syntax is legal only because you can overload operators in C++. • how to specialize class templates for known types. the language doesn't allow it. • how the compiler generates the template functions from your function template. questions and (of course) answers to this month's exercises! Part Part Part Part Part Part Part Part Part Part1 Part1 Part1 Part1 1 2 3 4 5 6 7 8 9 0 1 2 3 Introduction We've seen how the fundamental types of C++ can be written to the screen with "cout << value" and read from standard input with "cin >> variable". is not part of the language proper in C++ (or in C for that matter. You've learned: • how to write type independent functions with templates. "stack<T>" What happens if the copy constructor.) It's handled by an I/O library. X x. operator== or destructor of T throws exceptions? When can you. desires.. you will learn how you can do the same for your own classes and structs. How can this be expressed in the language? To begin with. • about template classes. • how to write and use traits classes. This month. Exploring I/O of fundamental types Formatted I/O.. You can't. without sacrificing type safety.) We've seen a number of times how we can print something with "cout << value". that's why it's built into the language itself. }. It's surprisingly easy to do. class X { public: .

ostream& operator<<(int). sort of.) With the above in mind. ostream& operator<<(long double). ostream& operator<<(signed char). is just like any other member function. Let's go back to printing again.. The value returned by each of these is the stream object itself (i.operator<<(i)). Does that seem like a good idea to you? It doesn't to me. double d. an operator overridden in a class. "a << b" is identical with "a. In fact. cout << i << d. ostream& operator<<(short). and if we add operators << and >> to our class. ostream& operator<<(unsigned int). }.. and it generates identical code.x=5. Another way of expressing this is: x.. because this is how the compiler will treat the more human-readable form "x=5".e. As I wrote above. The solution does yet again lie in operator overloading. The only difference for reading is that the class is called "istream" instead. ostream& operator<<(double). so if our own data types consisted of something completely different. in its . on the right hand side. Another possible way of doing this is to edit the ostream and istream class to contain operator<< and operator>> for our own classes. ostream& operator<<(long). how do we make sure we can do I/O on ranges and stacks (from the earlier lessons?) What about extending our own class with the members operator<< and operator>>? This would. such that the operator becomes a member function for that class (only. and that the operator used is operator>>().operator<<(b)". but this time in a somewhat different way. I/O would be very difficult indeed. So. . The C++ I/O library only supports I/O of the fundamental types. ostream& operator<<(const char*). where T is any of the fundamental types of C++.operator=(5). and that's not what we want. what actually happens is that operator= is called for the object named "x".. we can see that writing int i. and the stream to print on/read from. This is important. work. it's just called in a peculiar form. (cout. I/O with our own types The most important thing to recognise is that our own types (classes and structs) always consists of fundamental types. "cout" is an object of some class. ostream& operator<<(float). ostream& operator<<(unsigned long). We just saw how we can overload an operator for a class. the return value will be a reference to "cout" itself.. As we can see then. //** At the last line of the example. we'll require our object on the left hand side. which has operator<<(T) overloaded. this syntax is legal. is synonymous with int i. The relevant section of the class definition looks as follows: class ostream { . double d. ostream& operator<<(char). public: .operator<<(d).. if you call "operator<<(char)" on "cout". but the syntax would change. ostream& operator<<(unsigned char). ostream& operator<<(unsigned short).

. then print a range. int j. say. it is not at all obvious that this will occur. accept two parameters. #6 and #7 are usually skipped. int lower_bound = 0) throw (BoundsError). Full type safety 5. the C++ I/O library handles just exactly this for you. How should this thing be printed and read? Here's a wishlist. int upperBound() const throw ()... but very liberal in what you accept as input. Most operators that can be defined like a nonmember function.. 3. no spaces anywhere. the class "Range. the syntax differs. then reads something. This declares a function. and what format should we accept when reading? A golden rule in I/O (and not just in C++) is to be very strict in your output. to be more realistic later. . See part 3 for details if you've forgotten): struct BoundsError {}. Let's revisit our old friend. either we print all there is to be printed.) Since both reading and writing is normally buffered. Full commit or roll back. now. All of these are possible..lower]". 6. We'll reduce it a little bit. '. how? Overloading operator<< as a global function. cout << i << r << j. On input however. provided that at least one of the parameters to the operator is not a built-in type. If we have code like: Range r. cout << r << i. The signature becomes: ostream& operator<<(ostream&. if we read something. This even works for more complex expressions. }. 1. two integers separated by a comma. that is. For format I chose is "[upper. number. No unnecessary computations. Encapsulation not violated.. or we print nothing at all.' and ']'). such that the operator becomes a function. int includes(int aValue) const throw (). . const Range&). I'll skip #2 for now. The print must be in a form distinguishable from. We want printing and reading synchronized (i." This is the definition of "Range". we want the reading to complete before printing. and we want it all printed before reading again.. int i. int upper. private: int lower. int i. Which the compiler interprets as: . The syntax and semantics for printing must be the same as for the fundamental types of C++. 4. 2. for those who do not have old issues handy (I've added "const" on the member functions. // Precondition: upper_bound >= lower_bound // Postconditions: // lower == upper_bound // upper == upper_bound int lowerBound() const throw (). r). so now we have a pretty good picture on what to do.operator<<(i). like: Range r. 7.use. OK. Normally. The compiler will treat it as operator<<(cout.) It's also possible to overload operators. What's the appearance we want of a range when printed.e. which has the syntax of a left shift operator. now that you know what it's for. Such is the case for our new friends operator<< and operator>>. but #2. white space is allowed before the first bracket. class Range { public: Range(int upper_bound = 0. and between any of the tokens (the tokens here are '['.

. } The "prefix" ("opfx" means "output prefix") function checks for a valid stream. return os. Say. The format is distinct enough. is >> upper. it's fairly easy to get down to work with implementing the operator<< function. we have a detached process. since the function does not alter "r" in any way (and promises it won't.opfx()) return os. if (c != ']') // signal error and roll back stream.lowerBound() << ']'. Study these examples carefully. and copying a stream doesn't make much sense (think about it. even try? We also do not synchronize our output with input. after these examples.r). char and int. Why then.lowerBound() << ']'. is >> c. is >> c. so that synchronized input streams can begin accepting input again. int upper. is >> c. we have type safety and encapsulation is not violated. Printing does alter a stream.) I dare you to find this in a C++ book (I know of one book. ostream&A& operator<<(ostream& os.osfx(). . os.) The stream. "os". This is essential.upperBound() << '. However. . but I mentioned already in the beginning that we'll skip that for now. is passed by non-const reference. It is not possible to pass it by value.') // signal error somehow and roll back stream. means copying. since when passing by value.operator<<(j). and the semantics are too. and also synchronizes output with input. os << '[' << r. The check and synchronization is simple to make. since it isn't more difficult than this to avoid unnecessary computations and synchronize input with output. How well does this suit the 7 points above? The syntax is correct.' << r. int lower. if (c != '.ipfx()) return is. ostream& operator<<(ostream& os.) I don't know why it's just about always skipped. so the operator<< provided by the I/O class library suits just fine. we're printing known types. for example. This is as far as most books on C++ cover when it comes to printing your own types. Range& r) { if (!is. if (c != '[') // signal error somehow and roll back stream. Detached processes do not have standard output and standard input (unless redirected) and as such printing will always fail. is >> lower. as you will see further down. how about reading? The signature and general appearance of the function is pretty much clear from the above discussions.operator<<(i). however. The "suffix" ("osfx" means "output suffix") signals end of output. It will not be the same after printing as it was before printing.upperBound() << '. we do make some unnecessary computations if the stream is bad in one way or the other.) Inside the function. to make sure you understand what's going on.' << r. That was printing. const Range& r) { if (!os.operator<<(cout.) We do not have full commit or rollback. char c. const Range& r) { os << '[' << r. but oddly enough not mentioned in most C+ books. return os. Let's make a try: istream& operator>>(istream& is. Now. } Here "r" is passed as const reference. given the facts known this far (more is needed.

or actually changes the character. Now with the above in mind. One character. Remember you're dealing with input generated by human beings here. it's usually a failure. No.) The obvious solution to signalling an error.) A fourth call "is. so we needn't even try. return is.clear(ios::failbit|is. Sure you can. is >> upper >> c. and usually we want to do that when setting or resetting a status bit. So. since it's very difficult to do.. our only chance is if the first character read is not right. demand that the users of your program enter the exact right data in the exact correct format every time.fail()". return is. that is all that is guaranteed to work. "eof" is used to signal end of file.rdstate()).rdstate()).putback(c). and 0 otherwise. since it means the stream is really out of touch with reality and we cannot trust anything from it (I've only seen this one once. if (c != ']') { is. the program may do *anything*. The solution is that there isn't one. "eof" and "fail". It's also absolutely necessary that the character put back is the same as the last one read.bad()" and "is. let's make another try: istream& operator>>(istream& is." This is done with the odd named member function "clear(int)". The reason is conceptual. In other words. is wrong. so if nothing is passed. how to roll back the stream. and thus not exceptional.eof()" (which return 0 if the bit they represent is not set. return is. char c. it's almost impossible. . what we should do if we read something unexpected. and non-zero otherwise.ipfx()) return is. and leave the other bits as they were before the call. and thus not to be handled with exceptions. the name makes sense. in theory.. The status bits can also be checked with the calls "is. is >> lower >> c. it's used to signal that we received something that was not what we expected. is. } int lower. since we want to affect only that bit. but the stream itself is OK. the suffix function. so reading wasn't as easy. Use exceptions to signal exceptional situations. OK. "ios::failbit" and "ios::eofbit". There are three issues above that needs to be resolved. Putting back a character is done with "istream::putback(char)".clear(ios::failbit|is. We can put back a character.') { is.. In fact. "bad" is something we hope to never see. How then? A stream object has an error state consisting of three orthogonal failure flags. Range& r) { if (!is. but if it occurs in the middle of reading something.good()" returns non-zero if no error state bits are set. is >> c. but you won't be very popular among them. "is. How to signal error. r=Range(upper. // ERROR! Does not exist! return is.rdstate()).. We can get the current status bits by calling "is. if (c != '. and it was due to a bug in a library!) I guess we can expect "bad" if reading from a file. to throw an exception. since the guess "is.} Hmm. Let's begin from the easy end.isfx(). erroneous user input is expected. otherwise the behaviour is undefined (which literally means all bets are off. The problem is fixed by removing the faulty line (don't you just love bugs that you fix solely by removing code!) Rolling back the stream is interesting indeed. and how to deal with the suffix function. is. The wrong input *is* expected.rdstate()". a situation most programs rely on. and hit a bad sector. } int upper. is to set the stream state to "fail.clear(ios::failbit|is. a not too unusual situation (as a matter of fact. but in practice it means you cannot know if it just backs a position. "bad".) "fail" is the one we're interested in here. . if (c != '[') { is.) The bits we can set are "ios::badbit". "clear" sets the status bits of the stream to the pattern of the integer parameter (which defaults to 0. and other means to handle the expected.isfx()" was wrong. and soon will have none.lower).

where v is one of "ios::hex". a field width can be set. else is. hexadecimal). cout. cout << i << endl. As a small example.setf()" and "os. cout. are controlled with a few formatting flags. I think they're difficult to use. the compiler did it for us. and alignment within that field.ipfx()" not only synchronizes the input stream with output streams. ios::basefield). so the call is valid. } if (is. For floating point types the format can be fixed point or scientific. cout. Note that operator>> for built in types skips leading whitespace.setf(ios::hex. and return. octal.good()) { if (upper >= lower) r=Range(upper. but also checks for error conditions and reads past leading white space.clear(ios::failbit|is. "putback" is not guaranteed to work if the stream is in error.) otherwise set the fail error state. ios::basefield).good()" is enough to know if all parts were read as we expected. mark the stream as failed. and in fact better than what can be found in most books on the subject. the stream is set to fail state. How well do we match the 7 item wish list? You check and judge.setf(ios::oct. cout. the base can be set (decimal. If they were. and a few ways in which the requirements on the input format can be altered. Then read the lower limit and the terminator. and yet some.} This actually solves the problem as far as is possible. Formatting There are a number of ways in which the output format of the fundamental types of C++ can be altered.setf(v. and the separator. int main(void) { int i=19. "ios::dec" or "ios::oct". cout << i << endl.) thus the check near the end for "is. consider: #include <iostream. All flags are set or cleared with the member functions "os. } return is.lower). we set the stream state to failed and return. we put the character back and set the fail bit (the order is important. For example. } The result of running this program is: 19 13 23 19 The base is converted as expected. cout << i << endl. and a little data. If reading of either upper limit or lower limit failed.) After this we read the upper limit of the range. The call to "is. but there is no way to see what base it is. The base for integral output is altered with a call to "os. If the separator is not ". but fortunately there are easier ways of achieving the same effect. If the terminator is not ']'. cout << i << endl. I think we're doing fine. .rdstate()). so let's set that one too.unsetf()". so we needn't work on that at all. cout << i << endl.setf(ios::showbase). If the first character read is not a '['. ios::basefield)". ios::basefield). all we need to do is to check that the upper limit indeed is at or above the lower limit (precondition for the range) and if so set "r" (since we haven't declared an assignment operator.h> int main(void) { int i=19.". and we'll visit those later. This can be improved with the formatting flag ios::showbase.setf(ios::dec. return 0. All of these. and other reads will not do anything at all (not even alter the stream error state. For integral types.

If the masked version is called. the other one a full set of flags only. is potentially dangerous (what if "ios::oct" was already set? Then you'd end up with both being set. except those explicitly set by the first parameters. "ios::right". The three of these are mutually exclusive. so don't set two of them at the same time. cout << i << endl.setf(ios::hex.setf(ios::hex)". Alignment is set with the two parameter version of "os. As with the base for integral types. the one accepting only one parameter. and the mask is "ios::basefield". or all three of these flags at the same time. the width set does not affect the printing separate characters.width(10). "setf()" is overloaded in two forms. right? The call to "setf()" for setting the "ios::showbase" flag is different. } The output of this program is 19 0x13 023 19 That's more like it. return 0. cout << '[' << -55 << ']' << endl." Simple enough. cout. The result of running the program is shown below: [-55] [ -55] [ -55] [-55] Had you expected this? I didn't.h> int main() { cout << '[' << -55 << ']' << endl. ios::basefield).width(10). The second form. and the second parameter is "ios::adjustfield".) That was setting the base for integral types. and leaves the others unchanged (in other words.setf(ios::oct. let's play with alignment within a field. let's try it out: #include <iostream. but if there's extra room. cout << -55 << ']' << endl. cout. sets the flags sent as parameter.) Now you begin to see why this is messy.setf(ios::dec. return 0. the only formatting flags of the stream that will be affected are "ios::hex" or "ios::dec" or "ios::oct". One accepts a set of flags and a mask. cout << '['. ios::basefield).width(int)". alignment does make a difference. Formatting bits not represented by the mask will remain unchanged. Let's alter the width setting program to show the behaviour. field width and alignment. cout << i << endl.width(void). it bitwise "or"es the current bit-pattern with the one provided as the parameter. so a call to "os. and the version with the mask clears the bits represented by the mask.h> int main() .cout. though. cout << i << endl. cout << '[' << -55 << ']' << endl.) The second parameter "ios::basefield" guarantees that if you set "ios::hex". alignment doesn't matter. it's not a very good idea (yields undefined behaviour. the three alignment forms are mutually exclusive. then "ios::oct" and "ios::dec" will be cleared. for sure. } Executing this programs shows something interesting. This is not very intuitive I think. Now.setf()". All the formatting flags of the iostreams are represented as bits in an integer. ios::basefield). and the width is reset after printing the first thing that uses it. and the curious can get the current field width by calling "os. now for something that's common to all types. If the field width is not set. cout. or "ios::internal". The field width is set with "os. cout. where the first parameter is one os "ios::left". #include <iostream. While it's possible to set two. or the field width set is smaller than that necessary to represent the value to be printed.

{ cout.width(10). not very surprising: [-55] [-55] [-55] [ -55] [-55 ] [55] Well.setf(ios::left.fill('. but it kind of makes sense. } The result of running this is. the current alignment defines where in the field the value will be. while most think it's the number of digits to display.. is just the default. cout.setf(ios::right.. and where in the field space will be. The November 1997 draft C++ standards document (which. ios::adjustfield). ios::adjustfield). • ios::showpoint controls whether the decimals should be shown for floating point numbers if they are all zero. cout.fill(char)"...width(10).fill(void)"..precision".h> int main() { cout. however.. cout << '[' << -55 << ']' << endl. The unpleasant thing about this parameter.. why not try the other formatting flags there are: • ios::fixed and ios::scientific control the format of floating point numbers (the mask used is ios::floatfield. cout. cout.setf(ios::right. cout << '[' << -55 << ']' << endl.. cout. ios::adjustfield).. which comes in two flavours. return 0.. but the way.setf(ios::left. If the field width is larger than that required for a value.width(10). by . cout << '[' << -55 << ']' << endl. • ios::uppercase controls whether hexadecimal digits should be displayed with upper case letters or lower case letters. after the above explanations.setf(ios::internal. The pad character. cout << '[' << -55 << ']' << endl. remains the same until explicitly changed.. } Running it yields the surprising result . ios::adjustfield). and one with an int parameter. Some think the precision is the number of digits after the decimal point.) ios::showpos controls whether a "+" should be prepended to positive numbers or not (just like a "-" is prepended to negative numbers. cout. Now that you have the general idea.. Space.-5 Why was this surprising? Earlier we saw that the field width is "forgotten" once used. cout << -5 << endl. I found the formatting of "ios::internal" to be a bit odd.. ios::adjustfield). cout. by calling "os.setf(ios::internal.width(10). OK. Let's exercise that one too: #include <iostream. and get the current value with a call to "os. cout. return 0. we can change the "padding character".width(10). cout << '[' << -55 << ']' << endl. One without parameters which reports the current precision. The only thing remaining for formatting is "os.-5 . cout. ios::adjustfield). cout << -5 << endl. cout.').. cout << '[' << -55 << ']' << endl. is that many compilers interpret it differently.

) "flush" flushes the stream buffer (i. At any rate. this is a mess. Their use is fairly straightforward and doesn't require any example. i < nr. "ends" is rarely used. ++i) cout << ' '. return os. ostream& printOn(ostream& os) const { for (int i=0. There are two kinds of manipulators. and returning an ostream&. so if we "print" it with "cout lt. To access them. because it really is simple.h> int main(void) { cout << hex << 127 << " " << oct << 127 << " " << oct << 127 << endl. what on earth does this mean? It means that if you have a function accepting an ostream& parameter. ios::adjustfield). It looks like: ostream& operator<<(ostream& (*f)(ostream&)) { return f(*this). Let's write one that prints a defined number of spaces: class spaces { public: spaces(int s) : nr(s) {}. .) "setprecision". it isn't if you skip the mechanism offered by your compiler vendor and do the job yourself. } The advantage of this is both that the code becomes clearer. "endl". the above mentioned operator<< is called. and it in its turn calls the function for the stream. or may not.setf(ios::left." You've already used one manipulator a lot." The ones available are: "dec". inconsistencies aside. } Now.e. Let's exercise this by rolling our own "left" alignment manipulator: ostream& left(ostream& os) { os. "ends". and those that does not. but I'm not sure if that's what the current standards document says." and if you do.< left". those accepting a parameter. most probably is the final C++ standards document. so doing it in a portable way is very difficult. Or actually. }. so they defined something called "manipulators. and "setfill". and that there's no way you can accidentally set illegal base flag combinations.) Then there are some manipulators accepting a parameter. that function can be "printed.) says the number of digits after the decimal point is what's controlled. it's there to print a terminating '\0' (the terminating '\0' of strings is never printed normally. you need to #include <iomanip. so "cout << left". "hex". the function will be called with the stream as its parameter. Cool. just like "endl. return os. Their use is simple: #include <iostream. Let's first focus on those that don't. Every compiler I've seen provides its own mechanism for writing such manipulators. but it will alter the stream in some way." A manipulator does may.h>. } This function matches the required signature.) How do these manipulators work? There's a rather odd looking operator<< for output streams. forces printing right away. actually ends up as "left(cout)". For example "endl" prints a new line character. and "flush". "endl. print something on the stream. return 0.the way. The ones usually accessed from there are "setw" (for setting the field width. } private: int nr. isn't it? An easier way The authors of the I/O package realized that this is a mess. and flushes the stream buffer. eh? Roll your own "right" and "internal" manipulators as a simple exercise (they're handy too. "oct".

• • . though. what about our I/O of our own classes with respect to the formatting state of the stream? How's the "Range" class printed if the field width and alignment is set to something? How should it be printed (hint. which goes through the loop printing space characters. • How to make sure your own classes can be written and read. const spaces& s) { return s. but typedef's for class templates. • The very messy.printOn(os). traits>. and "ostream" as "basic_ostream<char. Instead you create an object of type istream::sentry or ostream::sentry. like this: istream& work(istream& is) { istream::sentry cerberos(is). and the somewhat less messy way of altering the formatting state of a stream. if (kerberos) { . Which error status bits cause exceptions to be thrown is controlled with an exception mask (a bit mask. traits>. "istream" and "ostream" are in fact not classes in the standard. Standards update • The prefix and postfix functions are history. Now something for you to think about until next month.ostream& operator<<(ostream& os. and sets the ios::fail status bit if they differ. Then the global operator<< for an ostream& and a const space& is called. do they have the effect you expect? Write an input manipulator accepting a character. I think writing manipulators requiring parameters this way is lots easier than trying to understand the non-portable way provided by your compiler vendor. Recap This month you've learned a number of things regarding the fundamentals of C++ I/O.) With the above in mind. write a class which will accept an ostream as its constructor parameter. } Can you see what happens if we call "cout << spaces(40)"? First the object of class "spaces" is created. and remembering that destructors can be put to good work. class traits> class basic_ostream<charT. Any operation that sets an error status bit may throw an exception. which when called compares it with a character read from the stream. clear and recognise the error state of a stream. char_traits<char> >".. char_traits<char> >". For example • How to set. which have effect. with a parameter of 40.) I still think it's easier to write a class the way I showed you. There's also the pair "wistream" and "wostream".) Exercises • • • • Find out which formatting parameters "stick" (like the choice of padding character) and which ones are dropped immediately after first use (like the field width. class traits> class basic_istream<charT. The class templates are template <class charT. } return is. • Why exceptions are not to be used when input is wrong. The mechanism for writing manipulators is standardised (and heavily based on templates. that are streams of wide characters. • How to write your own stream manipulators. and check it. no exceptions are thrown. and that function in its turn calls the printOn member function for the spaces object. "istream" is typedefed as "basic_istream<char.) By default.. } • The destructor of the sentry object does the work corresponding to that of the postfix function. and which on destruction will restore the ostreams formatting state to what it was on construction. and which don't? Of those that do have an effect. Experiment with the formatting flags on input. That parameter is in the constructor stored in the member variable "nr". your probably want it printed differently from what will be the case if you don't take care of it. and template<class charT.

copy(iarr. size_t isize=sizeof(iarr)/sizeof(iarr[0]). ++begin. One match for "IN" and "OUT" is obvious: a pointer in an array.) Must be comparable with operator !=. even though it's short and not the planned details on inheritance. Operator ++ (prefix) must be allowed. As always. OUT: Must be copy-able.45}. } return dest.h) and the names actually std::istream and std::ostream (everything in the C++ standard library is named std::whatever. However. whose return value must be something which can be assigned to the result of operator* for IN. and every standard header is named without trailing . double darr[isize]. I've had very little inspiration for writing this month. instead of leaving you in the dark for a month.iarr+isize. Inheritance is what Object Orientation is all about. class OUT> OUT copy(IN begin. Example: Study this function template: template <class IN. from) a type "T2". both for the parameter and the return value. IN: Must be copy-able (since the parameters are passed by value. People who enjoy and understand the philosophy of Platon will feel at home. Must have an operator*. Let's say that the types are "T1*" and "T2*". send me e-mail at once. desires. but let's analyze it. ++dest.34. as follows: . At the call-site.• • Formatting of numeric types (and time) is localised. whose return value must be assignable. questions and (of course) answers to this month's exercises! Part1 Part2 Part3 Part4 Part5 Part6 Part7 Part8 Part9 Part10 Part11 Part12 Part13 Part Part Part Part Part Part Part Part Part Part1 Part1 Part1 Part1 1 2 3 4 5 6 7 8 9 0 1 2 3 Int Introduction I admit it.23. but with the support for "imbuing" streams with other locales (formatting rules. what does the above do? The name no doubt gives a hint.) The header name is <iostream> (no . then this is legal for all types "T1" that can be assigned (or implicitly converted to. For example: int iarr[]={ 12. By default most implementations will probably use the same formatting as they do today. Inheritance is a way of expressing commonality. stating your opinions. not by reference. or the to be switch of jobs.h) Coming up Next month we'll have a look at inheritance.darr). } What does it mean? Let us first have a look at the requirements on the types for IN and OUT. OUT dest) { while (begin != end) { *dest = *begin. Must have operator*. IN end. Operator++ (prefix) must be allowed. So. the function template is expanded to a template function. here's some food for thought. Maybe it's the winter darkness.

e. } What this function does is to assign the value pointed to by "dest" the value of the dereferenced pointer "begin"..e. } return dest.. I will do the former. Either "*dest = *begin" prints the value of "*begin" on the screen. as long as "begin" does not equal "end". and conforms to the requirements stated for the template parameter "OUT?" The secrets seems to lie in the lines "*dest = *begin" and "++dest". and "end" the end of one. we can now copy arrays (or parts of arrays) of any type to arrays (or parts there of) of any type. but I can assure you. Of course. and decrementing it will make it point to the last element (as opposed to making it point n. // misc C& operator++(void). C operator++(int). prints on the screen. it must be possible to reach a state where "begin" does not compare unequal to "end. if the source type can be implicitly converted to the destination type. the values in an array) means copying the values from the array to the screen.double*>(int* begin. i. this is legal in C and C++. and then increment "begin" and "dest.b)" does nothing at all (except return "b").) The problem thus becomes.double copy<int*. or "*dest = *begin" makes our variable "dest" remember the value to print and on "++dest" it does the printing. How can we do this? Of course we can. as usual. "copy(a. like this: +----+-----+-----+-----+----+ primes_lt_10 = | 2 | 3 | 5 | 7 |XXXX| +----+-----+-----+-----+----+ ^ ^ | | begin end (points to non-existing element one past the last one. double* dest) { while (begin != end) { *dest = *begin. all bets are off. we can use the copy function template to do it. // prefix. use a loop over all elements and print them. Other uses Let's assume we want to print the contents of an array. there are two alternatives. which yields "undefined behaviour". while operator++ requires some thought since there are two operator++. by using the copy function template.e. i. Of these.) This is useful. where n>1. .") Assuming a class C. "begin" might be the beginning of an array. operator ++ is (usually) modeled as follows (and always with these member function signatures:) class C { public: . there's a difference between "dest++" and "++dest. no copying is done. c++ .e.) Fortunate as it is. The real joy begins when we realize we can write our own types that behaves the same way. As I see it. How? Printing an array (actually. one postfix and one prefix (i.. i. the fun has only begun. However. }. elements past the end. This means that to copy an entire array. how do we make a type that does the necessary conversion. It illegal to dereference the "one-past-the-end" pointer. int* end. ++begin.. Very useful. ++c. Note. like in the example above. "end" must point one past the last element of the array. through some conversion (the output formatting. That is. // other misc.forward.a. it is now necessary to see how some more operators can be overloaded. operator* is very straight. but the value is legal." This puts a requirement on "begin" and "end" by using operator++ (prefix only) on "begin". and "++dest" does nothing at all. // postfix." For example. however. To show you how this can be done. operator* and operator++. You can do the latter as an exercise. ++dest. that when "begin" and "end" are equal.

int_writer& operator*(). and one class whose only job in life is to be assignable by int. we're writing something. What do I want to use the result of operator* for? Only for assigning to. and I want the assignment to write something on standard output. but it makes perfect sense anyway. Here's what the implementation looks like: int_writer& int_writer::operator++() { return *this. the name "int_writer" is a dead giveaway for a class template. but the former is so much less work. If we made operator* return some other type.operator=(*begin)". Operator++ we implement to do nothing at all. return *this. the following "operator=(*begin)" means "dest.int_writer()). I say that dereferencing an int_writer yields an int_writer. can be expanded to: dest. we need to create two types. // let the prefix version do the job return old_val.. // do nothing } int_writer& int_writer::operator=(int i) { cout << i << endl. normally the screen. which writes integers on standard output (i. // whatever's needed to "increment" it. the line "*dest = *begin". Perhaps the latter is purer. }. we can use operator= for that class to do the writing. since it's not used for anything. decrementing is analogous. Of course. however. } C C::operator++(int) // throw away int. other than to { // distinguish between pre.operator*().and post-fix ++ C old_val(*this). // and return the old value } Needless to say. but operator* and operator= are interesting. and which very assignment actually means writing. Since the return value of "dest.) class int_writer { public: // trust the compiler to generate necessary constructors and destructor int_writer& operator++().iarr+isize. // remember the old value ::operator++().operator=(*begin). isn't it? Why limit it to integers only? template <class T> class writer { public: . one int_writer. it's not used. yes.e. It's weird.C& C::operator++(void) { . // does the real writing. and if the type of "*begin" can be implicitly converted to "int". dereferencing it yields a T. int_writer& operator=(int i). Let's make our simple "int_writer" class. If you look at a pointer to T. Weird? Well. right? If I make operator* return the very object for which operator* was called on.operator*()" is a reference to "dest" itself. } This means that if "dest" is of type int_writer. return *this.. In this case. // do nothing } int_writer& int_writer::operator*() { return *this. Cool eh? Here's all it takes to write the contents of the prime number array: copy(iarr.

// trust the compiler to generate necessary constructors // and destructor writer<T>& operator++(). I propose that we can create a "reader<T>" with a number.iarr+isize. through operator++. Of course. how else would you write it? As a last example. } template <class T> reader<T>& reader<T>::operator++() . } I've changed the signature for "operator=" to accept a const reference instead of a value. and operator* must return a value. It must also be possible to create an "end" reader<T>. Here's how reader<T> might look like: template <class T> class reader { public: reader(unsigned count=0). T must be writable through operator<<. writer<T>& operator*(). especially on the reachability issue.writer<int>()). let's have a look at the source side. reader<T>& operator++(). Can we create a type matching the requirements for "IN". and the number is the amount of T's to read from standard input. The prime number copying now becomes: copy(iarr. a new T is read. To make this example simple. for writer<T>. T t. For every operator++. } template <class T> writer<T>& writer<T>::operator*() { return *this. and the number of reads remaining is decremented. // does the real writing. yet another requirement surfaced. int operator!=(const reader<T>& r) const. With this template. template <class T> reader<T>::reader(unsigned count) : remaining(count) // the number of remaining reads. This requires some thought. 0 for { // the parameter-less constructor. writer<T>& operator=(const T&). the types for "IN". }. such that a copy would read values from standard input (normally the keyboard?) The requirements for "IN" are a little bit more complicated than those for "OUT. since T might be a type for which copying is expensive. const T& operator*() const. and we can use the parameter-less constructor for that. it must be possible to reach one value from another. } template <class T> writer<T>& writer<T>::operator=(const T& t) { cout << t << endl." It must be not-equal comparable. private: unsigned remaining. such that operator!= yields true. that's no surprise. }. template <class T> writer<T>& writer<T>::operator++() { return *this.

as long as the types are convertable from *IN to *OUT. allows both read and write.) bidirectional iterator (like forward. you can use it with any kind of data source/sink which have iterators that follows your convention. most probably your first ever encounter with. forward iterator (sort of the combination. } template <class T> int reader<T>::operator!=(const reader<T>& r) const { return r. The employee/engineer/manager inheritance tree was an example of that.{ if (remaining > 0 ) cin >> t. which behaves identically to what I used in this article. } The last one's perhaps debatable. and whatever you need. Conclusion What you've seen here is. That is a major time/code/debug saver. no remaining reads) state.array+size. output iterator. return *this.array). // read a new value only if // there are values to read. We can write output iterators to store values in a data base. } template <class T> const T& reader<T>::operator*() const { return t. The template parameters "IN" and "OUT" (from "copy") are called "iterators. // read 5 integers from standard input and // store in our float array. can be used as one. However. If you write your iterators to comply with the requirements of one of these categories. I've decided that operator != is really only useful for comparing with the end. We can write input iterators for data base access. and likewise for output iterators.e. The function template "copy" will be useful for any combination of the above. // print the read values as unsigned long's copy(array. this is mighty neat: const unsigned size=5. it will only return false if both sides have reached the end (i. the series of prime numbers or whatever you want to get values from. we . and iterators called "input_iterator" and "output_iterator" which behaves very similarly to the "reader" and "writer" class templates. To make matters even better.remaining != 0 || remaining != 0. float array[size]. copy(reader<int>(size). // return the last read value.) Pointers in arrays are typical bidirectional iterators. Part Part Part Part Part Part Part Part Part Part1 Part1 Part1 Part1 1 2 3 4 5 6 7 8 9 0 1 2 3 Short recap of inheritance Inheritance can be used to make runtime decisions about things we know conceptually. This is *VERY* useful. The standard documents 5 iterator categories." or "input iterators" and "output iterators" to be more specific.e. input iterator. by knowing about employees in general. generic programming. Anything that behaves like an input iterator. your iterators can be used with any of the algorithms that requires such iterators. this is all part of the now final draft C++ standard. to send audio data to our sound card.reader<int>().writer<unsigned long>()). Every algorithm you write can be used with any iterators of the type your algorithm requires. If you write an algorithm in terms of generic iterators. enumerating files in a directory. The draft contains a function template "copy". i. but allows moving backwards too. but not in detail.) and lastly random access iterators (iterators which can be incremented/decremented by more than one. to enter values at the end of a linked list.

The classic counter example is a vector drawing program. scaled. How do we force the descendants to override them? One way (a bad way) is to implement them in the base class in such a way that they bomb with an error message when called. though.'' There are 5 phases in which an error can be found. You know a number of things for shapes in general. As such the ``do nothing at all'' code belongs in ``Circle`` only. you can rotate them and translate them. Herein lies the problem. Pure virtual means that it must be overridden by descendants.can handle any kind of employee. A class which has only pure virtual member functions and no data is often called a pure abstract base class. managers. The problem is. A shape can be a square. including engineers. This makes sense. for example marketers. you can draw them on a canvas. Such a program usually holds a collection of shapes. a rectangle. It's only the concrete shapes that can be drawn. but there's a simple way of moving this particular discovery from runtime to compile time.'') • We can change the interface of ''Shape`` such that ``rotate'' is not a pure virtual. This in itself is not a problem. link and runtime. lines and so on. design. text. Pure virtual (abstract base classes) C++ offers a way of saying ``This member function must be overridden by all descendants. virtual void rotate(double angle) = 0. etc. In other words. or some times an interface class. and an empty implementation for ``Circle::rotate. a collection of grouped images. edit.'' Addressing pure virtuals I won't write a drawing program. The root of this lies in the illusion that doing nothing at all is the default behaviour. Mailing addresses have different formatting depending on sender and receiver country. shape. Here's how a pure abstract base class might be defined: class Shape { public: virtual void draw(Canvas&) = 0. project leaders. What would you do with a generic shape object? It's better to make it impossible to create one by mistake. virtual void translate(Coordinate c) = 0. The bad thing with that is that it violates a very simple rule-of-thumb. Instead I'll attack another often forgotten issue. virtual void scale(double) = 0. and code its implementation to do nothing. and ``Circle''). only objects of classes inheriting from it. how do you do any of these for a shape in general? How is a generic shape drawn or rotated? It's impossible. the best solution is with the original pure abstract ``Shape'' class. so how can we take care of that scenario? It's unnecessary to write code that does nothing. If you send something internationally . Abstract because you cannot instantiate objects of the class. while it is an optimization for circles. compile. Having one or more pure virtual member functions in a class makes the class an abstract base class. ``The sooner you catch an error. salesmen and janitors. rotated. since then our ``Circle'' class will be an abstract class (at least one pure virtual is not ``terminated. the class defines an interface that descendants must conform to. The graphically experienced reader has of course noticed that rotation of a circle can be implemented extremely efficiently by doing nothing at all. }. The latter is more descriptive. The problem in the model lies in the common base. addresses. Please note the obvious that errors that cannot be detected until runtime might go undetected! How to discover errors at design or edit time is not for this article (or even this article series). we can create our base class ``Shape'' with virtual member functions ``drawOn(Canvas&)''. • Let's just ignore it. since that'd make this article way too long. is it not? Let's have a look at the alternatives. ``Rectangle''. and any piece of code that can understand the interface can operate on objects implementing the interface (the concrete classes like ``Triangle''. a circle. secretaries. ``rotate(double degrees)''. This doesn't seem like a good idea because then the programmer implementing the square might forget to implement ``rotate'' without getting compiler errors. the better. and the point would be drowned in all other intricacies of graphical programming. The ``= 0'' ending of a member function declaration makes it pure virtual. It won't work. ``scale(double)'' and so on. and even kinds of employees we haven't yet thought of. it's not quite enough. ``translate(Coordinate c)''. you can scale them. A deficiency in the model While this is good. If you try you'll get compiler errors. and make sure to override these in our concrete shape classes. translated.'' Saying so also implies that objects of the class itself can never be instantiated. since it's meaningless anyway.

to access the address fields. Name Number Street City {Country} Postal-Code Then. virtual void acquire(void) = 0. however. As a simplification for this example I'll treat State and Zip in U.you add the destination country to the address. to always return the string ``Mailing address''. and this is a problem. but not pure virtual (what would happen if it was?) Unselfish protection All kinds of mailing addresses will share a base.S. that contains the address fields. of course. If the parameter for ``print'' is non-zero. This can be achieved through the third protection level.K. and ways to access them. the address will be printed in international form. Make sure ``State'' is only dealt with in address kinds where it makes sense. As an exercise you can improve this. there are totally different types of addresses.e. }.S. however. ``protected. virtual ~Address(). fax number. a mailing address. Ham Radio call-signs. even if they're Swedish addresses or U. The Country-Code as can be seen in the Swedish address example will also be ignored (this too makes for an excellent exercise to include). (i. Here we want something in between. e-mail address and so on. addresses are synonymous (i. The member function ``type'' will be defined here. The member function ``acquire'' is used for asking an operator to enter address data. This class. It is thus looser than private. since all kinds of mailing addresses are mailing addresses. and I will assume that PostalCode and State/Zip in U. etc. country name will be added to mailing addresses and international prefixes added to phone numbers). Note that the destructor is virtual. phone number. Here are a few (simplified) examples: Sweden Name Street Number {Country-Code}Postal-Code City {Country-Name} USA Name Number Street City. I'll only have one field that's used either as postal code or as state/zip combination. the concrete address classes. State Zip {Country-Name} Canada and U. E-mail. will not implement any of the formatting pure virtuals from ``Address. The address class hierarchy will be done such that other kinds of addresses like e-mail addresses and phone numbers can be added. but much stricter than public. while for domestic letters that's not necessary. virtual void print(int international=0) const = 0. but only the descendants and no one else. Here comes the ``MailingAddress'' base class: . We want descendants.e.'' Protected means that access is limited to the class itself (of course) and all descendants of it.S. Access to the address fields is for the concrete classes only. inheriting from ``Address''. Addresses.'' That must be done by the concrete address classes with knowledge about the country's formatting and naming. We've seen how we can make things generally available by declaring them public. Here's the base class: class Address { public: virtual const char* type() const = 0. addresses as a unit. depending on country). or by hiding them from the general public by making them private. The formatting itself also differs from country to country. The idea here is that ``type'' can be used to ask an address object what kind of address it is.

This is not because they conceptually don't make sense. It's the responsibility of this class to manage memory for the data strings. char* number_data. // get void street(const char*). protected: MailingAddress(). // set const char* postalCode() const. // get void postalCode(const char*). Here the copy constructor and assignment operator is declared private to disallow copying and assignment. // get private: char* name_data. // set const char* name() const. but because I'm too lazy to implement them (and yet want protection from stupid mistakes that would come. char* street_data. char* postalCode_data. // get void city(const char*). and always manage the resources for the data in a controlled way. protected data is a bad mistake. The reason for the constructor to be protected is more or less just aestethical. // get void country(const char*). Having all data private. // set const char* number() const. }. virtual void print(int international=0) const. Now we get to the concrete address classes: class SwedishAddress : public MailingAddress { public: SwedishAddress(). char* city_data. // set const char* country() const. }. MailingAddress& operator=(const MailingAddress&). and giving controlled access through protected access member functions will drastically cut down your aspirin consumption. // get void number(const char*). since some of the pure virtuals from ``Address'' aren't yet terminated.class MailingAddress : public Address { public: virtual ~MailingAddress(). char* country_data. void name(const char*). // // declared private to disallow them // MailingAddress(const MailingAddress&). no doubt. const char* type() const. As a rule of thumb. virtual void acquire(void). // set const char* street() const. No one but descendants can construct objects of this class anyway. // set const char* city() const. if I left it to the compiler to generate them). class USAddress : public MailingAddress { public: . distributing this to the concrete descendants is asking for trouble.

I've left the destructors to be implemented at the compilers discretion. Don't be afraid of copy construction and assignment.) Since it will never. The ``type'' and read-access methods are trivial: const char* MailingAddress::type(void) const . By termination. writing it like this can only mean one thing. There's no way around that. be called through virtual dispatch. delete[] city_data. street_data(0). just for the sake of argument. in order to guarantee destructability. city_data(0). delete[] country_data. }. you'll probably get a nasty run-time error when the first concrete descendant is destroyed. hence the rule that you cannot instantiate objects where pure virtuals are not terminated. delete[] postalCode_data. the ``MailingAddress'' base class. and yet implement it! Pure virtual does not illegalize implementation. by just calling the function on an object. I mean declaring it in a non pure virtual way. and ``acquire''. country_data(0) { } The only thing the constructor does is to make sure all pointers are 0. Yes. They were declared private in ``MailingAddress''. Now let's look at the middle class. we oughtn't restrict them.'' Let's look at the implementation. The only way to call the implementation of ``acquire'' in ``Address'' is to explicitly write ``Address::acquire. for some reason. This is used here. Since we don't know the length of the fields. It only means that the pure virtual version will NEVER be called through virtual dispatch (i. If. For the ``Address'' base class only one thing needs implementing and that is the destructor. it will be 0. that we through some magic found a way to implement the some reasonable generic behaviour of ``acquire'' in ``Address.'' but we want to be certain that descendants do implement it. you can declare a member function pure virtual. } I said when explaining the interface for this class. virtual void acquire(void). even if ``Address::acquire'' is declared pure virtual. the definitions of ``USAddress'' and ``SwedishAddress'' are identical. The ``delete[]'' syntax is for deleting arrays as opposed to just ``delete'' which deletes single objects. virtual void print(int international=0) const. Then how can one be called? Through explicit qualification. postalCode_data(0). As you can see. We know the parent takes care of it. That's wrong. Since there's no data to take care of in these classes (it's all in the parent class) we don't need to do anything special here.'' This is what explicit qualification means. If you declare it pure virtual and don't implement it. OK. by the way. it must be implemented by the descendants. so a pure virtual won't ever be called through virtual dispatch.USAddress(). There's no escape for the compiler. we can save a little typing by declaring it pure virtual and there won't be a need to implement it. delete[] street_data. delete[] number_data. one of the fields are not set to anything. Deleting the 0 pointer does nothing at all. From this to the constructor: MailingAddress::MailingAddress() : name_data(0). Let's assume. since the destructor will be called when a descendant is destroyed. that it is responsible for handling the resources for the member data. ever. a reference or a pointer to an object. but rather dynamically allocate whatever is needed. MailingAddress::~MailingAddress() { delete[] name_data. Since the class holds no data. though.e. The observant reader might have noticed a nasty pattern of the authors refusal to get to the point with pure virtuals and implementation. Note that it's legal to delete the 0 pointer. which means the compiler cannot create them the ``USAddress'' and ''SwedishAddress. the destructor will be empty: Address::~Address() { } A trap many beginners fall into is to think that since the destructor is empty. number_data(0). The only difference lies in the implementation of ``print''.

strcpy(data. const char* MailingAddress::name(void) const { return name_data.n). ``strlen'' and ``strcpy'' are the C library functions from <string> that calculates the length of. } const char* MailingAddress::postalCode(void) const { return postalCode_data. This is to achieve robustness. } The write access methods are a bit trickier. but I can't think of any way). it's perfectly possible to see something like: name(name()). const char* n) { if (data != n) { delete[] data. ``set the name to what it currently is. static void replace(char*& data. data = new char[strlen(n)+1]. } } This is done so many times over and over. exactly the same way for all kinds of data members. a new one allocated on heap and the contents copied. and copies strings. and do nothing in those situations. First we must check if the source and destination are the same. The meaning of this is. If the source and destination are different. } const char* MailingAddress::country(void) const { return country_data. } const char* MailingAddress::number(void) const { return number_data. Like this: void MailingAddress::name(const char* n) { if (n != name_data) { delete[] name_data. } const char* MailingAddress::street(void) const { return street_data.'' to do the job. While it may seem like a very stupid thing to do.{ } return "Mailing address". the old destination must be deleted. strcpy(name_data. however.'' We must make sure that doing this works (or find a way to illegalize the construct. though. // OK even if 0 name_data = new char[strlen(n)+1]. } const char* MailingAddress::city(void) const { return city_data. } } .n). of course. ``replace. that we'll use a convenience function.

} void SwedishAddress::acquire(void) { char buffer[100]. cout << street() << ' ' << number() << endl. } void MailingAddress::postalCode(const char* n) { ::replace(postalCode_data.sizeof(buffer)).n). } void MailingAddress::country(const char* n) { ::replace(country_data. } That was all the ``MailingAddress'' base class does.getline(buffer. cout << "Number: " << flush.getline(buffer. // what else? } void SwedishAddress::print(int international) const { cout << name() << endl. All they do is to ask questions with the right terminology and output the fields in the right places: SwedishAddress::SwedishAddress() : MailingAddress() { country("Sweden"). } void MailingAddress::city(const char* n) { ::replace(city_data. cin. cout << "Street: " << flush. if (international) cout << country() << endl. name(buffer).n). .n). } void MailingAddress::street(const char* n) { ::replace(street_data.Using this convenience function.n).sizeof(buffer)). // A mighty long field cout << "Name: " << flush. street(buffer).n). the write-access member functions will be fairly straight forward: void MailingAddress::name(const char* n) { ::replace(name_data. } void MailingAddress::number(const char* n) { ::replace(number_data. cin.getline(buffer.n). Now it's time for the concrete classes. cin. cout << postalCode() << ' ' << city() << endl.sizeof(buffer)).

cout << "City: " << flush. } void USAddress::acquire(void) { char buffer[100]. } A toy program Having done all this work with the classes.getline(buffer. // needed for VACPP (bug?) Address** last = get_addrs(addrs.sizeof(buffer)). cout << "Number: " << flush. cout << endl << "--------" << endl. postalCode(buffer). Here's an short and simple example program that (of course) also makes use of the generic programming paradigm introduced last month. city(buffer). cout << city() << ' ' << postalCode() << endl.addrs+size). if (international) cout << country() << endl. cin.number(buffer). cin.sizeof(buffer)). cout << "City: " << flush. cout << "State and ZIP: " << flush.sizeof(buffer)). .getline(buffer. sizeof(buffer)). number(buffer). we must of course play a bit with them. cout << "Street: " << flush.getline(buffer. cin.getline( buffer. name(buffer).sizeof(buffer)). postalCode(buffer). city(buffer). cout << "Postal code: " << flush. int main(void) { const unsigned size=10.S.getline(buffer. cin."). } USAddress::USAddress() : MailingAddress() { country("U.A. street(buffer). cout << number() << ' ' << street() << endl. // Seems like a mighty long field cout << "Name: " << flush. cin. cin.getline(buffer.sizeof(buffer)). Address* addrs[size]. cin. // what else? } void USAddress::print(int international) const { cout << name() << endl. Address** first = addrs.getline(buffer.sizeof(buffer)).

} } In fact. } (**current). break. return 0. case 'S': case 's': *current = new SwedishAddress.OI last. // Should be enough. cin. It could be implemented like this: template <class OI. there is a beast called ``for_each'' and behaving almost like this one (it returns the functor). ++current. default: return current.class F> void for_each(OI first. The reason is that we'd need to work a lot without gaining anything. which we can access through some subscript or whatever. It's something which behaves like a function. } In part 6 I mentioned that virtual dispatch could replace switch statements. break. that was mean. or it terminates for some other reason. for_each(first. switch (answer[0]) { case 'U': case 'u': *current = new USAddress. Why? We obviously cannot do virtual dispatch on the ``Address'' objects we're about to create. that was reading. Could this one be replaced with virtual dispatch as well? It would be unfair of me to say ``no''.} OK. but it would be equally unfair of me to propose using virtual dispatch here. . although it looks odd at first.print(1)).sizeof(answer)). and yet here is one.last. and call a virtual creation member function for. Doesn't seem to save a lot of work does it? Probably the selection mechanism for which address creating object to call would be a switch statement anyway! So. (S)wedish or (N)one " << flush. Instead we'd need a set of address creating objects. which reads addresses into a range of iterators (in this case pointers in an array) until the array is full. Here's how it may be implemented: Address** get_addrs(Address** first. const F& functor) { while (first != last) { functor(*first).Address** last) { Address** current = first. Defining one is easy. } return current. while (current != last) { cout << endl << "Kind (U)S. It's pretty handy. char answer[5]. for_each(first. if (!cin) break.acquire(). since they're not created yet.getline(answer. but which might store a state of some kind (in this case whether the country should be added to written addresses or not). now for the rest.'' or ``function object'' as they're often called. Imagine never again having to explicitly loop through a complete collection again. in the (final draft) C++ standard. Obviously there's a function ``get_addrs''. ++first. and which can be passed around like any object. What is ``print'' then? Print is a ``functor.deallocate<Address>()).last. ``for_each'' does something for every iterator in a range.

and how you declare pure virtual functions. • that despite what most C++ programmers believe. pure virtual functions can be implemented. • why it's a bad idea to make destructors pure virtual. and yet define it. // define print object. you've learned: • what pure virtual means. by the way.operator()(1).'' • why protected data is bad. • that there is a ``function call'' operator and how to define and use it. We'll look mostly at library stuff and clever ideas for how to use the language from now on. and implementing one. Think of two ways to handle the State/Zip problem. This is usually called the ``function call'' operator. pobject(1). and how you can work around it in a clever way. Recap This month. and simply call it. template <class T> void deallocate<T>::operator()(T* p) const { delete p. . private: int international. isn't it? You know what? You know by now most of the C++ language. and have some experience with the C++ standard class library. Like this: print pobject. The only remaining thing now is ``dealllocate<T>''. print::print(int i) : international(i) { } void print::operator()(const Address* p) const { p->print(international).class print { public: print(int i) . // pobject. but you probably already guessed it looks like this: template <class T> class deallocate { public: void operator()(T* p) const. • that the above means that there's a distinction between terminating a pure virtual. • a new protection level. and implement both (what are the advantages. void operator()(const Address*) const. }. Exercises • • • Find out what happens if you declare the ``MailingAddress'' destructor pure virtual. disadvantages of the methods?) Rewrite ``get_addrs'' to accept templatized iterators instead of pointers. } This is well enough for one month. • that switch statements cannot always be replaced by virtual dispatch. cout << endl. }. } What on earth is ``operator()''? It's the member function that's called if we boldly treat the name of an object just as if it was the name of some function. ``protected. Most of the language issues that remain are more or less obscure and little known.

and lots of useful and cool techniques are waiting to be exploited. In other words. from which the classes ``istream'' and ``ostream'' inherit. like Eiffel. We saw this for the staff hierarchy and mailing addresses in parts 7 and 8. The only thing that truly differs is the media where the formatted message ends up. Quite a bit of the library remains. independent of data. and ``fstream'' which inherits from both ``ifstream'' and ``ofstream. next month we'll look at file I/O (finally). but it can cause severe problems. there is very little difference. just for differing between character types. it's better to stop using the term I/O here. In the former case. how it does end up there) differs. since the data will be the same. and is by many seen as evil. since the ideas expressed here and in parts 5 and 6 can be used for other things than I/O. depending on what's common and what's not.'' Inheriting from two bases is called multiple inheritance. templates are used when we want the same kind of behaviour. with formatted reading and writing from standard input and output. Few compilers today support this. The C++ standard does indeed have templatized streams. regarding the type of characters used. (Incidentally. For example a stack of some data type. which inherits from both ``istream'' and ``ostream''. Smalltalk to mention a few. it's on your screen. Here is a situation where it's used in the right way. there's a good case for using templates too. behaviour at runtime for the same kind of data. /Björn Part Part Part Part Part Part Part Part Part Part1 Part1 Part1 Part1 1 2 3 4 5 6 7 8 9 0 1 2 3 In parts 5 and 6. Anyway. the basics of I/O were introduced. there's very much in common. Then there's the odd ones.Coming up As I mentioned just a few lines above. for example in-memory formatting of data (we'll see that at the very end of this article. commonality is expressed either through inheritance or templates. The classes ``ifstream'' and ``ofstream'' in their turn inherit from ``istream'' and ``ostream'' respectively.) Files In what way is writing ``Hello world'' on standard output different from writing it to a file? The question is worth some thought. however.) The inheritance tree for stream types look like this: The way to read this is that there's a base class named ``ios''. Anyway. but where it will end up (and most notably. most of the C++ language is covered. To refresh your memory. since in many programming languages there is a distinct difference. See the ``Standards Update'' towards the end of the article for more information. Is the message different? Is the format (as seen from the program) different? I cannot see any difference in those aspects. or at least. pointer and delete. but in some important aspects different. As we've seen so far. and instead use streams and streaming. Here's something for you to think about destructor. In this case it's inheritance that's the correct solution. Inheritance is used when you want similar. ``iostream''. this means . We'll now have a look at I/O for files. but for file I/O it's in a file somewhere on your hard disk. go to the other extreme and allow you to inherit the same base several times Personally I think multiple inheritance is very useful if used right. while other programming languages. Java. The ``f'' in the names imply that they're file streams. Many programming languages have banned it: Objective-C. In a sense.

The parts of interest look like this: class ifstream : public istream { ifstream(). while ``iostream'' is an abstract stream for both reading and writing. int mode=ios::in). Some implementations do not have ``ios::binary. This inheritance. void open(const char* name.h>. }. int mode=ios::in). and finally ``ios::binary.. a call to ``open'' must be made. you probably don't want to use the ``iostream'' or ``fstream'' classes. int mode). You get access to the classes by #including <fstream.. File Streams The first thing you need to know before you can use file streams is how to create them. . wasn't that neat? In other words. will work just as they do with file streams. int mode=ios::out). Since you normally use either ``ifstream'' or ``ofstream'' and rarely ``fstream''. however. To tie such an object to a file.. class fstream : public ofstream. the only things you need to learn for file based I/O are the details that are specific to files. int mode=ios::out). More often than you think. ofstream(const char* name. int mode). }.. scrap all data in the file if it already exists. }. class ofstream : public ostream { ofstream(). void open(const char* name.'' while others call it ``ios::bin. they belong to class ``ios_base.. public ifstream { fstream(). Fortunately. .'' rather than ``ios. this is normally the only parameter you need to supply. . It's a bit field. ifstream(const char* name.. however. that is. ``ios::out''. ``open'' and the constructors with parameters behaves identically. you need to use the ``mode'' parameter. Now. ``name'' is of course the name of the file.'' but those are extensions. . the six ones listed first are required by the standard (although.that ``fstream'' is a file stream for both reading and writing. fstream(const char* name. in which you use bitwise or (``operator|'') for any of the values ``ios::in''. Sometimes. ``ios::trunc''. ``ios::ate''. any write you make to the file will be appended to the file.'' These variations of course makes it difficult to write portable C++ today.'' Some implementations also provide ``ios::nocreate'' and ``ios::noreplace. open for append. The empty constructors always create a file stream object that is not tied to any file.'') The meaning of these are: ios::in ios::out ios::ate ios::app ios::trunc open for reading open for writing open with the get and set pointer at the end (see Seeking for info) of the file. means that all the stream insertion and extraction functions (the ``operator>>'' and ``operator<<'') you've written. ``ios::app''. void open(const char* name.

If you look at a file produced by. ostream& put(char c). so there's no need for it. for example to save space in a file.) They're two different concepts..h> int main(int argc.ios::binary open in binary mode. . and opening a file with the ``ios::binary'' mode. cannot open `` << argv[1] << endl. if (!of) { // something went wrong cout << ``Error. } // Now the file stream object is created. How this parameter behaves is very operating system dependent. do not do the brain damaged LF<->CR/LF conversions that OS/2. } As you can see. Binary streaming So far we've dealt with formatted streaming only. Binary streaming is what you use your stream for.. { public: ostream& write(const char* s. Actually this is all there is that's specific to files. ostream& flush(). Windows. just use the object as you've used ``cin'' earlier. Binary streaming is done through the stream member functions : class ostream . that is. return 1. the process of translating raw data into a human readable form. and probably other operating systems. Now for some simple usage: #include <fstream.the failure is guaranteed. The file stream classes also have a member function ``close''. Of course combinations like ``ios::noreplace | ios::nocreate'' doesn't make sense -. // create the ofstream object // and open the file. or translating human readable data into the computer's internal representation. Write to it! of << ``Hello file!'' << endl. since the destructors do close the file. raw data that is. its usage is analogous to that of ``cout'' that you're already familiar with. for example a word processor. that by force closes the file and unties the stream object from it. Some times you want to stream raw data as raw data. CP/M (RIP). Of course reading with ``ifstream'' is done the same way. char* argv[]) { if (argc != 2) { cout << ``Usage: `` << argv[0] << ``filename'' << endl. once the stream object is created. return 0. ios::nocreate cause the open to fail if the file doesn't exist. ios::noreplace cause the open to fail if the file already exists. Few are the situations when you need to call this member function. that is. // error code } ofstream of(argv[1]).'' a protection parameter. return 2. means turning the brain damaged LF<->CR/LF translation off. that is indeed often the case. so often insist on. DOS. The reason some implementations do not have ios::binary is that many operating systems do not have this conversion. Note that binary streaming does not necessarily mean using the ``ios::binary'' mode when opening a file (although. streamsize n). it's most likely not in a human readable form. On many implementations today there's also a third parameter for the constructors and ``open.

but with the difference that it reads at most ``n'' characters. Of course. but doesn't store them anywhere. Note that when the delimiter is found. streamsize n. It stops if the delimiter character is found. it is not read from the stream. no more. Reads at most ``n'' characters from the stream. int delim=EOF). istream& get(char* s. no less. The value is an ``int'' instead of ``char'' since the return value might be ``EOF'' (which is not uniquely representable as a ``char. or unpleasant things will happen. but read the character into ``c'' instead. istream& istream::ignore(streamsize n=1. }. char delim='\n'). one by one: ostream& ostream::write(const char* s.) istream& istream::read(char* s.'' since you can check the value directly by calling ``. int delim=EOF). streamsize n. .. int get(). Same as above. while the reading interface includes a number of small but important differences.. is that this one does read the delimiter from the stream. unless the last character read from the stream indeed is '\0'. istream& ignore(streamsize n=1. char delim='\n'). char delim='\n'). Note that only the characters read from the stream are inserted into the array. if the delimiter is ``EOF'' (as is the default) it does not read past ``EOF. Read ``n'' characters into the array pointed to by ``s. The only difference between this one and ``get'' above. streamsize n.'' that's physically impossible. Inserts the character into the stream. it stops there. istream& get(char& c).'' so they're not specific to files. however. Write ``n'' characters to the stream. ostream& ostream::put(char c).. int istream::get(). class istream . Let's have a look at them. ostream& ostream::flush(). }. streamsize n.'' Here you better make sure that the array is large enough. Note. The writing interface is extremely simple and straight forward.eof()'' on the reference returned. char delim='\n'). from the array pointed to by ``s. Note that these member functions are implemented in classes ``istream'' and ``ostream. istream& istream::getline(char* s. This one's similar to ``read'' above. Here a ``char'' is used instead of an ``int. It will not be zero terminated. Despite ``streamsize'' being signed. Read one character from the stream.. you're of course not allowed to pass a negative size here (what would that mean?) Exactly the characters found in ``s'' will be written to the stream. that the delimiter is not stored in the array. streamsize n). istream& getline(char* s. istream& istream::get(char* s.'') istream& istream::get(char& c).'' ``streamsize'' is a signed integral data type. Force the data in the stream to be written (file streams are usually buffered. streamsize n). { public: istream& read(char* s. although files are where you're most likely to use them. and return it.. streamsize n). If the delimiter character is read.

return elems. os. To read such an array into memory requires a little more work: #include <fstream. } The above code does a lot of ugly type casting. int*& p) { size_t elems.e. Both the size and the data will be in raw format. #include <fstream. ios::seek_dir). but what you find out might hold only for the current release of your specific compiler. is. both backward and forward.'' but that is a dangerous duplication of facts.'') Well. but that's normal for binary streaming.h> void storeArray(ostream& os. ostream& ostream::seekp(streamoff. elems*sizeof(*p)). it does the same kind of thing for the array. I could as well have written ``sizeof(int). continuous streams of data. is. what it sounds like.'' Repeating ``int'' again just means I'll forget to update one of them when I change the type to something else. which refers to the next position to write data to.h> size_t readArray(istream& is. then allocate an array of that size. Files. size_t elems) { os. streampos ostream::tellp(). there's a need to move around. which refers to the next position to read data from. there are two other things you can do with ``streampos'' values. After this. sizeof(elems)). within which you cannot move around. It's enough that I've said that ``p'' is a pointer to ``int. They're not to be confused with pointers in the normal C++ sense. Seeking Up until now we have seen streams as. Random access streams have something called position pointers. if you attempt to write anything.read((char*)elems. istream& istream::seekg(streampos). istream& istream::seekg(streamoff. and get a ``streamoff'' . You can subtract two values. and we want to do this in raw binary format. elems*sizeof(*p)). Note that ``sizeof(*p)'' reports the size of the type that ``p'' points to. You cannot use the values for anything other than ``seekg'' and ``seekp''. Streams like standard input and standard output are truly continuous streams. Sometimes however.Array on file An example: Say we want to store an array of integers in a file. There's the put pointer. A reasonable way is to first store a size (in elements) followed by the data. other compilers.read((char*)&elems. might show different characteristics for ``streampos. ostream& ostream::seekp(streampos). in contrast. Naturally we want to be able to read the array as well. const int* p. or other releases of the same compiler. are true random access data stores. You especially cannot examine a value and hope to find something useful there (i. first read the number of elements. What this actually does is to write out the raw memory that ``elems'' resides in to the stream. but it's something referring to where in the file you currently are. and the get pointer. and an istream only the get pointer. There's a total of 6 new member functions that deal with random access in a stream: streampos istream::tellg(). } It's not particularly hard to follow. What's done here is to use brute force to see the address of ``elems'' as a ``const char*'' (since that's what ``write'' expects) and then say that only the ``sizeof(elems)'' bytes from that pointer are to be read.sizeof(elems)).write((const char*)p. p = new int[elems]. ios::seek_dir). you can. which you get from ``tellg'' and ``tellp'' is an absolute position in a stream. and read the data into it.write((const char*)&elems. ``streampos''. An ostream of course only has the put pointer.

) In addition to arrays. ask . so it can be used to store arbitrary types.'' To make the next write occur on the very first byte of the stream. we want some measures of safety from stupid mistakes. any of the seek member functions use lazy evaluation. when you call any of the seek member functions. quite a few of the above listed features will be left for next month. disk corruption. for sure. such as asking for the number of elements in it. // use compiler defined destructor. What's the non-const ``operator[]'' to return? To see why this is a problem. Its usage must resemble that of real arrays as much as possible.value. probably a ``long. etc.'' In any reasonable implementation. Since we do not want the size to be part of the type signature. The array must be possible to use with any data type. // Create a new array and set the size. resembling pointers to arrays. Instead.'' where ``os'' is some random access ``ostream. I'll raise some interesting questions along the way. which has these three values ``ios::beg''.'' By using the value returned from ``tellg'' or ``tellp. ``ios::end'' and ``ios::cur. This makes for slow access.) We also want to say that an array is just a part of a file and not necessarily an entire file. at least not on my machine with a measly 64Mb RAM.. As can be expected. is some signed integral type. FileArray& operator=(const FileArray&). but nothing else will suffer. The ``seekg'' and ``seekp'' methods accept a ``streamoff'' value and a direction. To prevent this article from growing way too long.) A stream array. FileArray(const FileArray&). already here we see a problem. and you can add a ``streamoff'' value to a ``streampos'' value. which is handy for providing a familiar syntax. we cannot have the entire array duplicated in memory (then all the benefits will be lost. That is. but extra functionality that arrays do not have. and also for errors that arrays cannot have (disk full. and work in a slightly different way. . is OK. This would allow the user to create several arrays within the same file.'' by the way. ``operator[]'' can be overloaded.) and add that too next month. // Create an array from an existing file. let's use a file to access the data. call ``os.. such as addressing beyond the range of the array.seekp(0.'' you have a way of finding your way back. get the // size from the file. or do relative searches by adding/subtracting ``streamoff'' values. Of course. FileArray(const char* name). First of all. The things to cover this month are: An array of built-in fundamental types only. something truly happens on disk (or wherever the stream data resides. the only thing that happens is that some member variable in the stream object changes value. We'll also skip error handling for now (you can add it as an exercise. cannot create file. template <class T> class FileArray { public: FileArray(const char* name. is done through the ``ios::seek_dir'' enum. you know why. the end of the stream.ios::beg). It'll not just make this application crawl. but probably the whole system due to excessive paging. Here's the outline for the class. You search your way to a position relative to the beginning of the stream. or the current position. the array must be a template. the selection of which. for really huge amounts of data Suppose we have a need to access enormous amounts of simple data.) instead we will search for the data on file every time it's needed. It's not until you actually read or write. Here's the idea. the size is not a template parameter. size_t elements). which lacks pointers and is limited to one file per array. that can be used for traversing it. size_t size() const. including user defined classes. but a parameter for the constructor. say 10 million floating point numbers. However. There must be a type. It's not a very good idea to just allocate that much memory. We do not want the size of the array to be part of its type (if you've programmed in Pascal. }. ``streamoff. private: // don't want these to be used. ??? operator[](size_t index). T operator[](size_t index) const.

of course. This means that ``FileArray<T>'' can access everything in ``FileArrayProxy<T>.'' but rather let it return a type. is to make the constructors public.'' since then we'd have an infinite recursion.. Friends are useful for strong encapsulation. After all. // read a value // compiler generated destructor FileArrayProxy<T>& // read from p and then write operator=(const FileArrayProxy<T>& p). this class is a helper for the array only. . The trick is. in ``FileArrayProxy<T>'' declare ``FileArray<T>'' to be a friend. and (this is the real shock) that's a good thing! Friends break encapsulation in a controlled way. // read a value // compiler generated destructor FileArrayProxy<T>& operator=(const FileArrayProxy<T>& p).yourself what you want ``operator[]'' to do. We can.. I want to write data to the file. the non-const version for non-const array objects. Instead what we have to do is to pull a little trick. however. This is done by not taking care of the problem in ``operator[]. }. what you read is right. and if its on the right hand side of an assignment.'' It's meaningless without ``FileArray<T>. FileArray<T>& array. Paradoxically. all other constructors. . except for the copy constructors. When ``operator[]'' is on the left hand side of an assignment. poses a problem. We have to make sure. I want to read data from the file. but then anyone can create objects of this class. Friends break encapsulation. Ouch. We create a class template. violating encapsulation with friendship strengthens encapsulation when done right. to add another level of indirection. but it's important to use it only in situations where two (or more classes) are so tightly bound to one another that they're meaningless on their own. like this: FileArray<int> x. x[5] = 4. // write a value operator T() const. depending on where it's used. looking like this: template <class T> class FileArrayProxy { public: FileArrayProxy<T>& operator=(const T&). it's wrong and it won't work.'' The declaration then becomes: template <class T> class FileArrayProxy { public: FileArrayProxy& operator=(const T&).. and that's what we wanted to prevent. I want ``operator[]'' to do two things. This. The only alternative here to using friendship. with the constructors being private. const size_t index. The const version is called for const array objects. and the non-const version write a value. private: . Friends are a way of breaking encapsulation. and is not intended to ever even be seen.. which does the job. that there are member functions in ``FileArray<T>'' that can read and write (and of course. FileArrayProxy(const FileArrayProxy<T>&). Warning: I've often seen it suggested that the solution is to have the const version read and return a value. As slick as it would be. are made private to prevent users from creating objects of the class whenever they want to.'' thus ``FileArray<T>'' is declared a friend of ``FileArrayProxy<T>. those functions are not the ``operator[]. how can ``FileArray<T>::operator[]()'' create and return one? Enter another C++ feature: friends. // write value operator T() const. This is the case with ``FileArrayProxy<T>.) All constructors. What?!?! Yes. int y = x[3].'' including things that are declared private. as so often in computer science.

friend class FileArrayProxy<T>. stream. friend class FileArray<T>.hpp #ifndef FARRAY_HPP #define FARRAY_HPP #include <fstream. Let's define them right away template <class T> T FileArray<T>::readElement(size_t index) const { T t. Again. // illegal FileArray<T>& operator=(const FileArray<T>&). since they're not for anyone to use. const size_t index. size_t size). // Forward declaration necessary. size_t max_size. template <class T> class FileArray { public: FileArray(const char* name. and as such.h> #include <stdlib. since FileArray<T> // returns the type. // for use by FileArray<T> only.// compiler generated copy contructor private: FileArrayProxy(FileArray<T>& fa. private: FileArray(const FileArray<T>&). The functions for reading and writing are made private members of the array. size_t n). }.read((char*)&t. FileArray<T>& array. We can now start implementing the array. // what if seek fails? stream. void storeElement(size_t index. // use existing array T operator[](size_t size) const.h> // size_t template <class T> class FileArrayProxy. // for use by FileArrayProxy<T> T readElement(size_t index) const. and neither ``seekg'' nor ``read'' are allowed on constant . but I'll mention them as we go. // what if read fails? return t. sizeof(t)). FileArrayProxy<T> operator[](size_t size). The member function is declared ``const''. we need to make use of friendship to grant ``FileArrayProxy<T>'' the right to access them. // create FileArray(const char* name). size_t size() const. fstream stream. }. // farray.seekg(sizeof(max_size)+index*sizeof(T)). we face an unexpected problem. The above code won't compile. Some problems still lie ahead. } All of a sudden. const T&). all member variables are ``const''.

the thing pointed to still isn't a constant (look at the return type for ``operator*. one of adding another level of indirection. what if I forget to delete the pointer? Sure. // what if seek fails? T t. however.'' it's a ``T&. template <class T> ptr<T>::ptr(T* pt) : p(pt) { } template <class T> ptr<T>::~ptr() { delete p. T& operator*() const. However. This solution is. The only thing we have to keep in mind when using it.'' When in a ``const'' member function. destructor. pointer and delete.'' let's use a ``ptr<stream>'' member named ``pstream. When this thing is a constant.) The only reasonable way to achieve this is to store the stream object on the heap. it is not bitwise const.'' not a ``const T&. only bitwise constness. yet again. as it does not alter the array in any way. This solves our problem nicely. and a pointer to a constant. This member function is logically ``const''. // we don't want copying ptr<T>& operator=(const ptr<T>&). so I have to find a different solution. If you have a modern compiler. . and in doing this I introduce a possible danger. but not what it points to (there's a difference between a constant pointer.seekg(sizeof(max_size)+index*sizeof(T)). private: ptr(const ptr<T>&). the solution is very simple. C++ cannot understand logical constness.'' in the class definition. instead of using an ``fstream'' member variable called ``stream. is to make sure that whatever we feed it is allocated on heap (and is not an array) so it can be deleted with operator delete. ~ptr(). then the destructor will never execute (since no object has been created that must be destroyed. I. you declare ``stream'' to be ``mutable fstream stream. Thought of anything? What about this extremely simple class template? template <class T> class ptr { public: ptr(T* pt). } template <class T> T& ptr<T>::operator*() const { return *p. I'll delete it in the destructor. The problem is one of differing between logical constness and bitwise constness. I can have a pointer to an ``fstream. whatever it points to is deleted. the stream member changes.'' I'll probably devote a whole article exclusively for these some time. }.) Do you remember the ``thing to think of until this month?'' The clues were. Whenever an object of this type is destroyed. ``readElement'' must be slightly rewritten: template <class T> T FileArray<T>::readElement(size_t index) const { (*pstream). have a very old compiler.'') So. the pointer is also ``const''. but what if an exception is thrown already in the constructor. // nor assignment T* p.streams. } This is probably the simplest possible of the family known as ``smart pointers.'' With this change.

max_size(size) { // what if the file could not be opened? // store the size on file. } I bet the change wasn't too horrifying. // what if write failed? } Now for the constructors: template <class T> FileArray<T>::FileArray(const char* name. ios::in|ios::out|ios::binary)).write((char*)&elem. size_t size) : pstream(new fstream(name.(*pstream).seekp(sizeof(max_size)+index*sizeof(T). // What if read failed because of a disk error? } template <class T> . // what if read fails? return t. // what if seek fails? (*pstream).read((char*)&t. (*pstream). sizeof(elem)). // what if write failed? // We want to write a value (any value) at the end // to make sure there is enough space on disk. sizeof(max_size)). ios::in|ios::out|ios::binary)). // what if read fails or max_size == 0? // How do we know the file is even an array? } The access members: template <class T> T FileArray<T>::operator[](size_t size) const { // what if size >= max_size? return readElement(size).read((char*)&max_size. ios::beg). max_size(0) { // get the size from file. template <class T> void FileArray<T>::storeElement(size_t index. storeElement(max_size-1. const T& elem) { (*pstream). sizeof(max_size)).t). sizeof(t)). (*pstream).write((const char*)&max_size. // What if this fails? } template <class T> FileArray<T>::FileArray(const char* name) : pstream(new fstream(name. T t.

Note.'' template <class T> class FileArrayProxy { public: // copy constructor generated by compiler operator T() const. I've left out the ``size'' member function. this wasn't too much work. since the return value must be copied (return from ``FileArray<T>::operator[].'') and it must be public for this to succeed. but the result would *NOT* be what we want. The copy constructor is needed. which just copies all member variables. will do just fine.readElement(index). FileArrayProxy<T>& operator=(const FileArrayProxy<T>& p). } template <class T> FileArrayProxy<T>& FileArrayProxy<T>::operator=(const T& t) { fa. but then. FileArray<T>& fa. } Well. however.t). Next in line is ``FileArrayProxy<T>.storeElement(index. fa(f) { } template <class T> FileArrayProxy<T>::operator T() const { return fa. size_t index. The compiler doesn't generate a default constructor (one which accepts no parameters. but it will fail. size_t i). but what we want to do is to read data from one array and write it to another. it would succeed.) since we have explicitly defined a contructor. since references (``fa'') can't be rebound. since its implementation is trivial. The assignment operator is necessary. return *this. What it would do is to copy the member variables. the compiler will try to generate one for us if we don't. size_t i) : index(i). private: FileArrayProxy(FileArray<T>& f. size). that if we instead of a reference had used a pointer. Sure.FileArrayProxy<T> FileArray<T>::operator[](size_t size) { // what if size >= max_size? return FileArrayProxy<T>(*this . The one that the compiler generates for us. however. }. there's absolutely no error handling here. as can be seen by the comments. // read from one array and write to the other. FileArrayProxy<T>& operator=(const T& t). } template <class T> FileArrayProxy<T>& FileArrayProxy<T>::operator=( const FileArrayProxy<T>& p . Now for the implementation: template <class T> FileArrayProxy<T>::FileArrayProxy(FileArray<T>& f. friend class FileArray<T>.

a similar proxy is created through the call to ``operator[](2)'' This time.'' In other words ``arr[0] = arr[2]'' generates the code ``arr. Can you see what happens with the proxy? Let's analyze a small code snippet: 1 FileArray<int> arr("file". since the stream cannot know where to put it. An example will explain: char* s = "23542". x=5. int& x = arr[3].) ostrstream os. The other variant. which in turn calls ``fa.storeElement(index.'' where ``index'' is still 2 and the value of ``t'' is 0. In memory data formatting One often faced problem is that of converting strings representing some data to that data. ``istrstream'' isn't much more exciting than that. ``arr[2]=0'' ends up as ``arr. Since ``storeElement'' wants an ``int. istrstream is(s).readElement(2)'' and returns its value. has as its member ``fa'' a reference to ``arr''. For example. This operator in turn calls ``fa. ``x'' will have the value 23542. albeit very useful. With our file array we cannot do this. as needed (usually because you have no idea what size the buffer must have. arr. There are two alternative uses for ``ostrstream. On line 3. and as its member ``index'' the value 2. 2 arr[2]=0. After executing this snippet. is >> x.storeElement(2. sizeof(buffer)).34'' after this snippet. and one where you want the ``ostrstream'' to create it for you. double x=23. 3 int x=arr[2]. thus ``int x=arr[2]'' translates to ``int x=arr.storeElement(index. say we have a string containing digits. and want those digits as an integer. With ordinary arrays. or vice versa. .10). The assignment operator is called. t).) We'll mend that hole next month (think about how) and also add iterators. where you don't know how large a buffer you will need. There's one thing we cannot do: int* p = &arr[2]. which creates a ``FileArrayProxy<int>'' from ``arr'' with the index 2. On this temporary object.storeElement(0. which will allow us to use the file arrays almost exactly like real ones. and besides you might not always want it. The variable ``buffer'' will contain the string ``x=23. ``ostrstream'' on the other hand is more exciting. and one to index 2. but unfortunately the compiler does not prevent it (a decent compiler will warn that we're binding a constant or pointer to a temporary. ``arr. where p is the temporary proxy referring to element 2.'' One where you have an array you want to store data in. Zero termination is not done by default. ostrstream os(buffer. which calls ``arr. ``arr[0]=arr[2]'' creates two temporary proxies.storeElement(0.operator int() const'' is called.0)''. the thing to do is to create an ``istrstream'' object from the string. the ``operator int() const'' is called. finally.p)''. the above would be legal and have well defined semantics. On line two. is generally more useful (I think. one referring to index 0. int x. and arr[3] the value 5.p). the proxies don't add any new functionality. they're just syntactic sugar. 4 arr[0]=arr[2].operator[](2)'' is called. however. The stream manipulator ``ends'' zero terminates the buffer. os << "x=" << x << ends.'' As you can see.readElement(2). this is easy.34. With the aid of ``istrstream''.) The former usage is like this: char buffer[24]. which is a temporary and does not have a name. Thus.'' On line 4. #endif // FARRAY_HPP That was it. With them we can treat our file arrays very much like any kind of array. This member function in turn calls ``fa. *p=2.readElement(2).) { } fa. ``operator=(int)'' is executed. assigning arr[2] the value 2.readElement(2)). return *this. ``ostrstream'' and ``strstream''.'' ``p. The object.

Why the standard has removed the file stream open modes ios::create and ios::nocreate is beyond me. that is. • It is possible to move around in streams.'' • proxy classes can be used to differentiate read and write operations for ``operator[]'' (the construction can of course be used elsewhere too. Finally. strengthens encapsulation.45.) defined in the header <sstream>. how to differentiate between logical and bitwise const. etc. which both makes life easier and not. It's so easy to forget to release the buffer (by simply forgetting to call ``os. or in-memory formatting. and the names std::istream. • truly simple smart pointers can save some memory management house keeping. I know I'm violating type safety. std::wistringstream. static_cast<T>. at least file streams and in-memory formatting streams. nor overwrite it. as they're extremely useful. In the binary streaming seen in this article.pcount(). The syntax is: os. The member function ``str'' returns a pointer to the internal buffer (which is then frozen. It's generally not possible to move around in ``cin'' and ``cout.) dynamic_cast<T>. are used just the same way as the familiar ``cout'' and ``cin.) • streams can be used for binary. Attempts to alter the stream while frozen. but it's most useful in this case.) ``std::ostringstream'' does not suffer from the freeze problem that ``ostrstream'' does. std::ostream. where the underlying type is ``wchar_t''. the headers are actually <iostream> and <fstream>.) but it's often useful when dealing with files. const size_t length=os.h> (or for some compilers <strstrea. and also be used as a work around for compilers lacking ``mutable'' (i. or ``unfreeze'' it. OK?'' The good thing about it is that it's so visible that anyone doubting it can easily spot the dangerous lines and have a careful look. the generally useful strstreams has been replaced by ``std::istringstream''. will fail. Casting is ugly. ``std::ostringstream'' and ``std::stringstream'' (plus wide variants. This normally doesn't make sense for ``cout'' and ``cin'' or in-memory formatting (as the name implies. The latter is done by giving it a parameter with the value 0. is just like ``fstream'' the combined read/write stream. As I mentioned already last month. that are highly visible. in other words. Last ``freeze'' can either freeze the buffer. The streams are templatized too.freeze(0)'') and that leads to a memory leak. etc. os. For ``ostream'' this is ``char'' (ostream is actually a typedef.'' which saves both learning and coding (the already written ``operator<<'' and ``operator>>'' can be used for all kinds of streams already. but the C++ compiler doesn't know and always assumes bitwise const. The underlying type for std::ostream is: std::basic_ostream<class charT. as a way of saying.freeze(0). There are four new cast operators. ``std::wostream''. Recap The news this month were: • streams dealing with files. but on strings (there is a string class. when done right.) • streams can be used also for in-memory formatting of data. and some other house keeping things. which on most systems probably will be 16-bit Unicode. // release the memory.) • friends break encapsulation in a way that.double x=23.) There's another typedef.) ``pcount'' returns the number of characters stored in the buffer. in the standard. . const_cast<T> and reinterpret_cast<T>. unformatted I/O too. the stream guarantees that it will not deallocate the buffer. the value of EOF. or again. I know what I'm doing. where the most important template parameter is the underlying character.h>. and it's hard to see in large code blocks. y=34. but hey. ``Yeah. They do not operate on ``char*''. the way of declaring a variable as non-const for const members. sizeof(variable)). The class template ``char_traits'' is a traits class which holds the type used for EOF.str().34. They're (in approximate order of increasing danger. rather a string class template. • there's a difference between logical const and bitwise const. I think the example pretty much shows what this kind of usage does. os << x << '*' << y << '=' << x*y << ends. ``strstream'' finally. class traits=std::char_traits<charT> > ``charT'' is the basic type for the stream.) Standards update With the C++ standard. a lot of things have changed regarding streams. // work with p and length.write(reinterpret_cast<const char*>(&variable). I find this interface to be unfortunate. reinterpret_cast<T> would be used. const char* p = os. The string streams can be found in the header <strstream.e.

• • • • • Improve the file array such that it accepts a ``stream&'' instead of a file name, and allows for several arrays in the same file. Improve the proxy such that ``int& x=arr[2]'' and ``int* p=&arr[1]'' becomes illegal. Add a constructor to the array that accepts only a ``size_t'' describing the size of the array, which creates a temporary file and removes it in its destructor. What happens if we instantiate ``FileArray'' with a user defined type? Is it always desireable? If not, what is desireable? If you cannot define what's desireable, how can instantiation with user defined types be banned? How can you, using the stream interface, calculate the size of a file?

Coming up
Next month will be devoted to improving the ``FileArray.'' We'll have iterators, allow arbitrary types, add error handling and more. I assume I won't need to tell you that it'll be possible to use the ``FileArray,'' just as ordinary arrays with generic programming, i.e. we can have the exact same source code for dealing with both! Part Part Part Part Part Part Part Part Part Part1 Part1 Part1 Part1 1 2 3 4 5 6 7 8 9 0 1 2 3 [Note: the source code for this month is here. Ed.] Last month a file based array template for truly huge amounts of data was introduced. While good, it was nowhere near our goals. Error handling was missing completely, making it dangerous to use in real life. There was no way to say how a user defined data type should be represented on disk, yet they weren't disallowed, which is a dangerous combination. It was also lacking iterators, something that is handy, and is an absolute requirement for generic programming with algorithms that are independent of the source of the data. On top of that, we'd really like the ability to store several different arrays in the same file, and also have an anonymous array which creates a temporary file and removes it when the array is destroyed. All of these will be dealt with this month, yet very little will be new. Instead it's time to make use of all the things learned so far in the course.

The data representation problem
In the file array as implemented last month, data was always stored in a raw binary format, exactly mirroring the bits as they lay in memory. This works fine for integers and such, but can be disastrous in other situations. Imagine a file array of strings (where string is a ``char*''). With the implementation from last month, the pointer value would be stored, not the data pointed to. When reading, a pointer value is read, and when dereferenced, whatever happens to be at the memory location pointed to (if anything) will be used (which is more than likely to result in a rather quick crash.) Anything with pointers is dangerous when stored in a raw binary format, yet we must somehow allow pointers in the array, and preferably so without causing problems for those using the array with built-in arithmetic types. How can this be done? In part 4, when templates were introduced, a clever little construct called ``traits classes'' was shown. I then gave this rather terse description: ``A traits class is never instantiated, and doesn't contain any data. It just tells things about other classes, that is its sole purpose.'' Doesn't that smell like something we can use here? A traits class that tells how the data types should be represented on disk? What do we need from such a traits class? Obviously, we need to know how much disk space each element will take, so a ``size'' member will definitely be necessary, otherwise we cannot know much disk space will be required. We also need to know how to store the data, and how to read it. The easiest way is probably to have member functions ``writeTo'' and ``readFrom'' in the traits class. Thus we can have something looking like this: template <class T> class FileArrayElementAccess { public: static const size_t size; static void writeTo(T value, ostream& os); static T readFrom(istream& is); }; The array is then rewritten to use this when dealing with the data. The change is extremely minor. ``storeElement'' needs to be rewritten as: template <class T> void FileArray<T>::storeElement(size_t index, const T& element)

{ // what if index >= array_size? typedef FileArrayElementAccess<T> traits; (*pstream).seekp(traits::size*index +sizeof(array_size), ios::beg); // what if seek fails? traits::writeTo(element,*pstream); // what if write failed? // what if too much data was written? } The change for ``readElement'' is of course analogous. However, as indicated by the last comment, a new error possibility has shown up. What if the ``writeTo'' and ``readFrom'' members of the traits class are buggy and write or read more data to disk than they're allowed to? Since it's the user of the array that must write the traits class (at least for their own data types) we cannot solve the problem, but we can give the user a chance to discover that something went wrong. Unfortunately for writing, the error is extremely severe; it means that the next entry in the array will have its data destroyed... In the traits class, by the way, the constant ``size'', used for telling how many bytes in the stream each ``T'' will occupy, poses a problem with most C++ compilers today (modern ones mostly makes life so much easier.) The problem is that a static variable, and also a static constant, in a class, needs to reside somewhere in memory, and the class declaration is not enough for that. This problem is two-fold. To begin with, where should it be stored? It's very much up to whoever writes the class, but somewhere in the code, there must be something like: const size_t ArrayFileElementAccess<X>::size = ...; where ``X'' is the name of the class dealt with by the particular traits specialisation. The second problem is that this is totally unnecessary. What we want is a value that can be used by the compiler at compile time, not a memory location to read a value from. As I mentioned, a modern compiler does make this much easier. In standard C++ it is allowed to write: template<> class ArrayFileElementAccess<X> { public: const size_t size = ...; ... }; Note that for some reason that I do not know, this construct is only legal if the type is a constant of an integral or enumeration type. ``size_t'' is such a type, it's some unsigned integral type, probably ``unsigned int'', but possibly ``unsigned long''. The expression denoted ``...'' must be possible to evaluate at compile time. Unless code is written that explicitly takes the address of ``size'', we need not give the constant any space to reside in. The odd construct ``template <>'' is also new C++ syntax, and means that what follows is a specialisation of a previously declared template. For old compilers, however, there's a work-around for integral values, no larger than the largest ``int'' value. We cheat and use an enum instead of a ``size_t''. This makes the declaration: class ArrayFileElementAccess<X> { public: enum { size= ... }; ... }; This is a bit ugly, but it is perfectly harmless. The advantage gained by adding the traits class is flexibility and safety. If someone wants to use a file array for their own class, they're free to do so. However, they must first write a ``FileArrayElementAccess'' specialisation. Failure to do so will result in a compilation error. This early error detection is beneficial. The sloppy solution from last month would not yield any error until run-time, which means a (usually long) debugging session.

Several arrays in a file
What is needed in order to host several arrays in the same file? One way or the other, there must be a mechanism for finding out where one array begins and another ends. I think the simplest solution, is to let go of the file names, and instead make the constructors accept an ``fstream&''. We can then require that the put and get pointer of the stream must be where the array can begin, and we can in turn promise that the put and get pointer will be positioned at the byte after the array end. Of course, in addition to having a reference to the ``fstream'' in our class, we also need the

``home'' position, to seek relative to, when indexing the array. This becomes easy to write for us, it becomes easy to use as well. For someone requiring only one array in a file, there'll be slightly more code, an ``fstream'' object must be explicitly initialised somewhere, and passed to the constructor of the array, instead of just giving it a name. I think the functionality increase/code expansion exchange is favorable. In order to improve the likelihood of finding errors, we can waste a few bytes of disk space by writing a well known header and trailer pattern at the beginning and end of the array (before the first element, and after the last one.) If someone wants to allocate an array using an existing file, we can find out if the get pointer is in place for an array start. The constructor creating a file should, however, first try to read from the file to see if it exists. If it does, it should be created from the file, just like the constructor accepting a stream only does. If the read fails, however, we can safely assume that the file doesn't exist and should instead be created. The change in the class definition, and constructor implementation is relatively straight forward, if long: template <class T> class FileArray { public: FileArray(fstream& fs, size_t elements); // create a new file. FileArray(fstream& fs); // use an existing file and get size from there ... private: void initFromFile(const char*); fstream& stream; size_t array_size; // in elements streampos home; }; template <class T> FileArray<T>::FileArray(fstream& fs, size_t elements) : stream(fs), array_size(elements) { // what if the file could not be opened? // first try to read and see if there's a begin // pattern. Either there is one, or we should // get an eof. char pattern[6]; stream.read(pattern,6); if (stream.eof()) { stream.clear(); // clear error state // and initialise. // begin of array pattern. stream.write("ABegin",6); // must store size of elements, as last month const size_t elem_size =FileArrayElementAccess<T>::size; stream.write((const char*)&elem_size, sizeof(elem_size)); // and of course the number of elements stream.write((const char*)&array_size, sizeof(array_size)); // Now that we've written the maintenance // stuff, we know what the home position is. home = stream.tellp();

right? return.write("AEnd". stream. The data read from the stream. // set put and get pointer to past the end pos. } template <class T> void FileArray<T>::initFromFile(const char* p) { // Check if the read pattern is correct if (strncmp(p.tellg()).tellg()).6)) { // What to do? It was all wrong! stream.seekp(stream. initFromFile(pattern).seekg(stream. the element sizes // mismatch.seekp(home+elem_size*array_size). // for lack of better.read((char*)&elem_size. // set the fail flag. if (elem_size != FileArrayElementAccess<T>::size) { // wrong kind of array.seekp(stream. stream. } initFromFile(pattern). } // OK.read(pattern. stream. stream. now let's see if // it's of the right kind. return. Again.clear(ios::failbit). // shared with other // stream constructor if (array_size != elements) { // Uh oh.tellp()). we have a valid array. stream. stream."ABegin". what to do? Let's set // the fail flag for now. // stupid name for the // member function. stream.clear(ios::failbit).sizeof(elem_size)).// Then we must go the the end and write // the end pattern. } . size_t elem_size. char pattern[6].clear(ios::failbit).4). stream. // set put and get pointer to past the end pos.6). } template <class T> FileArray<T>::FileArray(fstream& fs) : stream(fs) { // First read the head pattern to see if // it's right. // and the size given in the constructor // mismatches! What now? stream. } // set put and get pointer to past the end pos. return.

// Get the size of the array. Can't do much with // the size here, though. stream.read((char*)&array_size,sizeof(array_size)); // Now we're past the header, so we know where the // data begins and can set the home position. home = stream.tellg(); stream.seekg(home+elem_size*array_size); // Now positioned immediately after the last // element. char epattern[4]; stream.read(epattern,4); if (strncmp(epattern,"AEnd",4)) { // Whoops, corrupt file! stream.clear(ios::failbit); return; } // Seems like we have a valid array! } Other than the above, the only change needed for the array is that seeking will be done relative to ``home'' rather than the beginning of the file (plus the size of the header entries.) The new versions of ``storeElement'' and ``readElement'' become: template <class T> T FileArray<T>::readElement(size_t index) const { // what if index >= max_elements? typedef FileArrayElementAccess<T> traits; stream.seekg(home+index*traits::size); // what if seek fails? return traits::readFrom(stream); // what if read fails? // What if too much data is read?


template <class T> void FileArray<T>::storeElement(size_t index, const T& element) { // what if index >= array_size? typedef FileArrayElementAccess<T> traits; stream.seekp(home+traits::size*index); // what if seek fails? traits::writeTo(element,stream); // what if write failed? // what if too much data was written? }

Temporary file array
Making use of a temporary file to store a file array that's not to be persistent between runs of the application isn't that tricky. The implementation so far makes use of a stream and known data about the beginning of the stream, number of elements and size of the elements. This can be used for the temporary file as well. The only thing we need to do is to create the temporary file first, open it with an fstream object, and tie the stream reference to that object, and remember to delete the file in the destructor. What's the best way of creating something and making sure we remember to undo it later? Well, of course, creating a new helper class which creates the file in its constructor and removes it in its destructor. Piece of cake. The only problem is that we shouldn't always create a temporary file, and when we do, we can handle it a bit different from what we do with a ``global'' file that can be shared. For example, we know that we have exclusive rights to the file, and that it won't be reused, so there's no need for the extra information in the beginning and end. So, how's a

temporary file created? The C++ standard doesn't say, and neither is there any support for it in the old de-facto standard. I don't think C does either. There are, however, two functions ``tmpnam'' and ``tempnam'' defined as commonly supported extensions to C. They can be found in <stdio.h>. I have in this implementation chosen to use ``tempnam'' as it's more flexible. ``tempnam'' works like this: it accepts two string parameters named ``dir'' and ``prefix''. It first attempts to create a temporary file in the directory pointed to by the environment variable ``TMPDIR''. If that fails, it attempts to create it in the directory indicated by the ``dir'' parameter, unless it's 0, in which case a hard-coded default is attempted. It returns a ``char*'' indicating a name to use. The memory area pointed to is allocated with the C function ``malloc'', and thus must be deallocated with ``free'' and not delete[]. Over to the implementation details: We add a class called temporaryfile, which does the above mentioned work. We also add a member variable ``pfile'' which is of type ``ptr<temporaryfile>''. Remember the ``ptr'' template from last month? It's a smart pointer that deallocates whatever it points to in its destructor. It's important that the member variable ``pfile'' is listed before the ``stream'' member, since initialisation is done in the order listed, and the ``stream'' member must be initialised from the file object owned by ``pfile''. We also add a constructor with the number of elements as its sole parameter, which makes use of the temporary file. class temporaryfile { public: temporaryfile(); ~temporaryfile(); iostream& stream(); private: char* name; fstream fs; }; temporaryfile::temporaryfile() : name(::tempnam(".","array")), fs(name, ios::in|ios::out|ios::binary) { // what if tmpnam fails and name is 0 // what if fs is bad? } temporaryfile::~temporaryfile() { fs.close(); ::remove(name); // what if remove fails? ::free(name); } In the above code, ``tempnam'', ``remove'' and ``free'' are prefixed with ``::``, to make sure that it's the names in global scope that are meant, just in case someone enhances the class with a few more member functions whose name might clash. For the sake of syntactical convenience, I have added yet another operator to the ``ptr'' class template: template <class T> class ptr { public: ptr(T* tp=0) : p(tp) {}; ~ptr() { delete p; }; T* operator->(void) const { return p; }; T& operator*(void) const { return *p;}; private: ptr(const ptr<T>&); ptr<T>& operator=(const ptr<T>&); T* p; }; It's the ``operator->'' that's new, which allows us to write things like ``p->x,'' where p is a ``ptr<X>'', and the type ``X'' contains some member named ``x''. The return type for ``operator->'' must be something that ``operator->'' can be applied to. The explanation sounds recursive, but it makes sense if you look at the above code.

``ptr<X>::operator->()'' returns an ``X*''. ``X*'' is something you can apply the built in ``operator->'' to (which gives you access to the elements.) template <class T> FileArray<T>::FileArray(size_t elements) : pfile(new temporaryfile), stream(pfile->stream()), array_size(elements), home(stream.tellg()) { const size_t elem_size= FileArrayElementAccess<T>::size; // put a char just after the end to make // sure there's enough free disk space. stream.seekp(home+array_size*elem_size); char c; stream.write(&c,1); // what to do if write fails? // set put and get pointer to past the end pos stream.seekg(stream.tellp()); } That's it! The rest of the array works exactly as before. No need to rewrite anything else.

Code reuse
If you're an experienced C programmer, especially experienced with programming embedded systems where memory constraints are tough and you also have a good memory, you might get a feeling that something's wrong here. What I'm talking about is something I mentioned the first time templates were introduced: ``Templates aren't source code. The source code is generated by the compiler when needed.'' This means that if we in a program uses FileArray<int>, FileArray<double>, FileArray<X> and FileArray<Y> (where ``X'' and ``Y'' are some classes,) there will be code for all four types. Now, have a close look at the member functions and see in what way ``FileArray<int>::FileArray(iostream& fs, size_t elements)'' differs from ``FileArray<char>::FileArray(iostream& fs, size_t elements)''. Please do compare them. What did you find? The only difference at all is in the handling of the member ``elem_size'', yet the same code is generated several times with that as the only difference. This is what is often referred to as the template code bloat of C++. We don't want code bloat. We want fast, tight, and slick applications. Since the only thing that differs is the size of the elements, we can move the rest to something that isn't templatised, and use that common base everywhere. I've already shown how code reuse can be done by creating a separate class and have a member variable of that type. In this article I want to show an alternative way of reusing code, and that is through inheritance. Note very carefully that I did not say public inheritance. Public inheritance models ``is-A'' relationships only. We don't want an ``is-A'' relationship here. All we want is to reuse code to reduce code bloat. This is done through private inheritance. Private inheritance is used far less than it should be. Here's all there is to it. Create a class with the desired implementation to reuse and inherit privately from it. Nothing more, nothing less. To a user of your class, it matters not at all if you chose not to reuse code at all, reuse through encapsulation of a member variable, or reuse through private inheritance. It's not possible to refer to the descendant class through a pointer to the private base class, private inheritance is an implementation detail only, and not an interface issue. To the point. What can, and what can not be isolated and put in a private base class? Let's first look at the data. The ``stream'' reference member can definitely be moved to the base, and so can the ``pfile'' member for temporary files. The ``array_size'' member can safely be there too and also the ``home'' member for marking the beginning of the array on the stream. By doing that alone we have saved just about nothing at all, but if we add as a data member in the base class the size (on disk) for the elements, and we can initialise that member through the ``FileArrayElementAccess::size'' traits member, all seeking in the file, including the initial seeking when creating the file array, can be moved to the base class. Now a lot has been gained. Left will be very little. Let's look at the new improved implementation: Now for the declaration of the base class. class FileArrayBase { public: protected: FileArrayBase(iostream& io,

array_size(elements). streampos home. size_t array_size. // number of elements size_t element_size() const. fstream fs. The only surprise here should be the nesting of the class ``temporaryfile. } The implementation of ``FileArrayBase'' is very similar to the ``FileArray'' earlier. The only difference is that we use a parameter for the element size. ::remove(name). FileArrayBase::temporaryfile::temporaryfile() : name(::tempnam(". size_t elements. size_t e_size. e_size(elem_size) { . FileArrayBase::FileArrayBase(iostream& io. size_t elem_size).ios::in|ios::out|ios::binary) { // what if tmpnam fails and name is 0 // what if fs is bad? } FileArrayBase::temporaryfile::~temporaryfile() { fs. but few compilers today support that. size_t elem_size)."array")). }. since the surrounding scope must be used. ~temporaryfile(). it's possible to define a class within a class. instead of the traits class. // What if remove fails? ::free(name). private: class temporaryfile { public: temporaryfile(). fs(name. }. } iostream& FileArrayBase::temporaryfile::stream() { return fs. ptr<temporaryfile> pfile.size_t elements. iostream& stream. When implementing the member functions of the nested class. FileArrayBase(iostream& io). It's actually possible to nest classes in class templates as well.". iostream& seekg(size_t index) const. iostream& seekp(size_t index) const. size_t size() const.close(). void initFromFile(const char* p). FileArrayBase(size_t elements. it's inaccessible from anywhere other than the ``FileArrayBase'' implementation. Since the ``temporaryfile'' class is defined in the private section of ``FileArrayBase''.'' Yes. it looks a bit ugly. iostream& stream(). private: char* name. size_t elem_size) : stream(io).

// Now that we've written the maintenance // stuff. sizeof(array_size)). // set put and get pointer to past the end pos.sizeof(pattern)). sizeof(elem_size)). } To make life a little bit easier. // and of course the number of elements stream. stream.write((const char*)&array_size. } if (e_size != elem_size) { stream. } FileArrayBase::FileArrayBase(size_t elements.char pattern[sizeof(ArrayBegin)]. stream. // and the size given in the constructor // mismatches! What now? stream. } // set put and get pointer to past the end pos. // Then we must go the the end and write // the end pattern.seekp(home+elem_size*array_size). // set put and get pointer to past the end pos.seekg(stream.tellg()). // shared with other // stream constructor if (array_size != elements) { // Uh oh. // begin of array pattern.seekp(stream. stream. return.tellp()).tellg()). stream. I've assumed two arrays of char named ``ArrayBegin'' and ``ArrayEnd''. stream. } initFromFile(pattern). // must store size of elements stream. size_t elem_size) . The data read from the stream.sizeof(ArrayBegin)).write(ArrayEnd.sizeof(ArrayEnd)).sizeof(pattern)).clear(ios::failbit).clear(). stream. we know what the home position is.seekp(stream.tellp().eof()) { stream. home = stream.write(ArrayBegin. initFromFile(pattern).read(pattern. if (stream. stream.write((const char*)&elem_size. which hold the patterns to be used for marking the beginning and end of an array on disk. stream.read(pattern. FileArrayBase::FileArrayBase(iostream& io) : stream(io) { char pattern[sizeof(ArrayBegin)].clear(ios::failbit). // clear error state // and initialize.

// set the fail flag. stream.tellg().read((char*)&array_size.seekp(home+index*e_size). } . stream.sizeof(ArrayEnd))) { // Whoops. e_size(elem_size).seekg(home+e_size*array_size). // Now we're past the header. stream. return. home = stream.: pfile(new temporaryfile). return. } iostream& FileArrayBase::seekp(size_t index) const { // What if index is out of bounds? stream. char c.clear(ios::failbit).ArrayEnd. } // Seems like we have a valid array! } iostream& FileArrayBase::seekg(size_t index) const { // what if index is out of bounds? stream.sizeof(e_size)).tellg()) { stream.tellp()). so we know where the // data begins and can set the home position.seekg(stream. we have a valid array.seekp(home+array_size*e_size). Can't do much with // the size here. char epattern[sizeof(ArrayEnd)].sizeof(ArrayBegin))) { // What to do? It was all wrong! stream. } void FileArrayBase::initFromFile(const char* p) { // Check if the read pattern is correct if (strncmp(p. corrupt file! stream.read(epattern. stream(pfile->stream()). stream.1).sizeof(epattern)). home(stream. stream. } // OK. // set put and get pointer to past the end pos. array_size(elements).clear(ios::failbit). // What if seek failed? return stream.read((char*)&e_size. now let's see if // it's of the right kind.write(&c. if (strncmp(epattern. though. // for lack of better. stream. // what if seek failed? return stream.seekg(home+index*e_size). // Now positioned immediately after the last // element. // Get the size of the array.sizeof(array_size)).ArrayBegin.

}. void storeElement(size_t index. FileArrayProxy<T> operator[](size_t index). is how easy this makes the implementation of the class template ``FileArray''. size_t size() { return FileArrayBase::size().size_t FileArrayBase::size() const { return array_size. private: FileArray(const FileArray<T>&). FileArrayElementAccess<T>::size) { } template <class T> T FileArray<T>::operator[](size_t index) const { // what if index>= size()? return readElement(index). // use existing array FileArray(size_t elements). template <class T> class FileArray : private FileArrayBase { public: FileArray(iostream& io. however. size_t size). Now watch this! template <class T> FileArray<T>::FileArray(iostream& io. }. The really good news. it's all pretty straight forward. } size_t FileArrayBase::element_size() const { return e_size. // illegal T readElement(size_t index) const. const T& elem). FileArray(iostream& io). } Apart from the tricky questions. size_t size) : FileArrayBase(io. // illegal FileArray<T>& operator=(const FileArray<T>&). FileArrayElementAccess<T>::size) { } template <class T> FileArray<T>::FileArray(iostream& io) : FileArrayBase(io) { // what if element_size is wrong? } template <class T> FileArray<T>::FileArray(size_t elements) : FileArrayBase(elements. friend class FileArrayProxy<T>. } . // create temporary T operator[](size_t index) const. elements.// create one.

} template <class T> T FileArray<T>::readElement(size_t index) const { // what if index>= size()? iostream& s = seekg(index). part 1. extend and maintain. When I introduced exceptions. const T& element) { // what if index>= size()? iostream& s = seekp(index). There was one thing I didn't tell. } catch (B& b) { // **1 } catch (C& c) { // **2 } catch (A& a) { // **3 } . index). What can go wrong? Already in the very beginning of this article series. } template <class T> void FileArray<T>::storeElement(size_t index. or to use wording slightly more English-like. because at that time it wouldn't have made much sense. the C++ error handling mechanism. I introduced exceptions. void f() (throw A). Of course exceptions should be used to handle the error situations that can occur in our array class. and also makes the source code easier to understand. C : public A {}. dynamic binding works. B1 : public B{}.s). // what if write failed? // What if too much data was written? } How much easier can it get? This reduced code bloat. // parent seekg return FileArrayElementAccess<T>::readFrom(s). That one thing is that when exceptions are caught. we can create exception class hierarchies with public inheritance. and we can choose what level to catch. I didn't tell the whole truth about them. // what if read failed? // What if too much data was read? return t. B : public A {}. Here's a mini example showing the idea: class class class class A {}. // may throw any of the above void x() { try { f().template <class T> FileArrayProxy<T> FileArray<T>::operator[](size_t index) { // what if index>= size()? return FileArrayProxy<T>(*this. // parent seekp // what if seek fails? FileArrayElementAccess<T>::writeTo(element.

the only exceptions ever thrown from file arrays will be of the ``FileArrayRuntimeError'' kind. Whenever the iterator is dereferenced. and an index. regardless of why (it's not very easy to find out if it's a faulty disk or lack of disk space. a check if there's enough disk space is still taking a chance. class FileArrayElementSizeError : public FileArrayLogicError {}. The iterator arithmetics becomes simple too. class FileArrayDataCorruptionError : public FileArrayRuntimeError {}. In a perfectly debugged program. We want to be able to create an iterator from the array (in which case the iterator refers to the first element of the array.} At ``**1'' above. class FileArrayBoundsError : public FileArrayLogicError {}.) You can increase the code size and eligibility gain from the private inheritance of the implementation in the base by putting quite a lot of the error handling there. If the read/write members of the element access traits class are faulty and either write too much (thus overwriting the data for the next element) or reads too much (in which case the last few bytes read will be garbage picked from the next element. and what they do throw. Iterators An iterator into a file array is something whose behavior is analogous to that of pointers into arrays. and ``FileArrayRuntimeError'' for things that the programmer may not have a chance to do something about. class FileArrayRuntimeError : public FileArray Exception {}.) A reasonable start for the exception hierarchy then becomes: class FileArrayException {}. class FileArrayLogicError : public FileArrayException {}. we return (*array)[index]. though. We can divide those further into: class FileArrayCreateError : public FileArrayRuntimeError {}. If after creation. For example. for example. That way we even have error handling for iterator arithmetic that lead us outside the valid range for the array given for free from the array itself. For whenever the creation of the array fails. abuse and environmental issues outside the control of the programmer. An easy way of getting there is to let an iterator contain a pointer to a file array.) class FileArrayStreamError : public FileArrayRuntimeError {}.) and we want iterator arithmetic with integers. For abuse I mean things like indexing outside the valid bounds. Addressing outside the legal bounds. Even if there was enough free space when the check was made. we can have a root class ``FileArrayException''. We can use abstraction levels for errors. and we note that the header or trailer doesn't match the expected. objects of class ``B'' and class ``B1'' are caught if thrown from ``f''. At ``**3'' all others from the ``A'' hierarchy are caught. from which all other exceptions regarding the file array inherits. If an array is created from an old existing file. Here ``FileArrayLogicError'' are for clear violations of the not too clearly stated preconditions. that space may be occupied when the next statement in the program is executed. This may seem like a curious detail of purely academic worth. Beware. We can see that there are clearly two kinds of errors that can occur in the file array. In ``**2'' objects of class ``C'' (and descendants of C. Now we have a reasonably fine level of error reporting. it's not a good idea to add exception specifications to the member functions making use of the T's (since you cannot know which operations on T's that may throw. however. yet an application that wishes a coarse level of error handling can choose to catch the higher levels of the hierarchy only. and with environmental issues I mean faulty or full disks (Since there are several programs running. I invite you to add the throws to the code. for example if seeking or reading/writing fails. if any are declared elsewhere) are caught.) It's of course possible to take this even further. As an exercise. something goes wrong with a stream. but it's extremely useful. .) We want to access that element by dereferencing the iterator (unary operator *. I think this is quite enough.

• iterator+=n (where n is of type long int) adds n to the value of the index in the iterator. • iterator+n yields a new iterator referring to the iterator. }. FileArrayProxy<T> operator*().index+n:th element of the array. • iterator1!=iterator2 returns !(iterator1==iterator2) • *iterator returns whatever (*array)[index] returns. It's just a lot of code to write. Here's how it's done in the iterator example: template <class T> class FileArrayIterator { public: FileArrayIterator(FileArray<T>& f). • iterator1>=iterator2 returns !(iterator1<iterator2). Neither of the above is difficult. The implementation thus seems easy. however. * iterator[n] returns (*array)[index+n]. • addition of array and ``long int'' value ``n'' yields iterator referring to n:th element of array. • iterator1-iterator2 yields a long int which is the difference between the indices of the iterators.e a • leArrayProxy. FileArrayProxy<T> operator[](long n). and two versions of operator+ that are implemented with ``operator+=''.index < iterator2.since it's just ordinary arithmetics on the index type. template <class T> FileArrayIterator<T>::FileArrayIterator( const FileArray<T>& a ) : array(&a). With a little thought. I think the above is an exhaustive list. Here's my idea: • creation from array yields iterator referring to first element • copy construction and assignment are of course well behaved.index. template <class T> FileArrayIterator<T> operator+(long n.. and thus a good chance of making errors. and analogous for operator-. As an example. index(0) { } template <class T> FileArrayIterator<T>::FileArrayIterator( const FileArrayIterator<T>& i ) . it's dereferencing the iterator that's an error if the index is out of range. it's an error and we throw an exception. Likewise for operator>. If the iterators refer to different arrays. • iterator1<iterator2 returns true if the iterators refer to the same array and iterator1. This addition is never an error. • iterator1==iterator2 returns non-zero if the arrays and indices of iterator1 and iterator2 are equal. unless you want to give the class users some rather unhealthy surprises) is to define ``operator+='' as a member of the class. private: FileArray<T>* array. it's an error and we throw an exception. template <class T> FileArrayIterator<T> operator+(const FileArrayIterator<T>& i. and the actions we want. long n). ``o+v'' and ``v+o'' are well defined and behaves like they do for the built in types (which they really ought to. all that's needed is to define the operations needed for the iterators. Operator -= is analogous. unsigned long index. .. a rule of thumb when writing a class for which an object ``o'' and some other value ``v'' the operations ``o+=v''. If iterator1 and iterator2 refer to different arrays. • moving forwards and backwards with operator++ and operator--. thus reducing the amount to write and also the risk for errors. Likewise for operator<=. const FileArrayIterator<T>& i). quite a lot of code can be reused over and over. FileArrayIterator<T>& operator+=(long n). i.

return *this. In many situations where public inheritance is used. Public inheritance models ``is-A'' relationships. does not have any support for the notion of temporary files. } Surely. Defining a class-scope static constant of an integral type in the class declaration is cleaner and more type safe. return it+=n. } template <class T> FileArrayProxy<T> FileArrayIterator<T>::operator*() { return (*array)[index]. return it+=n.array). long n) { FileArrayIterator<T> it(i). } template <class T> FileArrayIterator<T> operator+(long n. • Standard C++ and even C.) . Fortunately there are commonly supported extensions to the languages that do. } template <class T> FileArrayProxy<T> FileArrayIterator<T>::operator[](long n) { return (*array)[index+n]. though. you can study it in the sources.{ } : array(i. while private inheritance models ``is-implemented-in-terms-of'' relationships. • Exception catching is polymorphic (i. dynamic binding works when catching. • Private inheritance is very different from public inheritance. • A user of a class that has privately inherited from something else cannot take advantage of this fact. • Private inheritance can be used for code reuse.e. } template <class T> FileArrayIterator<T> operator+(const FileArrayIterator<T>& i. There's no need to display all the code here in the article. Recap This month the news in short was: • You can increase flexibility for your templates without sacrificing ease of use or safety by using traits classes. • Enumerations in classes can be used to have class-scope constants of integral type. The above shows how it all works. • Modern compilers do not need the above hack. index(i. • Private inheritance is in real-life used far less than it should be. it's fairly simple. and as you can see. but since its behaviour is defined in terms of ``operator+='' it means that if we have an error.index) template <class T> FileArrayIterator<T>& FileArrayIterator<T>::operator+=(long n) { index+=n. the code for the two versions of ``operator+'' must be written. To a user the private inheritance doesn't make any difference. const FileArrayIterator<T>& i) { FileArrayIterator<T> it(i). private inheritance should've been used. there's only one place to correct it.

They're templates. for reusing code? In which situations is it crucial which alternative you choose? Coming up Next month we'll have a look at smart pointers. operator* and operator/ as functions outside the classes. Here is a code fragment showing such a situation: void f(). where the alternatives store the data in different formats. Exercises • • • Alter the file array such that it's possible to instantiate two (or more) kinds of FileArray<X> in the same program. but deallocate in exceptional situations. however. with respect to exception safety. and using a member variable of that same class. Ed. That is the core purpose of all smart pointers. Exception safety In this respect ``auto_ptr<T>'' and ``ptr<T>'' are equal. They both delete whatever they point to in their destructor. (hint. For example there was no way to rebind an object to another pointer. it's a bit too simplistic to be generally useful. as we will see later in this article. try { auto_ptr<int> p(new int(1)). their syntax resembles that of pointers.) Always implement binary operator+. You can find a few things in common with them all.) What's the difference between using private inheritance of a base class. and they're dangerous if you forget that you're dealing with smart pointers. This can be used for holding onto something we want to return in normal cases. While ``ptr<T>'' served its purpose. is that we can tell an ``auto_ptr<T>'' object that it no longer owns a memory area. operator*= and operator/= members of the classes. I'm beginning to dry up on topics now. // may throw something int* ptr() { // returns a newly allocated value on success. called simply ``ptr<T>'' was used to make memory handling a little bit easier. or to tell it not to delete the memory (that too can be useful at times. • exception safety • safe memory area ownership transfer • no confusion with normal pointers • controlled and visible rebinding and release of ownership • works with dynamic types • pointer-like syntax for pointer-like behaviour Let us have a look at each of these in some detail and compare with the previous ``ptr<T>''. . they relieve you of the burden of remembering to deallocate the memory.• • The polymorphism of exception catching allows us to create an arbitrarily fine-grained error reporting mechanism while still allowing users who want a coarse error reporting mechanism to use one (they'll just catch classes near the root of the exception class inheritance tree. the class template ``auto_ptr<T>''. The only thing that ``auto_ptr<T>'' has to offer over ``ptr<T>''. // or the 0 pointer on failure. operator-. the alternatives will all need different traits class specialisations.) This article is devoted to the only smart pointer provided by the standard C++ library. operator-=. but I know what problems the implementation provided does solve. they aren't pointers. The problem to solve I do not know what the core issues where when the ``auto_ptr<T>'' was designed. Part Part Part Part Part Part Part Part Part Part1 Part1 Part1 Part1 1 2 3 4 5 6 7 8 9 0 1 2 3 [Note: the source code for this month is here.] In the past two articles. and always implement them in terms of the operator+=. we've seen how a simple smart pointer. so please write and give me suggestions for future topics to cover.

} One of the headaches of using dynamically allocated memory is knowing who is responsible for deallocating the memory at any given moment in a program's lifetime. but that is the behaviour set in the final draft of the C++ standard. is that the cheat is so bad that it's not really an assignment and definitely not a copy.) { return 0. What do you think about this? auto_ptr<int> creation(). void f() { auto_ptr<int> pi=creation(). by relieving it of ownership while accepting ownership itself.. // p2 doesn't own anything. // now p2 owns the memory area. Below are some examples of this: // simple transfer auto_ptr<int> p1(new int(1)). The reason I've quoted the names assignment and copy. Rather.. // p1 owns the memory area. return p. ``p. void termination(auto_ptr<int> rip). and any function that accepts an ``auto_ptr<T>'' requires ownership to work. such that ``termination'' ought not be called. an exception thrown from ``f'' results in the destruction of the ``auto_ptr<int>'' object ``p'' before the call of the ``release'' member function. not // p1 or p2. The value returned is the pointer to the memory area. // use pi for something termination(pi). If something goes wrong between calling ``creation'' and ``termination''.. p1 doesn't.f(). and the value returned from there is passed to the caller. both are ownership transfer operations. auto_ptr<int> p2. and since the function "termination" wants an auto_ptr<T> it wants the responsibility. // Now it is p3 that owns the memory area. Any function that returns an ``auto_ptr<T>'' leaves it to the caller to take care of the deallocation. we must take care of the . } catch (.release(). Safe memory area ownership transfer This safety is achieved by cheating in the ``assignment'' operator and ``copy'' constructor. An important issue here for those of you who have used early versions of the ``auto_ptr<T>'' is that older versions did not become 0 ``pointers'' when not owning the memory area. } } In the code above. The member function ``release'' releases ownership of the memory area from the ``auto_ptr<int>'' object. The ``auto_ptr<T>'' makes that rather easy. p2 = p1. ``f'' does not throw any exception. . the above program snippet is too simplistic to be useful. // // // // since we're sending it off as an auto_ptr<T>. // we must take care of deallocation somehow. which means that the object pointed to will be deleted. What happens is that they both modify the right hand side. it will not be deallocated. where it works as both documentation and implementation of ownership transfer. Of course. Since the object no longer owns the memory area. however. If. I think the above example speaks for itself.. Even // if we chose to release ownership from "pi". deletion is not our headache anymore.release()'' is called. // p1 has become the 0 "pointer" auto_ptr<int> p3(p2). as can be seen above. // It's now clear that we are responsible for // deletion of the memory area allocated. The properties of the ``auto_ptr<T>'' are more useful when working with functions.

No confusion with normal pointers Since the auto pointers have the behaviour outlined above. For both the result will be something that in standardese is called ``undefined behaviour''. What about this situation? int i. it is important that the memory area currently owned by the object (if any) is deallocated. or generally funny behaviour (possibly followed by a crash later. An auto_ptr<T> cannot be implicitly // converted to a pointer. This functionality is an advantage that ``auto_ptr<T>'' offer over ``ptr<T>''. we do not have to worry about it. since the latter doesn't have any way of transferring ownership. // illegal. ``ap'' might be declared somewhere far far away. The erroneous code below shows how: auto_ptr<int> creator(). This is done by explicitly prohibiting all implicit conversions between pointer types and ``auto_ptr<T>'' types. A pointer cannot be implicitly // converted to an auto_ptr<T>. termination(&i).deallocation. The member function ``release'' gives us a normal pointer to the memory area owned by the . allowing the last error. } It is indeed fortunate that the first and last error above are illegal. // illegal. void f() { int* p = creator(). The member function ``reset'' takes care of that. If you want to rebind an // auto_ptr<T> object to point to something else. When creating a // new auto_ptr<T> object. The auto_ptr<T> required by the // termination function cannot be implicitly // created from the pointer.)'' Well. Ouch! The function would attempt to delete the local variable. Controlled and visible rebinding and release of ownership If we want to rebind an ``auto_ptr<T>'' object to another memory area. In this respect ``auto_ptr<T>'' is better than ``ptr<T>''.) The last is as bad. ap. it is extremely important that they cannot accidentally be confused with normal pointers. but which in normal English best translates to ``a crash now. ap=p. Imagine the maintenance headaches you could get otherwise. since it is illegal.reset(p). What would the first mean? Would the implicit conversion from ``auto_ptr<T>'' to a raw pointer transfer ownership or not? All implementations I have seen where such implicit conversions are allowed do not transfer the ownership. use the constructor // syntax auto_ptr<T> ap(p). void termination(auto_ptr<int>).'' is perhaps a bit unfortunate since the intended behaviour is clear. // also illegal. termination(p). since ``ptr<T>'' does allow implicit construction. so that in the code near the assignment it is not obvious if it is an ``auto_ptr<T>'' or a normal pointer. but since we have it in an ``auto_ptr<T>'' that is automatically done for us if we return or throw an exception. That it is illegal comes as a natural consequence of banning the third situation ``ap=p'' which is not clear. auto_ptr<T> ap=p. a crash later. // use the "reset" member function. Calling ``ap. we can get it in two ways.reset(p)'' will deallocate whatever ``ap'' owns (if anything) and make it own whatever ``p'' points to. // also illegal. depending on the desired effect. The second error ``auto_ptr<int> ap=p. If we want a normal pointer from an ``auto_ptr<T>'' object. which in the situation above means that the memory would be deallocated when the ``auto_ptr<T>'' object returned is destroyed (which it would be immediately after the conversion.

since the functionality is exactly the same and so is the syntax. auto_ptr(auto_ptr<T>& t) throw(). This is the only functionality of a pointer that is implemented. If we do not want that responsibility. // return the pointer } Above we see that the function ``func'' requires a normal pointer. Since ``ptr<T>'' was specifically designed to disallow transfer of ownership. T& operator*() const throw (). Works with dynamic types Just as a normal pointer to a base class can point to an object of a publicly derived class. auto_ptr<A> pa(new B()). so that we will be responsible for the deallocation. since the functionality is only required if ownership transfer is allowed. // may throw int* f(void) { auto_ptr<int> p(new int(1)). the ``auto_ptr<int>'' object ``p'' will deallocate the memory in its destructor. We get access to the element pointed to with ``operator*'' and ``operator->''. but temporarily need a normal pointer to the memory area. Here is an example showing the differences: void func(const int*). private: T* p. This is not particularly strange: class A {}. template <class Y> auto_ptr<T>& operator=(auto_ptr<Y>& t) throw (). void reset(T* t = 0) throw (). auto_ptr<T>& operator=(auto_ptr<T>& t) throw (). template <class Y> auto_ptr(auto_ptr<Y>& t) throw(). and also gives us the ownership. we use the ``get'' member function. so we use the ``get'' member to temporarily get the pointer and pass it to ``func''. the syntax is exactly the same. . For ``ptr<T>'' this is not a problem. class B : public A{}. an ``auto_ptr<T>'' can too. The reverse is (of course) not allowed. }.``auto_ptr<T>'' object. // call func with a normal pointer return p. pa=pb. this functionality is added-value for ``auto_ptr<T>''. T* get(void) const throw ().release(). func(p. T* operator->() const throw (). auto_ptr<B> pb(new B()). but it does not assume ownership. This function ``f'' then returns the raw pointer if ``func'' does its job. T* release() throw (). ~auto_ptr() throw (). but if it fails with an exception. Pointer-like syntax for pointer-like behaviour For the small subset of a pointer's functionality that is implemented in the ``auto_ptr<T>'' class template. auto_ptr<A> pa2(pb). Here it is a tie between ``auto_ptr<T>'' and ``ptr<T>''. Implementation The definition of ``auto_ptr<T>'' looks as follows: template <class T> class auto_ptr { public: explicit auto_ptr(T* t = 0) throw ().get()).

and ``template <class Y>'' inside the class definition.) template <class T> inline auto_ptr<T>::auto_ptr(T* ptr) throw() : p(ptr) { } The ``inline'' keyword is new for this course. Third.) is that the ``copy'' constructor and ``assignment'' operator do take a non-const reference to their right hand side. If class ``B'' is publicly derived from class ``A''. strictly speaking. The keyword ``explicit'' in front of the constructor. and executing a conversion operator is too. the generated code will compile just fine. Look at this example usage: void termination(auto_ptr<int> pi). It is an essential addition to the C++ language. This is just a hint. Unfortunately even fewer compilers support this than support the ``explicit'' keyword. and it will be used for all member functions of the ``auto_ptr<T>''. otherwise we will get an error message from the compiler. we can see that a member function auto_ptr<A>& auto_ptr<A>::operator=(auto_ptr<B>&) throw() will be generated. necessary. ``explicit'' is what disallows implicit construction of objects. instead of making a function call. With this mini-example. auto_ptr<B> pb. so that it can be modified. it owns it (by definition. implicit conversions are allowed. for example in function calls (see the error example above. and the code above. This constructor is marked ``explicit'' in the class definition. and if it points to anything at all. so there's a place for the ``inline'' keyword.) Few compilers are smart enough to inline automatically. }. and most important (please take note of this.) This keyword is. template <class T> inline auto_ptr<T>::auto_ptr(explicit<T*> ptr) throw() : p(ptr) { } The way this works is as follows: By default. Here is the promised work-around: template <class T> class explicit { public: explicit(T t) : value(t) {}. to the best of my knowledge. not be worked around. however. not needed.Three new details can be seen above. Marking a function ``inline'' is a way of hinting to the compiler that you think this function is so simple that it can insert the function code directly where needed. as the ``template <class Y>'' used inside the class definition is called. and likewise a good compiler may inline even functions not marked as inline (provided you cannot see any difference in the behaviour of the program. is a way of creating new member functions at need. The only thing it needs to do is to initialize the ``auto_ptr<T>'' object such that it owns the memory area. The member templates. when attempting to call a function requiring an ``auto_ptr<T>'' parameter with a normal pointer. beginning with the constructor. although it has been part of C++ for a very long time. It is an error if two or more implicit conversions are required to get the desired effect. private: T value. This feature can. The code Let us do the member functions one by one. a compiler is free to ignore it. Constructing an object is a user defined conversion. I mentioned above that the ``explicit'' keyword is not. pa=pb''. operator T() const { return value. void func() { . This is what makes it possible to say ``auto_ptr<A> pa. There is a fake around it. }. but only one user defined implicit conversion may take place. which you will see when we get to the implementation details. strictly speaking. Both of these are relatively recent additions to the C++ language and far from all compilers support them.

is in error. } If the object owns anything. that I thought the latter would imply the former. //**1 legal termination(new int(2)). I made a mistake with the ``auto_ptr<T>'' implementation available in the adapted SGI STL. and one from ``explicit<int*>'' to ``auto_ptr<int>''. The code at //**2 however. return *this. Then. Note that deleting the 0 pointer is legal. even though it may seem so. It may seem like there are two implicit conversions taking place here. Our ``auto_ptr<int>'' accepts as its parameter an ``explicit<int*>'' which is implicitly created from the pointer value. As mentioned far above. the member ``release'' relieves the object of ownership and returns the pointer. by the way. //** 2 error . Note.. Please see the source code for how to work around the compilation error (the work around is simply not to have this member function. One from ``int*'' to ``explicit<int*>''. and one for getting the value out of it. one for creating the ``explicit<T>'' object. } template <class T> template <class Y> inline auto_ptr<T>& auto_ptr<T>::operator=(auto_ptr<Y>& t) throw () { reset(t. which means that the resulting ``auto_ptr'' will be limited in functionality. Please see the provided source code for how to allow both versions to coexist for different compilers in the same source file. because we say that we want an object of type ``auto_ptr<int*>''. It doesn't. Thus this constructor makes ``p'' point to what ``t'' did point to.release()) { } The code for this constructor is.>'' Of course. the syntax for a member template. If the object does not own anything. because the call to ``termination'' requires two user defined conversions.release()). except that the parameter is a non-const reference. in this case to get the value from it. it will be deleted by the destructor. of course. users of compilers that do not implement member templates will get compilation errors on this member function. Note that both are necessary. } auto_ptr<int> pi(new int(1)). it is a detail of the innards of the ``auto_ptr<T>'' constructor how it is used. however. template <class T> template <class Y> inline auto_ptr<T>::auto_ptr(auto_ptr<Y>& t) throw() : p(t. return *this. with the two subsequent ``template <. and does nothing at all.) template <class T> inline auto_ptr<T>::~auto_ptr() throw () { delete p. we will be obeyed.} The code at //**1 is not in error. ``p'' will be the 0 pointer. template <class T> inline auto_ptr<T>& auto_ptr<T>::operator=(auto_ptr<T>& t) throw () { reset(t.release()). and alters ``t'' so that it becomes the 0 ``pointer''.release()) { } There is not much strange going on here. template <class T> inline auto_ptr<T>::auto_ptr(auto_ptr<T>& t) throw() : p(t. Since we've been so stern about this.. the same as for the previous one. but that is not quite true.

for later deletion again! It seems like a better way is to just do nothing if the situation ever arises. it's even illegal to instantiate ``auto_ptr<T>'' if ``T'' is not a struct or class type. template <class T> inline void auto_ptr<T>::reset(T* t) throw () { if (t != p) { delete p.) The member function ``reset'' does exactly what we want. } } Deletes what it points to and sets ``p'' to the given value. we cannot just assign to ``p'' (it may point to something. How much does it cost. resetting to the current value would deallocate the memory and keep the ownership of it. and the value previously held by ``p'' is returned. however. You pay for what you use only. is there? The object is relieved of ownership by making ``p'' the 0 pointer. template <class T> inline T& auto_ptr<T>::operator*() const throw () { return *p.e. just as mentioned in the introduction of the class. I would say the difference is that with ``auto_ptr<T>'' you do many more deletions (i. it is normally structs and classes you handle this way. Nothing strange. In most cases this is a minor limitation. Efficiency The question of efficiency pops up now and then. ``operator->'' can.) . } template <class T> inline T* auto_ptr<T>::operator->() const throw () { return p. the price is nothing at all. and use only the functionality that ``ptr<T>'' offers. ``operator*'' and ``operator->'' holds exactly the same code for both templates. Most probably close to none at all. template <class T> inline T* auto_ptr<T>::release() throw () { T* tp=p. If we didn't have this guard. p=t. On some older compilers. only be used if ``T'' is a struct or class type. Since we are not creating a new object. return tp. you have mended memory leaks you were not aware of having. and not built-in types. though. except the safety guard against resetting to the value already held. The constructor. destructor. after all. } Not much to say. } These are not identical with the version of ``ptr<T>'' from the previous issue of the course. It will depend a lot on how clever your compiler is with inlining. of course. and give it a new value. If you have a measurable speed difference in a realworld application. Compared to raw pointers and doing your own deletion? I do not know. p=0. One word on the way. } Not much to say about this one. delete whatever ``p'' points to. template <class T> inline T* auto_ptr<T>::get(void) const throw () { return p.This is pretty much the same story as the ``copy'' constructor. in which case it must be deallocated. performance and memory-wise to use the ``auto_ptr<T>'' instead of ``ptr<T>'' from last month? If you use ``auto_ptr<T>'' instead of ``ptr<T>''.

since there's no need to worry about ownership. documents and implements ownership transfer of dynamically allocated memory. increments the reference count. If the counter reaches zero.) The problems to solve Many of the problems with a reference counting pointer are the same as for the auto pointer. When allocated it is set to 0. especially when exceptions occur. so the resource must be deallocated. however. • Member templates can be used to create member functions at compile time. • exception safety • no confusion with normal pointers • controlled and visible rebinding and access • works with polymorphic types • pointer-like syntax for pointer-like behaviour • automatic deletion when no longer referring to the object . we do not want to be bothered with ownership. • The ``explicit'' keyword can be faked. however. please drop me a line and I'll address your ideas in future articles. no one is referring to it anymore. The weakness of this compared to automatic garbage collection is that it does not work with circular data structures (the count never goes below 1. Often. and every smart pointer detaching from a resource (the smart pointer destroyed. so please write and give me suggestions for future topics to cover.) • The ``explicit'' keyword disallows implicit construction of objects. Next month we'll have a look at a smarter pointer. The list is actually a bit shorter. but last one out locks the door.Recap The news this month were: • The standard class template ``auto_ptr<T>'' handles memory deallocation and ownership transfer. • Automatic memory deallocation and ownership transfer reduces the risk for memory leaks. but we also want to be sure that the memory is deallocated when no longer needed. or assigned another value) the resource's counter is decremented. We want several places of the code to be able to access the memory. The idea is that a counter is attached to every object allocated. just like function templates can be used to create functions at compile time. When the first smart pointer attaches to it. • Implicit conversions between raw pointers and smart pointers is bad (even if it may seem tempting at first. Exercises • • • Why is it a bad idea to have arrays (or other collections) of ``auto_ptr<T>''? Can smart pointers be dangerous? When? ``auto_ptr<T>'' too? What is a better name for this function template? template <class T> void funcname(auto_ptr<T>) { } What happens if ~T throws an exception? • Coming up If I missed something. a reference counted one. Every smart pointer attaching to the resource.) The less general solution is reference counting. The general solution to this is called automatic garbage collection (something you can read several theses on. no owner. I'm beginning to dry up on topics now. the count is incremented to 1. and also buy a few commercially available libraries for. or you want something clarified further or disagree with me. • ``inline'' hints to a compiler that you think a function is so small that it is better to directly insert the function code where required instead of making a function call. Part Part Part Part Part Part Part Part Part Part1 Part1 Part1 Part1 1 2 3 4 5 6 7 8 9 0 1 2 3 Introduction Last month's ``auto_ptr<T>'' class template.

counting_ptr<int> P2(P1). P1. Adding this functionality is not difficult. When three counting pointers refer to the same object. This is exactly what we want to avoid. When a second counting pointer ``P2'' is created from ``P1''. so it is better not to have the functionality. After creating a counting pointer ``P1''. that of how to stop reference counting a resource. Here is how it is supposed to work when we are done: counting_ptr<int> P1(new int(value)). the reference count for the value pointed to is set to one. the object pointed to is not duplicated. counting_ptr<int> P3(P2). the value of the counter is three.This might also be the place to mention a problem not to solve. but it quickly leads to user code that is extremely hard to maintain.manage(new int(other)). but the reference count is incremented. .

P3=P2. the counter for the old one is yet again decremented. the old objects reference count goes to zero. When yet one of the pointers move attention from the old object to the new one.As one of the pointers referring to the first object created is reinitialized to another object. . and for the new one it is incremented. Now instead the new object has a reference count of 3. the old reference count is decremented (there are only two references to it now) and the new one is assigned a reference count of 1. and the object is deallocated. since there are three reference counting pointers referring to it. P2=P1. Now that the last counting pointer referring to the old object moves its attention away from it.

It also becomes very difficult to write the constructor and ``manage'' member function. The differences lie in accessing the raw pointer and giving the pointer a new value. and there is no member function corresponding to ``auto_ptr<T>::release()'' (which stops the managing of the pointer. }.Interface outline The interface of the reference counting smart pointer will. }. so it cannot reside in the smart pointer object. Unfortunately there are two severe drawbacks. counting_ptr<T>& operator=(const counting_ptr<T>& t) throw (). With this construct.) There is a performance disadvantage with this. ~counting_ptr() throw (). the pointer to the representation and the pointer to the object from the representation. we have the object and counter together. and the type referred to. We cannot use it for dynamic binding. share much with the auto pointer. This gives the following data section of our counting pointer class template: template <class T> class counting_pointer { public: . T& operator*(void) const throw (). we must figure out where to store the reference count. this would be very unfortunate since their semantics differ dramatically. template <class Y> counting_ptr<T>& operator=(const counting_ptr<Y>& ty) throw(). template <class T> class counting_ptr { public: explicit counting_ptr(T* t = 0). I think this name better describes what is happening. Here is a suggested interface. however. It is obvious that the counter belongs to the object referred to. T value. T* operator->(void) const throw (). T* peek(void) const throw(). }. All we do is to peek inside and see what the internal raw pointer value is. and the reason is a big difference in semantics. like this: template <class T> class counting_pointer { public: private: struct representation { unsigned counter. since the ``value'' component is indeed a value and dynamic binding only works through pointers and references. The work around is simple. The member function ``reset'' is here named ``manage''. void manage(T* t). All we need to work this way is to make sure to allocate this representation struct on heap in the constructor and ``manage'' member functions (and of course to deallocate the struct when we're done with it. we must follow two pointers. use a ``T*'' instead.) The member function ``get'' is here named ``peek''. template <class Y> counting_ptr(const counting_ptr<Y>& ty) throw(). representation* ptr. To me the word ``get'' associates with a transfer. counting_ptr(const counting_ptr<T>& t) throw(). Compared to the auto pointer. giving the reference counting pointer an identical interface. The best solution I have seen is to decouple the representation from the object and instead allocate an ``unsigned'' and in every counting pointer object keep both a pointer to the counter and to the object referred to. Where to store the reference count Before we can dive into implementation. While these aspects could use the same interface as does the auto pointer. A solution that easily springs to mind is to use a struct with a counter. the only differences are that some member functions do not have an empty exception specification. and there is no transfer occurring. Whenever accessing the object referred to. for obvious reasons.

To begin with. but there is a two-fold problem with that. so we can implement the counter managing code in a separate class. there is nothing to update. there is no need to waste time and memory by allocating a reference counter for it. such that the assignment and construction from a counting pointer of another type are impossible. we should delete the object we refer to) or 0 if it was just decremented. In fact. counting_base& operator=(const counting_base& cb) throw (). counting_base(const counting_base& cb) throw(). A count of 0 is represented by a 0 pointer. and makes life easier later on. ~counting_base(void) throw(). however. For older compilers. int reinit(unsigned count=1). }. However. } Copying a reference counting object means adding one to the reference counter (since there is now one more object referring to the counter. Second. inline counting_base::counting_base(unsigned count) : pcount(count ? new unsigned(count) : 0) { if (count && !pcount) throw bad_alloc(). } Initialize a counter with the value of the parameter. .private: T* ptr. Base implementation It is fairly easy to implement. if we want the ability to assign a ``counting_ptr<T>'' object from a value of type ``counting_ptr<Y>'' if a ``T*'' can be assigned from a ``Y*''. We just need to tell it how to behave. the function body throwing the exception will not be necessary. so both the member variables are private and thus inaccessible. Note that the copy constructor never involves any allocation or deallocation of dynamic memory. unsigned* pcount. and it reports the results to us. member templates open up holes in the type system you can only dream of. The member functions ``release'' and ``reinit'' return 1 if the old counter is discarded (and hence. extremely few compilers support template friends. a rather new addition to the language. If you have a modern compiler. we must think of something. The idea here is that the this class handles every aspect of the reference counter. it is all we need. there is nothing in the copy constructor that can throw exceptions. This kind of problem is exactly what ``friend'' declarations are for. The value of the raw pointer member can be accessed through the public ``peek'' member function. One step on the way towards a solution is to see that the management of the counter is independent of the type T. If our reference counting pointer is initialized with the 0 pointer. private: unsigned* pcount. even when member templates are not available. For the curious. int release() throw(). please read Scott Meyer's paper on the topic. The default constructor allows us to choose whether we want a reference counter or not. operator new will throw ``bad_alloc'' in out-of-memory conditions.) When we have 0 pointers. inline counting_base::counting_base( const counting_base& cb ) throw () : pcount(cb. The problem is that ``counting_ptr<T>'' and ``counting_ptr<Y>'' are two distinct types. For compilers that do not support member templates. A reference counting class may look like: class counting_base { public: counting_base(unsigned count = 0). you need to define the ``bad_alloc'' class.pcount) { if (pcount) ++(*pcount). Accessibility The solution outlined above is so good it almost works. but we need a solution for accessing the counter without making it publicly available. }.

As an exercise. It may be that ``release'' is called just prior to destruction. and incrementing it for the right hand side object. ``reinit'' is not needed. inline int counting_base::release() throw () { if (pcount && --(*pcount) == 0) { delete pcount. since there will be one less object referring to the counter from the left hand side object.) Accessibility again As nice and convenient the above helper class is. The problem remains. If the pointer to the counter is not set to zero it means either referring to just deleted memory. and one more referring to the counter from the right hand side object. if (pcount) ++(*pcount). prove to yourself that this reference counting base class does not have any memory handling errors (i. though. return 0. in that if this object was the last reference the counter is reinitialized instead of deallocated and then allocated again.pcount) { release(). } Assignment of reference counting objects means decrementing the reference count for the left hand side object. never accesses uninitialized or just deallocated memory. never deallocates the same area twice. A return value of 1 means deallocation took place (hinting to the user of this class that it should deallocate whatever object it refers to. } pcount = count ? new unsigned(count) : 0.) Since this code is needed in the assignment operator. It does make the implementation a bit easier. return 0. or decrementing the reference count twice. } return *this. as well as in the public interface for use by the reference counted smart pointer class template. pcount=cb. class ``counting_ptr<T>'' and ``counting_ptr<Y>'' are different classes and because of this are not allowed to see each others private sections. since it is the last one referring to it. we implement that work in the ``release'' member function. Its purpose is to release from the current counter and initialize a new one with a defined value.e. it always deallocates what it allocates. } Strictly speaking. inline counting_base& counting_base::operator=( const counting_base& cb ) throw() { if (pcount != cb. return 1. and the destructor calls ``release'' to deallocate memory. a new counter must of course be allocated. if (count && !pcount) throw bad_alloc(). } If the reference count goes to zero the counter is deallocated. If it was not the last object referring to the counter.pcount.) In both cases the pointer to the counter is set to 0 as a precaution. It is an optimization of memory handling. pcount = 0. it really does not solve the accessibility problem. however. inline int counting_base::reinit(unsigned count) { if (pcount && --(*pcount) == 0) { *pcount = count.inline counting_base::~counting_base() throw() { release(). and say that . The easy way out is to use public inheritance. } pcount=0. return 1. } Destroying a reference counting object means decrementing the reference counter and deallocating it if the count goes to zero (last one out locks the door. and never dereferences the 0 pointer.

but an is-implemented-in-terms-of relationship. pt(t) { } Initialize the counter to 1 if we have a pointer value. otherwise it returns 0. This is bordering on abuse. and since the helper class does most of the dirty work. pt=t.peek()) { } Note how the latter makes use of the knowledge that a ``counting_ptr<Y>'' is-a ``counting_base''. } template <class T> template <class Y> inline counting_ptr<T>& counting_ptr<T>::operator=( const counting_ptr<Y>& ty ) throw() { . but it works fine. the implementation is not too convoluted. and 0 otherwise. As such they do not become part of the public interface. If you review the functionality of the ``release'' member function of ``counting_base''. and only if.almost. template <class T> inline counting_ptr<T>::counting_ptr( const counting_ptr<T>& t ) throw() : counting_base(t). and the is-a relationship is mostly imaginary. That is a solution that is simple. counting_base::operator=(t). you see that it deallocates the counter if. Implementation of a reference counting pointer Finally we can get to work and write the reference counting pointer. inline counting_ptr<T>::~counting_ptr() throw() { if (release()) delete pt. } return *this. } The destructor is important to understand. pt(ty. they can be declared protected. There is no is-a relationship here.pt) { } template <class T> template <class Y> inline counting_ptr<T>::counting_ptr( const counting_ptr<Y>& ty ) throw() : counting_base(ty).pt) { if (release()) delete pt.every counting pointer is-a counting base. template <class T> inline counting_ptr<T>::counting_ptr(T* t) : counting_base(t ? 1 : 0). sweet and dead wrong. template <class T> inline counting_ptr<T>& counting_ptr<T>::operator=( const counting_ptr<T>& t ) throw() { if (pt != t. the count reaches 0 and then returns 1. pt(t. This means that in the destructor we will deallocate whatever ``pt'' points to if. Instead of having the member functions of the ``counting_base'' class public.pt. and only if the reference count for it reaches zero and is deallocated. Such relationships are implemented through private member data or private inheritance .

and thus ``release'' is called. I do not have access to any compiler under OS/2 that supports RTTI. counting_base::operator=(ty). } return *this.peek(). destroyed and copied. and tested with.) under Linux. pt=ty. Part Part Part Part Part Part Part Part Part Part1 Part1 Part1 Part1 1 2 3 4 5 6 7 8 9 0 1 2 3 Introduction Lucky number thirteen will conclude the long C++ series. and that costs a few CPU cycles. or it may be severe. The access operators are all trivial and need no further explanation: T* operator->() const T& operator*() const T* peek() const Now only the ``manage'' member function remains: template <class T> inline void counting_ptr<T>::manage(T* t) { if (!t && release() || t && t != pt && reinit()) delete pt. Depending on how efficient your compiler's memory manager is for small objects this cost may be negligible. ``release'' returns 1. pt = t. If the parameter is the zero pointer. What happens is that if the left hand side and right hand side counting pointers already refer to the same object. . Since ``release'' also resets the counter pointer to zero. or RTTI for short. Every time a counting pointer is assigned.peek()) { if (release()) delete pt. Efficiency There is no question about it. the object referred to by the left hand side reference counting pointer is deleted if. despite that all member functions are protected? • Coming up If I missed something. the ``release'' member function is called twice. If either of those tells that the old counter is discarded. Otherwise we want the counter value to be 1. if (pt != ty. After this the raw pointer value can safely be copied. the egcs compiler (a fast-moving bleeding edge gcc. constructed. we want the reference count to also be the zero pointer. so ``reinit'' is called instead. the assignment operator for ``counting_base'' does not decrement the counter again. which it does if. counter manipulation is done. For example at destruction. and the first time the counter pointer is set to zero to prevent decrementing the counter twice. the reference count goes to zero and the counter is deallocated. } This one is only slightly tricky.} Please spend some time convincing yourself that the above works. It's a new addition to C++. Instead it is simply assigned. Improve the implementation to never (implicitly) set the pointer to zero and yet always be safe. If they refer to different objects. there is a cost involved in using reference counting pointers.) one of the new functionalities in C++. Every time an object is allocated. on the other hand. and only if. please drop me a line and I'll address your ideas in future articles. Next month is devoted to Run Time Type Identification (RTTI. Last topic is Run Time Type Identification. or you want something clarified further or disagree with me. so the example programs are written for. and vice-versa for deallocation. nothing happens. the object referred to is deallocated. Exercises • • • What happens if ``~T'' throws an exception? What happens if allocation of a new counter fails? The ``counting_base'' implementation and use is suboptimal. Does the public derivation from ``counting_base'' open up any holes in the type safety. a pointer is also allocated.

it might not be as easy to see what happens. The idea is that whoever uses a push button can register a function that will be called whenever a pushbutton is called. pb->setText(txt). void createButton(const char* txt) { TextPushButton* pb=new TextPushButton(). class PushButton : public Button { public: class PushedEvent : public Event { public: PushButton* button(void) const. If you think about it. does it? The callback is registered when the button is created. but as the program grows.button()). class Button : public Control {}.button()). assert(pb). void register(void (*pushed)(const PushedEvent&)). and the down cast marked with ``^^^''is safe. giving a unique identifier for each unique type. } void pushed(const PushButton::PushedEvent& ev) { TextPushButton* pb=(TextPushButton*)(ev. Type Safe (down)Casting Many class libraries. class Control : public Window {}. from which you can get a pointer to the button itself. the exception ``bad_cast'' is thrown. One is finding a type identification for an object. At least the destructor should be virtual in such hierarchies anyway. which allows casting of pointers and references only if the type matches. however. Here is a new version of the ``pushed'' function using the RTTI ``dynamic_cast''. otherwise a zero pointer is returned (of if ``T'' is a reference. but you know the objects really are of some class inheriting from it. Let us put it to use: void pushed(const PushButton::PushedEvent& ev). so we know it is the right kind.RTTI is a way of finding out about the type of objects at run-time. void pushed(const PushButton::PushedEvent& ev) { TextPushButton* pb= dynamic_cast<TextPushButton*>(ev. Suppose a hierarchy like this: class Event {}. // ^^^^^^^^^^^^^^ pb->setText("***"). }. } This does not look too dangerous. The solution is to do the cast only if it is legal. The result is likely an uncontrolled crash. Or is it? Right now it is. that is not much of a penalty. Towards the end of the article there is also a discussion about various aspects of efficiency in C++. pb->register(pushed). or just generally weird behaviour. and somewhere the wrong callback is registered for some button. especially in comparison with C. Another is a clever cast. There are two distinct ways this can be done. There is one catch with ``dynamic_cast''. It works only if there is at least one virtual member function in the type casted from. }. otherwise you get compilation errors. suffer from the problem that callback functions will be called with pointers to objects of a base class.) That way we can check and take some action if the function is called with a pointer or reference to the wrong type. For example. . pb->setText("***"). } ``dynamic_cast<T>(p)'' works like this: If ``p'' is-a ``T'' the result is a ``T'' referring to the object. class Window {}. most notably GUI libraries like IBM's Open Class Library. concider a button push event. class TextPushButton : public PushButton { public: void setText(const char* txt). }.

Most notably it is not standardised what the printable form looks like for any given type. It does not need to have any meaning. The ``before'' member function gives you a sort order for types. const char* name() const. pos->freeze(0). as an error check. It is defined in the header named <typeinfo> (note. It can also be used during a transition between different libraries. it is not even required that the string is unique for each type. cout << pos->str(). and we have added an error check that should have been there earlier but was not because the language did not support it (what if the function is called with an ``fstream'' object?) Use RTTI ``dynamic_cast'' when being forced to adapt to poorly written libraries.) However. which purpose is to carry information about types.) Sometimes we have to live with poor designs. With RTTI we can live with both worlds at the same time. This unfortunately means that you cannot write portable (across compilers) applications that rely on the names in any way. Here is an example mirroring a problem from my previous job: void print(ostream& os) { if (ostrstream* pos=dynamic_cast<ostrstream*>(&os)) { *pos << ends. Identifying types Much more interesting is using explicit information about the type of an object. } This transition is from the ``old'' ``ostrstreams''. An ``ostrstream'' holds a raw buffer of memory. no ``. } else throw bad_cast(). The difference lies in handling the end. but I have never seen a counter proof either. In fact. type_info& operator=(const type_info&). and its end must be marked by appending a zero termination with the ``ends'' modifier. in no way.It must be stressed that this use of ``dynamic_cast'' can always be avoided (I cannot prove it. You get a ``type_info'' object for a value through the built-in operator ``typeid(x)''. Note that it is the runtime type of ``x'' that is accessed. where everything was always ``ostrstream''? Well.h'') and looks as follows: class type_info { public: virtual ~ type_info(). This requires a little bit of care. and during transitions between libraries. Say you have this situation: class parent {}. private: type_info(const type_info&). For new designs. This solves a problem that cannot reliably be worked around with clever designs. the standard requires very little of the ``name'' member function. by turning the problem ``inside-out''. and many have tried. while standard ``ostringstream'' returns a string object which itself knows how long it is and a zero termination must not be added to it (otherwise an extra zero character will be printed. bool operator==(const type_info&) const. However. The ``name'' member function gives you access to a printable form of the name of the type identified by the ``type_info'' object. . storing strings as plain ``char*''. never design new code like this! Use this construct only during a transition phase. use dynamic binding instead. other than that it makes it possible to keep sorted collections of type information objects. bool before(const type_info&) const. not the static type. there may not be one. which is based on standard strings. In which way is this worse from the previous version that did not have the problem. } else if(ostringstream*pos= dynamic_cast<ostringstream*>(&os)) { cout << pos->str(). Do not try to get a meaning from the sort order. bool operator!=(const type_info&) const. and then this can make our life a lot easier. however. }.) The need arises from a poor design (the solution is to use dynamic binding instead. to the standard ``ostringstream''. There is a standard class called ``type_info''. even the built-in ones.

Here is an outline for how to do this (in a non-portable. persistent_X* px=dynamic_cast<persistent_X*>(pp).//* persistent* pp=storage. since it is easy to forget changing it when creating a new class inheriting from another one. template <class T> void register_class<T>(). Let us call this class ``persistent_store''. cout << typeid(p). such as a file. virtual void retrieve(istream& is) = 0. It may look like this: class persistent { protected: virtual void store(ostream& os) const = 0. If we have a compiler whose name of a type as given from ``type_info'' is indeed the name of the type as we write it in the program. but it makes sense. we just need to create persistent versions of the classes. We can still make use of third party libraries. you can add some type identifier to all classes. If you are designing a complete system from scratch.name() << endl. }.template register_class<persistent_X>(). however. Is this useful? Suppose you need to store and retrieve objects from a typeless media. The syntax is a bit ugly. but what those will be will depend on our implementation. but when reading. storage.class child : public parent {}. Generality and portability requires more work. Chances are you will be extending existing code. The interface should be reasonably obvious. There will. void store_object(persistent*). int main(int argc. }. let us call it ``persistent''.) First we design a class for I/O objects. The intention is that all classes we want to store must inherit from ``persistent'' and implement the ``store'' and ``retrieve'' member functions. This came as a surprise to me. }. Storing them is easy. you need to know what type of object to create. Next we need something that does the work of calling the member functions and creating the objects when read. }.retrieve_object(). parent* p=new child(). what do you think the above snippet prints? The answer is ``parent*'' followed by ``child''. public persistent { protected: virtual void store(ostream& os) const { os << *this. Obviously only classes inherited from ``persistent'' may be stored and retrieved. }. but not as limiting as it may seem. persistent_store storage(file). but it is not too bad. and that is exactly what is mirrored by the output. It is what it points to that may differ. This is limiting. The use of template functions that do not have any parameters of the template type is a recent addition to C++ that few compilers support. persistent* retrieve_object(). char* argv[]) { fstream file(argv[1]). . and it does not work at all with existing classes. or a socket. or decide to use third party class libraries. The run-time type of ``p'' is ``parent*''. Only classes registered with the store may be used. I have chosen the only additional constraints to be that they have a default constructor and may be created on the heap with ''operator new''. cout << typeid(*p). be other requirements put on the type ``T''.name() << endl. It may be defined as follows: class persistent_store { public: persistent_store(iostream& stream). but this is error prone. and slightly restricted way. virtual void retrieve(istream& is) { is >> *this. Here is how to use the persistent store: class persistent_X : public X.

In other words. }.name(). medium >> len. // read past blank. So . creator_map.. } This function template implicitly carries out the type test for us (since ``T*'' can only be returned as ``persistent*'' if ``T'' publicly inherits from ``persistent''.read(name. a character buffer allocated for the correct size and exactly that many characters read. medium << strlen(name) << ' ' << name. When reading. medium. func)). p->store(medium). the length can be read. When storing the string is checked for in the map to make sure no unregistered types are stored.store_object(px). the implementation of ``register_class'' may look like this: template <class T> void register_class() { persistent* p=(T*)0. Here is how it is all implemented: void store_object(persistent* p) { const char* name=typeid(*p). }.end()) throw "unregistered type". A map is a data structure acting like an array.get(). The first is extremely easy to check for. persistent* (*creator_func)(void) = ???. }. when reading. ??? can be replaced with ``persistent_object_creator<T>'' and the check line can be removed. but I guess one gets used to it.insert(make_pair(typeid(T). My suggestion is simple. Here the map type from the C++ standard library is used.. medium. . maptype::iterator iter=creator_map. this time a function template. The solution lies in using a map from strings to a creation function. deciding the type from a name is easy to understand conceptually. Not too bad. template <class T> persistent* persistent_object_creator(void) { return new T. Note the ``template'' keyword. char* name=new char[len+1].len). Now there are a number of problems that must be solved: • How to prevent classes not inheriting from ``persistent'' from being registered • How to allocate an object of the correct type on heap.name(). It is ugly. if (iter == creator_map. persistent* retrieve_object(void) { size_t len. // type check. Now for how to store and retrieve the type name. In this case the indexing type will be a string.find(name). but allowing any kind of indexing type. . the creator function is looked up and called. given a string representation of its type. } The line marked ``//*'' shows the syntax for calling template member functions without parameters of the template type. The tricky part is to find out what ``???'' is. Here is one way of doing it: template <class T> void register_class() { persistent* p = (T*)0. That way. Any attempt to call ``register_class'' with a type not publicly inheriting from ``persistent'' will result in a compilation error. • How to store the type information on the stream such that it can be interpreted unambigously. Perhaps it comes as no surprise that the solution is yet another template. but requires a bit of trickery to implement. Every string in the map corresponds to a function accepting no parameters and returning a ``persistent*''.storage... store the length of the name followed by the name itself. The second problem. That is the easy part.

many programmers avoid them. After all.end())'' checks if the lookup was successful.end()) throw "unregistered type". return p.'' That is a major performance killer. // no longer needed if (iter == creator_map. }. developmenttime efficiency has been gained. If you do. Inline functions While well chosen inline functions may speed up your program. If the function is large (more than a very few instructions) the size of your program will grow considerably if the function is called often. the line ``if (iter == creator_map. the same design can be used with C++. there is just no way to do it faster than through a virtual function call. which will be called like normal functions. One is a cultural problem. use inline functions instead of macros and add constructors to your structs. I have been touting generality and extensibility over and over and over. Throughout this course. As I see it. and thus the type was not registered. the contents of the standard library are state of the art. delete[] name. Generality and extensibility have a price. not all problems need general and extensible solutions. If true the string was not represented in the map. something that the virtual function call interestingly is not. Are you prepared to pay the price? It depends. and read the documentation and code commens. Here is one very good argument for why. for every translation unit (read . possibly at the cost of program size and run-time efficiency. p->retrieve(medium). Does it need to be that way? The answer is no. than those in the standard library. If you need the functionality that a virtual function call gives you. The switch switch/call construct is guaranteed to require at least as many instructions (probably more) and is an instruction pipeline killer for the CPU. As can be guessed. Have a look at the SGI STL.) Where will the out-of-line function reside? With one exception. because of the added type safety. it does not. Enjoy. you are still better off than in plain C.maptype::iterator iter=creator_map. More interesting is what happens if the compiler for one reason or another fails to inline it. and contributes most to the bloat. There is one difference however (getting into the area of technical problems here. How many programmers have a working knowledge of the latest and greatest of algorithms and data structures? . persistent* p=(iter->second)(). there are mainly two reasons for C++ bloat. as a static function. one is technical. ill chosen ones will lead to bloated executables. the result is just about guaranteed to be slower. the programmer is a top-notch algorithm and data-structure guru. Here are some examples: Virtual functions I have heard people say things like ``As a performance optimization. Unless. I will return to this one shortly. In a large program. If inlined. and since every copy is in its own memory area. this might mean *many* copies. strictly speaking. but the program will not be slower and larger. the instructions for the function will be inserted at the place of the call. That is a mini persistency library for you. all compilers I know of will add one copy. It will then have to make an out-of-line function for it. however. but use new and delete instead of malloc/free. and you will notice that many of the data structures and algorithms used were not even known a year ago. because that is the one that is hardest to solve. If you have a lean C design.find(name). C++ and Efficiency I have once in a while heard things like ``C++ programs are slow and large compared to a C program with the same functionality. and are likely to reduce the number of cache hits since active code areas will increase in size. However. I've removed all virtual functions and replaced the virtual function calls with switch statements and ordinary member function calls. This means rolling their own algorithms and collections instead of using standard ones. it is true. Culture related inefficiency Let us first look at the cultural problem. Remember that an inline function is actually not a function.'' In a way that statement is often true. you will not get the boost from high performance hits due to good locality either. Another cultural problem leading to inefficient C++ programs is the hunt for efficiency! Yes.cpp file) where the function inlining failed. and quite possibly larger. right? If I can make use of the generality and extensibility at some other time. Fear of template code bloat Since it is ``a well known fact'' that templates leads to code bloat.

with examples and discussions. It has been shown to equal and even beat Fortran for performance.c++. A worse technical problem is exceptions. this time introducing cleverer optimisation techniques like copy-on-write. copy-constructor and default constructor for you? Why is RTTI there? Why no garbage collection? Why not dynamic types like in Smalltalk? ``Ruminations on C++''. but unfortunately many do. this book is not for beginners. I have used their compiler. Koenig is the only author I know who can explain a problem to solve. . A must read for any C++ programmer.) Unfortunately it is not available for OS/2. and many proponents of exceptions. Coplien. This book answers all the why's. exceptions add a number of program flows that would otherwise not be there. Now. No matter how good the implementation is. and it might happen. Most C++ compilers use exactly the same optimization techniques as C and Fortran compilers do. Entertaining and enlightening. It is a mind opener in many ways. but it is not true. Andrew Koening and Barbara Moo.moderated. ``Scientific and Engineering C++''. Meyers' writing style is easy to follow and entertaining too. type-safe and fast designs. ``Effective C++. Scott Meyers. ``The Design and Evolution of C++''. Some compiler vendors. Scott Meyers. and that makes life harder for the code optimiser. say that exception handling can be implemented such that there is no performance loss at all. Beware. though. Programming Styles and Idioms''. When feeding it a program written like a C program. Why does C++ have multiple inheritance? Why does it provide assignment. and I doubt it ever will be (bug them about it. 2nd ed''. What they have done is to break almost every rule of thumb. If you are curious. Another 35 tips. and that is when using good encapsulation and templates. this is a pleasant book to read. present an easy to understand one page solution just to say it was unnecessarily clumsy and reduce it to half a page without sacrificing extensibility and generality. Also. Contains 50 tips. unless an exception is actually thrown.) I mentioned earlier the problem of outlined inline functions. destruction. John J. performance will be poorer. despite the fact that the way a C++ program works and accesses memory etc. Barton and Lee R. however. which one immediately realizes will require a large program. I know of one C++ compiler that has a code optimizer made especially for the requirements of C++. I am not associated with them in any way. look up KAI C++ (Note. but it contains many useful tecniques. and as a result they have extremely extensible. do follow comp. Nackman. A good compiler will not leave you several copies as static functions.Technical problems The largest technical reason for slow C++ programs is compilers with very poor code optimizers. differs a lot. Without doubt this book from 1992 is getting old. Recommended reading I want to finish by recommending some books on C++. This is a modern ``Advanced C++''.lang. for how to improve your programs. James O. ``Advanced C++. Bjarne Stroustrup. slick. that is all. Often simply referred to as B&N. It may seem like it. ``More effective C++''.

You're Reading a Free Preview

/*********** DO NOT ALTER ANYTHING BELOW THIS LINE ! ************/ var s_code=s.t();if(s_code)document.write(s_code)//-->