6 Things About Programming That Every Computer Programmer Should Know - Vincent North

6
Things About Programming That Every

Computer Programmer Should Know

by Vincent P. North
2015
Introduction: Just Call It A Careers Experience, Condensed

Into Six Key Topics

By manner of introduction, I have been a professional computer programmer and,
software project manager for about thirty-five years. I was fortunate to have begun that
career when the personal computer had not quite yet arrived on the scene. (They were
still, for the most part, a tantalizing future curiosity, then found only in the pages of RadioElectronics and Popular Electronics magazines.) Computers either filled airconditioned rooms or, more recently, were about the size of a breadbox. Although
integrated circuits, including microprocessors, were a standard part of electronic design,
computers at that time were neither small nor fast. Computers, also, were devices that
other people owned: companies who could afford to buy the boxes and the airconditioned rooms to put them in. It was in this context that I first got my start.
As you well know, the subsequent years have witnessed dramatic changes. Moores
Law1 continues to hold true, as semiconductor manufacturers continue to astonish us all
with how much more computing power they can somehow cram into a single tiny piece of
sand. Builders of flat-screen and touch-sensitive displays also continue to astound us with
what they can do, as have the builders of sound-chips and nearly-microscopic digital
cameras. We live in interesting times that certainly show no sign of slacking off.
Into this whirling dervish of a technological world, then, comes you, Gentle Reader.2
Tasked not only with keeping up with the hardware technology that will never allow itself
to fully be kept-up with, but also with the task of computer programming itself.
It has been said, by some, that computer programming is an innate ability. That
you either have it (whatever it is ), or you dont. But, I dont agree. I never
thought of myself as being a natural in what became my career. I was merely naturally
interested in it, as I still am. I was interested enough, and found it to be engaging
enough, that I persevered to learn how to do it. But I made a lot of dumb mistakes along
the way which I would like to try to help you avoid.
Most of all, my experience has shown me a list of fundamental technical skills of a
professional programmer, which I have now sought to reduce to a succinct list of six
items. The actual list is longer maybe, much longer but after some reflection, here are
the six that in my humble opinion are key. These are the things all of them technical
abilities, procedures and perspectives which you will use on the job most every day.
These are six things for which, in my opinion at least, your command of these skills will
make the greatest difference in your success and longevity in the craft.
Computer programming, by the way, still is a craft.3 At this writing, it still
requires skilled work and the product of experience. But we should recognize that this
aspect also is changing. Thanks to the influence of open-source programs and cooperative
development, computer programming is tilting in-part toward the assembly of new
solutions based substantially on pre-existing components that the programmer(s) in
question did not themselves develop. This, itself, constitutes a new fundamental
technical skill that is one of the six that you will now find in this book.
And So, And Without Further Ado: Here Are The Six
As I said in the Introduction, the following is a list of what I consider to be six
fundamental technical skills that every computer programmer needs to know.
Technical means that these are things which you need to know and which you will apply
when crafting (and troubleshooting) the computer software that you write, and/or that you
(and your team) maintain, for your client or employer. These skills are not particular to
any single size, type, or brand of computer hardware, and for the most part also are not
limited to any computer programming language or tool. These also are not social nor
organizational skills. (That would be a separate list, entirely.)
The List
1. Internal memory management and data structures.
2. Objects.
3. SQL Database Queries and Concepts.
4. Precise specification, strategy, and implementation.
5. Front-End, Back-End. User-interfaces and frameworks.
6. Pragmatic Debugging skills.

This list is not in any particular order, although I will choose to address it in the
sequence given. My treatment of each topic will also not be extremely detailed. Please
understand that I am seeking to provide you with a 30,000-foot view, and to point you in
specific directions from which you can pursue additional research on your own.
This list is also not a primer, and I emphatically do not by using this phrase intend
any negative slight to you. The topics that I will present here might well, section-bysection, require re-reading. (And, they might require clarification. Since this is an e-book,
we can do that.)
David Intersimone, the original director of development a (now, long-defunct )
Borland International, referred to this experience as a sip from the fire-hose. I find
myself unable now to acknowledge that your superficial experience with regards to the
forthcoming material might well be the same. However, as your Gentle Author, I hope
that you will not in fact expect anything less from the text that you are about to consume.
And so, with all that now said: Let us begin.

One: Internal Memory Management and Data Structures:

Every digital computer room-sized or pocket-sized consists of the same three
functional parts:
1. CPU = Central Processing Unit (the microprocessor, GPU, etc.).
2. I/O = Input/Output.
3. Main Memory.
Memory, of course, consists of (today ) billions of individual storagecompartments, each one character (byte ) wide, each with an individual address. The
CPU retrieves both instructions and data from memory, which is the only part of the
computer system which is (must be ) [nearly ] as fast as the CPU itself.
All modern operating systems through interesting devisings that need not concern
us here are magically able to provide each executing program with the functional
illusion that they have some certain amount of memory all to themselves. They never
have to worry about stumbling into anyone elses memory, because no one elses
memory is ever visible or accessible to them, unless both programs make special
arrangements to share a certain part of it which very-interesting topic I hereby simply
declare to be out of scope for the purposes of our present conversation.4
Thus, we have, for each process5, a play-pen all their own, which they do not have
to share with anyone unless they want to. However, as it turns out, this allocation of
memory is not pristine and undisturbed. Every process runs under the auspices of an
operating system6 which completely defines how each process actually perceives [its
private view of ] Main Memory. This view, as it turns out, consists of exactly two
things:
1. The Stack: Every process, from the operating systems point of view,
consists of one subroutine. The operating system launches the process by
calling that one subroutine, and, when that one subroutine finally returns to
its caller, the entire process ends. This subroutine, directly or indirectly,
launches many other subroutines, each one of which is called and then,
finally, returns to its caller. Each subroutine, during its finite lifetime,
possesses some certain set of local variables which are peculiar to itself,
such that, if the subroutine (by whatever means) happens to call itself, the
local variables owned by each instance will be distinct. The entire portion of
memory which is used to accomplish this feat is called, the Stack, because it
has the effective functional organization of a set of dishes stored at the start of
any cafeteria line. The call-and-return flow of control, and the storage of all
local variables, is managed using this single area of storage.
2. The Heap: This is, quite simply, everything else. Storage in this area is
not allocated automatically: it is obtained only on request, and it is likewise
made available for re-allocation only on request. (It is rather-rudely called
the heap because this area of storage has no inherent structure, unlike the
(clearly, push-down) stack.)

Almost all of the storage that your process will actually use is taken from the heap,
not from the stack. Furthermore, all references to this storage are actually indirect,
accomplished through the use of pointers.
A pointer is simply a variable whose value is understood to be a memory address.
The stack, you see, is merely an amorphous pool of available storage. Therefore, in
order to obtain the use of a chunk of however-many bytes, for whatever purposes you may
devise, it will be necessary for your process to request it. The operating system will
provide you with the memory address of a suitable area. In order to make use of the area,
you must refer to it indirectly, using the address that you were given. Pointers are the
base mechanism by which this trick is done.
Most programming languages, however, clearly recognize that this riding the pony
bareback strategy is fraught with quite-unnecessary danger. (If programs fail to comply
with the operating systems expectations, at any point or for any reason, the entire
program will crash.) Hence, most languages implement a much more sophisticated
memory management strategy, and hereafter I will elect to presume that your situation is
more-or-less also like this one.
Typically, modern programming languages blur the distinction between a pointer
and a value. When you refer to a variable in your program, the language will
automatically (and, transparently) deduce whether this variables value is the actual
value, or a pointer to it, and in any case will respond accordingly (and, transparently).
The language system will also transparently keep track of how many variables
contain the addresses of (references to ) any particular piece of storage. They will, by
some very clever mechanism, use this to discover when a given piece of storage is no
longer being referenced by anyone. In this way, programs no longer have to be explicitly
concerned with cleaning up their own mess. Blocks of storage will be allocated
automatically and transparently, and, when they are no longer being used, they will be
harvested and re-used.
These, then, become the two basic ground rules upon which all processes (or,
threads7) may, in most programming languages, depend:
All programs directly or indirectly consist of subroutines, each subroutine
[instance ] of which have their own private copies of local variables, which
come into existence when they do, and disappear when they do.
For all other purposes of necessary memory allocation, programs may simply
ask for new storage to be allocated and may blissfully ignore where,
exactly, any particular piece of storage actually is. They may use the storage
until they no longer have use of it, and then simply abandon it. When everyone
has abandoned it, the storage will be silently and reliably re-used.
Upon the foundations of this entire to be assumed memory infrastructure, provided
for them gratis by the programming-language system which they use, all computer
programs must, sufficiently for their own purposes (whatever they are ) construct a
foundation which is sufficient for whatever-it-is that they are supposed to do.
This means, essentially, two things:
1. Any incoming (or computed) data used by the program must be stored in such
a way that the program, during its execution, can (of course ) obtain it
again.
2. The program must be capable of handling a variable and unpredictable
quantity of data. There must be no pre-conceived limits as to just how
many copies of data might be stored (or, storable).
Ordinarily, these chores are delegated to pre-existing storage strategies which are an
intrinsic part of whatever language system is being used. There are usually two types:
those which store a value under a particular key (requiring that exact key to be provided
in order to retrieve it again), and those which store an arbitrarily-sized list of (zero or
more) values. These two are frequently used in combination, to allow zero or more
values to be referenced by any unique key: each element of the keyed data-store (such as
a hash or tree), refers to a separate list.
Let us now consider what sort of things can go wrong with these arrangements.
What sort of things can cause a program to misbehave, or fail? Here are the mostcommon culprits:
1. Stack Overrun, caused by endless recursion: As I said earlier, the
stack is the portion of memory thats used to manage subroutine calls. When
a subroutine is called, information is stored in the stack to facilitate returning
from the subroutine, and the subroutine instances local variables are also
stored there. The stack is of a limited size. Therefore, it is possible to overrun
the boundaries of the stack if too many nested subroutine calls are made.
Pragmatically, this means that subroutines are calling themselves, so-called
recursively, without ever returning from any of those calls. The effect of
this sort of bug on a program is instantly fatal, but relatively easy to debug.
2. Heap Corruption (the infamous Double Free): This always-fatal
problem is caused by corruption of the internal data structures which manage
the allocation and return of memory in the Heap. There are two routines: a
malloc() routine, which requests a block of memory of a specified size
(returning an address), and a free() routine, which releases a block of memory
at a specified address (which must have previously been obtained from
malloc() ). Programs are required to free() only addresses that they obtained
from malloc(), and to free() any particular address only once. They are also
required to constrain their memory-modifications only to the range of
addresses given, never modifying any adjacent bytes. Most modern
programming languages protect from these types of problems by managing
the low-level malloc() and free() calls themselves.
3. Heap Exhaustion (Memory Leaks): Programs are required to timely
release any storage that they are no longer using. Most modern programming
languages take care of this chore through some mechanism which detects
automatically when a particular storage block is no longer being referenced,
but so-called leaks can occur when, for instance, a series of storage-blocks
all contain references to one another, but there remain no other references
elsewhere to any of those blocks. (Since all of the blocks are still
referenced, they never get released.) Heap exhaustion can also be caused
by inefficient program design.
4. Failure to detect when a storage-allocation request could not be satisfied:
When a malloc() request cannot, for whatever reason, obtain the amount of
storage requested, it will typically return zero a special value also known
as NULL. Programs should be detect if this occurs, and respond accordingly,
but they rarely do.
5. Exhaustion of fixed-size storage arrays: Some early programming
languages do not allow storage to be dynamically allocated from the heap.
Instead, the programmer must specify a fixed size for the structure. Programs
are supposed to determine if the space within these fixed structures has been
exhausted, but they rarely do. Programming languages also usually do not
detect that a reference has been attempted which lies outside of the proscribed
boundaries of the structure. The usual consequence is stack or heap
corruption.

Memory issues are a common source of problems in software that is in the process of
being developed, but they are much less common in programs that are in production.

Two: Objects
Many of the influential books in computer science literature share a common
characteristic: they are small. Certainly one of the most important of these was entitled
Algorithms + Data Structures = Programs, by Dr. Niklaus Wirth.8
Truly, the title of this book says it all. Any computer program consists of
algorithms (the step-by-step execution of instructions that is called-for), applied to data
structures such as the ones alluded to in the preceding section of this book.
In early programming languages, these two concerns (algorithm and data) were
addressed separately, in different parts of the program. (The COBOL language was a
particular example, defining all (fixed ) data structures in the so-called DATA
DIVISION, and all algorithms in the PROCEDURE DIVISION.) This is not so much
of a problem with regard to the local variables that might be associated with a particular
procedure or function, but it is a very vexing concern with regard to the global storage
that is used by multiple procedures and functions throughout a program.
In a word, the problem is that the two things are separated: the data structures which
are manipulated by algorithms, are separate from the algorithms which manipulate the
data structures. If decisions need to be made as to which algorithm should be applied to
which data, these decisions wind up being redundantly scattered throughout the entire
program.
To address this concern, the notion of objects was invented.
An object, for our purposes, is a self-describing piece of storage, allocated from
the heap. It contains, not only space for the individual values (properties) which might
need to be stored there, but also additional descriptive data (metadata) which serves to
directly associate the object with the procedural code (methods) that are designed to
operate in conjunction with it.
Significantly, given a particular object and a request to apply a particular function
against it (a so-called method call), the computer is able to determine which function is
the correct one to call based only on the metadata contained within the object itself.
The exact mechanisms by which this determination is made are concealed from the
programmer, but they are very efficient.
The paradigm that is usually quoted is: Hey, you! Do this! Whereas, in a
conventional programming language, a specified subroutine would be called and a
reference to the data would be supplied to it as a parameter, in an object-oriented
programming language the primary reference is to the object (Hey, you!) which is then
instructed to call one of its methods (Do this!). The actual sequence of events that
subsequently takes place may vary from object to object, and from one method-call to the
next, because the decision is made literally on-the-fly.9
Since late-binding is the fundamental characteristic of any object-oriented
programming system, there are many approaches that existing languages use to obtain it.
Some languages are designed strictly from the ground up to use an object-oriented
approach, whereas other languages permit object-oriented and conventional (procedural)

techniques to be used in the same programs at the same time. Languages also differ
sometimes, quite markedly in exactly what features they do and do not offer.
A key notion of any object-oriented system is concealment. Objects are said to
expose certain methods (functionality that can be called) and properties (values
associated with the object which can be examined or set). (The act of examining or
setting the value of a property might cause procedural code to be executed so-called
side effects.) The objects, in the act of exposing what they do and what can be done
to them, also conceal the details of how they do it. This is done so that different and
derivative definitions of an object can be devised, all of which appear (and appear to act)
the same to their clients to the other parts of the program which may use and reference
them while perhaps having many entirely-different implementations. This what, not
how approach avoids unnecessary dependencies coupling between different flavors
of objects, and also between the objects and their clients.
Object-oriented languages are designed to allow you to define a taxonomy of related
object definitions. For example, a so-called base class of vehicle could be defined, from
which are derived subclasses such as automobile, motorcycle, and truck. Each
subclass inherits the properties and methods of its base class (e.g. the method drive()
and properties such as color and number_of_wheels), but implements them in different
ways. Subclasses can completely override (replace ) the properties and methods of
their ancestors, or can augment them.
Some languages support multiple inheritance, where a subclass can inherit
characteristics from more than one base class. (And sometimes, when the language allows
these multiple base classes to define the same methods and properties, the results can be
quite interesting. The language will always define some kind of bright-line rule to
determine exactly what will actually happen.)
Every object-oriented language will define one or two special methods that can be
applied to any object, calling these constructors and destructors. A constructor is an
initialization subroutine, called by the language immediately after memory has been
carved-out for a new object instance. A destructor is a cleanup subroutine, and a
companion to the constructor, being called by the language immediately before the object
memory is freed. Constructors and destructors of subclasses usually are obliged to call the
constructors and destructors of their ancestors, and to do so at particular points.
Object-oriented languages openly encourage the use of the phrase, is a. A car
object is a car, and a dune_buggy object is a dune buggy as well as a car. The same
is true of a sedan. If the client of a particular object only wishes to do something that both
a sedan and a dune-buggy can do equally well (how boring ), then that client doesnt
need to know or care exactly what variety of car he is using. (The thirty-nine cent word
for this idea is, polymorphism.) If the client wants to ask the object to do something
that a sedan and a dune-buggy would carry out in different ways, then the client also
doesnt need to be concerned with these differences of implementation.
There are many fundamental advantages to object-oriented languages, and very little
runtime cost. But programs are subject to certain design difficulties once they have been
in service for a number of years, mostly due to the inheritance schemes aforementioned.
As long as the business requirements do not change in any way that is not perfectly
reflected in the object inheritance stratagem originally devised for the program, such
programs can have a very long service-life, indeed. However, if requirements do change
fundamentally, inheritance can become an intractable form of coupling between the
various subclasses which are derived from a common ancestor.

Three: SQL Queries and Concepts

SQL (Structured Query Language) is the now-ubiquitous lingua franca of the
database world. With few (legacy ) exceptions, all database systems in use today are
engineered to use it. Although the implementations are not the same, they are at this point
sufficiently similar that I can now make several important and general statements about
them.
The SQL language was invented as part of the SEQUEL project at IBM, based in part
on a pioneering paper by Dr. E. F. Codd, which was published in the June, 1970 issue of
the Communications of the ACM. A key feature of this language was that it was
essentially declarative. SQL allows you to specify what data you wish to obtain. It is up
to the database engine to, on the fly, devise a plan for obtaining these answers, and then to
do so. The database engine may exploit characteristics of the database such as indexes
that have been created on certain fields in order to produce its answers more efficiently,
but an SQL query does not specify how the work is to be carried out.
Information in an SQL database is organized into tables, which contain an unlimited
number of rows. Each row consists of an identical set of columns, each of which (usually
) contains either a single value (of a single specified data type), or no value at all.
(When this is the case, we say that the column IS NULL.) The rows in the table are in an
unspecified order, although you can request that query-results should be returned to you
sorted in any order you wish.
Conceptually, every SQL query (the SELECT statement) consists of a specification
of the following:
1. A list of the columns whose values you want to see. (If the same column
name appears in more than one table, you must be specific.)
2. The tables from whence the data is to come, and how these tables are related
to one another for the purposes of this query. (See below )
3. The selection criteria (WHERE clause ) that is to be used.
4. If youd like to receive summary statistics, specify what the data should be
GROUPed BY.
5. If youd like to receive only certain summary rows, specify what
characteristics the groups-of-interest should be HAVING.
6. If youd like to receive the results sorted in a particular way, specify what the
rows should be ORDERed BY.
SQL queries can obtain results from any number of tables at a time. The query
specifies how the rows in the various tables should be considered to be related to one
another. This relationship is expressed by the presence of identical values in one-or-more
specified columns in any pair of tables. For example, a customer_id field in an Orders
table is sufficient to enable any sort of information about that customer to be retrieved
from a Customers table in the same query. This is referred to as a join.
There are three types of joins that can be used: inner joins, which return only rows
which have identical values in both tables, and left or right outer joins, which
always return all of the rows from one of the tables or the other. (An inner join against
Orders and Customers would return Orders that are associated with Customers, while an
outer join might return [left outer-join] all Orders, with or without Customers, or
[right ] all Customers, with or without Orders.)
Since SQL queries do not specify how the database engine is to obtain the specified
results, it is very important to understand how your queries will be interpreted. It is
certainly possible to write two different queries that will produce the same results, but that
will do so in dramatically more- or less-efficient ways. Most database systems provide an
EXPLAIN command which will tell you (in rather arcane, system-specific terms) exactly
how the database engine would go about carrying out a particular query.
A very significant problem with SQL queries, in too-typical deployed applications, is
that the web server (or, whoever is issuing a particular query ) can do anything and
everything. Every SQL server has some kind of permissions-system which specifies
exactly what any user is and is not permitted to do. If some web-site hacker is, by
whatever means, able to persuade your web-server to issue the DROP TABLE (or even
the DROP DATABASE(!!)) command, and your web-server is authorized to issue such a
command, then (at least a very-significant part of) your database just disappeared.
Nuff said.
When you deal with SQL databases, you must also deal with the issue of
concurrency. On a typical database server, hundreds of queries might be executing at the
same time, and these queries may or may not be specifically concerned with what the
other queries are doing. (For instance, if a user is merely browsing a product catalog,
its essentially a certainty that the catalogs contents wont be changing at the time.
Accounting data, however, is a different matter.) SQL database systems have a specific
strategy for dealing with this issue: transactions.10
A transaction is defined as a single unit of work it could be a set of
modifications, deletions, and/or updates, or it could simply be a set of queries which is
considered to be atomic. That is to say, a single, indivisible group. In the case of
modifications, either the entire set of modifications happens, or, none of them do. In
any case, a transaction has some specified degree of isolation from every other transaction
that is occurring at the same time. For example, an accounting report might need to secure
a snapshot view of a very-busy database as it existed at a particular instant in time.
Four: Precise Specification, Strategy, and Implementation

As a professional computer programmer, you will always be confronted with the task
of turning user-requirements into 1s and 0s.
But first, you will be confronted with the realization that the user-requirements which
you have been given, are not yet suitable to be treated in that way!
People describe their requirements in human and business terms. To them, it is an
Invoice, and everything in their existing experience concerning invoices is:
assumed, implied, well, of course, and perhaps worst of all, I forgot to mention.
As I discuss in my e-book, Managing the Mechanism,11 when you build computer
software, you are building a self-directing machine. It is a machine composed of
if/then/else decisions, variables, internal and external state. A machine composed of
literally billions of moving parts, all potentially interconnected to one another and so
influencing one another. In true binary fashion, such a mechanism is either ( ) correct,
or ( ) not. There is nothing in-between.
Ordinarily, specifications will come to your team from business analysts, whose
job it is to flesh-out the description of the project to fully encompass all of the relevant
business implications including the implications that the stakeholders themselves
probably didnt think about. But even these will be business requirements, not yet
expressed in terms of new source-code that is to be written, nor changes that are to be
made to existing code.
The worst thing that a programmer can do at this point, is to do what is ordinarily
done at this point: to just start writing code. Effectively, making it up as they go
along. Addressing each problem or issue as it is encountered. The key problem with this
lack-of strategy strategy is simply that the pieces of source-code that you are writing
now must interact with other pieces that have not yet been developed or even designed!
Likewise, if a thorough analysis of the existing source-code base has not first been done,
there is an overwhelming probability that the new code wont mesh properly with it. The
incompatibility will be discovered at the worst possible time: when the previously-written
material is being put into service.
Software-writing is not must not be a voyage of discovery. No one in their
right mind sets sail from a harbor or takes off from an airport without a plan; a plan that
specifically includes contingencies. Sure, the overall plan usually consists of a number of
shorter hops, but careful attention is placed upon anticipating the course of the project
and looking for anything that might interfere with its success.
This form of project-planning, in a very real sense, is code-writing. Figuring out
what code needs to be written, what it needs to do, what inputs and pre-conditions it will
encounter, and exactly what outputs and responses it must produce in all cases, truly is
the hard part of developing computer software. By comparison, writing source-code
and getting it to compile is entirely secondary.

Five: Front-End / Back-End. User Interfaces and

Frameworks.
All real-world production applications will be found to have a multi-tier
architecture. They will involve the interaction of the machine in the customers hands
(or, on her desk ), connecting to some server(s) which are responsible for performing all
or part of the work. Each of these servers, in turn, may communicate with other servers.
AJAX is the term that is most-commonly used to describe the interaction between
the JavaScript on a typical web-page (which is executing on the users own computer),
and remote HTTP servers. This is a specific case of a technique called IPC = InterProgram Communication, RPC = Remote Procedure-Calls, and/or Client/Server.
In the case of AJAX, the JavaScript front-end code is making specific requests to
the back-end web server. Although these requests are being made using the HTTP (or
HTTPS) protocol, they are not requests for HTML content. Instead, the front-end creates
a packet of information similar to filling-out a standard form and sends it via HTTP to
the server, which carries out the request and returns another packet of information
describing the outcome. Hence the term, remote procedure-call. The front-end
client effectively asks the server to call a subroutine (a procedure), and to return to the
client the results thereof.
Since the protocol that is being used is HTTP, both the request and the response must
be encoded in such a way as to be compatible with the requirements of HTTP. This is
typically done using techniques such as JSON and YAML. Other characteristics of
HTTP, such as cookies, are used to maintain state and to identify the source of the
requests.
In larger systems, a format called XML is used, and the process is altogether much
more formalized. In this arrangement, called SOAP, the communicating computers
might not know each other but might discover one another. Formal methods have
been devised (WSDL, etc.) by which the computers can broadcast their capabilities to
one another as part of this process.
In all of these cases, the work that is being done is both very-complex and veryfamiliar. Its been done a thousand times before. Therefore, developers have concocted
and perfected many frameworks which can be used as foundations for building substantial
parts of the application. Every time you work on an application such as these, you will
encounter a framework most likely, a different one than you have ever used before. So it
goes.
Frameworks are also used to construct front-end user interfaces. Some toolkits are
used to gloss-over the differences between web browsers. Others gloss-over the
differences between different types (and brands) of mobile devices.
Each framework makes it very easy to do certain things to create certain visual
effects on a device, for example. Be warned, however, that the results are very beguiling.
Its easy to confuse apparent progress (which appears to happen very fast indeed) with
the not-so visible supporting software infrastructure which must be built under the surface.
It is also very easy to fall in love with what youve done, only to discover that a
different approach or presentation might work better. Once again, these discoveries are
often made at inopportune moments, and require sometimes deep-seated and far-reaching
changes to the system, which quickly de-stabilize it.

Six: Pragmatic Debugging Skills

Debugging seems like an arcane art. (Done well, it seems like voodoo.) However, in
its entirety, debugging is only one facet of a much larger concern: keeping defects from
getting into the code in the first place, and making them OIL = obvious, identifiable, and
locatable when they do.
This is a concern that should permeate your software-writing at many levels. It will
cause you to add things to your code that your colleagues might call unnecessary or
inefficient. (But let the record show that you spend far less time debugging than they
do!) Done properly, this will result in robust software that keeps your pocket-pager silent
all night long.
The first principle that I will now offer is that: the computer software, itself, is
actually the only party that is truly in the position to detect a defect within itself. The
first and hardest step in troubleshooting any piece of software is literally to be aware of
the defects existence. Therefore, code suspiciously. Subroutines should check all of
their inputs and their assumptions. In any chain of if/then/else logic where one of these
cases must match, always remember to add one more case: the case that cant possibly
happen, but just did. If you dont do this, its probably impossible for anyone to know.
So, when your suspicious code now finds that something which cant possibly
happen, just did, whats it supposed to do about it? The best strategy is to throw an
exception. This is similar to dividing-by-zero, or yelling Fire! Execution stops in its
tracks, and the programming language looks for an exception handler (which youve set
up in advance). When it finds one, control is immediately transferred to that handler
(never to return ), and an exception object describing the incident is handed to it.
Among this information is the exact location from whence the exception was thrown.
Your suspicious programming, having discovered a fire, has just notified the firedepartment.
Another excellent debugging technique is to log informative progress-messages to
some kind of file, or to an event-log mechanism provided by the operating system.
Generation of debugging messages is often an option that can be turned on or off.
Many programming languages provide a feature called assertions. These take the
form of a subroutine, usually named
, which is given a parameter that must
be true. If the parameter is not true, an exception is automatically thrown. Furthermore,
this feature is implemented in such a way that it can be turned-on or turned-off; included
in the program or omitted from it. Sprinkle your source-code liberally with these
assertions.
A final useful debugging-technique is a trace table. This is simply an array or buffer,
of some certain size, into which in-progress messages are recorded. The trace-table is
circular: new messages replace the oldest ones. (Each entry is typically timestamped.)
The advantage of this technique is that it is very fast, because it doesnt involve
input/output. Facilities must be provided to extract the present content of the trace table.
Debugging, unabashedly, is detective work. But these two techniques that I have
now discussed will greatly improve the effectiveness of this process. By making the
program suspicious of its own behavior, you improve the odds that defects will be
discovered and corrected early. By building a chronology of what happened recently,
you make it easier to discover the internal-state of the system which enabled the defective
behavior to occur.
All production systems should also be accompanied by a comprehensive test suite,
the purpose of which is to exercise and re-exercise components of the system at various
levels. The test-suite is run and re-run constantly, and both successful and unsuccessful
outcomes are logged. If a source-code change is introduced (or is about to be introduced)
which causes a test-case to fail well, forewarned is forearmed.
In Closing:
Well then, there you have it. My six. And, along the rambling way, my pragmatic
recommendations. Most if not all of the topics that I have quickly described in this little
book will call for further exploration on your part, and I hope that I have succeeded in
setting the stage of understanding for you to do so.
Computer programming has changed enormously over the past sixty years and
counting, but in many ways it has changed not-at-all. Were still writing instructions for
electronic machines to (unthinkingly) carry out, and the process requires a lot of human
thought. Capturing the big picture, in spite of the myriad details, is something that can
easily become lost in the shuffle. I hope that these words have helped you in some small
way, and welcome your comments, reviews and feedback.

Vince North

1 The observation, attributed to Dr. Gordon Moore, that semiconductors would double in speed and density every two
year.
2 A term most-likely originally coined by Charlotte Bront in her book, Jane Eyre: Gentle Reader, may you never feel
what I then felt! May your eyes never shed such stormy, scalding, heart-wrung tears as poured from mine Oh yes,
Ms. Bront had quite the way with words.
3 Wikipedia defines craft as: a pastime or a profession that requires particular skills and knowledge of skilled work.
4 Hey, Im the Author here. I can do that. Please, read on
5 All right, all right. I have no pragmatic choice, at this point, but to impose an important technical term: process,
where previously I said, executing program. On almost any computer today, you can run more-than-one copy of the
same program at the same time, just as easily as you can run different programs. Operating systems routinely call
each distinct instance of <any program>, running here a process.
6 The operating system is the foundational layer of software which governs the operation of the entire computer
system. Unix, Linux, OS/X, Z-OS, etc. are all examples of this. These create and manage the operating
environment under which all processes ultimately operate, and define and implement the entire world that is available to
them.
7 A thread is an independent thread of execution running within the auspices of a single process. For our purposes
now, simply think of it as having its own stack.
8 ISBN: 978-0-13-022418-7.
9 This characteristic is sometimes called, late binding.
10 Not every SQL database system supports transactions. Most do, but some have strings attached. For instance, the
ever-popular MySQL database only supports transactions if the InnoDB physical file-format is used.
11 North, Vincent P., Managing the Mechanism: Why Software Projects Arent Like Any Other Project Youve Ever Tried
to Manage (And How To Do It Successfully). Published 2012. ISBN: 978-0-715-74383-7.

6 Things About Programming That Every Computer Programmer Should Know - Vincent North

Enviado por

Dados do documento

Direitos autorais

Formatos disponíveis

Compartilhar este documento

Compartilhar ou incorporar documento

Opções de compartilhamento

Você considera este documento útil?

Este conteúdo é inapropriado?

Direitos autorais:

Formatos disponíveis

6 Things About Programming That Every Computer Programmer Should Know - Vincent North

Enviado por

Direitos autorais:

Formatos disponíveis

6

Things About Programming That Every

Introduction: Just Call It A Careers Experience, Condensed

One: Internal Memory Management and Data Structures:

(clearly, push-down) stack.)

approach, whereas other languages permit object-oriented and conventional (procedural)

Three: SQL Queries and Concepts

Four: Precise Specification, Strategy, and Implementation

Five: Front-End / Back-End. User Interfaces and

Six: Pragmatic Debugging Skills

4 Hey, Im the Author here. I can do that. Please, read on

now, simply think of it as having its own stack.

9 This characteristic is sometimes called, late binding.

to Manage (And How To Do It Successfully). Published 2012. ISBN: 978-0-715-74383-7.

Você também pode gostar