Você está na página 1de 13

DAT2343

Basic Character Encoding


Including ASCII

Alan T. Pinck / Algonquin College; 2003

Historical Character Usage


Early general purpose computers (dominated by IBM)
supported limited usage of non-numeric characters:
Identification/headings on printed reports
Program source code text
Outside of the general computer area, some character
encoding was used for text transmission (telegrams);
initially Morse code, but this was replaced with fixed
length pattern codes when automated equipment began to
be used.

Historical Requirements
Text was not intended for the general public.
Alphabetic characters where only in upper case
Relatively few special characters (periods,
parentheses, dollar signs, arithmetic operators)
were supplied.
10 digits, 26 letters, (less than) 20 special
character symbols (less than) 56 code patterns
were required.

6-Bit Codes
A 6-bit encoding systems permits 64 symbols to be
encoded. This was enough provided only upper case
alphabetic symbols and relatively few special
symbols were required.
IBM and Western Union (the telegraph company)
stayed with 6-bit encoding systems after most of the rest
of the computer and data transmission companies
moved to a system which permitted both upper and
lower alphabetics, and more special symbols.

Formation of ASCII
General user demand for more character symbols
(including lower case alphabetics).
IBM did not believe that the market demand was
sufficient to move from a 6-bit code
No other single company controlled a large enough
market share to be able to create a viable system on its
own.
A group of computer, peripheral, and data transmission
companies joined to establish a standard.

ASCII Basics
American Standard Code for Information Interchange
7-bit code provided unique codes for up to 128
different characters
Some terminal equipment: when idle, the power was
off (which would look like 0000000); other terminal
equipment: when idle, the power was on (which would
look like 1111111). Therefore both the 0000000 and
the 1111111 patterns were eliminated from the
encoding (null patterns).

Extended ASCII
Byte = Collection of bits used to encode a character
ASCII is almost always implemented using an 8-bit
byte (character).
Only the 7-bit patterns were standardized under
ASCII.
Standard 8-bit ASCII codes start with a zero-valued
bit (followed by the 7-bit ASCII code).
Extended ASCII codes start with a one-valued bit;
these codes are not standard and vary in meaning
among different manufactures and equipment.

Major ASCII Coding Patterns


First 32 patterns (when written in hexadecimal, any
patterns starting with 0 or 1): control codes; the most
common of these are 0Ah (Line Feed) and 0Dh
(Carriage Return)
20(hex) blank; remainder of codes starting with 2(hex)
are special characters.
30(hex): 0; 31(hex): 1; etc.
41(hex): A; 42(hex): B; etc.
61(hex): a; 62(hex): b; etc.

Sample ASCII Decoding - 1


Suppose we have the bit stream:
010101000110100001100101001000000011
Our first task would normally be to rewrite this as a series of
pairs of hexadecimal digits:
01010100 01101000 01100101 00100000 0011
5
4
6
8
6 5
2 0 .
(in actual practice it would be more common for the bit
stream to be presented already in pairs of hexadecimal
digits)

Sample ASCII Decoding - 2


Write down the alphabet and beside each letter write
its ASCII code:
A : 41h (lower case add 20h) K : 4Bh
B : 42h
C : 43h Z : 5Ah
. Remember: digits are 3?h
I : 49h
blank is 20h
J : 4Ah LF is 0Ah
CR is 0Dh

Sample ASCII Decoding - 3


Given the ASCII hexadecimal pattern (as an
example):
54 68 65 20 33 0A 0D 47 6F 61 74 73
Matching these codes to the table we created, we
should have no trouble converting this into the
text:
The 3
Goats

Note on End-Of-Line Codes


Different operating systems use different
standards for indicating an end of line.
MicroSoft uses a two-character sequence:
0Dh 0Ah (carriage return and line feed)
Unix uses only 0Dh (the carriage return)
Macintosh uses only 0Ah (the line feed)
This can cause some problems when moving
text files from one system to another.

End of Lecture

Você também pode gostar