Computer related numeric systems and their written representation
In My last post I talked a little about the way computers see both their pre-defined commands in the form of machine code instructions that are grouped into applications by a programmer.
All computers represent machine code instructions and any related data into sets of binary digits(1’s and 0’s, on or off). This Binary representation is composed of bits(1 or 0), which in turn are grouped into larger sets such as bytes(8 bits, 10101010).
Returning to a single bit as a binary digit it represents one of two states. The concept of a bit can be understood as a value of either 1 or 0, on or off, yes or no, true or false, or encoded by a switch or toggle of some kind.
While a single bit, on its own, is able to represent only two values, a string of bits may be used to represent larger values. For example, a string of three bits can represent up to eight distinct values.
A byte or eight bits is the smallest elementary grouping that a computer can use, even if the value it stores is less that the maximum value it can hold (Decimal values = 255) the full byte of storage will always be used.
As the number of bits composing a string increases, the number of possible 0 and 1 combinations increases exponentially. While a single bit allows only two value-combinations and two bits combined can make four separate values and so on. The amount of possible combinations doubles with each binary digit added as illustrated.
Adding all the values in the byte illustrated gives a maximum range from 0 to 255 or 256 possible combinations including the value of 0.
Groupings with a specific number of bits are used to represent varying things and have specific names.
A byte is a bit string containing the number of bits needed to represent a character. On most modern computers, this is an eight bit string. Because the definition of a byte is related to the number of bits composing a character, some older computers have used a different bit length for their byte.
It is important to understand that the hardware within a computer can and does only use the binary numeric system for storing both (machine code instructions, memory addresses and numeric values). This fact is related to the actual hardware itself which can electronically only store two states of being (on or off), each memory location actually consisting of a physical micro switch that exists only within these two possible conditions.
Beyond the on or off ( 1 or 0 ) state of the binary system however, programmers have made use of or devised the numeric systems of ( Octal, Hex-decimal and Decimal ) to make the best use of computer memory for data or program storage and execution.
I don’t want to go into the full mathematical details of these numeric representation systems in this post as it would be to much here. I will however show below the way computer programmers divide storage for the purpose of processing both data and machine code instructions.
Octal
The octal numeral system, or oct for short, is the base-8 number system, and uses the digits 0 to 7. Octal numerals can be made from binary numerals by grouping consecutive binary digits into groups of three (starting from the right). For example, the binary representation for decimal 74 is 1001010, which can be grouped into (00)1 001 010 – so the octal representation is 112.
Why use Octal In computer programming ?
Octal became widely used in computing when systems such as the PDP-8, ICL 1900 and IBM mainframes employed 12-bit, 24-bit or 36-bit words. Octal was an ideal abbreviation of binary for these machines because their word size is divisible by three (each octal digit represents three binary digits). So four, eight or twelve digits could concisely display an entire machine word. It also cut costs by allowing Nixie tubes, seven-segment displays, and calculators to be used for the operator consoles, where binary displays were too complex to use, decimal displays needed complex hardware to convert radices, and hexadecimal displays needed to display more numerals.
All modern computing platforms, however, use 16-, 32-, or 64-bit words, further divided into eight-bit bytes. On such systems three octal digits per byte would be required, with the most significant octal digit representing two binary digits (plus one bit of the next significant byte, if any). Octal representation of a 16-bit word requires 6 digits, but the most significant octal digit represents (quite inelegantly) only one bit (0 or 1). This representation offers no way to easily read the most significant byte, because it’s smeared over four octal digits. Therefore, hexadecimal is more commonly used in programming languages today, since two hexadecimal digits exactly specify one byte. Some platforms with a power-of-two word size still have instruction subwords that are more easily understood if displayed in octal; this includes the PDP-11 and Motorola 68000 family. The modern-day ubiquitous x86 architecture belongs to this category as well, but octal is rarely used on this platform, although certain properties of the binary encoding of opcodes become more readily apparent when displayed in octal, e.g. the ModRM byte, which is divided into fields of 2, 3, and 3 bits, so octal can be useful in describing these encodings.
Octal is sometimes used in computing instead of hexadecimal, perhaps most often in modern times in conjunction with file permissions under Unix systems (see chmod). It has the advantage of not requiring any extra symbols as digits (the hexadecimal system is base-16 and therefore needs six additional symbols beyond 0–9). It is also used for digital displays.
Hex-decimal
In mathematics and computing, hexadecimal (also base 16, or hex) is a positional numeral system with a radix, or base, of 16. It uses sixteen distinct symbols, most often the symbols 0–9 to represent values zero to nine, and A, B, C, D, E, F (or alternatively a, b, c, d, e, f) to represent values ten to fifteen. Hexadecimal numerals are widely used by computer systems designers and programmers. Several different notations are used to represent hexadecimal constants in computing languages; the prefix “0x” is widespread due to its use in Unix and C (and related operating systems and languages). Alternatively, some authors denote hexadecimal values using a suffix or subscript. For example, one could write 0x2AF3 or 2AF316, depending on the choice of notation.
A nibble (sometimes nybble), is a number composed of four bits. Being a half-byte, the nibble was named as a play on words. A person may need several nibbles for one bite from something; similarly, a nybble is a part of a byte. Because four bits allow for sixteen values, a nibble is sometimes known as a hexadecimal digit.
As an example, the hexadecimal number 2AF316 can be converted to an equivalent decimal representation. Observe that 2AF316 is equal to a sum of (200016 + A0016 + F016 + 316), by decomposing the numeral into a series of place value terms. Converting each term to decimal, one can further write: (216 × 163) + (A16 × 162) + (F16 × 161) + (316 × 160) =
(2 × 4096) + (10 × 256) + (15 × 16) + (3 × 1) = 10995.
Each hexadecimal digit represents four binary digits (bits), and the primary use of hexadecimal notation is a human-friendly representation of binary-coded values in computing and digital electronics. One hexadecimal digit represents a nibble, which is half of an octet or byte (8 bits). For example, byte values can range from 0 to 255 (decimal), but may be more conveniently represented as two hexadecimal digits in the range 00 to FF. Hexadecimal is also commonly used to represent computer memory addresses.
Why use Hex-Decimal in computer programming ?
The main reason why we use hexadecimal numbers is because it is much easier to express binary number representations in hex than it is in any other base number system. Computers do not actually work in hex. Lets look at an example, using a byte. Bytes are typically 8 bits, and can store the values 0 – 255 (0000 0000 – 1111 1111 in binary). For people, expressing numbers in binary is not convenient. You are not going to turn around to your co-worker and tell them that my phone number is 101 101 101 001 010 001 010 for obvious reasons. Imagine having to try and work with that on a daily basis. So a more convenient expression is needed for the human.
Since a byte is 8 bits, it makes sense to divide that up into two groups, the top 4 bits and the low 4 bits. Since 4 bits gives you the possible range from 0 – 15, a base 16 system is easier to work with, especially if you are only familiar with alphanumeric characters. It’s easier to express a binary value to another person as “A” then it is to express it as “1010”. This way I can simple use 2 hex values to represent a byte and have it work cleanly.
This way if you are poor at math, you only need to memorize the multiplication tables up to 15. So if you have a hex value of CE, I can easily determine that 12 * 14 = 206 in decimal, and can easily write it out in binary as 1100 1110.
Trying to convert from binary would require me to know what each place holder represents, and add all the values together (128 + 64 + 8 + 4 + 2 = 206). It’s much easier to work with binary through hex than any other base system.
Decimal
The Decimal value system is used within computer programming to primarily represent the numeric data values that a program will (receive, store, manipulate and output ) to the end user of an application.
As an example the C++ language can represent decimal values in the following formats in the form of (Type,Typical bit Width and Typical value range)
char, 1byte, -127 to 127 or 0 to 255
unsigned char, 1byte, 0 to 255
signed char, 1byte, -127 to 127
int, 4bytes, -2147483648 to 2147483647
unsigned int, 4bytes, 0 to 4294967295
signed int, 4bytes, -2147483648 to 2147483647
short int, 2bytes, -32768 to 32767
unsigned short int, Range, 0 to 65,535
signed short int, Range, -32768 to 32767
long int, 4bytes, -2,147,483,647 to 2,147,483,647
signed long int, 4bytes, same as long int
unsigned long int, 4bytes, 0 to 4,294,967,295
float, 4bytes, +/- 3.4e +/- 38 (~7 digits)
double, 8bytes, +/- 1.7e +/- 308 (~15 digits)
long double, 8bytes, +/- 1.7e +/- 308 (~15 digits)
wchar_t, 2 or 4 bytes, 1 wide character
As you can see these decimal data types (Including character types, alphanumeric characters are given a decimal value within a character table !) exist at a higher level of representation than at a lower level of Binary code computation. These decimal data types can and do only exist within low/high level programming languages which convert their decimal values into binary values in order for any lower level binary functions to be performed on their values.
This numeric representation process is clearly also reproduced in the case of Octal and Hex-decimal values both of which can also be used to represent instructions within the CPU’s machine code instruction set.
Pingback: Hexadecimal file editors and viewers | Nigel Borrington , Computer skills on small computers.