C Primer Plus (5th Edition)
When a user interacts with a computer, he or she provides input for it (see Figure 1.1). In response, the computer processes the input returning (hopefully) valuable output to the user. Input can be in the form of commands given, text and numbers typed, and images scanned. Output could be the results of spreadsheet calculations, a letter printed out on a printer, or a car moving onscreen during a racing car game.
Figure 1.1. Input is processed and is returned as output.
For the user to provide input, the computer has several input devices as part of its hardware the keyboard and the mouse being the most familiar.
To the user, output is typically provided with devices such as screens and printers. Other input or output devices are available, some of which are very specialized.
Two additional important hardware components, which are less obvious to the ordinary user, are the processor and the memory.
The processor is the device inside the computer that follows a program's instructions. Other terms used for the processor are CPU (Central Processing Unit) or chip. Many different commercial chips are available. A well-known example would be the Pentium chip.
The processor can only carry out simple instructions, such as very simple arithmetic calculations and moving items like numbers around among different locations. However, the speed with which the processor can perform these tasks is amazing and allows it to work through intricate combinations of instructions to perform very complex operations.
Computer memory can be divided into two categories auxiliary memory and main memory.
Auxiliary memory consists of devices used by the computer to store data permanently. When needed, the data can be retrieved from these devices.
Typical examples of auxiliary memory are floppy disks, compact disks, and disk drives.
Main memory holds the running program and the results of the computer's intermediate calculations. It is also referred to as RAM or Random Access Memory.
Main memory is where a program, or parts of a program currently executing, is stored. It also holds much of the data being manipulated by the program. Examples of data could be the wind speed in different parts of a simulated tornado or the current location of a race car in a car game application. C# conveniently allows the programmer to name the data values of the program and simply refer to the values by their name. For example, he or she could call the tornado wind speed windSpeed or the car location carLocation. Thus, most of the time, the programmer is not exposed to the underlying mechanisms of main memory; it is all taken care of by C#. However, the programmer must, to a limited degree, still decide how these values are represented by main memory to write a valid program. Furthermore, if appropriately done, he or she can design leaner and faster programs. Consequently, the nature of the main memory discussed shortly consequently contains important aspects for you to understand.
The Nature of Main Memory
The main memory contains millions of tiny electrical circuits. The circuits are like light switches in that they can be in one of two states 1 for "on" or 0 for "off." This two-stateness originates from the relative ease with which physical devices containing just two stable states can be designed and manufactured.
At the machine level of the computer, it is only possible to do the following 3 things:
-
Set a circuit to 1
-
Set a circuit to 0
-
Check the state of a circuit
These very primitive operations might seem too restrictive to allow the computer to perform complicated tasks such as word processing and playing games. However, the great advantage of main memory is its astonishing speed and the ability to assign various meanings to individual or groups of circuits.
To illustrate how you can use just one circuit (either 1 or 0) to contain important information, consider a computer game with two difficulty levels Novice and Advanced. When the user starts the game, the program asks whether the user wants to play at the Novice or Advanced level. If the user chooses Novice, the computer will, if programmed accordingly, set a specific circuit to 0. If the user chooses Advanced, the circuit will be set to 1. So whenever the running program needs to know the current difficulty level during a game (maybe to determine whether your enemy in the game should be good at martial arts or not), the computer can check the state of the circuit.
By applying other interpretations to the state of a circuit, any meaning can be assigned to it by the program. Other examples could be connected (1) or not connected (0) to the network, ready (1) or not ready (0), underlined (1) or not underlined (0), bold (1) or not bold (0).
New Term: Bit
A bit is a circuit or a digit that can have exactly two values, such as 1 and 0. |
So far, I have explained how concepts with only two states can possibly be represented. Far more complex information, such as a number like 6574635, or a sentence such as "This sentence is more complex." also needs to be represented in the main memory. Let's see how the numbers 0, 1, 2, 3 can be represented in the main memory. To represent a number, we form a group of several circuits and combine their states. In this case, we need to combine two circuits and then decide on the interpretation, as shown in Figure 1.2. It is important to notice that the interpretation is arbitrary and not necessarily calculated. Consequently, we could have written 20 instead of 3 in the lower-right square of the figure but, because we have decided to represent 0, 1, 2 and 3, stick to 3 for the moment.
Figure 1.2. Representing 0, 1, 2, and 3 with 2 bits.
When circuit 1 is 0 and circuit 2 is 0, we have decided the value to be 0 and so forth.
The computer program could easily apply another interpretation representing, for example, four colors, such as yellow, red, blue, and green.
How many numbers can we represent with 3 circuits? Figure 1.3 shows us the different combinations.
Figure 1.3. Representing 0, 1, 2, 3, 4, 5, 6, and 7 with 3 bits.
By involving more circuits and changing the interpretations of the states of the circuits, it is possible to represent any number, letter, color, and so on that we want.
The Number Systems
The system of 1s and 0s used by computer hardware internally to represent numbers requires a new way for us humans to think about numbers. Suddenly, the symbols 1, 0 (10) inside the computer at the machine level are not equal to the number of fingers we have on our two hands; they merely represent the number of legs on a human, if interpreted as in the previous example. We were taught the fundamentals of arithmetic at an early age and most of us do not question the meaning of 10 anymore. Our ingrained way of thinking about 10 assumes two things:
The presence of the 10 digits: 0, 1, 2, 3, 4, 5, 6, 7, 8, 9. Our number system is for that reason called base 10 or the decimal number system. 10 (as in base 10) can also be found by adding one to the highest digit (9).
Each position in which a digit is written has a specific positional value. For example, in the decimal number 853, we say that 3 is written in the ones position, five is written in the tens position, and 8 is written in the hundreds position. Accordingly, we read it eight hundred and fifty three.
The number system used by the computer is called the binary number system or base 2 because only two digits (0 and 1) are used to represent various numbers. Appendix D, "Number Systems," provides a detailed discussion of the binary number system.
Consider the binary number 1010100001. By applying the positional values as described in Appendix D, we can convert this number to base 10:
1010100001 has a relatively large number of digits compared to its base-10 counterpart.
In general, binary numbers tend to be considerably longer than their corresponding decimal numbers. Consequently, programmers manipulating numbers at the machine level find it very cumbersome to work with base 2 numbers. Luckily, it is possible to abbreviate binary numbers in a convenient manner using two other number systems called the octal number system (base 8) and the hexadecimal number system (base 16).
For a more detailed discussion of the octal and hexadecimal number systems and how to abbreviate binary numbers, please see Appendix D, "Number Systems."
Bytes
To keep track of the data stored in memory, the computer needs to know what is stored where. This is facilitated in main memory through a very long list of numbered locations called bytes. Each byte contains a list of eight bits so, instead of the 2 and 3 bits shown in Figures 1.2 and 1.3, we now are dealing with the equivalent of eight circuits. How many different numbers can then be represented by 1 byte? This is one of the questions at the end of the chapter.
One byte is limited to store data, such as relatively small numbers or a restricted set of characters. If larger numbers or texts with many characters need to be stored, the computer provides the necessary memory space by grouping together several adjacent bytes.
An address is attached to each byte and used by the computer to locate a particular byte when its data needs to be recovered. In the case of several adjacent bytes holding one piece of larger data (such as a large number or a string of text) as mentioned previously, these bytes are considered to have only one single memory location, represented by the address of the first byte in this group of bytes. Figure 1.4 illustrates how bytes can be positioned in the main memory of a typical computer.
Figure 1.4. Bytes and their locations.
The memory locations and their boundaries are determined by the software running on the computer, so they are not directly influenced by the hardware. In Chapter 2 and Chapter 3, "A Guided Tour Through C#: Part I," I will introduce the idea of a named variable as a convenient means of referring to specific locations in memory. This enables us to abstract away from the arcane memory addresses and allows us to give meaningful names to the data we are working with in our programs. For example, to represent a population size inside a program, we can simply call this size populationSize at our convenience instead of having to write something similar to "4 bytes at location 4027."
Files
Equipped with the knowledge of bits and bytes, let us return to the permanent auxiliary memory for a moment. This memory stores large collections of bits called files on the various storage devices mentioned earlier (floppy disks, hard disks, and so on). Each file usually has an arbitrary name (for example, MyDocument) and an extension that indicates the type of file we are dealing with (such as .doc, indicating a document written in the Microsoft Word processor). So, a fully qualified name could look like MyDocument.doc.
Files can contain almost any sort of data. Examples could be an image, a sound file, a computer program, a letter, or just a list of numbers. Various conventions are used to encode and interpret the contents of a file similar to the interpretations used for bits and bytes in main memory discussed earlier. Conventions you might have encountered are MPEG (extension .mpg) for storing multimedia files, JPEG for storing images (extension .jpg).
A specific type of file you will encounter frequently in this book is a text file with the extension .cs (for c sharp). These files contain C# source code. A source code file could, for example, be called MyProgram.cs. Files are frequently arranged into groups of files called folders or directories. These help the user organize files into coherent groups.