Win32 API Programming with Visual Basic

Page 64
  5. Signed and Unsigned Data Types  
   
  In this chapter, we will take a close look at signed and unsigned data types. The topic is one of special importance in view of the prevalence of unsigned data types in the Win32 API, their general absence in Visual Basic, and the frequent difficulty of translating an unsigned value into its corresponding signed value.  
 
  Signed and Unsigned Representations  
   
  We have seen that VC++ and the Win32 API use both signed and unsigned integral data types, whereas VB has only one unsigned type the Byte data type. This can create a problem when an API function either expects or returns an unsigned data type. To understand just what is involved, we need to take a look at the internal workings of these data types and how they are represented in memory.  
   
  The place to start is with some carefully defined terminology. We will couch our examples in terms of 16-bit words, but what we say applies equally to words of any length.  
   
  A 16-bit word is simply a string of 16 binary bits, as in:  
 
  w = 1111000011110000  
   
  The key point is that a binary word is not a number until and unless it is given an interpretation as a number. It is just a string of bits.  
   
  There are two common ways in which integers are represented as 16-bit words in most computers, including PCs: the unsigned representation is used to represent only nonnegative integers, and the two's complement signed representation is used to represent both negative and nonnegative integers. The latter is generally abbreviated in Microsoft's documentation as the signed representation, and we may do  
Page 65
   
  so as well, although there are other types of signed representations (including the one's complement signed representation and the signed magnitude representation).  
   
  It is important to understand that it is not the number itself that is signed or unsigned, it is the representation of the number as a binary word that is signed or unsigned. Numbers are neither signed nor unsigned. (A number can be positive, negative, or zero, but that is not the same thing as signed or unsigned all numbers have a sign.) Thus, the commonly used term signed integer is highly misleading. It should be read as signed representation of the integer. Nevertheless, this terminology is so common and so convenient that we will use it as well.  
   
  When we declare an integer variable in Visual Basic and give it a value, as in:  
 
  Dim i As Integer

i = 5

 
   
  VB represents the integer using the two's complement signed representation. In this, we have no choice. Put another way when VB interprets a binary word as an integer, it does so using the two's complement binary representation of that word. Period. On the other hand, VC++ is more flexible, allowing us to choose the representation, as in:  
 
  unsigned int ui;

ui = 65000;

int i;      // or signed int i;

i = -30000;

 
   
  Here ui is an unsigned integer, that is, ui is represented in memory by VC++ as a (32-bit) binary word using the unsigned representation. On the other hand, i is a signed integer, that is, i is represented in memory using the two's complement signed representation.  
 
  Why Two Different Representations?  
   
  The reason for using an unsigned representation for integers is simple: by using an unsigned representation, we can represent larger positive integers than when using a signed representation. In exchange, we give up the ability to represent negative numbers.  
   
  In particular, a 16-bit word that uses the two's complement signed representation can represent any integer in the range -32768 to 32767, whereas a 16-bit word that uses the unsigned representation can represent integers in the range 0 to 65535.  
   
  There are compelling reasons to include both signed and unsigned representations in a programming language. It is probably not necessary to comment on the fact that a language would be hampered significantly if it could not represent negative numbers. On the other hand, an unsigned representation is useful for two reasons.  
Page 66
   
  When we want to do arithmetic with positive numbers only, we get a larger range of values using an unsigned representation. This happens with addresses, for instance. If the word size of a computer is, say, 32 bits, then the most natural way to access all 232 possible memory addresses is by using an unsigned representation.  
   
  Many numeric data types do not require the use of arithmetic. For instance, window handles are 32-bit numbers, but it makes no sense to add, subtract, or otherwise manipulate these numbers. They are strictly for identification purposes. Thus, the HANDLE data type is an unsigned long data type.  
   
  It is time that we consider how the two's complement signed and unsigned interpretations actually work.  
 
  Unsigned Representation  
   
  When a 16-bit word is interpreted as unsigned, we simply count up from 0 in binary, as in the following list (the double arrow stands for ''represents"):  
 
  0000 0000 0000 0000   0

0000 0000 0000 0001   1

0000 0000 0000 0010   2

.

.

.

0111 1111 1111 1111   2^15 - 1 = 32767

1000 0000 0000 0000   2^15 = 32768

.

.

.

1111 1111 1111 1111   2^16 - 1 = 65535

 
   
  Thus, the unsigned interpretation of 16-bit words allows us to represent all integers in the range 0 to 216-1=65535.  
   
  As you probably know, each position in a binary number represents a power of 2, just as each position in a decimal number represents a power of 10. Table 5-1 is a template for creating unsigned representations of numbers. Each column represents a successive power of 2.  
Table 5-1. A Template for Unsigned Representations
215 214 213 212 211 210 29 28 27 26 25 24 23 22 21 20
32768 16384 8192 4096 2048 1024 512 256 128 64 32 16 8 4 2 1

Page 67
   
  Table 5-2 shows an example of filling in this template to find the unsigned representation of the integer 50000. By successively subtracting powers of 2, starting with the largest one that fits, we arrive at:  
 
  50000 = 32768 + 16384 + 512 + 256 + 64 + 16  
   
  Next, we place these numbers in the third row of Table 5-2 and then put 1s underneath the numbers and 0s everywhere else. This gives a fourth row in Table 5-2, from which we get the unsigned representation:  
 
  1100 0011 0101 0000   (unsigned) 50000  
Table 5-2. An Unsigned Example: Representing 50000
215 214 213 212 211 210 29 28 27 26 25 24 23 22 21 20
32768 16384 8192 4096 2048 1024 512 256 128 64 32 16 8 4 2 1
32768 16384         512 256   64   16        
1 1 0 0 0 0 1 1 0 1 0 1 0 0 0 0

 
  Signed Representation  
   
  The strategy used in two's complement representation is to use the leftmost bit as an indicator of the sign of the number. The leftmost bit is called the sign bit. If the sign bit is a 0, the word is interpreted as a nonnegative integer. If the sign bit is 1, the number is interpreted as a negative integer:  
 
  0xxx xxxx xxxx xxxx   nonnegative integer

1xxx xxxx xxxx xxxx   negative integer

 
   
  The Signed-Magnitude Representation  
   
  The most obvious way to fill in the other bits is with the magnitude (or absolute value) of the number. For example, to represent the positive number 5, we would write:  
 
  0000 0000 0000 0101   5  
   
  since the binary representation of 5 is 101. For the negative number -5, we would simply change the sign bit:  
 
  1000 0000 0000 0101   -5  
   
  This method of representing both positive and negative numbers is called the signed-magnitude representation. It is very simple, but not very useful. One problem is that arithmetic with numbers represented in this way requires taking special cases based on the sign of the number. (Just try adding the binary representations for 5 and -5.) Also, there are two representations of the number 0 (+0 and -0):  
 
  0000 0000 0000 0000   0

1000 0000 0000 0000   0

 
Page 68
   
  The Two's Complement Representation  
   
  The two's complement representation is a much better approach and is used by most modern computers. It is easy to describe using a table. The analog of Table 5-1 for the two's complement signed interpretation is Table 5-3. The only difference between this table and Table 5-1 is the negative sign in the first column.  
Table 5-3. A Template for Two's Complement Signed Representations
-215 214 213 212 211 210 29 28 27 26 25 24 23 22 21 21
-32768 16384 8192 4096 2048 1024 512 256 128 64 32 16 8 4 2 1

   
  To illustrate, Table 5-4 computes the signed representation of the integer -15536. Note that it is the same as the unsigned representation of 50000:  
 
  1100 0011 0101 0000   (signed) -15536  
Table 5-4. A Signed Example: Representing -15536
-215 214 213 212 211 210 29 28 27 26 25 24 23 22 21 20
-32768 16384 8192 4096 2048 1024 512 256 128 64 32 16 8 4 2 1
-32768 16384         512 256   64   16        
1 1 0 0 0 0 1 1 0 1 0 1 0 0 0 0

   
  Since the only difference between Table 5-1 and Table 5-3 is that the numbers in the first column are negative, it is clear that a binary word with a sign bit of 0 represents the same integer using either the signed or the unsigned representation. Put another way, for integers in the range 0 to 32767, the two representations are identical.  
   
  Also, since the sum of all of the numbers in the first row of Table 5-3 is -1, it is clear that a number is negative if and only if its sign bit is 1.  
   
  Following is a list that shows how signed representation works. The list is ordered by increasing binary word. Note the sudden change from positive to negative integers in the middle of the list.  
 
  0000 0000 0000 0000   0

0000 0000 0000 0001   1

0000 0000 0000 0010   2

.

.

.

0111 1111 1111 1111   2^15 - 1 = 32767 ' positive

1000 0000 0000 0000   -2^15 = -32768 ' negative

1000 0000 0000 0001   -2^15 + 1 = -32767

 

 

Page 69
 
  .

.

.

1111 1111 1111 1101   -3

1111 1111 1111 1110   -2

1111 1111 1111 1111   -1

 
   
  Why Is It Called Two's Complement?  
   
  The reason that it is called this has to do with how we take the negative of a number that is represented in this form. Consider any number x written in two's complement form. Let us use the number in Table 5-4 (x = -15536):  
 
  x   1100 0011 0101 0000  

 

   
  Consider now the ordinary complement of this binary word; that is, the word obtained by changing all 0s to 1s and all 1s to 0s:  
 
  xc   0011 1100 1010 1111  

 

   
  Adding the two binary numbers gives:  
 
  x + xc   1111 1111 1111 1111  

 

   
  Note that this will be the result no matter what number we start with.  
   
  But the binary word consisting of all 1s is the representation of the number -1, so we have:  
 
  x + xc = -1  

 

   
  from which it follows that:  
 
  x + (xc + 1) = 0  

 

   
  or:  
 
  -x = xc + 1  

 

   
  Thus, to get the negative of a number, we take the complement of its signed representation and then add 1. The resulting binary word is called the two's complement of the original binary word. Thus, to get the negative of a number that is represented in two's complement form, just take the two's complement of the number's binary representation..  

 
  Translating Between Signed and Unsigned Representations  
   
  Now we come to the heart of the matter translation between signed and unsigned representations. There are two issues to consider.  
Page 70
   
  First, we may need to pass to an API function a number that is too large to fit in the corresponding VB signed data type. For instance, we may need to pass a 16-bit representation of a number in the upper "unsigned" range 32768 to 65535, say for example the number 50000. In VC++, we could simply write:  
 
  unsigned short usVar;

usVar = 50000;

 
   
  but in VB, the code:  
 
  Dim iVar As Integer

iVar = 5000

 
   
  will produce an Overflow runtime error. Note that we cannot use the code:  
 
  Dim lVar As Long

lVar = 50000

 
   
  because the function is expecting a 16-bit binary word.  
   
  The second problem is the reverse. Suppose, for example, that an API function wants to return a 16-bit value in the range 32768 to 65535, such as 50000. Of course, the return value must be in a VB variable (since we are working in VB). But VB will interpret the variable as having a signed data type. In fact, it will interpret the number as -15536 because, as we have seen, this number has the same signed representation as the number 50000 has unsigned representation. So, the question is: "How do we recover the intended value?"  
   
  These problems are easy to solve if we look at them in the correct light. Referring to Figure 5-1, the point is that VB will give a 16-bit binary word a signed interpretation, whereas Win32 will give it an unsigned interpretation (we are assuming here that Win32 is expecting or returning an unsigned short integer).  
   
   
   
  Figure 5-1.

Passing numbers between VB and Win32

 
   
  As in Figure 5-1, if w is a 16-bit binary word, let us write un(w) to denote the number obtained by thinking of w as an unsigned representation and si(w) as the number obtained by thinking of w as a signed representation. Thus, from our previous example in Table 5-2 and Table 5-4, we have:  
Page 71
 
  un(1100 0011 0101 0000) = 50000

si(1100 0011 0101 0000) = -15536

 
   
  The point to keep in mind is that we are actually passing or receiving a binary word, not a number. VB and Win32 will both interpret this binary word as a number. The difficulty comes when they use different interpretations. VB uses the signed integer interpretation, and Win32 (we are assuming) uses the unsigned short integer interpretation.  
   
  Thus, referring to Figure 5-1, in passing a number un(w) to an API function, we need to tell VB to pass the number si(w), since Win32 will interpret the binary word w that is actually passed (on the stack) as un(w). Conversely, in receiving a number, VB will see it as si(w), and we need to make the translation to si(w) which, by the way, will require using a larger VB data type to hold the value.  
   
  So the whole problem boils down to translating between si(w) and un(w).  
   
  We can see how to make these translations by noting that the only difference between Table 5-1 and Table 5-3 is the negative sign in the first column. Accordingly, there are two cases to consider.  
   
  The first case is when the number si(w) is nonnegative or, equivalently, the number un(w) is in the lower half of the unsigned range (0 to 32767). (Whether we are passing or receiving, we will know one of these numbers!) In this case, the sign bit of w is 0. Hence, as we have seen:  
 
  un(w) = si(w)  
   
  Thus, in this case, we can use an ordinary VB integer to pass the number, and, in the other direction, the return value in a VB integer is the actual number (no changes are necessary).  
   
  On the other hand, suppose that the number si(w) is negative or, equivalently, un(w) is in the upper range 32768 to 65535. In this case, the sign bit of w is 1. This bit, being in the first column of Table 5-4, contributes a total of 215 to the number un(w). On the other hand, it contributes a total of -215 to the number si(w). Since the contributions from all other columns are the same in both un(w) and si(w), subtracting out the contributions from the first column should produce equal values, that is:  
 
  un(w) - 2^15 = si(w) - (-2^15)  
   
  From this, a little algebra gives the two formulas:  
 
  un(w) = si(w) + 2^16

si(w) = un(w) - 2^16

 
   
  These formulas are the key to all. We can now summarize.  
Page 72
   
  Integers  
   
  To pass a number un(w) in the range 0 to 65535 in a VB integer variable, put the number si(w) in the variable. In the other direction, if a VB integer variable receives a number and VB shows this number to be si(w), then the number passed is actually un(w). Here is the relationship between si(w) and un(w).  
   
  For si(w) >= 0 or 0 <= un(w) <= 32767(= 215-1):  
 
  un(w) = si(w)  
   
  For si(w) < 0 or 32768 <= un(w) <= 65535 (= 216-1):  
 
  un(w) = si(w) + 2^16

si(w) = un(w) - 2^16

 
   
  Sometimes a picture is worth a thousand words. Figure 5-2 shows the situation here. When a number lies in the range that is common to the signed and unsigned ranges (0 to 32767), then no changes are required when sending or receiving the number. To send a number in the upper unsigned range, subtract 216 to bring it into the signed range before sending the number. When receiving a number in the lower signed range, add 216 to get the actual number sent (in the unsigned range).  
   
   
   
  Figure 5-2.

Translating between signed and unsigned integers

 
   
  Longs  
   
  Of course, the same principle applies to 32-bit longs.  
   
  To pass a number un(w) in the range 0 to 232-1 in a VB long variable, put the number si(w) in the variable. In the other direction, if a VB long variable receives a number and VB shows this number to be si(w), then the number passed is actually un(w). Here is the relationship between si(w) and un(w):  
Page 73
   
  For si(w) >= 0 or 0 <= un(w) <= 231-1:  
 
  un(w) = si(w)  
   
  For si(w) < 0 or 231 <= un(w) <= 232-1:  
 
  un(w) = si(w) + 2^32

si(w) = un(w) - 2^32

 
   
  Figure 5-3 illustrates the translation process.  
   
   
   
  Figure 5-3.

Translating between signed and unsigned longs

 
   
  Bytes  
   
  The situation for bytes is actually the reverse of that for integers and longs, since the VB Byte type is unsigned. The problem here occurs when the API function expects or returns a signed byte. Nevertheless, the principle is exactly the same.  
   
  To pass a number si(w) in the range -128 to 127 in a VB byte variable, put the number un(w) in the variable. In the other direction, if a VB long variable receives a number and VB shows this number to be un(w), then the number passed is actually si(w). Here is the relationship between si(w) and un(w):  
   
  For si(w) >= 0 or 0 <= un(w) <= 127 (=27-1):  
 
  un(w) = si(w)  
   
  For si(w) < 0 or 128 <= un(w) <= 255 (=28-1):  
 
  un(w) = si(w) + 2^8

si(w) = un(w) - 2^8

 
   
  Figure 5-4 illustrates the translation process.  
Page 74
   
   
   
  Figure 5-4.

Translating signed and unsigned bytes

 
   
  Examples  
   
  Here are some examples:  
   
  Pass a number in the range 0 to 65535 that currently resides in a VB long variable lng to an API function with an unsigned short parameter:  
 
   APIFunction(unsigned short param)   
 
  with VB declaration:  
 
  Declare   APIFunction(param As Integer)   
 
  Solution:  
 
  Dim param As Integer ' same size as the API function's parameter

If lng >= 0 And lng <= 32767 Then

   param = lng

ElseIf lng >= 32768 And lng <= 65535 Then

   param = CInt(lng - 2^16)

Else

   MsgBox "Value out of range for an unsigned short", vbCritical

End If

Call APIFunction(param)

 
   
  Pass a number in the range 0 to 232-1 that currently resides in a VB Currency variable cVar to an API function with an unsigned int or unsigned long parameter:  
 
   APIFunction(unsigned int param)   
 
  with VB declaration:  
 
  Declare   APIFunction(param As Long)   
 
  Solution  
 
  Dim param As Long ' same size as the API function's parameter

If cVar >= 0 And cVar <= 2^31 - 1 Then

 
Page 75
 
     param = cVar

ElseIf cVar >= 2^31 And cVar <= 2^32 - 1 Then

   param = CLng(cVar - 2^32)

Else

   MsgBox "Value out of range for an unsigned int", vbCritical

End If

Call APIFunction(param)

 
   
  In the next example, the situation is reversed the API function expects a signed value but the VB Byte data type is unsigned.  
   
  Pass a number in the range -128 to 127 currently residing in a VB integer variable iVar to an API function with a signed char parameter (recall that the VB Byte type is unsigned):  
 
   APIFunction(signed char param)   
 
  with VB declaration:  
 
  Declare   APIFunction(param As Byte)   
 
  Solution:  
 
  Dim param As Byte  ' same size as the API function's parameter

if iVar >= 0 And iVar <= 127 Then

   param = iVar

ElseIf iVar >= -127 And iVar <= -1 Then

   param = CByte(iVar + 2^8)

End If

Call APIFunction(param)

 
   
  Receive a number in the range 0 to 65535 in a VB integer variable iVar, from an API function that has an OUT parameter of type unsigned short:  
 
   APIFunction(unsigned short param)   
 
  with VB declaration:  
 
  Declare   APIFunction(param As Integer)   
 
  Solution:  
 
  Dim lRealValue As Long      ' to hold the real value passed to VB

If iVar >= -32768 And iVar <= -1 Then

   lRealValue = CLng(iVar) + 2^16

ElseIf iVar >= 0 And iVar <= 32767 Then

   lRealValue = CLng(iVar)

End If

 
   
  Receive a number in the range 0 to 232-1 in a VB long variable lVar, from an API function that has an OUT parameter of type unsigned int:  
 
   APIFunction(unsigned int param)   
 
  with VB declaration:  
 
  Declare   APIFunction(param As Long)   
 
  Solution:  
 
  Dim cRealValue As Currency ' to hold the real value passed to VB

If lVar >= -2^31 And lVar <= -1 Then

 
Page 76
 
     cRealValue = CCur(lVar) + 2^32

ElseIf lVar >= 0 And lVar <= 2^31 - 1 Then

   cRealValue = CCur(lVar)

End If

 
   
  Receive a number in the range 0 to 255 in a VB byte variable bVar, from an API function that has an OUT parameter of type unsigned char:  
 
   APIFunction(unsigned char param)   
 
  with VB declaration:  
 
  Declare   APIFunction(param As Byte)   
 
  Solution:  
 
  Dim iRealValue As Integer ' to hold the real value passed to VB

If bVar >= -128 And bar <= -1 Then

   iRealValue = CInt(bVar) + 128

ElseIf bVar >= 0 And bVar <= 127 Then

   iRealValue = CInt(bVar)

End If

 
 
  Converting Between Word Lengths  
   
  Let us conclude our discussion of signed data types with the issue of converting between word lengths. This issue does not arise in API programming, so you may skip it if desired.  
   
  To illustrate, suppose we have a number in the signed integer range -32768 to 32767 and we want to place it in a Long variable. What does VB do to the 16-bit signed representation of the number to get a 32-bit signed representation?  
   
  If the number is positive, the answer is as expected VB just puts 16 additional 0s on the left. For instance,  
 
                      0000 0000 0000 1010   5

0000 0000 0000 0000 0000 0000 0000 1010   5

 
   
  On the other hand, what about a negative number?  
   
  As an example, consider the negative number -32765, with signed representation:  
 
  1000 0000 0000 0011   -32765  
   
  Putting 16 0s on the left would produce a positive number, so this is not correct. Also, just changing the sign bit does not help the word:  
 
  1000 0000 0000 0000 1000 0000 0000 0011  
   
  represents:  
 
  -2^31 + 2^15 + 2 + 1 = -2147450877  
   
  which is certainly not -32765.  
Page 77
   
  Suppose instead that we put 16 1s on the left, changing:  
 
  1000 0000 0000 0011   -32765  
   
  to:  
 
  1111 1111 1111 1111 1000 0000 0000 0011   x  
   
  To compute the value of x, we look at the contributions of the new bits.  
   
  Since the original sign bit contributes -215 to the number -32765, but now contributes 215 to the number x, the increase in going from -32765 to x from this bit alone is:  
 
  2 * (2^15) = 2^16  
   
  In addition, the new 1s in positions 16 through 30 contribute an increase of:  
 
  2^16 + 2^17 +   + 2^30  
   
  to the value of x. Finally, the 31st bit, which is the new sign bit, contributes a negative quantity -231. Adding up all of the changes gives the net change in going from -32765 to x:  
 
  2*2^15 + 2^16 + 2^17 +   + 2^30 - 2^31  
   
  Some algebra that I guess you would prefer that I omit shows that this net increase is actually 0:  
 
  2*2^15 + 2^16 + 2^17 +   + 2^30 - 2^31 = 0  
   
  In other words, there is no change. Hence, x = -32765!  
   
  We have shown that adding 16 1s on the left does not change the number. Put another way, to get the 32-bit signed representation of a negative number from the number's 16-bit signed representation, we just put 16 1s on the left.  
   
  We can combine both cases (positive and negative numbers) as follows:  
  To get the 32-bit signed representation of a number from the number's 16-bit signed representation, just copy the sign bit (whether it be a 0 or a 1) to the left 16 times. This process is called sign extension.  

Категории