Expression Evaluation and Type Conversion

Now that you have seen all the standard PostgreSQL data types, it's time to talk about how you can combine values of different types into complex expressions.

First, you should understand that an expression represents a value. In a well-designed language, you can use an expression anywhere you can use a value. An expression can be as simple as a single value: 3.14159 is an expression. A complex expression is created by combining two simple expressions with an operator. An operator is a symbol that represents some sort of operation to be applied to one or two operands. For example, the expression "customer_balance * 1.10 " uses the multiplication operator (*) to multiply customer_balance by 1.10. In this example, customer_balance is the left operand, * is the operator, and 1.10 is the right operand. This expression combines two different kinds of values: customer_balance is (presumably) a column in one of your tables; whereas 1.10 is a literal value (informally called a constant). You can combine column values, literal values, function results, and other expressions to build complex expressions.

Most operators (such as *, +, and <) require two operands: these are called binary operators. Other operators (such as !!, the factorial operator) work with a single value: these are called unary operators[17]. Some operators (such as -) can function as either.

[17] You may also see the terms dyadic (meaning two-valued) and monadic (meaning single-valued). These terms have the distinct advantage that you will never have to worry about accidentally saying "urinary operator" in polite company.

For some expressions, particularly those expressions that mix data types, PostgreSQL must perform implicit type conversions[18]. For example, there is no predefined operator that allows you to add an INT2 to a FLOAT8. PostgreSQL can convert the INT2 into a FLOAT8 before performing the addition, and there is an operator that can add two FLOAT8 values. Every computer language defines a set of rules[19] that govern automatic type conversion; PostgreSQL is no exception.

[18] A type conversion that is automatically provided by PostgreSQL is called a coercion. A type conversion caused explicitly by the programmer (using the CAST() or '::' operator) is called a cast.

[19] A given language might simply prohibit automatic type conversion, but most languages try to help out the programmer a bit.

PostgreSQL is rather unique in its depth of support for user-defined data types. In most RDBMSs, you can define new data types, but you are really just providing a different name for an existing data type (although you might be able to constrain the set of legal values in the new type). With PostgreSQL, you can add new data types that are not necessarily related to the existing data types. When you add a new data type to PostgreSQL, you can also define a set of operators that can operate on the new type. Each operator is implemented as an operator function; usually, but not necessarily, written in C. When you use an operator in an expression, PostgreSQL must find an operator function that it can use to evaluate the expression. The point of this short digression is that although most languages can define a static set of rules governing type conversion, the presence of user-defined data types requires a more dynamic approach. To accommodate user-defined data types, PostgreSQL consults a table named pg_operator. Each row in the pg_operator contains an operator name (such as + or #), the operand data types, and the data type of the result. For example, (in PostgreSQL version 7.1.2) there are 31 rows in pg_operator that describe the + operator: One row describes the + operator when applied to two POINT values, another row describes the + operator when applied to two INTERVAL values, and a third row describes the + operator when applied to an INT2 and an INT4.

You can see the complete list of operators using the "do " command in the psql query tool.

When searching for an operator function, PostgreSQL first searches the pg_operator table for an operator that exactly matches data types involved in the expression. For example, given the expression:

CAST( 1.2 AS DECIMAL ) + CAST( 5 AS INTEGER )

PostgreSQL searches for a function named '+ ' that takes a DECIMAL value as the left operand and an INTEGER value as right operand. If it can't find a function that meets those criteria, the next step is to determine whether it can coerce one (or both) of the values into a different data type. In our example, PostgreSQL could choose to convert either value: The DECIMAL value could be converted into an INTEGER, or the INTEGER value could be converted into a DECIMAL. Now we have two operator functions to choose from: One function can add two DECIMAL values and the other can add two INTEGER values. If PostgreSQL chooses the INTEGER + INTEGER operator function, it will have to convert the DECIMAL value into an INTEGER this will result in loss of precision (the fractional portion of the DECIMAL value will be rounded to the nearest whole number). Instead, PostgreSQL will choose the DECIMAL + DECIMAL operator, coercing the INTEGER value into a DECIMAL.

So to summarize, PostgreSQL first looks for an operator function in which the operand types exactly match the expression being evaluated. If it can't find one, PostgreSQL looks through the list of operator functions that could be applied by coercing one (or both) operands into a different type. If type coercion would result in more than one alternative, PostgreSQL tries to find the operator function that will maintain the greatest precision.

The process of selecting an operator function can get complex and is described more fully in Chapter 5 of the PostgreSQL User's Guide.

Table 2.30 lists the type conversion functions supplied with a standard PostgreSQL distribution.

Table 2.30. Explicit Type Conversion Functions

Result Type

Source Type

BOX

CIRCLE, POLYGON

DATE

TIMESTAMPTZ, DATE, TEXT

INTERVAL

INTERVAL, TEXT, TIME

LSEG

BOX

MACADDR

TEXT

NUMERIC

BIGINT, SMALLINT, INTEGER, REAL, DOUBLE PRECISION

OID

TEXT

PATH

POLYGON

POINT

PATH, LSEG, BOX, POLYGON, CIRCLE

POLYGON

PATH, CIRCLE, BOX

TEXT

INET, DOUBLE PRECISION, NAME, OID, SMALLINT, INTEGER, INTERVAL, TIMESTAMP WITH TIME ZONE, TIME WITH TIME ZONE, TIME, BIGINT, DATE, MACADDR, CHAR, REAL

TIME

TEXT, TIME, TIMESTAMP WITH TIME ZONE, INTERVAL

Категории