Language Structure
PL/pgSQL is termed a block-structured language. A block is a sequence of statements between a matched set of DECLARE/BEGIN and END statements. Blocks can be nestedmeaning that one block can entirely contain another block, which in turn can contain other blocks, and so on. For example, here is a PL/pgSQL function:
1 -- 2 -- ch07.sql 3 -- 4 5 CREATE OR REPLACE FUNCTION my_factorial(value INTEGER) RETURNS INTEGER AS $$ 6 DECLARE 7 arg INTEGER; 8 BEGIN 9 10 arg := value; 11 12 IF arg IS NULL OR arg < 0 THEN 13 RAISE NOTICE 'Invalid Number'; 14 RETURN NULL; 15 ELSE 16 IF arg = 1 THEN 17 RETURN 1; 18 ELSE 19 DECLARE 20 next_value INTEGER; 21 BEGIN 22 next_value := my_factorial(arg - 1) * arg; 23 RETURN next_value; 24 END; 25 END IF; 26 END IF; 27 END; 28 $$ LANGUAGE 'plpgsql';
The body of my_factorial() is actually the string between the opening dollar quotes (following the word AS) and the closing dollar quotes (just before the word LANGUAGE).
This function contains two blocks of code. The first block starts at line 6 and ends at line 27. The second block, which is nested inside the first, starts at line 19 and ends at line 24. The first block is called an outer block because it contains the inner block.
I'll talk about variable declarations in more detail in a moment, but I want to point out a few things here. At line 7, we declare a variable named arg. This variable has a well-defined lifetime. arg comes into existence when the function reaches the first DECLARE statement and goes out of existence as soon as the function reaches the END statement at line 27. The lifetime of a variable is also referred to as its scope. You can refer to a variable in any statement within the block that defines the scope of the variable. If you try to refer to a variable outside of its scope, you will receive a compilation error. Remember that you have two (nested) blocks in this function: the outer block and the inner block. Variables declared in an outer block can be used in inner blocks, but the reverse is not true. At line 22 (which is in the inner block), we use the arg variable, which was declared in the outer block. The variable next_value is declared within the inner block: If you try to use next_value in the outer block, you'll get an error.
This function (my_factorial()) contains two blocks, one nested within the other. You can nest blocks as deeply as you need to. You can also define blocks that are not nested. Here is the my_factorial() function again, but this time, I've included a few more blocks:
1 -- 2 -- ch07.sql 3 -- 4 5 CREATE FUNCTION my_factorial( value INTEGER ) RETURNS INTEGER AS $$ 6 DECLARE 7 arg INTEGER; 8 BEGIN 9 10 arg := value; 11 12 IF arg IS NULL OR arg < 0 THEN 13 BEGIN 14 RAISE NOTICE 'Invalid Number'; 15 RETURN NULL; 16 END; 17 ELSE 18 IF arg = 1 THEN 19 BEGIN 20 RETURN 1; 21 END; 22 ELSE 23 DECLARE 24 next_value INTEGER; 25 BEGIN 26 next_value := my_factorial(arg - 1) * arg; 27 RETURN next_value; 28 END; 29 END IF; 30 END IF; 31 END; 32 $$ LANGUAGE 'plpgsql';
This version still has an outer block (lines 6 through 31), but you have multiple inner blocks: lines 13 through 16, lines 19 through 21, and lines 23 through 28. As I said earlier, variables declared in an outer block can be used in inner blocks but the reverse is not true. If you had declared any variables in the block starting at line 19, you could not use any of those variables past the end of the block (at line 21).
Notice that you can indicate the beginning of a block with a DECLARE statement or with a BEGIN statement. If you need to declare any variables within a block, you must include a DECLARE section. If you don't need any local variables within a block, the DECLARE section is optional (an empty DECLARE section is perfectly legal).
Quoting Embedded Strings
Prior to version 8.0, including string literals in a PL/pgSQL function was difficult and error prone. Because the body of a PL/pgSQL function is itself a string, you had to double up the quote characters around any string literals within the function.
Take a close look at line 14 in the previous example:
RAISE NOTICE 'Invalid Number';
Notice that the string literal Invalid Number is surrounded by a set of single quotes. You can write an embedded string value that way because the body of the function is defined in a string delimited by PostgreSQL's new dollar-quoting mechanism. If you don't use dollar-quoting to define function body, you must double up the quotes, like this:
RAISE NOTICE ''Invalid Number'';
If you're using a version of PostgreSQL older than 8.0, you can't use dollar-quoting and you'll have to write embedded string literals in one of the other forms described in Chapter 2, "Working with Data in PostgreSQL." You could have written the embedded string in any of the three following forms:
RAISE NOTICE ''Invalid Number''; RAISE NOTICE 'Invalid Number'; RAISE NOTICE 47Invalid Number 47;
CREATE FUNCTION
Now, let's go back and look at the components of a function in more detail.
You define a new PL/pgSQL function using the CREATE FUNCTION command. The CREATE FUNCTION command comes in two forms. The first form is used for language interpreters that are embedded into the PostgreSQL serverPL/pgSQL functions fall into this category:
CREATE [OR REPLACE] FUNCTION name ( [[argname] argtype [, ...] ] ) RETURNS return_type AS $$definition$$ LANGUAGE langname [ WITH ( attribute [, ...] ) ]
The second form is used to define functions that are defined in an external language and compiled into a dynamically loaded object module:
CREATE [OR REPLACE] FUNCTION name ( [[argname] argtype [, ...] ] ) RETURNS return_type AS $$obj_file$$, $$link_symbol$$ LANGUAGE langname [ WITH ( attribute [, ...] ) ]
I covered compiled functions in more detail in Chapter 6, "Extending PostgreSQL." For this chapter, I'll focus on the first form. Don't forget, if you're using a version of PostgreSQL older than 8.0, you can't use $$ to delimit string values and you'll have to carefully quote embedded strings as described earlier in this chapter.
Each function has a name. However, the name alone is not enough to uniquely identify a PostgreSQL function. Instead, the function name and the data types of each argument (if any) are combined into a signature. A function's signature uniquely identifies the function within a database. This means that you can define many my_factorial() functions:
CREATE FUNCTION my_factorial( INTEGER )... CREATE FUNCTION my_factorial( REAL )... CREATE FUNCTION my_factorial( NUMERIC )...
Each of these functions is uniquely identified by its signature. When you call one of these functions, you provide the function name and an argument; PostgreSQL determines which function to use by comparing the data type of the arguments that you provide with the function signatures. If an exact match is found, PostgreSQL uses that function. If PostgreSQL can't find an exact match, it tries to find the closest match.
When you create a new function, you specify a list of arguments required by that function. In most programming languages, you would declare a name and a type for each function argument. In PL/pgSQL, you declare only the data type. The first argument is automatically named "$1", the second argument is named "$2", and so forth, up to a maximum of 32 arguments (if you're using a version of PostgreSQL older than 8.0, you're limited to 16 arguments per function). Starting with PostgreSQL version 8.0, you can include argument names in the CREATE FUNCTION command. That means that you can define the my_factorial() function like this:
CREATE FUNCTION my_factorial( inputArgument INTEGER )... CREATE FUNCTION my_factorial( inputArgument REAL )... CREATE FUNCTION my_factorial( inputArgument NUMERIC )...
Inside of my_factorial(), you can refer to the first argument as $1 or as inputArgument. If you include argument names in the CREATE FUNCTION command, the names are not considered to be part of the function signature. If you define a function such as
CREATE FUNCTION my_factorial( inputArgument INTEGER )...
the function's signature is my_factorial( INTEGER )you can DROP the function without specifying argument names (in fact, if you do specify argument names in a DROP FUNCTION command, the names are ignored).
You can use predefined data types, user-defined data types, and arrays of those types in a PL/pgSQL function.
It is important to remember that PL/pgSQL does not support default parameters. If you define a function that requires three parameters, you cannot call that function with fewer (or more) parameters. If you find that you need a function with a variable argument list, you can usually overload your function to obtain the same effect. When you overload a function, you define two (or more) functions with the same name but different argument lists. For example, let's define a function to compute the due date for a tape rental:
1 -- 2 -- ch07.sql 3 -- 4 5 CREATE FUNCTION compute_due_date(DATE) RETURNS DATE AS $$ 6 DECLARE 7 8 due_date DATE; 9 rental_period INTERVAL := '7 days'; 10 11 BEGIN 12 13 due_date := $1 + rental_period; 14 15 RETURN due_date; 16 17 END; 18 $$ LANGUAGE 'plpgsql';
This function takes a single parameter, a DATE value, and returns the date one week later. You might want a second version of this function that expects the rental date and a rental period:
20 -- ch07.sql 21 -- 22 CREATE FUNCTION compute_due_date(DATE, INTERVAL) RETURNS DATE AS $$ 23 BEGIN 24 25 RETURN( $1 + $2 ); 26 27 END; 28 $$ LANGUAGE 'plpgsql';
Now you have two functions named compute_due_date(). One function expects a DATE value, and the other expects a DATE value and an INTERVAL value. The first function compute_due_date(DATE), provides the equivalent of a default parameter. If you call compute_due_date() with a single argument, the rental_period defaults to seven days.
I'd like to point out two things about the compute_due_date(DATE, INTERVAL) function.
First, a stylistic issuethe RETURN statement takes a single argument, the value to be returned to the caller. You can RETURN any expression that evaluates to the return_type of the function (we'll talk more about a function's return_type in a moment). I find it easier to read a RETURN statement if the expression is enclosed in parentheses (see line 25).
Second, you'll notice that I did not DECLARE any local variables. You can treat parameter variables just like any other variableI used them in an expression in line 25. It's a rare occasion when you should settle for the automatic variable names supplied for function parameters. The name "$1" doesn't convey much meaning beyond telling you that this variable happens to be the first parameter. You should really provide a meaningful name for each parameter; this gives the reader some idea of what you intended to do with each parameter.
If you're using an older version of PostgreSQL (or you're writing code that must work on an older version), you can use the ALIAS statement to give a second, more meaningful name to a parameter. Here is the compute_due_date(DATE, INTERVAL) function again, but this time I have given alternate names to the parameters:
20 -- ch07.sql 21 -- 22 CREATE FUNCTION compute_due_date(DATE, INTERVAL) RETURNS DATE AS ' 23 DECLARE 24 rental_date ALIAS FOR $1; 25 rental_period ALIAS FOR $2; 26 BEGIN 27 28 RETURN( rental_date + rental_period ); 29 30 END; 31 ' LANGUAGE 'plpgsql';
ALIAS gives you an alternate name for a parameter: you can still refer to an aliased parameter using the $n form, but I don't recommend it. Why bother to give a meaningful name to a parameter and then ignore it?
Starting with PostgreSQL version 8.0, you can skip the ALIAS commands and simply name the arguments in the CREATE FUNCTION command, like this:
CREATE FUNCTION compute_due_date(rental_date DATE, rental_period INTERVAL) RETURNS DATE AS ...
When you create a function, you must declare the data type of the return value. Our compute_due_date() functions return a value of type DATE. A value is returned from a function using the RETURN expression statement. Keep in mind that PL/pgSQL will try to convert the returned expression into the type that you specified when you created the function. If you tried to RETURN( ''Bad Value'' ) from the compute_due_date() function, you would get an error (Bad Date External Representation). We'll see a special data type a little later (trIGGER, or in versions older than 8.0, OPAQUE) that can be used only for trigger functions.
If you're writing a PL/pgSQL function that you want to run in a version of PostgreSQL older than 7.3, you must ensure that the function returns a value, even it if it only returns NULL. Starting with version 7.3, you can define functions that return type void. A function that returns type void doesn't actually return a valueyou would call such a function for the side effects provided by the function.
I'll skip over the function body[1] for the moment and look at the final component[2] required to define a new function. PostgreSQL functions can be written in a variety of languages. When you create a new function, the last component that you specify is the name of the language in which the body of the function is written. All the functions that you will see in this chapter are written in PL/pgSQL, which PostgreSQL knows as LANGUAGE 'plpgsql'.
[1] The function body is everything between the AS keyword and the LANGUAGE keyword. The function body is specified in the form of a string.
[2] When you create a function, you can also specify a set of optional attributes that apply to that function. These attributes tell PostgreSQL about the behavior of the function so that the query optimizer can know whether it can take certain shortcuts when evaluating the function. See the CREATE FUNCTION section in the PostgreSQL Programmer's Guide for more information.
DROP FUNCTION
Before you experiment much more with PL/pgSQL functions, it might be useful for you to know how to replace the definition of a function.
If you are using PostgreSQL 7.2 or later, you can use the CREATE OR REPLACE FUNCTION ... syntax. If a function with the same signature already exists, PostgreSQL will silently replace the old version of the function; otherwise, a new function is created.
If you are using a version of PostgreSQL older than 7.2, you will have to DROP the old function before you can create a new one. The syntax for the DROP FUNCTION command is
DROP FUNCTION name( [[argname] argtype [, ...] ] );
Notice that you have to provide the complete signature when you drop a function; otherwise, PostgreSQL would not know which version of the function to remove.
Of course, you can use the DROP FUNCTION command to simply remove a functionyou don't have to replace it with a new version.