Coding Style and Conventions
Software developers are a very privileged bunch. We don't have to work in dangerous environments, and our jobs aren't physically taxing (though carpal tunnel syndrome is always a threat). We are paid to think about things, and then to write down our thoughts in the form of code. This code is then used and maintained by others, sometimes for decades. Now just think of your code as a form of poetry and rejoice in your fortunate circumstances!
Given this situation, we all have a responsibility to write code that can be easily understood and maintained (and, c'mon, let's admit our secret desires, admired) by developers who follow in our footsteps.
|
STYL-01: Adopt a consistent, readable format that is easy to maintain |
Your code should have a "signature," a style that is consistent (all your programs look the same), readable (anyone can pick up your code and make sense of it), and maintainable (a minor change in the code shouldn't require 15 minutes of reformatting).
Ideally, everyone in your organization would adopt a similar style, so that everyone can easily understand everyone else's code. This can be tricky, as programmers sometimes take a dogmatic approach to such issues as size of indentation and use of whitespace. However, research and experience confirm that the benefit of adopting a similar style is not so much that any one standard confers a significant benefit over another, but rather that the use of a consistent standard throughout an organization improves efficiency and reduces maintenance costs.
STYL-02: Adopt logical, consistent naming conventions for modules and data structures |
Adopt and promote standard ways to define names of program elements. Choose a level of "formality" of naming conventions based on your needs. If, for example, you have a team of two developers working on a small code base, you can probably get away with naming conventions that don't go far beyond "use meaningful names." If you are building a massive application involving dozens of developers, you probably need to define more comprehensive rules.
Here are some general recommendations for conventions:
- Identify the scope of a variable in its name: A global variable can be prefaced with v_, for example.
- Use a prefix or suffix to identify the types of structures being defined: Consider, for example, declarations of cursors. A standard approach to declaring such a structure is _csr. Cursors are quite different from variables; you should be able to identify the difference with a glance.
- Use a readable format for your names: Since the stored program language isn't case sensitive, the "camel notation" (as in minBalanceRequired), for example, is probably not a good choice for constructing names. Instead, use separators such as _ (underscore) to improve readability (as in min_balance_required). While MySQL allows names to be extremely long (compared with other databases and/or languages), keep them short, as well as readable.
- Consider portability: If you ever want to port your code to an alternate RDBMS (perish the thought!) you should consider adopting a naming convention that will work across RDBMS types. You can find a summary of the conventions for the "other" databases at http://www.dbazine.com/db2/db2-disarticles/gulutzan5.
It isn't possible to provide a comprehensive list of naming conventions in this book. The particular conventions you choose, furthermore, aren't nearly as important as the fact that you set some standard for naming conventions.
STYL-03: Self-document using block and loop labels |
While block and loop labels are often necessary to allow for variable scoping or as targets for LEAVE or ITERATE statements, they can also be a big help in improving the readability of code.
Use a label directly in front of loops and nested blocks:
- To name that portion of code and thereby self-document what it's doing
- So that you can repeat that name with the END statement of that block or loop
- To provide a target for a LEAVE or ITERATE statement
This recommendation is especially important when you have multiple nestings of loops (and possibly inconsistent indentation), as in the following:
WHILE condition DO some code WHILE condition DO some code END WHILE; some code END WHILE;
Example
In this example we use labels for a block and two nested loops, and then apply them in the appropriate END statements. We can now easily see which loop and block are ending, no matter how badly the code is indented!
CREATE PROCEDURE display_book_usage( ) READS SQL DATA BEGIN DECLARE v_month INT; DECLARE v_x INT; DECLARE yearly_analysis_csr CURSOR FOR SELECT ...; DECLARE monthly_analysis_csr CURSOR FOR SELECT ...; OPEN yearly_analysis_csr; yearly_analysis: LOOP FETCH yearly_analysis_csr INTO v_month; OPEN monthly_analysis_csr; monthly_analysis: LOOP FETCH monthly_analysis_csr INTO v_x; ... Lots of monthly analysis code ... END LOOP monthly_analysis; ...Lots of yearly analysis code END LOOP yearly_analysis;
Benefits
If you use labels, it's much easier to read your code, especially if it contains loops and nested blocks that have long bodies (i.e., the loop starts on page 2 and ends on page 7, with three other loops inside that outer loopnot that we recommend this!).
STYL-04: Express complex expressions unambiguously using parentheses |
The rules of operator precedence in the MySQL stored program language follow the commonly accepted precedence of algebraic operators. The rules of precedence often make many parentheses unnecessary. When an uncommon combination of operators occurs, however, it may be helpful to add parentheses even when the precedence rules apply.
The rules of evaluation do specify left-to-right evaluation for operators that have the same precedence level. However, this is the most commonly overlooked rule of evaluation when checking expressions for correctness.
Many developers apply a consistent rule for improved readability in this area: always use parentheses around every Boolean expression, including IF, ELSEIF, and WHILE statements, as well as variable assignments, regardless of the simplicity of the expressions. So, rather than:
IF min_balance < 1000 THEN ...
you instead write:
IF ( min_balance < 1000 ) THEN ...
Example
You might not want a standard that requires you to always use parentheses, but in some situations, parentheses are all but required for readability. Consider the following expression:
5 + Y**3 MOD 10
MySQL will not be the least bit confused by this statement; it will apply its unambiguous rules and come up with an answer. Developers, however, may not have such an easy time of it. You are better off writing that same line of code as follows:
5 + ((Y ** 3) MOD 10)
Benefits
Everyone, including the author of the code, can more easily understand the logic and intent (which is crucial for maintenance) of complex expressions.
STYL-05: Use vertical code alignment to emphasize vertical relationships |
A common code formatting technique is vertical alignment. Here is an example in a SQL WHERE clause:
WHERE COM.company_id = SAL.company_id AND COM.company_type_cd = TYP.company_type_cd AND TYP.company_type_cd = CFG.company_type_cd AND COM.region_cd = REG.region_cd AND REG.status = RST.status;
You should use vertical alignment only when the elements that are lined up vertically have a relationship with each other that you want to express. In the WHERE clause shown here, however, there is no relationship between the right sides of the various expressions. The relationship is between the left and right sides of each individual expression. This is, therefore, a misuse of vertical alignment.
Example
Developers often (and justifiably) use vertical alignment with program parameter lists, as in:
CREATE PROCEDURE maximize_profits ( IN advertising_budget NUMERIC(12,2), INOUT bribery_budget NUMERIC(12,2), IN merge_and_purge_on DATE , OUT obscene_bonus NUMERIC(12,2))
Vertical alignment allows you to easily see the different parameter modes and data types.
Vertical alignment is also handy when declaring many variables, as in:
CREATE PROCEDURE genAPI( ) DETERMINISTIC BEGIN DECLARE c_table CHAR(5) DEFAULT 'TABLE'; DECLARE c_column CHAR(6) DEFAULT 'COLUMN'; DECLARE c_genpky CHAR(6) DEFAULT 'GENPKY'; DECLARE c_genpkyonly CHAR(10) DEFAULT 'GENPKYONLY'; DECLARE c_sequence CHAR(7) DEFAULT 'SEQNAME'; DECLARE c_pkygenproc CHAR(10) DEFAULT 'PKYGENPROC'; DECLARE c_pkygenfunc CHAR(10) DEFAULT 'PKYGENFUNC'; DECLARE c_usingxmn CHAR(8) DEFAULT 'USINGXMN'; DECLARE c_fromod2k CHAR(8) DEFAULT 'FROMOD2K';
In this case, we want to be able to scan the list of values to make sure they are unique. I can also easily compare lengths of strings with the CHAR declarations, avoiding nuisance truncation exceptions on initialization.
Benefits
Careful and appropriate use of vertical alignment enhances readability. Used inappropriately, however, vertical alignment actually makes it harder to see what is really going on in your code.
Challenges
Vertical alignment is a "high maintenance" format. Add a new, long variable name, and you find yourself reformatting 20 other lines of code to match.
STYL-06: Comment tersely with value-added information |
The best way to explain what your code is doing is to let that code speak for itself. You can take advantage of many self-documentation techniques, including:
- Use meaningful variable, procedure, and function names.
- Use the language construct that best reflects the code you are writing (choose the right kind of loop for your logic, label loops and BEGIN-END blocks, etc.).
Whenever you find yourself adding a comment to your code, first consider whether it is possible to modify the code itself to express your comment. Good reasons to add comments include:
- Program headers, explanations of workarounds, patches, operating-system dependencies, and other "exceptional" circumstances
- Complex or opaque logic
Example
Let's follow a trail of unnecessarily commented code to self-documenting code. We start with:
-- If the first properties element is N... IF properties1 = 'N'
Yikes! Our line of code was incomprehensible and our comment simply repeated the code using the English language, rather than the stored program language. No added value, no real assistance, yet not at all uncommon. The least we can do is use the comment to "translate" from computer-talk to business requirement:
-- If the customer is not eligible for a discount... IF properties1 = 'N'
That's better, but we have created a redundancy: if our requirement ever changes, We have to change the comment and the code. Why not change the names of our variables and literals so that the code explains itself?
IF customer_discount_flag = const_ineligible
Much better! Now we no longer need a comment. Our remaining concern with this line of code is that it "exposes" a business rule; it shows how (at this moment in time) we determine whether a customer is eligible for a discount. Business rules are notorious for changing over timeand for being referenced in multiple places throughout our application. So our best bet is to hide the rule behind a self-documenting function call:
IF NOT eligible_for_discount (customer_id)