Wednesday, November 4, 2009

PHP Coding Standards

I wrote this a few years ago in order to create a more standard approach to developing PHP code. Enjoy...

/**
* @author Jonathon Hibbard
*/

PHP Coding Standards

Contents
1 PHP Coding Standards
2 Benefits of Coding Standards
3 Coding Standard Used in this Document
4 Identifier Naming
5 Keep Functions to a Maximum of One Page
6 Whitespace and Indentation
7 Placement of Parenthesis and Curly Braces
8 Line Up Like Values
9 Add a Space between Variable Definitions
10 File Structure and File Naming (More Work Needed in this section)
11 Comments
12 A Full Example
13 Conclusion
14 References



PHP Coding Standards
This document provides a complete coding standard to resolve conflicts that may arise when developers share and/or debug code. It is important for every developer to understand the benefits of such standards, how to use them and how not doing so affects others. Each developer adapts to their own methods and forcing them to change usually results in a negative result.

With that being said, it is encouraged rather than required that you follow this standard. Coding standards do not have to be a difficult thing to conform to. They actually do more good than bad, especially in an environment where such standards do not already exist. This document's intension is to put standards to use and make for a more team-friendly development environment.


Benefits of Coding Standards
Standards benefit both developers and businesses alike. It applies conventions that make it significantly easier for both current and subsequent developers to understand how the application they work on operates. It also helps with how to fix or extend existing code for new needs.

Example:
$a = $b * c;

The above is 100% valid and works, but it's very unclear to its true meaning. What are the variables true representations? How hard is it going to be when we have more variables like $aa, $az, $z, or others that obfuscate this further?

A Standard-Correct Example:
$weekly_pay = $hours_worked * $pay_rate

This describes both the intent and meaning of the variables, the end result and makes it very easy for anyone else to decipher who may stumble upon this line.

Coding standards also are beneficial in that when used, they can enable a transparency to the code, eases debugging and can increase code maintainability.

Quote from the Java Standards Webpage:
  "Why have code conventions?
Code conventions are important to programmers for a number of reasons:
* 80% of the lifetime cost of a piece of software goes to maintenance.
* Hardly any software is maintained for its whole life by the original author
* Code conventions improve the readability of the software,
allowing engineers to understand new code more quickly and thoroughly."



Coding Standard Used in this Document
The coding standard used in this document is very closely tied to the C++ (and C) language called the Kernighan and Ritchie (K&R) standard.

There are other standards (including the ANSI C++ standard, Java standard, Pascal Standard and hundreds others) that developers use, but this standard is much more universal in many ways. The primary way being that it conforms to other adapted models outside of programming (such as the MySQL and SQL database server's standards) and has been widely used for over a decade now.


Identifier Naming
Identifiers are names given to variables, functions, objects, etc.

Given this understanding, you should follow these rules when defining an Identifier:
1) Do not use a leading underscore. They are usually reserved for compiler or internal variables of a language and can mislead a developer into thinking it's a SYSTEM identifier instead of a custom built one. Example of an incorrect name: $_first_name;

2) Do not use names that are like standard system identifiers as these words are reserved words for a reason, and doing so will make the application more difficult to understand. Example of an incorrect name: $for;

3) Give descriptive meaning to identifier names. Example of a bad name: $fn Example of a good name: $first_name. However, make sure to keep in mind not to get too descriptive since this will result in a lot of typing for the rest. Try to keep it short, but to the point. This rule can also be relaxed a little more for instances where short variable names such as $i are used in for loops.

4) Give descriptive meaning for functions that contain a mix of verbs and nouns separated by underscores.
Bad Example:
function first_name() {

Good Example:
function get_first_name() {

Once again, this is considered the K&R standard. The other method is the ANSI C++ Standard, which capitalizes the first letter in each word (getFirstName). However, since we are using the K&R Standard in this document, it is required you follow the K&R naming conventions. Function names can easily be dismissed by a developer who is in the heat of the moment in code. It's very easy to call a function "doit" instead of "get_first_name".


Keep Functions to a Maximum of One Page
This is not an absolute rule as much as it is a helpful practice. Functions that exceed 1 page are by default a little harder to decipher both during creation as well as debugging. Scrolling up and down a page can cause confusion as to what function you are in, where the braces line up, and also how something in the beginning of the function worked. This becomes more evident when opening a file in a console editor like VI or EMACs.

Plus, if you find yourself writing more than a page for a function, it may be a very good indication that you might need to break the function down to 2 or more separate functions, that the function could instead be used as a recursive function, or simply that the method you are choosing is not the quickest route.

This does not mean that you should obfuscate your code in order to meet this requirement. If the function absolutely must extend a page (and usually this is the case when formatting arrays correctly) then go ahead and do it. Just keep it in your head that if you can put a piece of code in its own separate function to be used later, do it.


Whitespace and Indentation
Most modern editors today allow you to define how many spaces a tab can be. They also offer the ability to decide if you would rather use the tab (\t) character when the TAB key is pressed, or if the TAB should be interpreted as X number of spaces.

The preferred method is that all TABS be handled as spaces with 2 spaces as the default (4 is usually the system default). If you are used to the standard Window's tab of 4 spaces, this may seem a little awkward at first. Rest assured, it will reduce the number of column space your code is taking up, but also give it a more structured and easier flow.

Finally, it should also be common practice to change how new lines are handled with your IDE. The default should be set to treat new lines as the Unix \n (new line) instead of the Microsoft default \r (Return). The reason is that \n is a universal default supported in all Operating Systems, so no matter what OS your code is opened up in, the new lines will be nicely available on each new line instead of one concurrent line.

Placement of Parenthesis and Curly Braces
This is perhaps one of the most difficult subjects, not because of its complexity, but more of an infringement of a developer's habit. The ANSI C++ standard has us put curly braces on their own separate lines after ifs, loops, function declarations, etc. The K&R standard, however, has us put curly braces on the same line.

Example ANSI C++ Standard
if (this_condition == true)
{
//do something
}
else {
//do something else
}

K&R 1TBS Standard Example

if(this_condition == true) {
//do something
} else {
//do something else
}

This document requires you to use the K&R Standard as was first defined in the beginning of this text. It is very important that this method be used in order to reduce the need of other developers to reformat your code in order to bring it up to scope with this standard.


Line Up Like Values
When you have a list of variable declarations, or a large array definition, it's a good idea to line these values up. It helps with readability and is key to making your code clean for other developers.

Bad Example
$first_name = 'John';
$last_name = 'Doe';
$city = 'Cincinnati';

Good Example

$first_name = 'John';
$last_name = 'Doe';
$city = 'Cincinnati';

As you can see, the second example makes it much more legible. One more exaple would be for an array;

Bad Example:
Array('first_name' => 'John', 'last_name' => 'Doe', 'city' => 'Cincinnati', 'address' => array('street' => 'PO Box 20', 'zip' => '45241'));

Good Example:

Array('first_name' => 'John',
'last_name' => 'Doe',
'city' => 'Cincinnati',
'address' => array('street' => 'PO Box 20',
'zip' => '45241',
),
);



Add a Space between Variable Definitions
When defining a variable to be equal to a value, always make sure to put a space between the variable and the equal sign, and also a space between the equal sign and the value.

Bad Example:
$i=0;
function foo($bar=1){
}

Good Example:
$i = 0;
function foo($bar = 1) {
}


Filesystem Naming Conventions (Files/Directories)
Both files and directories should be named by default by all lower-case. They should be descriptive in their meaning, but also limited in length. This is not only because both Linux and Unix servers are case-sensitive file systems, but also because even web addresses typed in capitalized lettering will, by default, be translated as lower-case values (IE: WWW.GOOGLE.COM will be switched to www.google.com on Enter).

A typical file structure should have the following structure:

/admin/
/apps/
|- classes
|- includes
/css/
/js/
/utils/
/xml/


The admin directory is where backend application files that would require a login and admins a page should go. The apps directory has 2 sub directories: includes and classes. Includes are files that are included simply to reduce repetition (such as header or footer files), whereas classes is where files containing class definitions are stored. The css directory, as the name implies, is where style sheet (css) files will go. The js directory is where JavaScript files would go. The utils directory is where either 3rd party applications or reusable applications (such as thumbnail generators) would go. The xml directory is where xml files will go.

Example of a bad filename
1.php

Example of a GOOD filename
list_users.php


Comments
Comments play a major role in any development environment. It not only helps out an individual better debug their own code, but it is basically a manual to how your code works. Commenting should follow the DOC standard for commenting.

The following is an example of this:

/**
* A brief description of what you are commenting goes here.
*/


Commenting is vital with any team. Over commenting can cause your code to be very confusing, under commenting could leave many questions regarding what your code actually does.

You should always comment in these circumstances:

* Whenever you define a function or class
* Whenever you define a variable that does not describe itself properly.
* Whenever you write a "hack".
* Whenever a block of code is foreign to you.
* Every new page you create should have a description at the top indicating its purpose.
* If something you do is not 100% and needs to be rewritten in the future.
* If something you do is not 100% obvious.


The PHPdoc comment is very intuitive and supported by most popular IDE's.

Some of the most popular keys used with this commenting style are:

@author // Indicates the author's name
@param // Describes a parameter for a function
@copyright // The copyright information (if applicable).
@return // The return value of a function
@todo // Gives you the ability to write-up a "todo" list.



A Full Example

/**
* @author John Doe
* This class' purpose is to provide a control area for handling database connections,
* errors and results.
*
* @copyright ACME
* @todo This class requires a new method of error control.
*/
class db_handle {
/**
* This is the database handler variable.
*/
public $dbh;

/**
* This is the constructor of the class.
* @param string $strDSN // String containg connection information
* @return object // Returns the db_handle object.
*/
public function __construct($strDSN) {
// Do Something.
}
}



Conclusion
This is the end of the current Coding Standards document. There are a few other concepts such as private/public variables, unit naming, and other high-end programming techniques that can be applied, but are not due to this not being about "how to program", but instead is about coding standards.


References
http://en.wikipedia.org/wiki/Coding_standard - Wikipedia Definition of a Coding Standard
http://en.wikibooks.org/wiki/C%2B%2B_Programming/Code_Style - Code Standards
http://en.wikipedia.org/wiki/Indent_style - K&R Coding Style Definition
http://dn.codegear.com/article/10280/Code - Great Article relating to the K&R Coding Style

No comments:

Post a Comment