Core Web Application Development with PHP and MySQL

The main theme of this book is writing web applications using PHP and MySQL, but we have not yet defined what exactly we mean by this. It is important to understand that when we say web application, we are talking about something very different from a simple web site that serves up files such as HTML, XML, and media.

Terminology

We will define a web application as a dynamic program that uses a web-based interface and a client-server architecture. This is not to say that web applications have to be complicated and difficult to implementwe will demonstrate some extremely simple ones in later chaptersbut they definitely have a more dynamic and code-oriented nature than simple sites.

When we talk about the server, we are primarily referring to the machine or group of machines that acts as the web server and executes PHP code. It is important to note that "the server" does not have to be one machine. Any number of components that the server uses to execute and serve the application might reside on different machines, such as the application databases, web services used for credit card processing, and so on. The web servers might reside on multiple machines to help handle large demand.

As for the client, we are referring to a computer that accesses the web application via HTTP by using a web browser. As we write user interfaces for our web applications, we will resist the urge to use features specific to individual browsers, ensuring the largest possible audience for our products. As we mentioned before, some of these clients will have high-speed connections to the Internet and will be able to transfer large amounts of data, while others will be restricted to modem-based connections with a maximum bandwidth of 56K.

Basic Layout

Every web application that you write will have a different layout, flow of execution, and way of behaving. It can prove helpful to follow some general strategies for organizing code and functionality to help with concerns such as maintainability, performance, scalability, and security.

As far as the average end user is concerned, web applications and web sites have a very simple architecture, as you can see in Figure 13-3.

Figure 13-3. How end users and clients view interactions with web sites.

The user strictly sees a program on his computer talking to another computer, which is doing all sorts of things, such as consulting databases, services, and so on. As we will mention again in Chapter 14, "Implementing a User Interface," users largely do not think of browsers and the content they serve up as different thingsit is all part of the Internet to them.

As authors of web application software, the initial temptation for us might be to put all of the code for the web application in one placewriting scripts that queried the database for information, printed the results for the user, and then did some credit card processing or other things.

While this does have the advantage of being reasonably quick to code, the drawbacks become quickly apparent:

  • The code becomes difficult to maintain when we try to change and upgrade the functionality. Instead of finding clear and well-known places for particular pieces of functionality, we have to go through various files to find what needs to be changed.

  • The possibilities for code reuse are reduced. For example, there is no clear place where user management takes place; if we wanted to add a new page to do user account maintenance, we would end up duplicating a lot of existing code.

  • If we wanted to completely change the way our databases were laid out and move to a new model or schema, we would have to touch all of our files and make large numbers of potentially destabilizing changes.

  • Our monolithic program becomes very difficult to analyze in terms of performance and scalability. If one operation in the application is taking an unusually long time, it is challenging (if not impossible) to pinpoint what part of our system is causing this problem.

  • With all of our functionality in one place, we have limited options for scaling the web application to multiple systems and splitting the various pieces of functionality into different modules.

  • With one big pile of code, it is also difficult to analyze its behavior with regards to security. Analyzing the code and identifying potential weak points or tracking down known security problems ends up becoming harder than it needs to be.

Thus, we will choose to use a multi-tired approach for the server portion of our web application, where we split key pieces of functionality into isolatable units. We will use a common "3-tiered" approach, which you can see in Figure 13-4.

Figure 13-4. Dividing our applications logically into multiple tiers.

This architecture provides us with a good balance between modularization of our code for all of the reasons listed previously but does not prove to be so overly modularized that it becomes a problem. (See the later section titled "n-Tier Architectures.")

Please note that even though these different modules or tiers are logically separate, they do not have to reside on different computers or different processes within a given computer. In fact, for a vast majority of the samples we provide, these divisions are more logical than anything else. In particular, the first two tiers reside in the same instance of the web server and PHP language engine. The power of web applications lies in the fact that they can be split apart and moved to new machines as needs change, which lets us scale our systems as necessary.

We will now discuss the individual pieces of our chosen approach.

User Interface

The layer with which the end user interacts most directly is the user-interface portion of our application, or the front end. This module acts as the main driving force behind our web applicaton and can be implemented any way we want. We could write a client for Windows or the Apple Macintosh or come up with a number of ways to interact with the application.

However, since the purpose of this book is to demonstrate web applications, we will focus our efforts on HTMLspecifically XHTML, an updated version of HTML that is fully compliant with XML and is generally cleaner and easier to parse than HTML. XML is a highly organized markup language in which tags must be followed by closing tags, and the rules for how the tags are placed are more clearly specifiedin particular, no overlapping is allowed.

Thus, the following HTML code

<br> <br> <B>This is an <em>example of </B>overlapping tags</em>

is not valid in XHTML, since it is not valid XML. (See Chapter 23 for more detail.) The <br> tags need to be closed either by an accompanying </br> tag or by replacing them with the empty tag, <br/>. Similarly, the <b> and <em> tags are not allowed to overlap. To write this code in XHTML, we simply need to change it to

<br/> <br/> <b>This is an <em>example of</em></b><em>overlapping tags</em>

For those who are unfamiliar with XHTML, we will provide more details on it in Chapter 23. For now, it is worth noting that it is not very different from regular HTML, barring the exceptions we mentioned previously. Throughout this book, we will use HTML and XTHML interchangeablywhen we mention the former, chances are we are writing about the latter.

When designing the user interface for our application, it is important to think of how we want the users to interact with it. If we have a highly functional web application with all the features we could possibly want, but it is completely counterintuitive and indecipherable to the end user, we have failed in our goal of providing a useful web application.

As we will see in Chapter 14, it is very important to plan the interface to our application in advance, have a few people review it, and even prototype it in simple HTML (without any of the logic behind it hooked up) to see how people react to it. More time spent planning at the beginning of the project translates into less time spent on painful rewrites later on.

Business Logic

As we mentioned in the "Basic Layout" section, if our user interface code were to talk to all of the backend components in our system, such as databases and web services, we would quickly end up with a mess of "spaghetti code." We would find ourselves in serious trouble if we wanted to remove one of those components and replace it with something completely different.

Abstracting Functionality

To avoid this problem, we are going to create a middle tier in our application, often referred to as the "business logic" or "biz-logic" part of the program. In this, we can create and implement an abstraction of the critical elements in our system. Any complicated logic for rules, requirements, or relationships is managed here, and our user interface code does not have to worry about it.

The middle tier is more of a logical abstraction than a separate system in our program. Given that our options for abstracting functionality into different processes or services are limited in PHP, we will implement our business logic by putting it into separate classes, separate directories, and otherwise keep the code separate but still operating from the same PHP scripts. However, we will be sure that the implementation maintains these abstractionsthe user interface code will only talk to the business logic, and the business logic will be the code that manages the databases and auxiliary files necessary for implementation.

For example, our business logic might want to have a "user" object. As we design the application, we might come up with a User and a UserManager object. For both objects, we would define a set of operations and properties for them that our user interface code might need.

User: { Properties: - user id - name - address, city, province/state, zip/postal code, country - phone number(s) - age - gender - account number Operations: - Verify Account is valid - Verify Old Enough to create Account - Verify Account has enough funds for Requested Transactions - Get Purchase History for User - Purchase Goods } UserManager: { Properties: - number of users Operations: - Add new user - Delete User - Update User Information - Find User - List All Users }

Given these crude specifications for our objects, we could then implement them and create a bizlogic directory and a number of .inc files that implement the various classes needed for this functionality. If our UserManager and User objects were sufficiently complex, we could create a separate userman directory for them and put other middle-tier functionality, such as payment systems, order tracking systems, or product catalogues into their own directories. Referring back to Chapter 3, "Code Organization and Reuse," we might choose a layout similar to the following:

Web Site Directory Layout: www/ generatepage.php uigeneration.inc images/ homepage.png bizlogic/ userman/ user.inc usermanager.inc payments/ payment.inc paymentmanager.inc catalogues/ item.inc catalogue.inc cataloguemanager.inc orders/ order.inc ordermanager.inc

If we ever completely changed how we wanted to store the information in the database or how we implemented our various routines (perhaps our "Find User" method had poor performance), our user interface code would not need to change. However, we should try not to let the database implementation and middle tier diverge too much. If our middle tier is spending large amounts of time creating its object model on top of a database that is now completely different, we are unlikely to have an efficient application. In this case, we might need to think about whether we want to change the object model exposed by the business logic (and therefore also modify the front end) to match this, or think more about making the back end match its usage better.

What Goes into the Business Logic

Finally, in the middle tier of our web applications, we should change how we consider certain pieces of functionality. For example, if we had a certain set of rules to which something must comply, there would be an instinctive temptation to add PHP code to verify and conform to those rules in the implementation of the middle tier. However, rules changeoften frequently. By putting these rules in our scripts, we have made it harder for ourselves to find and change them and increased the likelihood of introducing bugs and problems into our web application.

We would be better served by storing as many of these rules in our database as possible and implementing a more generic system that knows how to process these rules from tables. This helps reduce the risk and cost (testing, development, and deployment) whenever any of the rules change. For example, if we were implementing an airline frequent flyer miles management system, we might initially write code to see how miles can be used:

<?php if ($user_miles < 50000) { if ($desired_ticket_cat = "Domestic") { if ($destination == "NYC" or $destination == "LAX") { $too_far = TRUE; } else if (in_range($desired_date, "Dec-15", "Dec-31")) { $too_far = TRUE; } else ($full_moon == TRUE and is_equinox($desired_date)) { $too_far = FALSE; // it's ok to buy this ticket } } else { $too_far = TRUE; } else { // etc. } ?>

Any changes to our rules for using frequent flyer miles would require changing this code, with a very high likelihood of introducing bugs. However, if we were to codify the rules into the database, we could implement a system for processing these rules.

MILES_REQUIRED DESTINATION VALID_START_DATE VALID_END_DATE 55000 LAX Jan-1 Dec-14 65000 LAX Dec-15 Dec-31 55000 NYC Jan-1 Dec-14 65000 NYC Dec-15 Dec-31 45000 DOMESTIC Jan-1 Dec-14 55000 DOMESTIC Dec-15 Dec-31 etc...

We will see more about how we organize our middle tier in later chapters, when we introduce new functionality that our business logic might want to handle for us.

Back End/Server

Our final tier is the place without which none of our other tiers would have anything to do. It is where we store the data for the system, validate the information we are sending and receiving, and manage the existing information. For most of the web applications we will be writing, it is our database engine and any additional information we store on the file system (most notably .xml files, which we will see in Chapter 23).

As we discussed in Chapter 8, "Introduction to Databases," and Chapter 9, "Designing and Creating Your Database," a fair amount of thought should go into the exact layout of your data. If improperly designed, it can end up being a bottleneck in your application that slows things down considerably. Many people fall into the trap of assuming that once the data is put in a database, accessing it is always going to be fast and easy.

As we will show you in the examples throughout this book, while there are no hard and fast rules for organizing your data and other back end information, there are some general principles we will try to follow (performance, maintainability, and scalability).

While we have chosen to use MySQL as the database for the back-end tier of our application, this is not a hard requirement of all web applications. We also mentioned in Chapter 8 that there are a number of database products available that are excellent in their own regard and suited to certain scenarios. We have chosen to use MySQL due to its popularity, familiarity to programmers, and ease of setup.

However, as we show in Appendix B, "Database Function Equivalents," there is nothing preventing us from trying others. And, having chosen a 3-tier layout for our web applications, the cost of switching from one database to another is greatly reduced (though still not something to be taken lightly).

n-Tier Architectures

For complicated web applications that require many different pieces of functionality and many different technologies to implement, we can take the abstraction of tiers a bit further and abstract other major blocks of functionality into different modules.

If we were designing a major E-Commerce application that had product catalogs, customer databases, payment processing systems, order databases, and inventory systems, we might want to abstract many of these components so that changes in any of them would have minimal impact on the rest of the system (see Figure 13-5).

Figure 13-5. An n-tiered web application setup example.

We should be constantly thinking about maintaining a balance between abstraction and its cost. The more we break areas of functionality into their own modules, the more time we spend implementing and executing code to communicate between the different modules. If we see something that is likely to be reused, has a particular implementation that lends itself to an isolated implementation, or is very likely to change, then we would probably consider putting it in its own module. Otherwise, we should be asking ourselves about the benefits of doing this.

Any way that we decide to implement our web application, we will stress the importance of planning in advance. Before you start writing code, you should have an idea of how the various layers of the application will be implemented, where they will be implemented, and what the interface will look like. Not having this in mind before you begin is a sure way to guarantee wasted time as problems crop up. (This is not to say that a well-planned system will not encounter problems as it is being implemented, but the goal is to make them fewer and less catastrophic.)

Performance and Stability

We have frequently mentioned that the performance and scalability of our web applications are of high concern to us, but we must first define these. Without this, it is difficult to state what our goals are or to decide how to measure our success against them.

Performance

Perormance is very easy to define for our web applications. It is simply a measure of how much time elapses between when the user asks for something to happen and his receiving confirmation of its happening, usually in the form of a page. For example, an E-Commerce shoe store that takes 20 seconds to show you any pair of shoes you ask to see is not going to be perceived as having favorable performance. A site where that same operation takes 2 seconds will be better received by the user. The potential locations for performance problems are many. If our web application takes forever to compute things before sending a response to the user, our database server is particularly slow, or our computer hardware is out of date, performance may suffer.

However, beyond things over which we have complete control, there are other problems that can adversely affect our performance. If our ISP (Internet Service Provider) or the entire Internet slows down significantly (as it has a tendency to do during large news events or huge virus/worm outbreaks), our web site's performance could suffer. If we find ourselves under a denial-of-service (DoS) attack, our web site could appear unusually slow and take many seconds to respond to seemingly simple requests.

The key item to remember is that the user does not knowand probably does not carewhat the source of the problem is. To his eyes, our web application is slow, and after a certain point, it irritates or discourages him from using our application. As web application authors, we need to constantly be thinking about the potential problems we might encounter and ways to deal with them as best we can.

Scalability

While there is a tendency to group them together, scalability and performance are very different beasts. We have seen that performance is a measure of how quickly a web site or web application responds to a particular request. Scalability, on the other hand, measures the degree to which the performance of our web site degrades under an increasing load.

A web application that serves up a product catalog page in 0.5 seconds when 10 users are accessing the site but 5 seconds when 1,000 users are using the site sees 1,000 percent degradation in performance from 10 to 1,000 users. An application that serves up a product catalogue in 10 seconds for 10 users and 11 seconds for 1,000 users only sees 10 percent degradation in performance and has better scalability (but obviously worse performance)!

While we will not argue that the latter server is the better setup, it serves to demonstrate that observing great performance on a simple test system is not sufficientwe need to think about how that same application will respond to many users accessing it at the same time. As we will discuss in Chapter 29, "Development and Deployment," you will want to consider using testing tools that simulate high loads on your machines to see how the application responds.

Improving Performance and Scalability

Many people, when faced with the possibility of suboptimal performance or scalability, suggest that you "throw more hardware" at the problembuy a few more servers and spread the load across more machines. While this is a possible approach to improving performance, it should never be the first one taken or considered.

If our web application was so poorly designed that adding a $20,000 server only gave us the ability to handle 10 more users at a time or improved the speed of our product catalog listings by 5 percent, we would no be spending our money wisely.

Instead, we should focus on designing our applications for performance, thinking about potential bottlenecks, and taking the time to test and analyze our programs as we are writing them. Unfortunately, there are no hard and fast rules or checklists we can consult to find the magic answers to our performance problems. Every application has different requirements and performance goals, and each application is implemented in a very different way.

However, we will endeavor to show you how we think about performance in the design of any samples or web applications we develop.

Категории