The more jobs flowing through Gearman, the more likely something will happen. Queues can get backed up, workers can crash and performance can degrade. It is important to monitor the status of the Gearman ecosystem and be proactive about fixing problems. We can do this using the gearadmin tool.
Gearman comes with a few tools that make development and testing easier. The gearman program creates boilerplate clients and workers. The gearman program comes default with the gearmand package. Do not confuse gearman with gearmand. The gearmand daemon is what manages the queue, clients and workers. The gearman program is a tool to quickly create simple clients and workers. The options for gearman can be slightly confusing, so I will go through a set of examples on how to use them.
The Gearman examples on php.net are a great primer for groking how the Gearman client and worker interact with each other. One gripe I have is that the examples declare global functions for the worker to register. I feel this leads develpers down the wrong path. With PHP5.3, there is an easier solution though: anonymous functions.
The Gearman project has been slowly migrating from C to C++. This migration has gone under the radar due to the popularity of Cent OS 5 and given gearmand version of 0.14. This version of gearmand worked with any version of pecl/gearman and there was never any compelling reason to upgrade gearmand. That changed with the release of pecl/gearman 1.0
The Standard PHP Library (SPL) is a powerful set of tools that are often overlooked. It is very common to see an SPL talk at conferences, but those talks usually just introduce each SPL class to the audience without giving some real world examples. I am going to show you a real world example on how to use SPL FilterIterator in an ecommerce website.
Writing the scaffolding for gearman workers is a pretty trivial task using the pecl/gearman extension. Keeping that scaffolding consistent between all the gearman workers in your application can get tough. I created a script that will remove the boilerplate gearman code and allow gearman worker scrips to simply be function definitions.
I read a really good post from Vic on his move away from ORM frameworks. I did not agree with everything he said though and wanted to start a discussion. Unfortunately, there is no way to leave comments on his blog. The next best thing is to post it here.
Gearman is one of my favorite technologies to use. So much in fact that I recently decided to take over the maintenance of pecl/gearman. While asynchronous tasks are a great feature, I find the ability to run multiple tasks in parallel to be much more useful. One of the biggest shortcomings of this approach was that uncaught worker exceptions would be treated as a successful completion of a job. I used to wrap all my workers in a generic try/catch block to prevent this from happening. With the latest commits to pecl/gearman, I can now use the exception callback to properly track the exceptions.
Coding standards are religious in nature, ranking high on the list near vim vs emacs. Paul Reinheimer woke up many in the twitterverse with a simple post:
In my last post I explained how to build a development version of a pecl extension. Now we will go through the bug lifecycle in the pecl/memcache extension. Besides writing the actual C code to fix the bug, it is considered a best practice to write a test that verifies the bug has been fixed. I will use PECL bug #16442 - memcache_set fail with integer value as an example, even though it is already been fixed.
My last post I explained how to efficiently checkout the php svn repository. Now we need to start building pecl extensions and even php itself. I prefer to use Cent OS for my linux needs and naturally use rpm's to track all my packages. This means I have a stable version of php installed with all the various extensions that I could want. Rather than messing with this stable version, I am going to build a custom debug build of php in /usr/local. I say "debug", because this build of php will use the --enable-debug option to allow easy debugging using gdb. Since I am doing pecl extension development, I don't want to build the trunk version of php. I want to build my pecl extensions against the most recent stable version of php to isolate environmental issues as much as possible.
The svn repository for PHP is rather large. Trying to checkout the entire repo is both time consuming and wastes a lot of space. Most people, including myself, are only concerned with a subset of the repository. One of the advantages svn has over git is the ability to do partial checkouts of the repository. I am going to borrow from an old email Rasmus sent that details how to do a partial checkout of the PHP source code.
The gearman job queue is great for farming out work. After reading a great post about Poison Jobs, I limited the number of attempts the gearman daemon will retry a job. This seemed fairly straight-forward to me: if a job fails, then the gearman daemon will retry the job the specified number of times. I learned the hard way that it was not that simple. There is specific criteria the gearman daemon follows in order to retry a job.
I just recently stumbled upon PHPUnit's --bootstrap flag. I used to bootstrap each of my unit tests using a require statement at the top of the file. I always found this very tedious, but did not want to write some script to wrap each unit test. The --bootstrap flag solves this problem quite nicely.
Reuse is a term often used amongst developers. It usually carries with it a positive connotation and a developer writing reusable code is seen as a good thing. I think there are a lot of developers who have a completely different understanding of what code reuse means. When I talk about code reuse, I am talking about reusing logic within the code. Based on code reviews, it seems the most common definition of reuse is: anytime a function or method is used by two or more callers. This definition fails to realize the true meaning of reuse and can lead to problems. In the name of "reuse", I have noticed some developers group common code at the application level by creating functions or methods that solve some pseudo-generic problem. I call this type or reuse inverted reuse.
I sometimes help update the PHP documentation. I have not done it in a while since I started maintaining pecl/memcache. However, there was a recent bug submission where I felt the documentation for pecl/memcache should be updated. A lot of work has been done to the documentation tools since I last updated documentation. I went to http://doc.php.net for a quick primer on how to generate some new documentation output so I could test my changes and found the documentation for generating documentation a little hard to follow.
It has been said that all languages, over time, implement a dialect of lisp. PHP appears to be no exception.
If you separate your business logic from your data access logic, the last thing you want to do is make your business logic unit tests reliant on the database. This is normally not a big deal: retrieve the data, store it in an array and pass it off to the class with the business logic. Mocking the data for the unit test simply requires you to hardcode from array information in the test. However, I recently ran into a case where I wanted to pass Zend_Db_Table_Row and Zend_Db_Table_Row objects to the business logic and mocking them was not so easy.
If you have ever visited StackOverflow.com you may have noticed the ads for Splunk. Splunk aggregates log files together and provides a web interface to search through those logs. The setup for php is easy: set the php.ini error_log value to "syslog". The Splunk instructions show you how add a single line to your syslong.conf to have syslog send those messages over to Splunk.
I was writing some code today and not using Test-Driven development. The reason was that I did not have a good understanding of what I was writing, so I decided to write some of the guts before writing the tests. In the process of writing the guts, I recognized that I was paying very close attention to how I was going to later test each of the methods I was writing. I was paying especially close attention to the Law of Demeter. The idea behind the Law of Demeter is to keep units of code distinct from one another. So how did this relate to my code? To put it simply, my business logic methods did not use get methods.
A while back I wrote a post about using Facebook's Thrift. One comment asked me to post the PHP client used to connect to the C++ server I was demo'ing. Most of the client is boiler-plate code generated by Thrift, so I chose to omit it at the time. Here it is:
Phing is a PHP port of Java Ant. It is a great tool to use in development. It standardizes a lot of build scripts you would have to maintain internally. Unfortunately, examples seem to be lacking. As a quick introduction to Phing, I will show how you can check all your php scripts for syntax errors.
The Zend Framework ships with SOAP functionality and one especially neat class called Zend_Soap_AutoDiscover. This class uses a comment docblock to auto-generate a WSDL at runtime. I won't go into the details how it works here, but you can check the Zend Framework documentation for an example. When using this class at work, I noticed the WSDL would not always generate correctly. After a lot of digging around, I found the cause: eAccelerator.
There is a new magic constant in PHP 5.3: DIR. This new constant does not actually do anything new. It replaces the use of dirname(FILE) that is commonly found code using prior PHP versions.
Using non-relative URL's during early development can alleviate a lot of growing pains. This may seem counter-intuitive at first, but hear me out. We all learned long ago to stop hard-coding the domain name into the href attribute of an anchor tag. Instead, we used relative URL's such as '/index.php' to make our code much more portable. However, relative URL's become a pain point when trying to scale your website. Let's review some common scenarios that can be averted with some proper planning.
The namespace operator in PHP 5.3 is a backslash (). One of the criticisms of this operator is that the code starts to look like directory paths on Windows. The added side benefit of this is that spl_autoload() knows how to autoload classes that use a namespace style that matches the directory layout.
When I began taking over the web development project at work, I noticed a developer using a lot of static members and methods in his class definitions. His explanation was that it was an optimization he used to improve performance. Unfortunately, he had no metrics to back the statement up. So I set out to do some of my own.
Stefan Esser gave a presentation on Secure Programming with the Zend Framework at the 2009 Dutch PHP Conference. While the presentation was good, one thing that bothered me was the way authentication was being handled.
I started using the phpDocumentor for Vim (PDV) script written by Tobias Schlitt. Very quickly I found a bug with one of the regular expressions used to parse apart the class definition. Tobias does not seem to be maintaing this plugin anymore, so I decided I would fix the bug and submit a new version to vim.org. I packaged up the new version of the script and went to update the PDV page on vim.org only to find out I can't. There isn't even a mechanism for me to post a comment.
At work we were using VIM for all editing, except PHP. Back when the decision was made to use PHP for all web development, consultants told us we needed an IDE that offered all kinds of tools that VIM lacks. So we shopped around for IDE's and eventually bought Zend Studio licenses for everyone. Today those licesnses are collecting dust. Even with the feature rich toolset of Zend Studio, and other IDE's, none seem to satisfy our need like VIM does.
At work I maintain a handful of custom PHP extensions. When someone reports a problem with one of the extensions, I want to fire up gdb right away and see exactly what is going on. In order to do this, I build a custom php binary with debugging enabled. I leave this binary inside my home directory so as not to affect my installed production php binary. I should now be able to rebuild my custom extensions now with debugging enabled and start debugging. But wait, the configure script rejects the flag.