PHP 5.4 And Short Syntax

I’m actually pretty excited about PHP 5.4’s release. I still manage to write a fair amount of PHP these days.

I suspect it will be quite some time until I have enough PHP 5.4 targets to utilize some of the newer features like Traits and the short array syntax, but that’s OK. Performance and memory improvements are always welcome. I doubt I’ll touch the built in web server much. It seems to only be intended for testing, and I never really ran across PHP without Apache (except when I didn’t want it in which case cli is what I wanted).

One slight disappointment is that there is a short array syntax, but no short object syntax. I guess you could always use casting to make it happen like this:

$obj = (object) array(‘foo’ => ‘bar’);
$obj = (object) [‘foo’=>‘bla’, ‘bar’=>2]; // PHP 5.4+

But that still isn’t ideal. I’d prefer to see:

$obj = {‘foo’=>‘bla’, ‘bar’=>2};

Maybe next time around.

PHP’s include_once() Is Insanely Expensive

I’ve always heard the include_once() and require_once() functions were computationally expensive in PHP, but I never knew how much. I tested the following out on my i7 2010 MacBook Pro using PHP 5.3.4 as shipped by Apple.

This first test uses include_once() to keep track of how often a file is included:

$includes = Array();
$file = ‘benchmarkinclude’;
 
for($i=0; $i < 1000000; $i++){
    include_once($file.‘.php’);
}

Took: 10.020140171051 sec

This second example uses include() and uses in_array() to keep track of if I loaded the include:

$includes = Array();
$file = ‘benchmarkinclude’;
 
for($i=0; $i < 1000000; $i++){
    if(!in_array($file, $includes)){
        include($file . ‘.php’);
        $includes[] = $file;
    }
}

Took: 0.27652382850647 sec

For both, the include had the following computation:

$x = 1 + 1;

Lesson learned: Avoid using _once if you can avoid it.

Update: That means something like this will theoretically be faster:

$rja_includes = Array();
function rja_include_once($file){
    global $rja_includes;
    if(!in_array($file, $rja_includes)){
        include($file);
        $rja_includes[] = $file;
    }
}

PHP 5.3.4 Changes rand(), Filled My Error Log, Spikes Load

I ran into a peculiar situation with a PHP web application that went from working for several years without incident to suddenly resulting in timeouts and spiking the load on my server. Some investigation traced it back to a seemingly benign and obscure change to PHP’s rand() implementation between 5.3.3 and 5.3.4.

To summarize several hundred lines of code: it gets a value from an array where the index is a random number between X and Y. X and Y are highly unpredictable by nature of the application. It keeps trying with different values until something is returned. Something like:

function random($a, $b){
   if(MT_RAND){
       return mt_rand($a, $b);
   }
   return rand($a, $b);
}
 
$x = 0;
while($x == 0){
    $x = $arr[random($x, $y)];
}
return $x;

See it? If you don’t, you shouldn’t feel bad, I didn’t see it initially either.

Prior to PHP 5.3.4 mt_/rand did not check if the max is greater than the min. This has changed as a result of bug 46587. That 4 line change made an impact.

Take this example code:

print "right: " . mt_rand(1, 5) . "\n";
print "wrong: " . mt_rand(5, 1) . "\n";

In PHP 5.3.3 you’d get:

$ php test.php
right: 3
wrong: 4

$ php test.php
right: 2
wrong: 5

Despite the incorrect order of max/min it actually worked just fine. It had done so at least since PHP 4.3 (circa 2003) as far as I’m aware.

In PHP 5.3.4:

$ php test.php
right: 2
PHP Warning:  mt_rand(): max(1) is smaller than min(5) in /test/test.php on line 4
wrong: 

As a result, this while(){} never terminated until the timeout was reached.

The solution is obviously trivial once you actually trace this bug back:

function random($a, $b){
   
    // If a is greater, flip order
    if($a > $b){
        $tmp = $a;
        $a = $b;
        $b = $tmp;
   }
       
   if(MT_RAND){
       return mt_rand($a, $b);
   }
   return rand($a, $b);
}

This resulted in several GB’s worth of warnings in my error log in a matter of hours. You can also see how it (the brown area) dropped off once the fix was deployed as measured by % of wall clock time:

Transaction %

It’s the little things sometimes that cause all the trouble.

Another Brick In The Facebook Wall

I ran across the problem recently trying to write to a users wall using the Facebook API. The Facebook documentation is hardly sane as it’s a mix of languages, not entirely up to date, and lacks good examples. The error messages are hardly ideal either. “A session key is required” at least leads me in the right direction. “Invalid parameter” is just unacceptable and makes me stabby.

So here’s some cleaned up pseudocode I pulled together that will hopefully be of use to others who bang their heads against the wall. This “works for me” in my limited testing over several days:

require_once(‘./facebook-platform/php/facebook.php’);
 
$facebook = new Facebook($apiKey, $appSecret);
 
// This gets us the uid
$canvasUser = $facebook->get_canvas_user();
 
// And the session key
$sessionKey = $facebook->api_client->session_key;
 
// You need both of these permission bits
$user = $facebook->require_login($required_permissions = ‘publish_stream,offline_access’);
 
// You’ll likely have an application sitting here and at
// some point in your application be doing the following
 
// Here’s where we actually set the status
$facebook->api_client->call_method("facebook.status.set", array(
    ‘uid’ => $canvasUser,
    ‘status’ => "All in all it’s just another brick in the wall.",
    ‘session_key’ => $sessionKey
));

Getting the right permissions is key.

The thing that ends up being the most confusing is the session_key. After reading the docs, I was inclined to do:

$token = $facebook->api_client->auth_createToken();
$sessionKey = $facebook->api_client->auth_getSession($token);

What you really want is:

$sessionKey = $facebook->api_client->session_key;

You can also use adapt this to use stream.publish if you’d like.

Facebook’s HipHop For PHP

I mentioned the other day that Facebook was about to open source a method for speeding up PHP. Today they announced HipHop a code transformation tool that converts PHP into C++ and compiles using g++. There is apparently a server component to this strategy as well.

I’m slightly skeptical that this approach will have much more success than the other attempts in the past. This approach may make sense for Facebook, but I don’t think it will pay off for most smaller (relatively speaking) sites.

I think for most users doing something similar to the Unladen Swallow, an effort for Python which is trying to build a custom virtual machine with a JIT built on top of LLVM would be best. Perhaps even Nanojit could be a potential option.

I suspect HipHop will be a fork more than anything else. Regardless it’s a pretty cool project and some really interesting technology.

Facebook’s New PHP “Runtime”

According to SDTimes Facebook is about to release a new open source project where it has either re-written the PHP Runtime (unlikely) or built a PHP compiler (more likely).

There is another possibility. It could be a Zend extension acting as an opcode cache (APC, XCache, etc.) and a FastCGI replacement.

It’s also possible they used Quercus as either a starting point or inspiration and it’s actually Java based, but that sounds unlikely.

Regardless, it will be interesting to see what comes of this.

PHP Namespacing

The PHP folks have finally announced that PHP will get namespacing in the form of ‘\‘. The universal escape character. They really should have went with the standard ‘::‘ or ‘:::‘. Using ‘\‘ is going to work well.

I was thinking something like this would be more appropriate (background on compatibility here):

......................................__................................................
.............................,-~*`¯lllllll`*~,..........................................
.......................,-~*`lllllllllllllllllllllllllll¯`*-,....................................
..................,-~*llllllllllllllllllllllllllllllllllllllllllll*-,..................................
...............,-*llllllllllllllllllllllllllllllllllllllllllllllllllllll.\.......................... .......
.............;*`lllllllllllllllllllllllllll,-~*~-,llllllllllllllllllll\................................
..............\lllllllllllllllllllllllllll/.........\;;;;llllllllllll,-`~-,......................... ..
...............\lllllllllllllllllllll,-*...........`~-~-,...(.(¯`*,`,..........................
................\llllllllllll,-~*.....................)_-\..*`*;..)..........................
.................\,-*`¯,*`)............,-~*`~................/.....................
..................|/.../.../~,......-~*,-~*`;................/.\..................
................./.../.../.../..,-,..*~,.`*~*................*...\.................
................|.../.../.../.*`...\...........................)....)¯`~,..................
................|./.../..../.......)......,.)`*~-,............/....|..)...`~-,.............
..............././.../...,*`-,.....`-,...*`....,---......\..../...../..|.........¯```*~-,,,,
...............(..........)`*~-,....`*`.,-~*.,-*......|.../..../.../............\........
................*-,.......`*-,...`~,..``.,,,-*..........|.,*...,*...|..............\........
...................*,.........`-,...)-,..............,-*`...,-*....(`-,............\.......
......................f`-,.........`-,/...*-,___,,-~*....,-*......|...`-,..........\........ 

Embedded JavaScript For Web 3.0

John Resig has an interesting blog post on embedded JavaScript. It’s something I’ve been thinking about for a little while.

It would be awesome to see a PHP extension to embed SpiderMonkey into PHP. As far as I’m aware Facebook is the only one that’s taken a step in that direction with FBJS, which uses Mozilla source code. Perhaps that could be a starting point.

Considering the ubiquity of JavaScript, using SpiderMonkey, which is already available for Perl and Python, or Rhino (for Java) would make sense. It would allow for JavaScript to be for logic what XML is for data. In my mind that is nirvana for the web.

XML made our data portable. JavaScript can make our logic portable. Seems practical enough right?

For those who question security, it’s really up to the client to decide if it should parse JS, and what subset it should allow (perhaps no eval()). Having an API based on JS is really no less secure than any other language including one that’s home made. It’s advantage is that it’s used everywhere else and makes your API easier to work with.

This could be cornerstone of Web 3.0. Web 2.0 was largely about shared data and isolated small services. Web 3.0 could be about shared data and shared services.

Summer Of Code 2008

Google announced the project lists for Summer Of Code 2008. Some of the more interesting projects from my perspective:

Adium

Dojo Foundation

FFmpeg

Gallery

Inkscape

Joomla!

The Mozilla Project

MySQL

PHP

Pidgin

WebKit

WordPress

The Winner For Most Embedded Is: SQLite

So the format war of Blue-ray vs. HD-DVD is over. There are still several other rather significant battles going on in the tech world right now that aren’t Microsoft vs. Apple or Yahoo vs. Google. For example:

Adobe Air vs. Mozilla Prism vs. Microsoft Silverlight

Google Gears vs. HTML5 Offline support

Android vs. iPhone SDK vs. Symbian

Ruby On Rails vs. PHP

Not every case will have a true “winner”. That’s not really a bad thing. Choice is good. In some cases they will merge to form one standard, such as what’s likely for offline web applications.

What is interesting is that SQLite really dominates right now. Adobe Air, Mozilla Prism, Google Gears, Android, iPhone SDK (likely through Core Data API), Symbian, Ruby On Rails (default DB in 2.0), PHP 5 (bundled but disabled in php.ini by default). It’s becoming harder and harder to ignore that SQL survived the transition from mainframe to server, and now is going from server to client.

No longer is the term “database” purely referring to an expensive RAID5 machine in a datacenter running Oracle, MySQL, DB2 or Microsoft SQL Server. It can now refer to someone’s web browser, or mobile phone.

This has really just begun to have an impact on things. The availability of good information storage, retrieval, and sorting means much less of these poorly concocted solutions and much better applications. Client side databases are the next AJAX.

Edit [2/27/2008 9:14 AM EST]: Added Symbian, since they also use SQLite. Thanks Chris.