My programming life is not very glamorous. Most of my time is spent in loops, usually working over arrays of objects that are loose wrappers around records stored in a database. Pretty much the bread and butter of all web applications.
Which is how I became fast friends with the foreach control structure in PHP:
Because of this, I’ve written a lot code that returns large arrays of objects, only to be iterated over using a foreach. The problem with this method is that each object has to be instantiated in advance and shoved into an array before any work can be done on it. The longer the array of objects, the more memory and time required.
What I needed was a way to use foreach to instantiate a new object at the beginning of each iteration—and then discard it at the end—so that at no point would more than one object exist in memory.
It was looking like I’d have to leave my precious foreach behind (for a while-loop) when I discovered that in PHP5, I can define a class that implements PHP’s internal Iterator interface—giving it the crucial methods that allow a foreach to iterate over an object (rewind, next, current, key, valid), giving me the power to decide when the individual objects in the collection are instantiated.
Meaning my code can continue to use the elegantly readable foreach, but instead of passing it an array of objects, I can pass it a custom Collection object with the ability to instantiate each child only when foreach requests it. Here’s the code:
<?php
class Collection implements Iterator
{
private $class_name;
private $rst;
private $key = -1;
private $value;
private $length = false;
public function __construct($class_name, $sql)
{
$this->class_name = $class_name;
if (strtolower(substr(trim($sql), 0, 6)) == 'select') {
// this is here for illustrative purposes
// you probably want to wrap this in a DB class
$conn = @mysql_pconnect(DB_SERVER, DB_USER, DB_PASSWORD);
@mysql_select_db(DB_NAME, $conn);
$this->rst = @mysql_query($sql, $conn);
$this->rewind();
} else {
// throw some kind of error
}
}
public function rewind()
{
if ($this->key != 0) {
$this->key = 0;
@mysql_data_seek($this->rst, 0);
$this->cacheNext();
}
}
private function cacheNext()
{
if ($row = mysql_fetch_assoc($this->rst)) {
$this->value = new $this->class_name($row['id']);
} else {
$this->value = false;
}
}
public function current()
{
return $this->value;
}
public function key()
{
return $this->key;
}
public function next()
{
$this->key++;
$this->cacheNext();
return $this->current();
}
public function valid()
{
return $this->current() !== false;
}
public function length()
{
if ($this->length === false) {
$this->length = mysql_num_rows($this->rst);
}
return $this->length;
}
}
?>
One possible improvement: Each constructor of our model classes accepts an id parameter which is used to load the rest of the fields for that record from the DB. But since the Collection object already has to execute a select query to get the id from the DB, it seems like it might as well grab the rest of the fields for that record at the same time—and then use them to instantiate a child object without an additional database select. What I don’t know is how the performance savings of select * from table compares to the memory savings of select id from table + select * from table where id = $id.
While I was away in France, they reorganized the office, and the tech group moved to the north “wing” which is nice because it has a door.
As much as I sometimes complain about the time it takes having to bus all the way to Sausalito (though working to change that) I’ve probably got the killerest view I’ll ever have from where I work.
I look right, and I see this. Water and an island!
That is, when I don’t have the blinds down to keep the glare off my screen :)
It must be a tight job market when the creator of PHP is trying to lure PHP developers to Yahoo, but if you’d rather work on a small team for an awesome startup supporting bloggers, Federated Media is also hiring PHP developers. Come work with me, Andre, Andy, Jonathan, and Ivan on the future of publishing!
Interested, but want to know more? Feel free to leave questions in the comments or send me an email.
I’d be remiss not to mention we fire up the grill for lunch sometimes. Like today.
At Federated Media, any individual can buy ads on any of our 80+ sites. When they submit their creative, it’s very likely it’ll be animated. There’s probably no doubt that animated ads are more effective at attracting attention than their static counterparts, but that’s also what makes them so annoying.
To strike a balance—respecting our authors and their audiences above all else—we require that any animation lasts no longer than 15 seconds.
The only problem is, we had no way to enforce this. So Andre, who first built FM’s platform, would watch every banner ad for at least 15 seconds to make sure it stopped animating. A nefarious ad designer could have successfully gotten past the Andre-check by animating the banner for 12 seconds, pausing it for 20 seconds, and then repeating continuously. What usually happens though is that someone submits a banner with animation that lasts for about 2-3 seconds, pauses for 3, and then repeats continuously.
Whenever an ad like this was submitted, we’d have to email the person who submitted it, ask if they could resubmit an ad with animation that stops after 15 seconds, etc etc etc. It was a really manual process.
So earlier this week I set out to see if there was an easy way to probe a GIF file programmatically to determine whether or not it was configured to loop continuously. PHP doesn’t do this natively, and I couldn’t find anything on Google that even came close to analyzing a GIF’s animation metadata—in any language.
So, I decided to write my own.
With a hex editor in one hand and the GIF 89a standard in the other, I set about understanding the image format and the Netscape 2.0 Application Extension (below) that added the option to loop the animation continuously (or for a specific number of iterations):
byte 1 : 33 (hex 0x21) GIF Extension code
byte 2 : 255 (hex 0xFF) Application Extension Label
byte 3 : 11 (hex (0x0B) Length of Application Block
(eleven bytes of data to follow)
bytes 4 to 11 : "NETSCAPE"
bytes 12 to 14 : "2.0"
byte 15 : 3 (hex 0x03) Length of Data Sub-Block
(three bytes of data to follow)
byte 16 : 1 (hex 0x01)
bytes 17 to 18 : 0 to 65535, an unsigned integer in
lo-hi byte format. This indicate the
number of iterations the loop should
be executed.
bytes 19 : 0 (hex 0x00) a Data Sub-block Terminator.
In the end, I didn’t write a complete GIF parser, but opted for a simple regular expression probe of the GIF file. Bonus: figuring out how to convert little-endian unsigned ints from hexadecimal to decimal.
So far I’ve tested the code on some pretty long GIF animations, and it seems accurate. So if you need to find the duration of a GIF animation and/or the number of times it loops, here’s the code to do so in PHP—which should be pretty easily translated into just about any other language.
During the summer of 2004, Boing Boing was experimenting with sponsorship ads, partly to cover skyrocketing hosting costs and partly to make money. Underneath those ads was the text: Interested in Sponsoring? Email Us.
By August I was getting annoyed by the array of Suicide Girls ads showing up next to my beloved Boing Boing, so I decided to let them know. This is the email I sent.
The SG ads on boingboing are distracting. I have no problem with SG or porn, but I’d rather they not encroach on my ability to focus on the boingboing content.
Which makes me think about all the other “ads”. Typepad, O’Reilly, Wired. I have no problem with the BB editors hawking their books, but the real ads are boring. They are too static. They are the antithesis of a frequently updated blog like boingboing.
So why not create a means for frequently updated, user-submitted ads. Think of it as adwords meets blogs. People submit a small text add plus URL, and for a buck (via paypal, credit card?) it sits at the top of the right hand column. Maybe you ensure that it will be displayed for at least X page views, and then the next one will appear on top. Maybe you’ll display five at once and they scroll through as each hits its max page
views.
Just a thought.
I remembered writing this (presumably to Mark Frauenfelder) because last month (exactly two years after I sent it) I implemented the ability to sell and serve flat rate text ads in Federated Media’s advertising platform—which manages Boing Boing’s ads.
Here’s the part that really bakes my noodle. Turns out that the Interested in Sponsoring? Email Us text linked to the email address of John Battelle, founder of Federated Media, who it turns out I had actually sent the email to (not Mark).
Which means that two years ago I had an idea, and two years later I implemented it, now working for the person I originally shared the idea with.