For anyone who’s left their feedreader and visited my homepage in the last few days, you may have been surprised to see that the most recent post in the leftmost column is 15 seconds of fame from Tuesday, February 13, 2007 11:01am. Yes, 2007.
Of course that’s not actually the most recent post—it should be showing Caught between PulseAudio and a quiet place from Wednesday, May 21, 2008 4:39pm, (and now this). But what’s altogether weirder is that the neatlinks column is completely up to date.
In my index.php template file, I actually have two loops, the main one which shows all posts without neatlinks, and the neatlinks loop which only shows posts from the neatlinks category. To exclude neatlinks from the main column, I just add
query_posts('cat=-23'); above the loop.
After a little digging into the PHP code behind WordPress, I had it spit out the query it sends to the database to pull out the data for the posts in the “The Loop”. What’s crazy (at first) is that rather than generating one big query joining every possible table to wp_posts, WordPress runs preliminary queries based on the query_posts attributes, to return a list of eligible or ineligible post IDs. Then it does the final posts select using giant
NOT IN() lists in the where clause. The main reason it structures it this way is so that it can use the LIMIT clause for paging.
SELECT SQL_CALC_FOUND_ROWS wp_posts.* FROM wp_posts WHERE 1=1 AND wp_posts.ID NOT IN ('876', '877', '878', '879', '880'...'5456', '5458', '5460', '5461', '5462', '5463') AND wp_posts.post_type = 'post' AND (wp_posts.post_status = 'publish') GROUP BY wp_posts.ID ORDER BY wp_posts.post_date DESC LIMIT 0, 8
When I ran this giant query directly on the database, I got back the old blog posts visible on my homepage. But when I ran it on my laptop with a copy of my WordPress data, I got back the correct list of recent posts. When I deleted a few IDs from the thousands in the
NOT IN() part of the query, and ran that on the database, I got the correct posts back.
So it appears there’s a devastating bug in MySQL’s handling of long
IN() lists that’s since been fixed. Unfortunately Dreamhost is running MySQL v5.0.24a, which was released in August 2006, almost two years ago! Locally I’m running 5.0.51a from January 2008.
The MySQL docs say that the
IN() list is limited only by the
max_allowed_packet size, which is 16 megabytes by default, and which this query, as giant as it is, is nowhere near reaching. I skimmed through the release notes between those two versions of MySQL, and the closest thing I could find to a related bug was in the 5.0.41 release notes from May 2007:
IN(value_list), the result could be incorrect if
BIGINT UNSIGNEDvalues were used for expr or in the value list. (Bug#19342)
So what did I immediately go do?
alter table wp_posts modify ID int unsigned not null auto_increment;
Once again I’m caught between a rock and a hard place. On one hand, I have no idea what’s the likelihood of Dreamhost upgrading MySQL on their database servers (though I sent them a very detailed support request), and on the other hand, my blog is broken, and that makes me very grumpy. I generally like the laid-back Dreamhost culture, but it might be time to pay for something a little more dedicated…