Meta Archives, page 6

Not only do I blog about blogging, I also blog about this very blog! So meta!

Blogger’s block

I want to write, but I don’t know what to write about. I feel like I’ve kind of painted myself into a corner where I only really blog about two different things. Hiking trips I’ve taken and technical howtos. (I’m not counting neatlinks, that’s social bookmarking, not blogging.)

The technical howtos are easy, they practically write themselves, and I actually find them really satisfying to write because they’re so damn simple. That and they tend to be the posts I refer back to again and again. But I wouldn’t want my blog to consist of only those. Thank goodness their source of inspiration is sporadic.

The hiking posts are labors of love, heavy emphasis on labor. Writing about a hike is essentially writing about walking. Sometimes it’s hard to make interesting. And I find that the effort involved to select, resize, and retouch a bunch of photos and then to write about the experience requires totally different parts of my brain—thus two or more days to execute. Reading my blog you might think that my life consists of hopping from National Park to National Park. (That actually sounds like kind of a nice life.)

It’s just that I feel like I’ve really gotten away from writing about the impromptu, the daily life stuff, which is half the fun of blogging. I also feel like I’ve gotten away from taking risks, either in function or form. But for crying out loud, I’ve been blogging here non-stop for over 6 years. It’s kind of hard to stay original for that long.

So yeah, I really hate blogging about blogging, or vomiting a bunch of stream-of-consciousness stuff out on my blog (especially when I have nothing else concrete to write). I guess I’m just trying to get this out of my system. Like maybe admitting I have a problem is the first step to recovery. So there. I’ve said it. I don’t know what to write about. I’m occasionally bored of my blog. And it’s autumn. I blame autumn. Fall is a depressing season. It always bums me out.

Broken blog = man obsessed

Databases are supposed to be these monolithic paragons of correctness and uniformity. So it’s a little unsettling to discover a database that returns grossly incorrect results. At least I wouldn’t expect WordPress to put version 5 of MySQL through its paces.

Here’s the scenario. Dreamhost, my webhost, is running MySQL Server v5.0.24a, released in August 2006. On my laptop I’m running MySQL Server v5.0.51a, released in January 2008. The difference between the two is a year and a half of bugfixes and performance improvements.

On each I’ve got a copy of the same database table, wp_posts, from my WordPress 2.5.1 install for Justinsomnia. And on each system I’ve got a copy of the giant SQL select statement generated by WordPress that returns the first 8 blog posts that should appear on my homepage. The unusual thing about this particular query is that it contains an insanely long NOT IN () list to filter out my neatlinks from the main column.

SELECT SQL_CALC_FOUND_ROWS  wp_posts.* 
FROM wp_posts  
WHERE 1=1  
AND wp_posts.ID NOT IN ('876', '877', '878', '879', '880'...'5456', '5458', '5460', '5461', '5462', '5463') 
AND wp_posts.post_type = 'post' 
AND (wp_posts.post_status = 'publish') 
GROUP BY wp_posts.ID 
ORDER BY wp_posts.post_date DESC 
LIMIT 0, 8

The problem is that running the same SQL statement on each server returns different results—and I wanted to figure out why.

So I prefixed the query with EXPLAIN, to see what each version of MySQL would tell me about how they were dealing with the same query. I expected the results to be the same. They turned out to be quite different.

MySQL 5.0.24a:
+----+-------------+----------+-------+--------------------------------------+---------+---------+------+------+----------------------------------------------+
| id | select_type | table    | type  | possible_keys                        | key     | key_len | ref  | rows | Extra                                        |
+----+-------------+----------+-------+--------------------------------------+---------+---------+------+------+----------------------------------------------+
|  1 | SIMPLE      | wp_posts | range | PRIMARY,post_status,type_status_date | PRIMARY | 4       | NULL |  611 | Using where; Using temporary; Using filesort |
+----+-------------+----------+-------+--------------------------------------+---------+---------+------+------+----------------------------------------------+

MySQL 5.0.51a:
+----+-------------+----------+------+--------------------------------------+------------------+---------+-------------+------+-------------+
| id | select_type | table    | type | possible_keys                        | key              | key_len | ref         | rows | Extra       |
+----+-------------+----------+------+--------------------------------------+------------------+---------+-------------+------+-------------+
|  1 | SIMPLE      | wp_posts | ref  | PRIMARY,post_status,type_status_date | type_status_date | 25      | const,const | 5350 | Using where | 
+----+-------------+----------+------+--------------------------------------+------------------+---------+-------------+------+-------------+

As you scroll to the right, you can see that v5.0.24a chose the PRIMARY key to find the most recent posts, while v5.0.51a opted for type_status_date. I’m not quite sure what that means, except that v5.0.24a obviously made the wrong choice. Which gave me an idea for an interim solution (while I wait for Dreamhost to upgrade MySQL): index hints.

If I could force MySQL to ignore the PRIMARY key, maybe it would “fix” my blog. So I added IGNORE INDEX (PRIMARY) to the query that selects posts, and voila, my blog homepage started working again! Here’s the diff of the code change I made. Luckily I have WordPress “installed” via Subversion, so any changes I make to core files get carried over with each upgrade.

Index: wp-includes/query.php
===================================================================
--- wp-includes/query.php	(revision 7987)
+++ wp-includes/query.php	(working copy)
@@ -1424,7 +1424,7 @@
 		if ( !empty($limits) )
 			$found_rows = 'SQL_CALC_FOUND_ROWS';
 
-		$request = " SELECT $found_rows $distinct $fields FROM $wpdb->posts $join WHERE 1=1 $where $groupby $orderby $limits";
+		$request = " SELECT $found_rows $distinct $fields FROM $wpdb->posts IGNORE INDEX (PRIMARY) $join WHERE 1=1 $where $groupby $orderby $limits";
 		$this->request = apply_filters('posts_request', $request);
 
 		$this->posts = $wpdb->get_results($this->request);

Hopefully I haven’t inadvertently broken some other aspect of my blog. If anyone sees anything out of order, please leave a comment, mmmk?

Update: for Nate, here’s the EXPLAIN on 5.0.24a with the IGNORE INDEX:

+----+-------------+----------+------+------------------------------+------------------+---------+-------------+------+----------------------------------------------+
| id | select_type | table    | type | possible_keys                | key              | key_len | ref         | rows | Extra                                        |
+----+-------------+----------+------+------------------------------+------------------+---------+-------------+------+----------------------------------------------+
|  1 | SIMPLE      | wp_posts | ref  | post_status,type_status_date | type_status_date | 25      | const,const | 5351 | Using where; Using temporary; Using filesort |
+----+-------------+----------+------+------------------------------+------------------+---------+-------------+------+----------------------------------------------+

Yes I know my blog is busted

For anyone who’s left their feedreader and visited my homepage in the last few days, you may have been surprised to see that the most recent post in the leftmost column is 15 seconds of fame from Tuesday, February 13, 2007 11:01am. Yes, 2007.

Justinsomnia screenshot busted

Of course that’s not actually the most recent post—it should be showing Caught between PulseAudio and a quiet place from Wednesday, May 21, 2008 4:39pm, (and now this). But what’s altogether weirder is that the neatlinks column is completely up to date.

In my index.php template file, I actually have two loops, the main one which shows all posts without neatlinks, and the neatlinks loop which only shows posts from the neatlinks category. To exclude neatlinks from the main column, I just add query_posts('cat=-23'); above the loop.

After a little digging into the PHP code behind WordPress, I had it spit out the query it sends to the database to pull out the data for the posts in the “The Loop”. What’s crazy (at first) is that rather than generating one big query joining every possible table to wp_posts, WordPress runs preliminary queries based on the query_posts attributes, to return a list of eligible or ineligible post IDs. Then it does the final posts select using giant IN() or NOT IN() lists in the where clause. The main reason it structures it this way is so that it can use the LIMIT clause for paging.

SELECT SQL_CALC_FOUND_ROWS  wp_posts.* 
FROM wp_posts  
WHERE 1=1  
AND wp_posts.ID NOT IN ('876', '877', '878', '879', '880'...'5456', '5458', '5460', '5461', '5462', '5463') 
AND wp_posts.post_type = 'post' 
AND (wp_posts.post_status = 'publish') 
GROUP BY wp_posts.ID 
ORDER BY wp_posts.post_date DESC 
LIMIT 0, 8

When I ran this giant query directly on the database, I got back the old blog posts visible on my homepage. But when I ran it on my laptop with a copy of my WordPress data, I got back the correct list of recent posts. When I deleted a few IDs from the thousands in the NOT IN() part of the query, and ran that on the database, I got the correct posts back.

So it appears there’s a devastating bug in MySQL’s handling of long IN() lists that’s since been fixed. Unfortunately Dreamhost is running MySQL v5.0.24a, which was released in August 2006, almost two years ago! Locally I’m running 5.0.51a from January 2008.

The MySQL docs say that the IN() list is limited only by the max_allowed_packet size, which is 16 megabytes by default, and which this query, as giant as it is, is nowhere near reaching. I skimmed through the release notes between those two versions of MySQL, and the closest thing I could find to a related bug was in the 5.0.41 release notes from May 2007:

For expr IN(value_list), the result could be incorrect if BIGINT UNSIGNED values were used for expr or in the value list. (Bug#19342)

So what did I immediately go do?

alter table wp_posts modify ID int unsigned not null auto_increment;

No difference.

Once again I’m caught between a rock and a hard place. On one hand, I have no idea what’s the likelihood of Dreamhost upgrading MySQL on their database servers (though I sent them a very detailed support request), and on the other hand, my blog is broken, and that makes me very grumpy. I generally like the laid-back Dreamhost culture, but it might be time to pay for something a little more dedicated…

Update: Broken blog = man obsessed

From the Belly of the Beasts, a eulogy

When I worked at O’Reilly Media, the Online Publishing Group (OPG) had this external-facing blog they called “From the Belly of the Beasts”, a reference to O’Reilly’s famous animal-branded technical books. The blog lived at blogs.oreillynet.com/beasts/.

Screenshot of OPG's 'From the Belly of the Beasts' blog

It was started in October 2004 by Terrie Miller, the group’s manager. I arrived in May 2005, hired partly to assist with O’Reilly’s blog and feed projects. I loved that our group had a blog of its own, so I started posting to it within days of my arrival.

In order to justify blogging at work (to myself), I treated it as a knowledge sharing tool. I’d usually post useful things I found online and then pass the URL around to my co-workers. Frequently I’d refer back to an old post to jog my memory (or someone else’s). It was also a place to be a little silly, to lighten the mood. I posted there frequently until I left O’Reilly in June 2006. After my last post only two more posts appeared, one by Mark and one by Terrie, and from that point on the blog has lain dormant. Update: Terrie left O’Reilly in March 2008.

I’ve always been a little surprised, and of course happy that those old posts still exist out there on the web. They come from a special time in my life, my first year in California, my first job outside of the university. I definitely felt I was leaving behind something dear when I left O’Reilly. I expected the blog to just cease to exist one day, presumably after someone unplugged some dusty server in the corner that happened to be blogs.oreillynet.com.

Anyway yesterday I heard from Mark that the Beasts blog is indeed getting “unplugged”, and he wanted to know whether I wanted to save anything beforehand. Yikes, yes! So I’m going to recreate each of the posts I wrote there on Justinsomnia, categorize them in such a way that they can be pulled out together, and postdate them appropriately. I’m actually really excited about this, to me it’s like getting back a piece of myself for my archives.

QR Code

QR Code for http://justinsomnia.org/