Category Archives: writing

It’s an apostrophe

Even the smartest folks I know can’t get apostrophes right. (Please don’t read all my blog posts and find the mistakes!). Let me see if I can help. 

  • “It’s” is equivalent to “it is”. If you write “it’s” in a sentence, check it makes sense if you replace it with “it is”. If yes, good. If no, you probably meant “its”
  • “Its” is a possessive. “The dog looked at its tail”. As in, the tail attached to the dog was stared at by the aforementioned canine

Get those right, and you’re in the top 98% of apostrophe users.

Don’t write “In the 1980’s, rock music was…”. You mean “In the 1980s, …”. As in, the plural: the ten years that constitute the decade that began in 1980. These are also correct: “He collected LPs” or “She installed LEDs instead of incandescent globes”. You’ll find some people argue about these: for example, some folks write “mind your P’s and Q’s”, and argue correctness. I personally think it’s wrong, there are many Ps and Qs, and so it should be “mind your Ps and Qs”.

Watch out for possession of non-plurals that end in consonants. “Hugh William’s blogs are annoying” and “Hugh Williams’ blogs are annoying” are both wrong. “Hugh Williams’s blogs are annoying” is right (in more ways than one?).

One trick I use is this: if you say the “s”, add the “s”. Hugh Williams’s blog. Ross’s Dad. The boss’s desk. If you don’t say it, don’t add it. His Achilles’ heel. That genres’ meaning.

Have a fun week!

Don’t use a Pie Chart

I don’t like pie charts. Why? It’s almost impossible to compare the relative size of the slices. Worse still, it is actually impossible to compare the slices between two pie charts.

Pie charts in action.

Pie charts in action.

Take the example above. Take a look at Pie Chart A. The blue slice looks the same size as the red or green slice. You might draw the conclusion they’re roughly the same. In fact, take a look at the histogram below — the red is 17 units, blue is 18 units, and green is 20 units. The histogram is informative, useful for comparison, and clear for communication. (There’s still a cardinal sin here though: no labels on the axes; I’ll rant about that some other day.)

Compare pie charts B and C above. It sure looks like there’s the same quantity of blue in B and C, and about the same amount of green. The quantities aren’t the same: the histograms below show that green is 19 and 20 respectively, and blue is 20 and 22.

You might argue that pie charts are useful for comparing relative quantities. I’d argue it’s possible but difficult to interpret. Take a look at three yellow slices in charts A, B, and C. They look to be similar percentages of the pies, but it’s hard to tell: they are oriented slightly differently, making the comparison unclear. What’s the alternative? Use a histogram with the y-axis representing the percentage.

Pie charts get even more ugly when there’s lots of slices, or some of the slices are ridiculously small. If you’re going to use them — I still don’t think it’s a good idea — then use them when there are only a few values and the smallest slices can be labeled. The pie chart below is ridiculous.

A pie chart that's nearly impossible to read. I'm not sure what conclusions you could draw from this one.

A pie chart that’s nearly impossible to read. I’m not sure what conclusions you could draw from this one.

Why do I care? I’m in the business of communicating information, and I spend much of my time reviewing information that people share with me. Clarity is important — decisions are made on data. See you next time.

Reflecting on my hash table post

My blog post on hash tables was shared 400 times, had 12,000 views in one day, and 28,000 views in the month it was published. There were long threads on reddit and Hacker News.

The sentiment is more negative than positive – folks pointing out that the article was obvious, picking apart one or more points, or stating I don’t know what I’m doing. There are valid points. There are incorrect points too.

This blog went relatively crazy last October. The cause was my post on hash tables.

This blog went relatively crazy last October. The cause was my post on hash tables.

When I speak to people, the sentiment is mostly positive. They thank me for explaining something they’ve long forgotten, and asking them to think before they use a library function.

I enjoy writing simple, accessible posts: my first book sold over 100,000 copies because I wrote plainly in an accessible way. That’s always how I’ve taught too: by explaining through analogy or in simple terms concepts that are complex. I’m just not the guy to explain concepts using math or who’ll offer the shortest, densest, most information-rich explanations.

I agree that my hash table post paints a simplified picture for the average reader. Yes, that makes it problematic at the edges. For example, you’d want to be careful of a hash table’s worst-case complexity in a mission critical or cryptography application. Yes, there are hash functions that distribute string keys better —  this one is reasonable, and I said it was fast. There are other valid points too.

Here’s where I was coming from: Are most developers using a library hash function without thinking about it? I think so. Will reading my article help them think and make better choices? I hope so. That’s the intended audience. No apologies from me — I’ll stir up more trouble soon with a post on how trees work.

See you next week.

Six months, fifty thousand visits

Thanks for traveling on my blog journey. It’s been fun to have you along. It’s great to be writing again.

I’ve had around 50,000 views of my 31 posts so far. Here’s a few other factoids from the journey thus far:

  • The most popular post was about eBay’s size and scale, which I published on June 26. It’s had around 5,000 views, and made it into WordPress’s “Freshly Pressed” section. It’s also received the most comments and likes, and contributed to the blog’s busiest day, June 27
  • The least popular post was about my keynote at the PHP UK 2012 conference. That had 176 views
  • This story about Bing’s image search is one of my favorite posts, and my biggest surprise with only around 400 views
  • Referrals to the blog come from a few popular sources: Twitter (3500 referrals), Google (2200), Facebook (2300), LinkedIn (1400), and WordPress (1200)
  • Referrals to the blog don’t come from Bing (58 referrals), Ask (13), or Yahoo! Search (8)
  • I’ve had exactly 25,000 visits from the United States (where I live), 2600 from the UK, and 2400 from Australia (where I’m originally from)
  • When people search on Google (and subsequently land on a blog page), the most popular queries they type are: ebay.com, five variations of my name, query rewrite, and byte versus bit inverted index
  • There’s a few sites out there that occasionally highlight my posts, which I appreciate very much. The awesome highscalability.com has driven 900 views, Jason Haley’s great “interesting finds” blog has driven 58, and Y Combinator’s Hacker News about 80
  • If gossip in the corridors were a measure, this post about fitness and nutrition caused the biggest stir and got people talking the most
  • Ardent Logophile has offered the most comments on the blog, and thoughtful ones at that. Thanks Ardent!
  • A bizarre factoid: someone translated this post of mine on successful teams into Japanese, and added cool pictures (including Star Wars stormtroopers having a meeting)
  • This is far from the most popular site I own. That honor goes to the (no longer maintained) webdatabasebook.com that I built as a companion to my first book

The open question is what I should write about next. More on search engines? Management? Fitness and nutrition? eBay? Something else? Your thoughts?

Have a great week.

You, me, and the comma

Writing requires precision. You need to be clear.

There are five “Elementary Rules of Usage” that relate to commas in the legendary Strunk and White. I talk about one in this post, I’ll come back to the others in future posts. I’ve reproduced the relevant page below.

Strunk and White’s Elementary Rules of Usage. Using commas in lists is discussed at the top of the page. Click on the image to see an enlarged version.

Suppose you want to write a list of three or more things. Put a comma before the last item in the list. Here’s some examples:

  • The American and Australian flags are red, white, and blue.
  • The choices of shirts are red, blue, black, and white.
  • The city was bursting with cars, trains, automobiles, trucks, bicycles, and motorcycles.

The key point is that there’s a comma before the last item in the list.

Why’s this important? If you omit the comma, there’s ambiguity. Take the second example: “The choices of shirts are red, blue, black, and white”. It’s clear that there are four choices of shirt colors. If you omit the final comma, we’d have the following: “The choices of shirts are red, blue, black and white”. Are there three choices? Is the last choice a black and white shirt? Or are there four choices? If there were indeed three choices, this would be correctly written as “The choices of shirts are red, blue, and black and white”.

There is one exception to this rule. That’s when it’s a list of people in a company name. In that case, it’s obvious there’s no ambiguity. Here’s some examples:

  • Togut, Segal & Segal LLP
  • Amper, Politziner & Mattia
  • Berry, Dunn, McNeil & Parker

Grab a copy of Strunk and White (my latest copy is this beautifully illustrated, hardcover edition). Read the first ten pages (and then decide whether to read the rest, or pop it on your bookshelf and get street cred from your colleagues).