Category Archives: stories

Writing a Book

One day I’ll write another book. Perhaps a sports book about people and their stories, or the story of search engines and the people that build them.

I wrote my first book in 2001 with David Lane, and we rewrote it in 2003 for the second edition. I wrote another book with Saied Tahaghoghi in 2004 – the truth is I started it, and he picked up the pieces when I changed careers and countries; he’s a good man. The first book sold over 100,000 copies over the two editions (I still get a royalty check quarterly) and the second modestly (Saied and I earned our advance back). They’re both dated, old books now.

Web Database Applications with PHP and MySQL. My first book in its second English edition.

Web Database Applications with PHP and MySQL. My first book in its second English edition.

It’s thousands of hours of work to write a book. I spent at least 20 hours per week for 18 months on the first edition of the first book – that’s 1500 hours at least. I got out of bed at 5:30am and I did a few hours of writing before work. I’d also squeeze in a little more after work (typically some proof reading), and a longer period of writing on the weekend (where I’d still get out of bed at 5:30am).

Did I get rich? No. Typically, the authors get less than 10% of the wholesale book price — a couple of dollars per book sold at most. I got more than the minimum wage for the first book (roughly dividing the royalties by the number of hours by the number of authors to get an answer). The second book didn’t pay its way.

The longer I worked in a single sitting, the more productive I was. It takes a certain fixed amount of startup time to begin writing – you reread what you’ve written, edit it a little, get the context back, think about the structure of what you want to say next, and then start writing anew. But I can’t write for an extended period – it’s tiring, and I need to stop and take time away to think about what I want to do next. Three or four hour stints are the most productive for me.

When I wrote the first book, I’d count how many words I wrote in a session, and use that as a measure of success. I’d decide that I was going to write 1000 words before I took a break. It turns out, that doesn’t work for me: I’ve learnt that what’s important is sustained output, averaged over a month or so. Some days, I’ve got writer’s block. But I’ve learnt that that’s when I am doing valuable thinking – I’m working through a larger problem, or thinking through structure, or solving something that’s been bugging me for a while; sometimes, this is a subconscious activity. Other days, I’m a machine: I write as fast as I can type, and thousands of words flow. A whole chapter has been known to flow after a writer’s block.

My second book, Learning MySQL in its one-and-only edition.

My second book, Learning MySQL in its one-and-only edition.

Writing slows down as the book takes shape. I’ll be in the middle of a new section, and I’ll want to reference something else I wrote using something like “as you learned in Chapter <x>, the <something>”. Then I have to figure out what chapter it was, and what exactly was that <something> – that takes time. And, as the book gets longer, you repeat yourself – at least, my memory isn’t amazing enough to make sure I only say the same thing once. I’ll find myself waxing lyrical about some great idea, only to discover that it is somewhere else in the book too. Then it’s a case of figuring out where it should be – which is going to lead to editing a was-finished chapter elsewhere in the book or rethinking what I’m writing today.

I worked with a major publisher, O’Reilly Media Inc., and the wonderful editorial skills of Andy Oram helped me on both books. Editors are awesome – they push, prod, ask questions, push for clarity, and say things that make your book better. But they create a ton of work – you’ll get feedback such as “Perhaps you could merge those two chapters?” or “Chapter 4 needs to be broken into two chapters, and you need to really go into much more depth”. That often happens after you think you’re finished. You’ll get asked for extra chapters, rewrites, more or less content, and complete changes in style. It makes your book better. Between the first edition and second edition of my first book, Andy helped me change my style from a formal computer science style to a more conversational, chatty style – the kind that I use in this blog.

Writing a book isn’t a social experience. You need to enjoy being alone with your thoughts. Prepare for one hour of inspiration and conversation to turn into a hundred hours of hard labor and iteration. If it takes a couple of thousand hours to write a book, ten of them are the inspirational ones where you create the fundamental ideas. I wrote much of the second book in my parent’s Winnebago – parked on the grass, a good distance from the house, deliberately without wifi, and with nothing inside to distract me.

So, why do it? I enjoy writing – there’s rewarding impact in sharing knowledge with thousands of people. If you’re lucky, someone’s success or their impact will be because of what you shared. Or maybe you change how someone thinks or sees the world. Or you make their life better. It’s also cool to see your name on the cover, and to feel it in your hands – it’s even cooler when it’s translated into a language you don’t understand. I wonder what joy there is in knowing people are reading a digital copy? The digital sales on the royalty statement have never quite inspired me the same way.

Have fun. See you soon.

Changing platforms at Scale: Lessons from eBay

At eBay, we’ve been on a journey to modernize our platforms, and rebuild old, crufty systems as modern, flexible ones.

In 2011, Sri Shivananda, Mark Carges, and I decided to modernize our front-end development stack. We were building our web applications in v4, an entirely home-grown framework. It wasn’t intuitive to new engineers who joined eBay, they were familiar with industry standards we didn’t use, and we’d also built very tightly coupled balls of spaghetti code over many years. (That’s not a criticism of our engineers – every system gets crufty and unwieldy eventually; software has an effective timespan and then it needs to be replaced.)

Sri Shivananda, eBay Marketplaces' VP of Platform and Intrastructure

Sri Shivananda, eBay Marketplaces’ VP of Platform and Intrastructure

We set design goals for our new Raptor framework, including that we wanted to do a better job separating presentation from business logic. We also wanted better tools for engineers, faster code build times, better monitoring and alerting when problems occur, the ability to test changes without restarting our web servers, and a framework that was intuitive to engineers who joined from other companies. It was an ambitious project, and one that Sri’s lead as a successful revolution in the Marketplaces business. We now build software much faster than ever before, and we’ve rewritten major parts of the front-end systems. (And we’ve open sourced part of the framework.)

That’s the context, but what this post is really about is how you execute a change in platforms in a large company with complex software systems.

The “Steel Thread” Model

Mark Carges, eBay's Chief Technical Officer

Mark Carges, eBay’s Chief Technical Officer

Our CTO, Mark Carges, advocates building a “steel thread” use case when we rethink platforms. What he means is that when you build a new platform, build it at the same time as a single use case on top of the platform. That is, build a system with the platform, like a steel thread running end-to-end through everything we do.

A good platform team thinks broadly about all systems that’ll be built on the platform, and designs for the present and the future. The risk is they’ll build the whole thing – including features that no one ultimately needs for use cases that are three years away. Things change fast in this world. Large platform projects can go down very deep holes, and sometimes never come out.

The wisdom of the “steel thread” model is that the platform team still does the thinking, but it’s pushed by an application team to only fully design and build that parts that are immediately needed. The tension forces prioritization, tradeoffs, and a pragmatism in the platform team. Once you’re done with the first use case, you can move onto subsequent ones and build more of the platform.

Rebuilding the Search Results Page

Our first steel thread use case on Raptor was the eBay Marketplaces Search Results Page (the SRP). We picked this use case because it was hard: it’s our second-most trafficked page, and one of our most complex; building the SRP on Raptor would exercise the new platform extensively.

We co-located our new Raptor platform team – which was a small team by design – together with one of our most mission critical teams, the search frontend team. We declared that their success was mutually dependent: we’re not celebrating until the SRP is built on Raptor.

We asked the team to rebuild the SRP together. We asked for an aggressive timeline. We set bold goals. But there was one twist: build the same functionality and look-and-feel as the existing SRP. That is, we asked the team to only change one variable: change the platform. We asked them not to change an important second variable: the functionality of the site.

This turned out to be important. The end result – after much hard work – was a shiny new SRP code base:  modular, cleaner, simpler, and built on a modern platform.  But it looked and behaved the same as the old one. This allowed us to test one thing: is it equivalent for our customers to the old one?

Testing the new Search Results Page

We ran a few weeks of A/B tests, where we showed different customer populations the old and new search results page. Remember, they’re pretty much the same SRPs from the customers’ perspective. What we were looking for were subtle problems: was the new experience slower for some scenarios than the old one? Did it break in certain browsers on some platforms? Was it as reliable? Could we operate it as effectively? We could compare the populations and spot the differences reasonably easily.

This was a substantial change in our platforms and systems, and the answer wasn’t always perfect. We took the new SRP out of service a few times, fixed bugs, and put it back in. Ultimately, we deemed it a fine replacement in North America, and turned it on for all our customers in North America. The next few months saw us repeat the process across our other major markets (where there are subtle differences between our SRPs).

What’s important is that we didn’t change the look and feel or functionality at first: if we’d done that, we may not have seen several of the small problems we did see as fast as we saw them.

Keeping the old application

Another wise choice was we didn’t follow that old adage of “out with the old, and in with the new”. We kept the old SRP around running in our data centers for a few months, even though it wasn’t in service.

This gave us a fallback plan: when you make major changes, it’s never going to be entirely plain sailing. We knew that the new SRP would have problems, and that we’d want to take it out of service. When we did, we could put the old one back in service while we fixed the problem.

Eventually, we reached the confidence with our new SRP that we didn’t need the old one. And so it was retired, and the hardware put to other uses. That was over a year ago – it has been smooth sailing since.

The curse of dual-development

You might ask why we set bold goals and pushed the teams hard to build the new Raptor platform and the SRP. We like to do that at eBay, but there’s also a pragmatic reason: while there’s two SRP code bases, there’s twice the engineering going on.

Imagine that we’ve got a new idea for an improvement to the SRP. While we’re building the new SRP, the team has to add that idea to the new code base.  The team also has to get the idea into the old code base too – both so we can get it out to our customers, and so that we can carry out that careful testing I described earlier.

To prevent dual development slowing down our project, we declared a moratorium on features in the SRP for a couple of months. This was tough on the broader team – lots of folks want features from the search team, and we delayed all requests. The benefit was we could move much faster in building the new SRP, and getting it out to customers. Of course, a moratorium can’t go on for too long.

And then we changed the page

After we were done with the rollout, the SRP application team could move with speed on modernizing the functionality and look-and-feel of the search results page.

Ultimately, this became an important part of eBay 2.0, a  refresh of the site that we launched in 2012. And they’re now set up to move faster whenever they need to: we are testing more new ideas that improve the customer experience than we’ve been able to before, and that’s key to the continued technology-driven revolution at eBay.

See you next week.

The size, scale, and numbers of

I work in the Marketplaces business at eBay. That’s the part of the company that builds,,,, and most of the other worldwide marketplaces under the eBay brand. (The other major parts of eBay Inc are PayPal, GSI Commerce, x.commerce, and StubHub.)

I am lucky to have opportunities to speak publicly about eBay, and about the technology we’re building. It’s an exciting time to give a talk – we are in the middle of rewriting our search engine, we’ve improved search substantially, we’re automating our data centers, we’re retooling our user experience development stack, and much more.

At the beginning of most talks, I get the chance to share a few facts about our scale and size. I thought I’d share some with you:

  • We have over 10 petabytes of data stored in our Hadoop and Teradata clusters. Hadoop is primarily used by engineers who use data to build products, and Teradata is primarily used by our finance team to understand our business
  • We have over 300 million items for sale, and over a billion accessible at any time (including, for example, items that are no longer for sale but that are used by customers for price research)
  • We process around 250 million user queries per day (which become many billions of queries behind the scenes – query rewriting implies many calls to search to provide results for a single user query, and many other parts of our system use search for various reasons)
  • We serve over 2 billion pages to customers every day
  • We have over 100 million active users
  • We sold over US$68 billion in merchandize in 2011
  • We make over 75 billion database calls each day (our database tables are denormalized because doing relational joins at our scale is often too slow – and so we precompute and store the results, leading to many more queries that take much less time each)

They’re some pretty large numbers, ones that make our engineering challenges exciting and rewarding to solve.

Any surprises for you?

The race to build better search: a Reuters article

I spent a couple of hours with Alistair Barr from Reuters discussing search at eBay, and our Project Cassini rewrite of eBay’s search platform. Alistair published the story yesterday, and it’s a good, short read. Thanks Alastair for sharing the story with your readers.

Alistair discusses Walmart’s search rewrite (which didn’t take too long by the sounds of it — my recent blog post suggests why), quotes responses from Google’s PR team, and shares insights from my good friend Oren Etzioni who works at both the University of Washington and the rather awesome shopping decision engine, He mentions Google’s features that obviously do a little light image processing to match color and shape intents in queries such as “red dress” or “v-neck dress” against the content in the images.

Google does do lots of things well, but they’re often not the first to do them — we built that color search into Bing’s image search in 2008 (try this “red dress” query). On a related note, eBay has a rather cool image search feature, which we really should make more prominent in our search experience (mental note: must work on that). Try this “red dress” query, and you’ll see results that use visual image features to find related items.

I’ll be back with part #3 of my Ranking at eBay series soon.

Ideas and Invention (and the story of Bing’s Image Search)

I was recently told that I am an ideas guy. Probably the best compliment I’ve received. It got me thinking, and I thought I’d share a story.

With two remarkable people, Nick Craswell and Julie Farago, I invented infinite scroll in MSN Search’s image search in 2005. Use Bing’s image search, you’ll see that there’s only one page of results – you can scroll and more images are loaded, unlike web search where you have to click on a pagination control to get to page two. Google released infinite scroll a couple of years ago in their image search, and Facebook, Twitter, and others use a similar infinite approach. Image search at Bing was the first to do this.

MSN Search's original image search with infinite scroll

How’d this idea come about? Most good ideas are small, obvious increments based on studying data, not lightning-bolt moments that are abstracted from the current reality. In this case, we began by studying data from image search engines, back when all image search engines had a pagination control and roughly twenty images per page.

Over a pizza lunch, Nick, Julie, and I spent time digging in user sessions from users who’d used web and image search. We learnt a couple of things in a few hours. At least, I recall a couple of things – somewhere along the way we invented a thumbnail slider, a cool hover-over feature, and a few other things. But I don’t think that was over pizza.

Back to the story. The first thing we learnt was that users paginate in image search.  A lot. In web search, you’ll typically see that for around 75% of queries, users stay on page one of the results; they don’t like pagination. In image search, it’s the opposite: 43% of queries stay on page 1 and it takes until page 8 to hit the 75% threshold.

Second, we learnt that users inspect large numbers of images before they click on a result. Nick remembers finding a session where a user went to page twenty-something before clicking on an image of a chocolate cake. That was a pretty wow moment – you don’t see that patience in web search (though, as it turns out, we do see it at eBay).

If you were there, having pizza with us, perhaps you would have invented infinite scroll. It’s an obvious step forward when you know that users are suffering through clicking on pagination for the bulk of their queries, and that they want to consume many images before they click. Well, perhaps the simplest invention would have been more than 20 images per page (say, 100 images per page) – but it’s a logical small leap from there to “infinity”. (While we called it “infinite scroll”, the limit was 1,000 images before you hit the bottom.) It was later publicly known as “smart scroll”.

To get from the inspiration to the implementation, we went through many incarnations of scroll bars, and ways to help users understand where they were in the results set (that’s the problem with infinite scroll – infinity is hard to navigate). In the end, the scroll bar was rather unremarkable looking – but watch what it does as you aggressively scroll down. It’s intuitive but it isn’t obvious that’s how the scroll bar should have worked based on the original idea.

This idea is an incremental one. Like many others, it was created through understanding the customer through data, figuring out what problem the customers are trying to solve, and having a simple idea that helps. It’s also about being able to let go of ideas that don’t work or clutter the experience – my advice is don’t hang onto a boat anchor for too long. (We had an idea of a kind of scratchpad where users could save their images. It was later dropped from the product.)

The Live Search scratchpad. No longer in Bing's Image Search

I’m still an ideas guy. That’s what fires me up. By the way, Nick’s still at Microsoft. Julie’s at Google. And I am at eBay.

(Here’s the patent document, and here’s a presentation I made about image search during my time at Microsoft. All of the details in this blog post are taken from the presentation or patent document.)