Category Archives: opinions

Putting Email on a Diet

A wise friend of mine once said: try something new for 30 days, and then decide if you want to make it permanent.

Here’s my latest experiment: turning off email on my iPhone. Why? I found I was in work meetings, or spending time with the family, and I’d frequently pick up my phone and check my email. The result was I wasn’t participating in what I’d chosen to be part of — I was distracted, disrespectful of the folks I was with, and fostering a culture of rapid-fire responses to what was supposed to be an asynchronous communication medium. So, I turned email off on my iPhone. image1

What happened? I am enjoying and participating in meetings more. I am paying attention to the people and places I have chosen to be. And I’m not falling behind on email — I do email when I choose to do it, and it’s a more deliberate and effective effort.

Have I strayed? Yes, I have. When I’m truly mobile (traveling and away from my computer), I choose to turn it on and stay on top of my inbox — that’s a time when I want to multitask and make the best use of my time by actually choosing to do email. And then I turn it off again.

My calendar and contacts are still enabled. On the go, I want to know where and when I need to be somewhere, and to be able to consciously check my plans. I also want to be able to contact people with my phone.

Will I stick with it? I think so. Give it a try.

See you next time.

Don’t use a Pie Chart

I don’t like pie charts. Why? It’s almost impossible to compare the relative size of the slices. Worse still, it is actually impossible to compare the slices between two pie charts.

Pie charts in action.

Pie charts in action.

Take the example above. Take a look at Pie Chart A. The blue slice looks the same size as the red or green slice. You might draw the conclusion they’re roughly the same. In fact, take a look at the histogram below — the red is 17 units, blue is 18 units, and green is 20 units. The histogram is informative, useful for comparison, and clear for communication. (There’s still a cardinal sin here though: no labels on the axes; I’ll rant about that some other day.)

Compare pie charts B and C above. It sure looks like there’s the same quantity of blue in B and C, and about the same amount of green. The quantities aren’t the same: the histograms below show that green is 19 and 20 respectively, and blue is 20 and 22.

You might argue that pie charts are useful for comparing relative quantities. I’d argue it’s possible but difficult to interpret. Take a look at three yellow slices in charts A, B, and C. They look to be similar percentages of the pies, but it’s hard to tell: they are oriented slightly differently, making the comparison unclear. What’s the alternative? Use a histogram with the y-axis representing the percentage.

Pie charts get even more ugly when there’s lots of slices, or some of the slices are ridiculously small. If you’re going to use them — I still don’t think it’s a good idea — then use them when there are only a few values and the smallest slices can be labeled. The pie chart below is ridiculous.

A pie chart that's nearly impossible to read. I'm not sure what conclusions you could draw from this one.

A pie chart that’s nearly impossible to read. I’m not sure what conclusions you could draw from this one.

Why do I care? I’m in the business of communicating information, and I spend much of my time reviewing information that people share with me. Clarity is important — decisions are made on data. See you next time.

Music everywhere with Sonos

I’ve embraced Sonos as the way to enjoy music and radio in my house.

What’s Sonos?

I was late to the game too, so don’t worry if you haven’t heard of Sonos or don’t quite know what it does. Sonos is a company, and they make several powered speakers, that is, nice little units that contain an amplifier and speakers. They also make a product that allows you to connect your existing amplifier to the Sonos system.

The Sonos family of powered speakers and integration products. At the rear left is their subwoofer. The Play:3, Play:5, and Play:1 are grouped in the middle rear. At the front is Playbar for home theater. At the rear right are the integration products.

The Sonos family of powered speakers and integration products. At the rear left is their subwoofer. The Play:3, Play:5, and Play:1 are grouped in the middle rear. At the front is Playbar for home theater. At the rear right are the integration products.

One thing that’s cool about Sonos is that the powered speakers don’t need to be wired to a system. You put them where you want, and they connect wirelessly to a base station that’s plugged into your home wireless Internet router. Alternately, you can wire them to a standard Ethernet socket if you’ve wired your house. Sonos call their base station a bridge, and right now one of those comes free with any of Sonos’s speakers.

What makes a Sonos system cool, though, isn’t just that it’s portable and unwired. It’s that it sounds pretty darn good, and it integrates reasonably nicely with popular music services such as slacker and tunein radio. That means you can pay a few bucks a month and play a large library of music, and you can listen to a vast array of radio stations. You control this experience using your smartphone, tablet, or PC.

Playing music

It’s pretty simple to play music. You select the room you want to play — the available rooms are shown on the left in the image below. Then you select a source you want to play — you can choose from your own music library, or one of the streaming services, or a line-in input into one of the devices.

The Sonos Mac OS X application. Very similar to the Sonos iPad app. On the left are rooms, on the right are sounds sources.

The Sonos Mac OS X application. Very similar to the Sonos iPad app. On the left are rooms, on the right are sounds sources.

You can group rooms together to create a zone, and have the same source playing throughout part or all of your house. For example, I often put on the radio, and group together my bedroom, main living areas, garage gym, and outside patio so that I can listen to them as I move around the house.

I’ve got a turntable, and I’ve connected that to one of Sonos’s larger Play:5 systems; the smaller Play:1 and Play:3 don’t have a line-in input. I needed a pre-amp between the turntable and the Play:5, and picked up a reasonable one at an online store. With this setup, I can listen to vinyl throughout the house in the same way as I can listen to the rest of my music.

I sometimes plug other sources into another line-in socket in another Play:5. For example, when I want to listen to Major League Baseball, I fire up my MLB At:Bat app on my iPhone, and connect the iPhone to the Play:5. Then, I select the Line-in as a source in the Sonos app, and we’ve got baseball in the house. (Go Mariners!) The drawback is that if I want to adjust volume or settings, I have to walk to the Play:5 and fiddle with the iPhone.

What’s Great

Here’s the top five things I love about Sonos:

  1. Sounds good to great. I can’t get over how much sound is in the Play:1 for the size and price. The thing is about as big as a coffee tin, and it has nice bass response and looks good. The bigger Play:5 is a serious unit, and has five amplifiers and five speakers — when you pair two together to create a stereo system, and add a subwoofer, you’ve got a serious sound system (and it’s priced like one too — you’re talking US$1500)
  2. Music and radio everywhere. Buy a few units, put them around the house, and your life will be better. You’ll be better connected to the world through radio, and you’ll enjoy your music even more
  3. Easy to set up. When you buy a new speaker, you can use any Sonos app on any device to register the unit. It takes about two minutes to add the unit to your house
  4. Range. I can put speakers anywhere in my house — in locations where I don’t get wifi on my laptop or phone — and it works just fine. I can take one of them out in the yard, and all is well
  5. It’s an alarm clock. It’s easy to set up an alarm on any Sonos device, and choose a source. I wake up to KQED radio, and it gently fades in. It turns off after an hour (that’s configurable). The rest of my family uses this feature too

What Needs Work

Here’s where there’s room for improvement:

  1. It’s expensive. The Play:1 is the first sub $200 offering from Sonos, the Play:3 is $299, and it’s upward from there. The Play:1 is great value, but fitting out your house is an investment. Be warned: these things multiply, you’ll buy one or two, and you’ll be back for more
  2. The service integration is a bit clunky. I really like Slacker’s iPhone app — but you only get a fraction of the features when you use the Sonos app to stream the Slacker service. The Sonos folks use the APIs that these streaming companies provide, rather than the streaming companies integrating Sonos capabilities natively into their apps. You can also tell Sonos has no relationship with Apple — the music library integration is pretty clunky, it’s at the file system level
  3. The apps need a little bit of a rethink and redesign, they lack the beauty and simplicity of the hardware. The app paradigm is you select a room, then you select music. That isn’t always how you think — sometimes you want to dive into the music, and then select the room. You can do it, but it’s a little clunky (and sometimes you’ll surprise someone in your house with a blast of music). Still, I’ve seen tweens using it easily enough
  4. The apps or the network or something can be sluggish. I find that my iPhone is a little frustrating as the interface to Sonos — my iPad and Mac are much better. It sometimes takes a while for the iPhone app to find my Sonos system, and the app can be unresponsive to interactions sometimes. It’s also not a reliable device for streaming my music library
  5. It needs power. The Play:1 looks portable, but you need an electrical outlet

All up?

Pretty awesome. A game changer at my house. The hardware is amazing — and that’s what’s actually important. Software and music service integrations can be fixed, and they’re improving with every version.

See you again soon.

My Tesla Model S

Beta testing the Tesla Model S

My Tesla Model S

My Tesla Model S

I bought a Tesla Model S earlier this year. It’s a dream car: comfortable, responsive, spacious, and great looking. It’s a total geek dream gadget, and I feel good about owning an environmentally sensible electric car. It’s 95% of the way to perfect – and it’s fun being part of the ongoing experiment to find the last 5%.

Scheduled Software Updates

Tesla updates the car occasionally – the car has a 3G cell connection. A dialog box on the massive 17” screen says an update is available, you schedule it, and wake up to an improved car. It’s like updating iOS on your iPhone. Indeed, it’s very similar – your car could be quite different after the update, and it’s clear the car is designed to be a flexible software-driven platform. This is mostly where the beta testing feeling comes in.

Scheduled Charging

Scheduled charging. My car is configured to being charging at 1am when it's plugged in at home.

Scheduled charging. My car is configured to begin charging at 1am when it’s plugged in at home.

The most recent update added scheduled charging. You plug the car into its charge point, and it’ll start charging when you tell it – this allows you to take advantage of lower electricity rates in the early hours of the morning. What’s cool is that it is location-aware: you can set different charge behaviors for different locations, and the car remembers those. So, for example, you could have it charge as soon as it’s plugged in at work, and beginning at 1am at home – and once it’s set, it just works. Pretty neat. (I’m glad this feature arrived – I was beginning to figure out how to install a timer on my 50 Amp 220 volt plug at home.)

Plugged in for charging with the mobile charging cable.

Plugged in for charging with the mobile charging cable.

I actually got this new feature about a week ahead of everyone else. How? Well, I scheduled an update and it failed. I woke up to a dialog box that told me to call Tesla Service. The climate control didn’t work, the odometer read 0 miles, and a few other things were a little off – but the car was completely drivable. I called Tesla service, dreading the need to take it to their service center – but it was way simpler than that. The guy on the phone asked me when I’d next have the car parked for a couple of hours. They later logged into my car, remarking that “the packages were all there but didn’t unpack properly” (suggesting a Linux flavor to the car), and “cleaned things up”. When I got back to the car, all was great – everything back to normal, and I’m the first guy on the block with the latest software that includes scheduled charging.

Climate Control Problems

Climate control must be a harder problem than you’d think. It’s entirely automatic by default: you set the temperature, and the Model S looks after maintaining it. However, I get blasted with cold air most of the time – if you jump in the car when it’s warm outside, and ask for 70 degrees inside, it’ll get you there as fast as it can. And once it’s there, it’ll lower the fan speed until (I guess) it gets a couple of degrees warmer, and then it’ll Arctic blast again. It always feels like it’s not quite doing what I want – sometimes 70 degrees feels rather too warm, and other times I’m freezing. There must be subtlety in making this an awesome feature (maybe other car companies took a long time to get this right?): you want the occupants to be comfortable as soon as possible, but you also want them to have a pleasant time getting there. I bet there’s a software update coming.

Spinal Tap humor: the volume control and even the climate control fan settings go all the way to 11

Spinal Tap humor: the volume control and even the climate control fan settings go all the way to 11

The web browser and nav apps fall short

The giant 17” screen includes a web browser and a navigation application. The browser is about as basic as you’ll get: it doesn’t have autocomplete (with much-needed spelling correction), it doesn’t save form data, and it randomly seems to lose its history and cookies. It’s also got problems with its touch interface: you need to press a little above any link you want to click, and often a few times. The navigation application is ok, but has a few quirks: it’s always oriented so that north is facing up, which isn’t how I like to use navigation, and traffic data seems to update on its own frequency (even if you turn traffic on and off) – which can lead you into a jam. I am not quite sure whether the traffic data is used to determine routes – I suspect not yet ; it’s certainly not configurable to tell the navigation app whether you’d prefer a faster, shorter, non-highway, or highway route as in many other nav tools.

If the 17” screen has issues, you can reboot it by holding the two scroll wheels on the steering wheel. You can do this while you’re driving. You can reboot the screen behind the steering wheel separately by holding the two buttons above the scroll wheels. Again, no problem while you’re driving. This suggests there’s several physical or virtual machines in the Model S – at least one for each of the screens, and more behind running what’s needed to drive the car.

Am I unhappy? No. The future has arrived early – a car that’s as much software as hardware, and that can be iterated on and improved without you going near a service center. Is it entirely baked? Not yet. Do I love my Tesla Model S? Best car I’ve owned easily.

See you next week.

By the way, while I own a Tesla, I don’t own any shares in the company nor do I plan to buy any. I wish I did, after their spectacular rise in the past couple of weeks.

Why Facebook shouldn’t have dumped HTML5

We all want to build fast, reliable mobile apps. Facebook couldn’t make its HTML5 mobile app deliver on that goal, and decided to build its own native app. In practice, this means retiring an app that is a browser-like shell that renders web pages (a thin client), and launching a fully-fledged app on the mobile device (a thick client, with apparently a sprinkling of HTML5 inside).

That’s a step backwards, and flies in the face of history. Haven’t we just been through a twenty-year evolution of thicker clients in general being replaced by thinner clients? How many apps did you install on your PC this year compared to ten years ago? How many tabs do you have open in your browser?

Should Facebook have just made the web faster and more reliable? Rather than mostly abandon HTML5, why didn’t they evolve the standard and make the web better? It wouldn’t be the first time that has happened — I’m many of you remember web standards evolving in the 1990s, and you can thank those days for the better experiences we all have today. So, an opportunity lost — but I am sure the story is not over, and indeed it sounds like their new app is indeed a “hybrid app” (where there is some HTML5 inside the native app’s framework).

This change also makes experimentation much harder. On the web, most major companies are running test versus control experiments or A/B tests. We put a population of users in an experiment, and compare their behaviors with those who aren’t in the test — for example, at eBay, we try out improvements to search ranking on a small population of customers and compare their behaviors to those who are seeing the regular results. The great thing about the web is you can do this fast — pretty much as fast as you can code and test the changes — and test large numbers of simultaneous variants. The outcome is you make fast progress in improving your product on behalf of your customers.

Building native apps makes experimentation harder. You could build an “A” and a “B” experience into an iPhone native app, get it through Apple’s approval process, and try out the two experiences on the customers. But the barrier to entry is much higher — you can only run a couple of experiments, and you probably only release once a month at most. You’re not going to evolve your application as fast as your customers want.

There are always tensions and tradeoffs: in this case it is speed and reliability on one side, and the future of the web and experimentation on the other. I would have fought hard to stay in the latter camp.

Knowing Your Customer with Data

Are you really data driven? Here’s what I’ve learnt about making decisions using quantitative data.

A Typical Test versus Control Experiment

Let’s get on a page about what we’re discussing. Most web companies run test versus control experiments, or A/B tests. The idea is simple:

  1. Divide the customers into populations
  2. Show one population the control (default, “A”) experience
  3. Show one or more populations the test (new, altered, “B”) experience
  4. Collect data from each population
  5. Compute metrics from the data
  6. Understand the relative results between the test and the control
  7. Make decisions: either keep the control, or replace it with a new, better experience from a positive test

Explaining how to really know your customer with data at the 2012 eBay Data Conference

It’s critical in Step 5 to compute confidence intervals, that is, statistical measures that tell you the probability that the phenomena you’re seeing is real. For example, using a one-sided t-test, you might learn that there’s a 90% probability that the test experience is better than the control.

Let’s suppose you’ve reorganized the layout of your site, and what you’ve learnt is that customers abandon the pages much less. Through your test, you’re 90% confident that a new experience you’ve tested is better than the default, control experience. On that basis, you might want to launch the new, test experience — but I’d caution you to learn more before you make a decision.

Where does the behavior come from?

I recommend you always dig deep into your data. Learn as much as you can before you decide. I like to see data “cut” (broken into sub populations) by:

  • Device (Mobile vs. tablet vs. desktop. Break it down by brand, make, and model [for example, Apple iPad HD])
  • Operating system (Linux vs. Mac OS X vs. Windows, break it out by versions)
  • Browser (Chrome vs. IE vs. Firefox vs. Safari, break it out by version)
  • Channel (Visits from within your site vs. visits from Google search vs. Visits from paid advertising)

When you do this, and add in your confidence intervals, you will almost always learn something. Is the new experience working as expected on the dreaded IE6 and IE7? Any issues on a mobile device? Does it work better when customers are navigating within your site versus landing in the middle of it from a Google search?

Ask yourself: what can I improve before I make a decision? And always ask: knowing this detail, am I still comfortable with my decision? Be very careful about launching new experiences that help most of the population, and hurt some of it — ask whether you can live with the worst case experience.

When you do these cuts, make sure the data makes sense. I’ve learnt over the years that when you see something that you don’t expect, it’s almost always a bug, or an error in the data. Never explain away surprises with complex theories — something is probably broken.

Who or what is affected by the change?

You can think of the previous section as suggesting you cut the data funnel — where the behaviors come from. You should also cut the data by who or what it affects on your site:

  • Which customers are affected? (Old versus new, first time visitors versus returning, regular versus occasional, international versus domestic, near versus far, and so on)
  • What categories are affected? (Fashion versus electronics, browse versus buy, and so on)
  • Which queries are affected? (A search-centric view. Long versus short queries, English versus non-English, Navigational versus Informational, and so on)
  • Which sessions are affected? (Long research sessions versus short purchase sessions, multi-query sessions versus single-query sessions, multi-click sessions versus single-click sessions, and so on)
  • Which pages are affected?

All the same caveats and suggestions from the previous section apply here.

I also love to compute many different metrics. While you’ll often have a “north star” metric that you’re trying to move — whether it’s relevance of the experience, abandonment of your site, or the dollar value of goods sold — it’s great to have supporting data to inform your decision. When you compute more metrics, you almost always will see contradiction that makes your decisions harder: but it’s always better to know more than to have your head in the sand. It takes smart, sensible debate to make most launch decisions.

The mean average hides the truth

Here’s an over-simplified example. Suppose six customers rate your site on a scale of 1 (horrible) to 10 (amazing). In the control, they rate you as 4, 5, and 6. In the test, they rate you as 1, 4,  and 10. The control and test have a mean average rating of 5. (Ignore the statistical significance for the simple example.)

On this basis, you might abandon the work on the new experience — it’s no better than the control. But if you dig in the data, you’d see that some customers love the new experience, and some hate it. Imagine if you can fix whatever is causing customers to hate it — if you could get that 1 to be a 5, you’d see a mean average of over 6 for the test. The fastest way to move a mean is to fix the outliers: focusing on what’s broken.

I don’t like mean averages because they hide the interesting nuggets. I like to see 90th and 95th percentiles — show me the performance of the best and worst 10% or 5% of customer experiences respectively. In our simple example, I’d love to know that the worst customer experience was 1 in the test and 4 in the control, and the best experience was 10 and 6. Knowing this, I’m, excited about the potential of the test, but worried that something is very wrong about it for some customers. That guides me where to put my energy.

Don’t be myopic

It’s common to measure your feature in the product, and ignore the ecosystem. For example, you might be working on an improvement on some part of a page — imagine that you’re working on Facebook’s news feed. You’ve figured out an improvement, run the test, seen much better customer engagement, and you’re excited to launch.

But did you worry about what you’ve done to the sponsored links on the right side of the page? Did you hurt the performance of another part of the product owned by another team? It’s common for features to hurt performance of others, and often cause the overall result to be neutral. This happens between features on one page, and between pages. Make sure you always measure overall page and site performance too.

Tests don’t tell you everything

Tests don’t tell you what you don’t measure. Measure as much as you can.

Even if you do measure as much as you can, there’ll be much happening outside your test that’s important. For example, if you run a test for a week, you don’t learn anything about the long term effects on customer retention. You don’t know anything about how customers will adapt to using the feature. You won’t know whether the effects are seasonal, or what might happen if some of your assumptions change — for example, what if another team changes something else on the page or site in the future?

This can be ok. Just realize the limitations, and be aware that retesting in the future might be a smart choice.

Quantitative testing also won’t tell you anything qualitative about what you’re working on. That’s a whole another theme of testing — and one I do plan to come back to talk about in the future.

Afterword

Around 1,000 people attended the employee-only eBay Data Conference recently. I had the opportunity to speak to them through my opening keynote address, and this post is based on that presentation. Thanks to Bob Page for inviting me.

Rebooting: a trick to avoid bugs

We all know that rebooting the home computer, router, backup device, DVR, or iPhone often solves mystery problems. (Have you noticed how frequently you’re rebooting your once-was-reliable iPhone?)

This works in large, distributed systems too. If you’ve got buggy code, a memory leak, or a shaky operating system, rebooting machines in a large distributed system works too. I’ve seen this in practice: periodic, scheduled reboots of boxes to reduce memory use, reduce CPU load, or generally cause a return to a known state.

Indeed, I’ve seen plenty of problems that occur when this doesn’t happen. A system remains untouched for a while, and things go south. After a problem has occured, I’ve heard quite a few folks say “we hadn’t rolled out code to that pool for a while” or “that box wasn’t rebooted for a few months”. In many cases, the issue was the gentle creep of increasing CPU or memory use.

Perhaps it’s good practice to ensure boxes are rebooted periodically. It’s probably wise when the machines are out-of-sight and out-of-mind: those less critical, less monitored, sometimes unowned services. It’s perhaps not even a bad thing: one of the wonderful properties of web services is they don’t have to be perfect, since you’re in control and the software’s running on your choice of hardware (unless you’re on someone’s virtual machine in some opaque cloud).