Podcast of Nate Bolt’s rap on Remote UX Research
Nate Bolt, President of Bolt|Peters
Remote UX is an INCREDIBLY small niche
Nate started off with a nifty comparison: “About 5% of all research is remote. Research in general is a niche industry. User research within is pretty small. UX research within that is even smaller. Remote UX research is even smaller.”
Why do people love remote?
- Remote is a cheap access to someone’s physical environment
- You want to intercept somebody who is “in the moment.” Ex: They are in the middle of entering their credit card to buy something.
- Easier to recruit
- Easier to observe and record their actions
- Can get participants from all around the world
When not to use remote
So when is remote research a bad idea? If you need specific skill sets. Example: If you want feedback on a video game from 13-year-old kids who are French and Chilean but speak English and have web cams. This is really bad idea for remote testing. Why?
- Bandwith is a huge issue
- Translation is a huge issue
- Its hard to get consent to talk to kids
- Hard to make behavioral inferences form web cams is tough
- Security is a problem
- People mis-assess their language skills. Nate mentioned that they ran into this issue a lot. A quick tip to overcome this is to ask a question such as “What happens to a balloon when you let it go? Up? Down? “ as preliminary screening to make sure they actually speak the language. Always better than asking “Do you speak English?”
Designing the actual study
Nate outlined an example study he did for Intuit trying to figure out why folks were bailing out of checkout. The remote UX study was setup had:
- 10 participants live with screen sharing tool
- 10 participants from usertesting.com (self moderated which meant the participants were talking to a mic on their own, nobody was on the phone with them)
- Used Ethnio to intercept users as were about to checkout and ask them what is going on in their mind
Two tips to remember
- Screen whether people will be good participants. A good question to ask before you recruit a participants is: What did you come to this site to do today? Folks that answer “I have the last model and am looking for the next one” are killer participants! Folks that say “Just because” are the ones you’ve got to X out.
- Try to avoid professional survey takers. Folks that say: “Oh yeah – I would totally use that!” are the ones you want to get rid of. Try to ask people to perform a task instead of asking them about their opinion. Nate did a study testing Survey Monkey’s new UI. Nate asked the users to create a survey for something they needed to test. He could see the people that truly carted about the tool. Those are the ones that he recruited for more testing.
The talk had a lot more details and information which you can get by listening to the full podcast above as you page through the slides. We'd like to thank Nate Bolt once again for showing us the ins and outs of Remote UX Research!
Nate: In this last six months there has been more UX research tools that have been in the preceding history of mankind. When you talk to people about their lives in the content of their usage you generate more surprises. They were streaming the video from the car live to the designers in Santa Monica and the engineers in Germany.
Moderator : Thanks everyone for coming to ZURBsoapbox. I'm stoked to have Nate Bolt here. Nate Bolt who's conducted over 230 user research studies for companies such as Electronic Arts or Oracle, for Sony who is the mastermind behind Ethnio, a nifty little app that has recruited hundreds of thousands, millions of participants in the user research studies around the world in only the seven years that it's been out. Of course Nate Bolt who's published on [inaudible @01:04], and [inaudible @01:06], who's written an awesome book, Remote User Research from Rosenfeld Media, which all of you should get. So with that, let's welcome Nate Bolt, the one, the only, number one on Google for robotic treat dispenser, Nate Bolt.
Nate: Thank you Moderator , that was an awesome intro. So since we have a pretty small group today I thought we could custom tailor this. I'm not just going to like go through the spiel. We can I think find out what you guys are interested in about remote research and we can focus on those. My guess is that it's going to be about the tools, so I've got a bunch of stuff in here that we usually get into the different categories of UX apps. Since you guys have your own UX research apps I thought that would be the most fun to dive into.
It's just worth a reminder that we're in this incredibly small niche. Research in general is sort of a niche industry, user research within that is pretty small and then UX research within that is even smaller and remote UX research or online is even smaller. This is our scientific guess that how much overall user research is conducted overall, but certainly less than 5%, especially when you think about the world of ad agencies conducting focus groups and that dwarfs our world of designers and developers right?
There's just more of that industry. My question for you guys is, how much moderated remote stuff do you guys do if any? How many people have done any moderated like screening sharing type of remote testing here? None. Zero. Fantastic. We'll definitely focus on the tools. This is kind of the way we split it up in terms of the world of researching people. Most of our work tends to be on the behavioral side. We do some opinion-based research, but we're most interested in people's functional interactions with interfaces, whether it's software or hardware or whatever.
A lot of people want to know what scenarios are good for doing moderated remote research, but I think the same applies to when to do online UX research too. So let's say we've got our fictional character here. This guy's name is Marv. He's a creative director, UX interaction designer, product manager, developer, IA and this is his boss. His boss is kind of a dick, but he wants to know how they can basically, I don't know what the birds have to do with it, but he wants to save money on research right, so it's a common motivating factor to get into online research.
Again, we kind of split up remote or online research into this watching people or using a tool. On the moderated side we encourage people to think about it as a cheap way to get access to somebody's physical environment. Typical lab research where you have somebody come in. They're on your terms. They're coming into your facility, they're coming into your facility, they're sitting in your office. Even if it's guerrilla style and you're in a coffee shop there's still an artificial physical environment.
The cool thing about doing some kind of remote moderate thing is they're in their moment, they're in their own physical environment, they're in their own computing environment. You can get sort of what we call time aware research, which is just when you intercept somebody and they're buying something or doing something and you say, hey do you have 10 seconds to share your screen with me and I want to follow along with you in what you're doing and talk to you about it. That's sort of a way to get on their timeline.
We call it time aware research, but you're obviously giving up physical preference. So it's really easy to get observers. You can do it with people all over the world clearly, but there's certainly times when this, and again this is for the moderate side, when it's a bad idea. So if you did want feedback on a video game with 13-year-old kids who are actually French and Chilean, but spoke English and had webcams, that's a really bad idea for remote testing because bandwidth is a huge issue, translation is a huge issue, it's hard to get consent for kids.
The actual degree to which you can make behavioral inferences from webcams is kind of arguable. It's just pretty shitty resolution. If there's any security that's a problem. And that's actually a safe in our office. We did the testing on Spore a few years ago and they made us store the code in a safe and the only place we could keep the safe was in our bathroom and it now holds toilet paper. This one where they wanted to do the web cams with kids, that was a real study too.
It was for this game called Habo, which is big for like 12- year-olds. This was like the French screener that we put up on their site and it was as close to a complete disaster as we've ever had doing moderate remote intercepts because kids don't play video games when their parents are home. We needed parental consent in order to do the interviews. The kids also like about their ability to speak English over the phone, so French kids, German kids, and Chilean kids all lied consistently.
They basically said, yes I can speak English, on the phone and then we'd call them and they didn't speak any English, at least not spoken. This is kind of a bunch of things that went wrong with the study, but the most interesting is that when they did mis-assess their speaking skills, if you ever are looking to recruit people online from other countries to do English speaking research of any kind, even if it's a tool or if it's moderated.
We ask the question, what happens when you let go of a balloon, and then we have a dropdown that says, up, down, left, right, depends on what's in the balloon, depends on what planet you're standing on, but anything that kind of makes somebody prove that they speak English is a much better question than just asking somebody if they do. Talking about designing an actual study. Let's say the evil boss wants some insight on why people are bailing out at checkout.
This is one we just did for Intuit and what we came up with is to do 10 users live with a screen sharing tool and then 10 users with usertesting.com. Have you guys ever heard of usertesting.com? Okay. It's a service where they'll find random people who are on their panel and for $39 per person that person will talk into their own computer mic so we call it self-moderated because they're talking so you get some sort of human feedback from them, but you're not on the phone with them. Their experience is a little weird, it's kind of like you're sitting alone in the dark talking to your computer being like, OK I got this, I understood that. It's kind of cool and we've been using it more.
We try to come up with a way to say okay, if you're about to purchase this product, we use our Ethnio, which is our tool to intercept people right in the middle of checkout and ask them about the product that they're about to purchase. So this is kind of the difference between when you ask somebody to pretend to care about purchasing a product. When they're really about to buy it the criteria by which they make click decisions are totally different. You have a bunch of things on your mind when you're about to submit your credit card that aren't necessarily on your mind when you're doing sort of make believe.
This is what it looks like when we do moderated remote testing where all the clients and stakeholders we ask them to come in the same room because the physical collaboration on the client team is great, it's just that the participant is somewhere else. A lot of people ask me at this point if you can do prototypes. I know you guys know that you can. Probably about 60% of our projects involve some type of prototypes and there's tons of tools to do and we've just done a ton of these prototype studies and dinosaur videos, so this is the part that I kind of wanted to get to for you guys.
The tools, the UX tools has been exploding. In the last six months there have been more UX research tools that have been created than in the preceding history of mankind. It's really interesting. The way we define a UX tool is that it has to be at least a little bit behavioral right? So your stuff falls into this category. It's not a survey. Surveys have been around forever and sure, that's a web or an online form of research, but we all know that it's strictly opinions.
There's no relation to functional interaction in a survey. It might ask, how do you find this site, do you find this site easy to use, but people's own characterizations of their experience are notoriously shitty. A survey is not really a user experience tool and web analytics, again you can get great data, we're huge fans of it, but it's not really a UX tool because it has no human face to those statistics. You can get tons of richness, but you don't know the story behind those numbers.
We kind of put the tools into five categories. There's moderated, which is just any kind of remote research where you're talking to somebody. It could be with screen sharing, it could be like a tool that shows their click behavior, but you're having some kind of contact with them. Self-moderated like we just talked about. Automated, this is you guys, there's two subcategories of automated, but it's anything where you're giving the user a task and then recording their interactions with either a static image or a functional site.
Have you guys heard of Loop11? Okay. It's like a live site. Then there's this whole category of fancy analytics like CrazyEgg that have gotten a ton of press in the UX sort of world about being rad and they just do different visualizations of web analytics, there's no task involved and we'll talk about that in a second. Then there's longitudinal studies, there's diary studies, so there's only one tool that we know of that's good called Revelation where people over long periods of time sort of enter in data either from their cell phones or from a website and they say, oh, I've been using this new Sony product for a week now and I love it.
It's been three weeks and they kind of check in. Those are sort of longitudinal or for a long time. Here's just some example products. I'm sure you guys recognize a bunch of these. I would put, in fact I thought I did put, what's the new one? Verify yeah, under automated static because you're giving questions and tasks and then asking people to interact with a static image or a set of static images, which is a little bit different from automated live like Loop11 users and Keynote, they spend all their time and resources being able to give a task in a little iFrame and then watch people interact with a live site or a prototype.
Another way to think about these is you talk to people, people talk to a machine, you get tasks on a live site, you get tasks on images and then just fancy pants analytics. Another way to think about that is we tend to generate the most raw insight from one-on-one interaction, which is probably why ethinography became a buzzword in corporate America in the first place. When you talk to people about their lives and the context of their usage you generate more surprises than you do as you get more into what I sometimes call computers watching people using computers.
Another way to think about it is you tend to get more opinions as you go down the line in this direction, because people are self-reporting in a lot of the tools. They're just saying, you know, I think honestly that color sucks, so it's not as much of a behavioral feedback ad agencies tend to love the farther you go down more and also you see this term.
This is where our world of UX and development touches up against ad agencies and branding and if you type online qualitative research into Google there's this huge world that's ten times bigger of online focus groups and online panels and things like that that ad agencies, when they're going to do a campaign, an outdoor campaign or even a web campaign or something, they want to spend $100,000 interview people about the concepts of the ads. We don't really tend to do that kind of stuff.
Another way to think about that is that Jared Spool hates it. But we were getting ready to do a project for him and he said, "Well, that's not true. I don't actually hate it," and he sent me this long email and he said hate is not the right word. It's just that you shouldn't blame DVD players for bad Jim Carey movies. It's not the tools' fault. It's just that these tools, in his opinion, encourage sloppy inferences for user experience decisions because they make it so easy to just execute a test. A 5-second test or feedback army you can just send off a test and be done with it and you get the feedback and you can just basically go with whatever people say.
So you could get direct inferences and his point is it takes a lot of training to really get to understand user behavior and by the time you've done that you might as well just do some sort of moderated observation instead of just doing tools. He's not against the tools themselves, he does hate iTracking though, no he said he doesn't. So we just have some examples of usability. It's hard to see over the top, but this is a task we had from a recent test that we did for Levis and Facebook in conjunction.
They wanted to know how the like button is working because it was a recent launch and the task was, now you've gone to a product page, click on what you would look at or click on and this is the heatmap of where people would click and the second interesting thing is that the like button obviously has almost no clicks and it's funny because we very deliberately put where were you click on what you liked about this page because that would be a horrible leading task, but they got a big kick out of reading through the comments.
The other thing is it's kind of hard to see, but these notes, when you use usability you guys probably know people justify where they click so you get a bunch of notes. I think this is similar to you guys tools. This is another task that we had which was, imagine the first thing you want to do is make a quiz, where would you click to do that? This is an educational software company, but what's funny is if you just look at the page everybody just clicked on quizzes pretty much, but it says, to insert the quiz click the arrange button, which kind of throws you for a loop.
Most people didn't click the arrange button, there's like 4- 5 people that clicked there, maybe 6, but 45 people clicked on the quizzes. It's one of those things. This is a perfect example of where straight up usability issue, no insight required right? Like duh, that quiz button needs to actually work. There was a bunch of reasons why it couldn't at first, you know, technical constraints, but this is fantastic for just illustrating that. This is what the task looked like on Loop11.
It's got a Harry Potter task and then the difference here is that people self-report when they're completed a task or abandoned it. So it's up to them to say yes, I've finished the task or no, I abandoned it. So the statistics you get about their completion rate are sort of dependent on peoples own ability to judge themselves. And we found that if you would watch them some people would clearly have succeeded, but then mark themselves as abandoned and sometimes vice versa, not always, but that happens. We tend to use GoToMeeting for a lot of moderated stuff, Ethnio for recruiting.
The reason we use Ethnio for recruiting and we would've been happy to never have developed a tool, but there was no way to get live responses. Like if you use Wufoo or Survey Monkey or something like to execute a survey on your site and you're looking for people to participate in either automated or moderated research. All those survey tools, you have to look at the results. They're built for analyzing aggregate data and all Ethnio does is just dynamically dump the people who are filling out your questions into the categories and you can set up filters and so, are you going to buy this product, is a common question that our clients want to use.
We only want people to participate in research that are actually prospective buyers and then we use the open-ended responses like, what did you come to this site to do today, to get a feel for if people are going to be good participants. You get a lot of insight from somebody typing, yeah, I had the last Sony flat screen from the previous model and I really want to upgrade to the new 3D OLED or whatever and I'm looking to find that. You know that person is going to be a killer participant because they're articulate and they have a good task versus the 10 people that just wrote info.
They might not be like the right fit. Here is you guys stuff. I've been pointing out that I really like the emotion scale and I noticed too that there's another, this is a new one, PlainFrame. You can upload a potential IA and it removes the context from the IA and you can just test the navigation and it will tell you like where people click and stuff like that. But this is the one, so I think it's funny when people introduce new jargon like emotion analytics to what's basically a Likert scale, similar to your thing, it's a form of a Likert scale, but they're calling it emotion analytics.
But really it's just automated feedback on a static image. It's the same category of research with a big frilly graphic. Let's talk about recruiting for a second. Let's say the boss wants to know how to find people, there's a kind of gradient of realness and I was at South By this past year and there was a panel with FreshBooks and Wufoo talking about what kind of research they did if any in the development of their products and they spent the whole time with the panel talking about AB testing. They were all about AB testing.
What somebody asked was, did you do that for optimizing your marketing site or to come up with the core interface of Wufoo and FreshBooks? They were like, actually all we did for the core interface was grab people in the hallway and like guerrilla testing and significant others and nothing about the core interaction of Wufoo and FreshBooks ever had a single AB test. All that the AB tests are like email campaigns and registration, get signed up for Wufoo and how many people can we get to buy per visit. So what's interesting to me is, there's nothing wrong with that method.
We're huge fans of guerrilla methods. You just get more accuracy the more down the spectrum you go and sort of the bottom of the spectrum I would say was Craig's List. We do it sometimes, we don't like to talk about it at parties, but it's a legit way to get participants and obviously you can do agencies. Do you guys for your tools do most people just enter in emails or do they put something on their site?
Audience Member: On Twitter maybe [inaudible @21:31] URL [inaudible @21:32] Twitter . . .
Nate: Oh okay, they Tweet it out. Okay.
Audience Member: [inaudible @21:33] like email something out.
Audience Member: We pretty much give them a link.
Nate: And then who knows what they do with it? Got it. All that is pretty legit. Obviously, Wufoo is great for any kind of complicated form.
Let's see here. We used to think that any kind of existing customer panel is really bad, but I've been changing that recently to only sort of bad, because the idea was that you'd only have people that were like professional survey takers and those people are really just getting paid to give you an opinion and they're just going to tell you what they think you want to hear. Oh yeah, this is great. I would totally use this in my job. That kind of feedback makes me want to shoot myself.
They call it the Hawthorne effect, giving the tester what they want to hear. But I've sort of changed my mind because Usertesting's panel includes a rating system so every time you do a user for $30 or whatever it is, you get to say did they seem like they bullshitted? That's a really interested scale because you end up getting people who have a pretty authentic way of pretending to get involved in your task basically, so this is kind of some meta stuff here, but we used Usertesting for Survey Monkey, who is a client of ours, so we did Usertesting on 10 users about Survey Monkey's new product and we just asked people, pretend to come up with a survey that you could use for your job or, and we had some academic people, at school and they really did.
If I sat down and thought about it I could probably come up with a survey that I do need to run and they did take the time to do that and that made all the difference, so even though they were panelists they took the time to sit and be like, okay, what are my projects this year? One guy was a developer and he was really curious like if he could convince 10 of his friends to take the quick survey.
He was curious about their development environment, how they set up their stations and I forget what language he was using, but you could tell as he was creating the survey he was like, oh yeah, and actually I keep meaning to ask people, how do they deal with this one setup task that I hate for visual basic or something and you could hear him care. That made a big difference, so he was technically a panelist, we talked about that already.
We do a lot of live recruiting with Ethnio. Again, this is on the moderated side, I have a feeling you guys are more interested in the tool side, but if you ever do any moderated testing you need about 10,000 uniques per day to be able to nab people, pull them out of thin air and then talk to them on the phone, because only about 200 people out of 10,000 will fill out a screener, it's like a 1.6% response rate, and then only 10 of those are going to be people you want to talk to. They aren't going to be interested in buying or fit the criteria and then of them, when you call people on the phone and say, hey you filled out a screener on zurb.com.
We're doing a study on our site. Will you join a GoToMeeting and share your screen? Four of those people are going to be like, hell no, because that's crazy. They're like, I don't know who you are, I just filled out a form on your site, I'm not ready to share my screen with you. But 60% of people will because they're nice and maybe you're offering them a gift certificate and so that's how that works. We set up this handy dandy little short URL.
We use bolt.ps that's just super, super easy to read over the phone, so if you are doing any kind of moderated testing it's so much easier not to read somebody a 9-digit WebEx meeting or something stupid like that to get them going and then we do a lot of Clickwrap consent agreements so they're sort of minorly legally binding, but they're close. We tend to use Amazon gift certificates to pay incentives. The only reason we use Amazon is because they only require an email address to fulfill, so if you're trying to pay somebody and not everybody has PayPal out there in the world, so it's nice to just say, what's your email, here's your gift certificate.
A lot of crap goes wrong on the moderated side, people are sort of distrustful a little bit at first, but usually after you are talking to them and up and running they kind of let go and are down. There's a bunch of advanced techniques we could talk about. I see it's about 12:55 and I know we want to have time for questions. As I'm flashing through these advances techniques, as you guys have questions start shouting them out. Like, what the hell is that?
Audience Member: Is there any time you don't want to be remote?
Nate: Yeah totally, all that stuff from that video game example. That was all meant to be like if bandwidth is an issue, if the movements on screen.
Audience Member: Just the tech stuff?
Nate: Yeah, tech stuff and then also who is the user audience? Like kids, remote is typically bad. Anything that involves like real sensitive data like the security stuff. Like banks pretty much never to do remote stuff. They're like, we can't. Sometimes, like we did a big study for Wikipedia and they were dying to get people's full on facial expressions so they wanted that face to face, which is cool, so we did a ton of in-person testing and they loved it. I think that's a legitimate reason to not do remote if you want to see people's faces.
Audience Member: What percentage of your clients actually act on the research you produce?
Nate: I would say at least 1%.
Audience Member: Okay. Excellent. That's good news.
Nate: Yeah, I mean maybe bordering on 2 this year, pushing 2%. It's so funny, not a ton, sad but true.
Audience Member: I was going to say, how do you moderate, like you were showing at the end when you had them all watching I think it was the interrupted flow sitting around, like how do you conduct that?
Nate: The one too many? This one? The multi-threading?
Audience Member: No, it was the one where you had the room where you pan across to and everybody from the company were watching people what they're purchasing.
Audience Member: You're moderating with the client, how do you actually run that so that? Are people just starting to shout stuff out or usually one person might run with an opinion or something?
Nate: That's a super good question. That's something that looks us like years to get honed down, but it takes two people, so you have to have a lead moderator and a support person there. We put everybody into a campfire room, so we encourage people even though we're in the same room. They can talk, but we ask if they have any real issues to type them into chat and the support person sort of filters that out for the moderator.
The other thing is, the moderator a bunch of times during the interview will say to the participant, hey Frank, can you hold on a second while I adjust something on my end? They'll put the participant on mute and then everybody opens it up for discussion, like what did we just see here. An engineer will be like, I know it's not part of the study, but can we have them go check this one thing, we want to make sure it's working and we say sure or somebody else will say, can you explain what you thought you saw there? That kind of stuff. And we have a workshop too where we teach that specifically called escape the lab. It takes some practice. Shit can get crazy pretty fast.
Audience Member: You were talking about one time you were doing a study in a car?
Nate: Yeah, for Volkswagen.
Audience Member: How did that work out?
Nate: That's a good question. So that's this portable research one. So all we did was, they wanted to study people's interactions with the cockpit interface and their own personal devices, so we had the moderator in the back seat with a laptop, an external web cam and just a Sprint Mobile broadband card and they were streaming the video from the car live to the designers in Santa Monica and the engineers in Germany and all those people were IM with the moderator. I'm sorry, that was all the support person.
The lead moderator was sitting in the front seat talking to the driver, I mean not all the time, just watching with a pen and paper, just very unobtrusive. The support person was in the back seat doing all the technological craziness. That was their only job was to record and IM back and forth with people. If something came up that was really important they would say to the moderator, can you ask what they just did there? Hans wants to know why did they stop the car and look up that device or whatever it was.
Audience Member: And the Internet was okay?
Nate: The Internet was okay. We set the expectation like hey, you know, it's cell phone data connection, it's probably going to drop and it did and they would rejoin. We used Ustream so we just had a page and the stream would stop and start sometimes, but everybody understood and nobody minded. Yeah, that was fun.
Audience Member: Was that the strangest testing you've ever done?
Nate: That's a good question. That was so much fun, I guess it was kind of strange. I feel like there have been some other strange ones. I can't think of any right now. I feel like some of the video game studies we've done have been weird. Oh yeah, we've had people come really drunk and stoned to video testing.
Audience Member: Awesome.
Nate: It was kind of awesome. And we have to say to the client, what's your preference here? We can ask them to leave really easily and nobody will know and they were like, no, fuck it, this is real life. So it's like, okay.
Audience Member: [inaudible @31:21]
Nate: Right. I mean, they still gave plenty of good feedback.
Audience Member: A little bit more honest.
Nate: Yeah, totally. This game sucks, never playing this.
Moderator : So it's almost 1:00 so thanks for coming.
Nate: All right. My pleasure.