In this episode, Ledge chats with Ryan Burgess, Software Engineering Manager at Netflix, about leading their Acquisition UI team optimizing the signup and login processes for one of highest usage apps on the planet. They discuss the challenges of coordinating among engineering teams at massive scale, working on technologies that span a multitude of different platforms, and how Netflix incorporates A/B testing at the core of everything they do while delivering your weekend video binge.
Ledge: Ryan, great to have you here, man! Thanks so much for joining us.
Ryan: Thanks so much for having me. I’m excited to join the podcast today
Ledge: I know who you are and you’ve done some amazing things. But for the audience, can you give your two- or three-minute background introduction?
Ryan: Two or three minutes ─ they might be pretty bored with that. But I am a software engineering manager at Netflix. I lead a team that’s called the “Acquisition UI” team; and what we focus on is building signup ─ so registration payment, onboarding, login, really everything that gets our customer into using Netflix.
What’s unique about our team is that we’re doing that cross platforms. We’re building UIs for IOS, Android, website, and TV. So there are really unique challenges that come with how to support cross platform on our teams.
I think another thing that I’ve always really been a fan of what Netflix is doing and even one of the big reasons why I joined Netflix was the A/B testing culture. We’re not just shipping a feature. I always joke that we’re not really shipping features; we’re actually shipping a B test. test.
We’re constantly iterating if, in the product, that users are using the playback. But even in the signup flow, we’re constantly iterating on that to build a great user experience; and you don’t always know what the best user experience is until you really understand the data and actually see it in real production.
I think that’s always been something that I’ve loved as an engineer, that constant iteration and feedback. And, sometimes, it’s painful because you think it’s a great user experience, and then it’s not.
But it’s good to know. Ultimately, I’d rather know if it’s a good user experience or a good feature that we’re rolling out and really get that feedback.
I’m also on a podcast or a panelist on a podcast called the “Front End Happy Hour.” We have drinks and we talk about front end so you can hear about us talking about different sorts of topics typically around the front-end software engineering industry. That’s been a good one as well.
Ledge: We’ll get that in the show notes, for sure. And I’d be happy to be invited and talk about front end.
Ryan: You’re always welcome. The one thing that’s hard is you and I are remote right now. One thing that we’ve stayed true to because it is a happy hour and we’re sitting around talking in a room, we are legitimately in a room. There’s no remote call. We have yet to do that.
We’ve done episodes where we are live on a stage on a conference panel, things like that but we’ve yet to do a remote call; and we’re trying to really stick to that more in the conversational aspect.
We do travel. So next time I’m your in area or you’re in the Bay Area, we should definitely connect.
Ledge: That’s good. We’ll get to party together. You’re from Netflix. You work on one of the biggest apps in the entire universe. Before we get into the neat stuff about culture and all the things that you, guys, are famous for and the team and A/B and all that, I’ll have to ask just a little picture of the journey of the tech that goes into a monstrously used app. I think that every entrepreneur and mobile developer would, at least, want to know that.
So give me the baseline.
Ryan: Wow! That’s a deep question and it probably has a long way to go about answering that.
For me, some of the things that are unique and that people know about Netflix is that we’re known for micro services. And one thing that’s really been beneficial, I think, to Netflix, in general, be it in the culture or how we write and create features across the app is that, obviously, there are a lot of teams involved.
One thing that’s been really unique by having such a micro service culture and really splitting the applications up that way is it really allows us to run independently. We’re ultimately achieving the same goal. We’re all on that same mission of what we’re trying to achieve. You can really rely on teams to take their end of it and run with it and run independently and really focus on that quality and aspect of what they’re doing.
My team focusing on the “why” is they’re going to build a performant user experience as much as possible, and that allows us to really specialize and think about that strategically.
So that’s been a really helpful aspect really speaking to the large aspect. And there are teams that I probably don’t even realize that we’re interacting with because there are so many layers down the stack which is great.
I don’t necessarily have to. The systems are working and I interact with maybe a team that eventually interacts with that. I think that’s kind of unique.
As a bit of a summary ─ I mean, we can go on for hours and hours going down the stack and thinking about how Netflix works in that way. I feel that there have been talks where we’ve shown diagrams of our system, and it’s a complicated system.
Ledge: It has to be. You’re probably the greatest load that could ever be documented of any service. The A/B split testing culture, I think, is huge. You hear about this with any mega-scale service.
Just talk through how that would work when you roll out a new feature or the things that you learn from it and how that impacts the actual day to day in your UX and front-end world.
Ryan: Let’s start from the beginning of how an A/B test is even started. At the end of the day, the start to an A/B test is really “What’s our hypothesis?”
Back to our sixth grade science class, “What’s your hypothesis to this? What do you expect to happen or what are you trying to test and achieve?”
Oftentimes, that’s a really big collaboration with our teams. My team, especially, is working closely with the product manager as well as the design, UX as well as our engineering teams to really think like “What’s the problem we’re trying to solve?” and not necessarily “Here’s the spec and feature; let’s roll it out as an A/B test.”
It’s really thinking about “How do we achieve to this hypothesis and what does that look like?”
Then, it’s really isolating the variables. Say, if we we’re doing something very simple where it’s like a copy test, where it’s like “Sign up for Netflix free” or “Sign up for a free month” and thinking that’s a pretty simple thing to test but you want to isolate all the variables and run them at the same time in production, once the variables and features are created, we, then, do actually run production traffic all at the same time.
If there are two variations, that could be just a 50/50 split of traffic that are coming to that feature. It’s randomly generated like you and I won’t necessarily be in the same test. We could be. It’s randomized. We don’t really want to be like, “Well, everyone in the U.S. gets this feature and everyone in Japan gets this feature.”
That doesn’t work. You’re not going to hit anything significant that well and it’s a little biased when you do that.
Sometimes, we’ll have tests that have four or five variations or ten variations all running at the same time so we can all actually be experiencing Netflix in different aspects that are different from each other.
Ledge: You have such an enormous production user base. Is there any temptation or reason to ever do A, B, C, D, E, F, and G test?
Ryan: Absolutely! I guess, that’s what I was alluding to in the sense that it’s not necessarily just an A/B a 50/50 split. Sometimes, yes. But there might be five or six variations of a single test. So that would be like A, B, C, D, E, F, G or whatever that looks like.
Then, there’s also the potential that we’re running a test in one area and then another team is running another test and so forth. So there are times when you can be in multiple A/B tests. But there are also times when we have to caution that that could actually potentially not work well. Those two tests could butt up against each other and not build the best user experience.
It’s always an interesting discussion on how we run these tests and run them out. But, essentially, yes, you can have multiple variations in one test.
Ledge: How do you get that organizational “working together” on that? It’s such a big thing ─ so many teams. I’m reminded of these websites that you go to and you’re like it’s very obvious that nobody communicated that they’re A/B split testing one thing while this ad pops up over it from the other team.
So how do you coordinate that whole product road map at such a massive scale?
Ryan: I think a lot of it is really communicating. I don’t think there’s a simple answer. I don’t think we always have the solution on this either. But it’s really understanding and communicating what you’re trying to achieve.
There are some really amazing forums within the product teams and product org that we’re actually bringing these test ideas or ideas that we want to run the test out. We bring them to a forum and actually discuss. And it helps broadly share what we’re trying to achieve and it allows each other to kind of pour coals in it and question it and then also bring up those pain points like “Hey, this probably can’t run at the same time as X, Y, and Z because this won’t work well. It will be a terrible user experience.”
But that’s still not perfect. I think it really take a lot of coordination and just really socializing and sharing that context with more teams as much as possible.
We write a lot of memos. We’re fans of Google Docs where you can actually share that with as many people as possible and weigh in and give comments and then also be able to share that with others that you think can affect the tests they’re running.
I don’t think there’s a perfect solution. It’s not easy. That’s for sure.
Ledge: You, guys, have largely escaped ─ and I’m wondering if it’s the culture that you’re talking about and your mastery of this type of A/B testing world and sort of global communication because I don’t think of a single time where you’ve rolled a feature out and your users went absolutely nuts and slaughtered you on the Internet. And that happens to so many big companies.
Do you attribute it to that culture and that sort of skill in this area?
Ryan: Yes, I think so. I think just because really, at the end of the day, we might think we have a great idea and we think “Hey, this is a great user experience.” There are often times when ─ I’ve been doing this for so many years that I think I know best like “Yes, this is a great user experience” or reducing steps is a great idea in something like a sign-up flow.
Not always! To me, going back to testing it and really putting with production traffic is super important. I mean, data shouldn’t. It’s data and we’re making data-informed decisions.
So I do really attribute a lot of that to the A/B testing and really understanding that. Also, at the end of the day, we have a lot of traffic to do that with but we also don’t want to hurt the user experience just because we’re testing.
So I think it’s super important as we’re doing this test to really make sure that, at the same time, we’re not making a poor negative experience; and we’ll get those reads fast. If a feature is not good, we will know.
And it’s not just asking for feedback in a qualitative survey. It’s real world traffic using this product and they’re almost unaware that that feature is being tested on them.
Ledge: Can you tell us any really good stories of massive failure kind of Phoenix rising from the ashes of learning opportunities, shall we say?
Ryan: I love that you used the word “learning.” That’s actually something that I pride ourselves in those A/B tests. Even something that doesn’t work as a feature, we should treat that as a learning. Just because it’s a failure doesn’t mean it’s absolutely a big fail.
I guess, if we keep trying to force it down the users’ throat, that might not be a great thing but I think we can learn from those aspects.
One that has really stuck out for me ─ and I’ve even shared this at some talks that I’ve given at conferences about A/B testing ─ is this aspect which I think I alluded to earlier. Sometimes, reducing steps or friction is always going to be a better user experience or that’s always been my thought and hypothesis. It’s like, yes, anytime you can reduce steps, that’s going to be better.
Well, we found this one test that we were running where we actually introduced some additional steps in our sign-up flow that actually performed better. And I’ll explain why just as a brief one. It’s hard to visualize if you’re not seeing it.
But we added these in-between pages between steps. It’s like you’re getting a page before you’re adding your email and password; and it’s giving you just a really brief description of what you’re doing like “Hey, you’re going to be giving us your email password. Here’s why.”
There’s a “continue” button right in this exact same spot. That, I think, is super important. Throughout these additional steps, there are three extra steps that were added and it’s giving the user to quickly skip past them really fast.
So my opinion is the reason why this worked is because you’re able to give the information to someone who needs the additional information in understanding what the steps are and then also making it very easy for someone who says, “Yes, I get it.” They can skip really fast because that button is right in that exact same spot which allows people to kind of push forward.
But for someone who needs additional information, it’s there and you can give that to them and provide that because you don’t know unless you ask a question up front like “Do you need lots of information or do you want to just do this as fast as possible?”
I thought that was a really unique one where my initial thought when we first started putting it together was like, there’s no way this is going to be a better user experience.; you’re adding more friction, and that can never be a good thing.
Ledge: I guess, in that learning, we’ve got to test everything. You have to be of the mindset that you can completely suspend all of your assumptions. And that’s a difficult mental model ─ to drop every bias that you have and all of your frameworks and try to go, what if I had a clear mind about this?
How do you even go about that? It’s like a creative process.
Ryan: Yes. I think it’s being open to varying ideas and opinions and really understanding, okay, that first one, I wouldn’t have agreed that that was a good idea necessarily.
Obviously, I did think, yes, we should test this and find out what happens. To me, being helpful in that way is really thinking strategically, okay, how do we best test this to make sure that we isolate the variables in a proper way to truly understand when that data comes back, is this a good user experience or is it not if we try and add too many variables in a single variation?
Well, that might muddy up the data and you’re like, was it because we added those additional steps or was it because we changed some copy or changed some colors or a design layout?
You are to really be thoughtful about what you’re isolating between those variables. So I think that can really help.
And it’s just being open to “Yes, we should test this.” This seems like a plausible thing that we should understand.
And if it didn’t work, now, you’ve learned that “Yes, definitively, this probably isn’t the best user experience. How can we think strategically to iterate on that or move past it and do something completely different?”
Ledge: So we’re in the business of sort of finding and hiring and vetting and certifying the very best developers. That’s just our whole thing and we spend a great deal of time on our proprietary system. And we think we’re pretty good at it and it’s working out really well.
And yet, I’m talking to tech leads everyday on the podcast and I’d really want to know, what are your heuristics? You’re in a position where you need to hire really excellent engineers. You work at Netflix and you, guys, are famous for your engineering culture and everybody is paying attention.
What is that process like? How do you know when you have A-plus players ready to join the team?
Ryan: Wow! I mean, it is a huge part of my job. I think it’s super important to build out a strong engineering team. And not necessarily just the most technical sound and the best engineer out there, it’s really strategically about “What do they bring to the table? How do they integrate with the team? How thoughtful are they to also question the product and the user experience?”
To me, what is really important, too, is having an opinion, challenging some of these ideas, and really making sure that we’re rolling out the best product not just from a technical standpoint but also from a user perspective.
And I think it’s always important in our team to each have an opinion and thoughts. We should be sharing that.
Yes, it’s not an easy problem. I think that bar is set very high for our teams.
For me, I don’t necessarily always know right away if someone is the right fit. We also don’t have a rigid interview process. Netflix is somewhere where the culture is not very process driven. It’s a sense that we have a lot of freedom.
And so, I have the freedom to really decide “What does the best interview look like? What’s going to really let me know if this person is the right fit for Netflix or the right fit for my team? How can we vet those skills out?”
I don’t have to answer to HR or a recruiting team that say, “You have to do these exact steps.” They’re helpful as partners and say, “Hey, maybe you’re not vetting this” or “Let’s think about that.”
But, at the end of the day, we don’t have to follow a prescriptive path. And I love that. I think that’s really important. I feel like interviewing is not perfect. I still don’t think anyone does it right.
But not necessarily having to follow a set process has been very valuable and just being strategic along those and really constantly questioning.
It’s the same as the A/B test. It’s constantly questioning the process like “What’s working and what’s not?,” getting feedback from the engineers on the team that are interviewing ─ myself or the recruiting team ─ but also, “Hey, we just hired a new candidate who’s joining your team. Feedback on the interview. What did you think? What could have been better?”
And so, we’re constantly doing that and I think that continues to help the process and it also allows us to think strategically about that next hire.
Ledge: Last question: What’s next for you? Passion areas and projects ─ what are you speaking at this year? What kind of stuff is getting your mind moving?
Ryan: Wow, that’s a great question! I think there’s a lot of interesting work that my team is specifically doing that we haven’t always typically done. You think of Netflix as “We do a lot of video,” correct?
You know, there’s something to be said about that. My team has really focused on the sign-up flow and login and really thinking strategically around that. But I think there are also some ways that we can be pooling in video and really introducing Netflix earlier in the process.
And so, my team has been working on that which I’m super excited for us to really test and figure that out. I think it’s a new area and we’ll see what happens.
One that I’m always super excited about and passionate about that I don’t think we should ever lose is performance. Thinking about performance in all areas of the world, that can vary the connectivity.
And, to me, it’s so important to build a great user experience on a 3G network to a very fast Silicon Valley connection.
How do we do that and be very strategic and thoughtful about that?
I think there’s always a continuous thought and way of testing. We’re trying different things that I always want to see our teams doing.
Ledge: Ryan, super cool to have you on, man! Thanks for the insights and spending the time with us.
Ryan: Awesome! Thanks so much for having me on. It was a pleasure being on. It’s great. Hopefully, there were some useful information.
Ledge: Absolutely! Thanks so much.