May 23, 2019 · 9 min read

Put your DevOps through P90X with Bob Chen of Beachbody

You’re one of the largest exercise brands in the world. How do you get your DevOps in shape? Ledge sits down with Bob Chen, Senior Director of Technical Operations at Beachbody to chat DevOps, deployment, working with multiple distributed teams, and knocking off technical debt.

See it turns out software teams often want to do their own thing… until they don’t. A good Ops team knows how to serve them, support them, and also give them room to set their own standards when needed. As Bob reminds us, “There’s always room for improvement!”

Bob Chen

Senior Director of Technology Operations for Beachbody LLC

Bob Chen is currently the Senior Director of Technology Operations for Beachbody LLC, leading their DevOps/DevSecOps team and strategy. Bob’s experience spans over a decade leading and building global teams within Technical /Engineering Operations. In addition to managing applications in corporate data centers, he also has extensive experience with multi-cloud strategies utilizing Amazon Web Services and Microsoft Azure. He specializes in platforms and apps within AWS.

bob-s-chen

Read transcript

Ledge: Bob, it’s good to have you.

Bob: Thank you, David. Thanks for inviting me.

Ledge: Can you give a two- or three-minute intro of yourself and your work for the listeners before we dive in?

Bob: Absolutely! I’m currently the senior director of technical operations at Beachbody. Many of you might not have heard of Beachbody but we’re the creators of P90X and Insanity and those workouts that everybody is very familiar with.

I’m managing the DevOps organization over at Beachbody. The previous five years, I’ve also been running orbiting operations groups whether they’re DevOps or system operations and so forth.

I’ve been clearly IN the cloud world. I’ve been working in AWS or some projects of migrating from datacenter to AWS.

And that’s kind of a quick nutshell in terms of my background most recently.

Ledge: I totally never do this but I heard a great joke and somebody saying, “I watched both seasons of P90X and it’s not working yet.”

Bob: I highly encourage you. You should try it out. You might break into a pretty big sweat.

Ledge: Crack a beer and sit on the couch and watch a couple of seasons.

Bob: I’ll probably be right there with you doing that ─

Ledge: Anyway, back to technology. I’m curious, what are the challenges that you face? Obviously, you, guys, probably do a tremendous amount of traffic. You’re probably thinking about all kinds of operational bottlenecks, scaling ─ horizontal, vertical.

What kind of stuff do you run into and what precisely does the stack look like to handle that kind of beating?

Bob: Our stack varies. We have our own data center stack with the teams that we support for enterprise teams as well as our digital stack that supports and builds our Beachbody on Demand which is our video streaming platform.

The stack varies from Java to Node to PHP, and it varies quite a bit.

The challenge is that we have a lot of legacy pipelines and jobs and we’re really trying to kind of take Beachbody to the next level from a DevOps perspective as well as automation.

It’s not that we haven’t done a great job but there’s always room for improvement and we’re just kind of taking it to the next level.

Ledge: So what are you doing on the deployment pipeline CI/CD? How many developers are you working with? What’s the organization? What’s the pipeline for getting code to production?

Bob: The DevOps organization at Beachbody works with two development teams, in general ─ our enterprise team as well our digital team And then, each side ranges from 50 to 80 developers depending on the number of projects that we’re working on.

What we’re primarily using was Jenkins for kind of “build and deploy” and we’re really taking that model and shifting a little bit and breaking out the “build and the deploy.”

So CI using its own CI tool sets and CD using its own CD tool sets as well and not really co-mingling the two because if both work with Jenkins, then that means no; you really can’t build at all and deploy.

Ledge: Right. Given all those different stacks, do you have to have different build strategies for some of the legacy code versus some of the new stuff and how you make those transitions like if you have to support all those different stacks and environments?

Bob: Our best strategy is really to provide sort of a process, a standardized process with development teams to follow and to work through and really utilize the software engineers’ knowledge and skillset to actually create those CI pipelines.

To be honest with you, we’re not going to be working as more in depth with them on the CI side as the engineers are. So we provide these standards and processes for them and really, then, have the software engineers create the CI jobs.

Now, there are other teams where they really need a lot of help. I’m not certain if we want to just give them full control to create the CI job. That’s where we will step in and say, “Look, we’re going to create the best fit for you,” and then we meet with them constantly to understand how they do things and where things are. And it also kind of changes the way they’re doing things.

One thing we’re trying to do is before we’re building artifacts and deploying out the artifacts as the code itself and taking that model and really shifting towards containers ─ so we’re working with Docker as well as Kubernetes and building out Docker and Kubernetes artifacts.

Ledge: Do you end up finding that the demands from the developers of different styles and classes and enterprise versus consumer what have you are different? What kind of stresses do you face as an organization trying to unify the process when there are different groups and different users for your standpoint?

Bob: Absolutely!

Ledge: How do you coalesce that into one thing that centralizes ops?

Bob: That’s a great question because that really is a big struggle. How do you work with multiple groups and multiple software teams and each team wants to do their own thing?

It’s been a challenge because I’ve had software engineering teams, current as well as past, said, “Look, we don’t want DevOps. You’re just going to slow us down. Forget it. We’re just going to do our own things.”

At the end, they always come back and say, “Hey, we need help.”

How we do that is really ─ I work with one team. I try to focus on one team and really create some really big successes for that and show how much we can automate and what we can do with these new processes and how much time we have.

And really based off that, we have other things like “Look, actually, we want to use that tool as well and process.”

What we did was we created a really reference pipeline and based on that reference pipeline, you either put in that type of language that they’re using and then the build process and so forth but the reference pipeline is really “Okay, we’re going to use this tool.” First, we’re going to do this; we’re going to do some sort of cool analyses. Third, we’re going to build artifact and then scan once the artifact is completed and then go on to the CD base.

So that’s how we did it. We structure it so you can just really plug and play.

Ledge: Do you use any type of Datadog, New Relic, anything like that to track the data and the deployment pipeline performance and things like that?

Bob: Yes. We have Dynatrace. We’ll use Dynatrace to kind of capture at least that metrics from the APM side.

Ledge: And what have you learned from that process? I hear a lot of people say that while we wish we had tracked these metrics to begin with, we could have fixed things right at the beginning of ─ what’s that been like as you build out and scale?

Bob: When we recently acquired Dynatrace ─ we’re still kind of working out the kinks in that process but in terms of the CI/CD, we really haven’t got to integrate Dynatrace in that aspect. We really just have been using Dynatrace from the operational perspective.

What we’re looking to do is, obviously, as we deploy the artifact and run our QA test is then to look at the Dynatrace results and see if we are actually getting more errors or not.

So one thing we’re trying to do right now is also get better baselines. Baseline is always a tricky thing to get and I’m really helping the QA team and the application teams to understand the baselines and try to figure out what the baselines should be.

Ledge: Right, and it’s tempting to make your baseline as bad as possible. Of course, you’re not doing that.

Bob: That’s always a struggle. You’ve got the software manager ─ “My app can handle more than that.” He’s going to reset the baseline to whatever he wants it to be.

Ledge: The last question I ask everybody is we’re in the business of finding and evaluating and really vetting just super senior engineers ─ DevOps, software, architecture, etcetera ─ and we have a process that we go through for that. It’s very rigorous but I think we’re always reaching out for best practices.

So I love to ask every guest, “How do you evaluate and determine the very best DevOps engineers in front of you who are trying to join your team? How do you evaluate? What’s the interview process? What are the heuristics you use for somebody to get in the door that you feel comfortable hiring?

Bob: Wow, that is a lot of question right there because that process is always tricky and it varies by team as well. But, first and foremost, I usually definitely check to see DevOps and your mix with the team.

I really want to understand how they do their CI/CD and what their perfect ideal CI/CD process is and what tools they would utilize and where did it put them to because your process, depending on where and how you start, is pretty critical; and everybody varies. And I look at that.

Based off that, I just dig into it through their experience as to how they utilize it and what their pain points are and what their successes have been and really understand their thinking that way.

I’ll let my engineers dig into their technical backgrounds like “Okay, how do you do error handling?” and so forth. But, usually, I go about it with “Do they understand how to do CI/CD properly in a way that I’m thinking about doing CI/COULD?” because that’s how the team is doing it; and to shift it midway through and have somebody kind of just throw an egg into the pie is going to be pretty chaotic.

Ledge: We always want to avoid chaos.

Bob: You’re going to have chaos.

Ledge: Constructive chaos, right?

Bob: Exactly!

Ledge: Excellent! Bob, it’s really good to have you, man. I appreciate the insights. Best of luck getting through the holiday traffic!

Bob: Thank you. And you have a good holiday as well.