| # What’s Up With BUILD.gn |
| |
| This is a transcript of [What's Up With |
| That](https://www.youtube.com/playlist?list=PL9ioqAuyl6ULIdZQys3fwRxi3G3ns39Hq) |
| Episode 5, a 2023 video discussion between [Sharon (yangsharon@chromium.org) |
| and Nico (thakis@chromium.org)](https://www.youtube.com/watch?v=NcvJG3MqquQ). |
| |
| The transcript was automatically generated by speech-to-text software. It may |
| contain minor errors. |
| |
| --- |
| |
| Building Chrome is an integral part of being a Chrome engineer. What actually |
| happens when you build Chrome, and what exactly happens when you run those |
| build commands? Today, we have Nico, who was responsible for making Ninja the |
| Chrome default build system, to tell us more. |
| |
| Notes: |
| - https://docs.google.com/document/d/1iDFqA3cZAUo0TUFA69cu5wEKL4HjSoIGfcoLIrH3v4M/edit |
| |
| --- |
| |
| 00:00 SHARON: Hello, and welcome to "What's Up With That," the series that |
| demystifies all things Chrome. I'm your host, Sharon, and today, we're talking |
| about building Chrome. How do you go from a bunch of files on your computer to |
| running a browser? What are all the steps involved? Our special guest today is |
| Nico. He's responsible for making Ninja, the Chrome default build system, and |
| he's worked on Clang and all sorts of areas of the Chrome build. If you don't |
| know what some of those things are, don't worry. We'll get into it. Welcome, |
| Nico. |
| |
| 00:29 NICO: Hello, Sharon, and hello, internet. |
| |
| 00:29 SHARON: Hello. We have lots to cover, so let's get right into it. If I |
| want to build Chrome at a really quick overview, what are all the steps that I |
| need to do? |
| |
| 00:41 NICO: It's very easy. First, you download `depot_tools` and add that to |
| your path. Then you run fetch Chromium. Then you type `cd source`, run `gclient |
| sync`, `gn gen out/GN`, and `ninja -C out/GN chrome`. And that's it. |
| |
| 00:53 SHARON: Wow. Sounds so easy. All right. We can wrap that up. See you guys |
| next time. OK. All right. Let's take it from the start, then, and go over in |
| more detail what some of those things are. So the first thing you mentioned is |
| `depot_tools`. What is that? |
| |
| 01:11 NICO: `depot_tools` is just a collection of random utilities for - like, |
| back in the day, for managing subversion repositories, nowadays for pulling |
| things from git. It contains Ninja and GN. Just adds a bunch of stuff to your |
| path that you need for working on Chrome. |
| |
| 01:25 SHARON: OK. Is this a Chrome-specific thing, or is this used elsewhere, |
| too? |
| |
| 01:33 NICO: In theory, it's fairly flexible. In practice, I think it's mostly |
| used by Chromium projects. |
| |
| 01:39 SHARON: OK, all right. And there, you mentioned Ninja and GN. And for |
| people - I think most people who are watching this have built Chrome at some |
| point. But what is the difference between Ninja and GN? Because you have your |
| build files, which are generally called Build.gn, and then you run a command |
| that has Ninja in it. So are those the same thing? Are those related? |
| |
| 01:57 NICO: Yes. So GN is short for Generate Ninja. So Ninja is a build system. |
| It's similar to Make. It basically gets a list of source files and a list of |
| build outputs. And then when you run Ninja, Ninja figures out which build steps |
| do I have to run, and then it runs them. So it's kind of like Make but simpler |
| and faster. And then GN - and Ninja doesn't have any conditionals or anything, |
| so GN is - just a built - it describes the build. And then it generates Ninja |
| files. |
| |
| 02:34 SHARON: OK. |
| |
| 02:34 NICO: So if you want to do, like, add these files only if you're building |
| for Windows, this is something you can do, say, in GN. But then it only |
| generates a Windows-specific Ninja file. |
| |
| 02:46 SHARON: All right. And in terms of when you mention OS, so there's a |
| couple places that you can specify different arguments for how you build |
| Chrome. So you have your gclient sync - sorry, your gclient file, and then you |
| have a separate args.gn. And in both of these places, you can specify different |
| arguments. And for example, the operating system you use - that can be |
| specified in both places. There's an OS option in both. So what is the purpose |
| of the gclient file, and what is the purpose of the args.gn file? |
| |
| 03:25 NICO: Yes. So gclient reads the steps file that is at the root of the |
| directory, and the DEPS file basically specifies dependencies that Chrome pulls |
| in. It's kind of similar to git submodules, but it predates git, so we don't |
| use git submodules also for other reasons. And so if you run gclient sync, that |
| reads the DEPS file, the Chrome root, and that downloads a couple hundred |
| repositories that Chrome depends on. And then it executes a bunch of so-called |
| hooks, which are just Python scripts, which also download a bunch of more |
| stuff. And the hooks and the dependencies are operating system dependent, so |
| gclient needs to know the operating system. But the build also needs to know |
| the operating system. And GN args are basic things that are needed for the |
| builds. So the OS is something that's needed in both places, but many GN args |
| gclient doesn't need to know about. For example, if you enable DCHECKs, like |
| Peter discussed a few episodes ago, that's a GN-only thing. |
| |
| 04:26 SHARON: All right. That sounds good. So let's see. When you actually - |
| OK. So when you run Chrome and you - say you build Chrome, right? A typical |
| example of a command to do that would be, say, `autoninja -C out/default |
| content`, right? And let's just go through each part of that and say what each |
| of those things is doing and what happens there. Because I think that's just an |
| example from one of the starter docs. That's just the copy and paste command |
| that they give you. So autoninja seems like it's based on Ninja. What is the |
| auto they're doing for us? |
| |
| 05:15 NICO: Yeah. So autoninja is also one of the things that's just |
| `depot_tools`. It's a very - or it used to be a very thin wraparound Ninja. Now |
| it's maybe a little thicker, but it's optional. You don't have to use autoninja |
| if you don't want to. But what it does is basically - like, it helps - So |
| Chrome contains a lot of code. So we have this system called Goma, which can |
| run all the C++ compilations in a remote data center. And if you do use the |
| system, then you want to build with a very high build parallelism. You want to, |
| say, `-j 1000` or what and run, like, a thousand bit processes in parallel. But |
| if you're building locally, you don't want to do that. So what autoninja |
| basically does - it looks at your args.gn file, sees if you have enabled Goma, |
| and if so, it runs Ninja with many processes, and else, it runs it with just |
| one process per core, or something like that. So that's originally all that |
| autoninja does. Nowadays, I think it also uploads a bunch of stuff. But you can |
| just run which autoninja, and that prints some path, and you can just open that |
| in the editor and read it. I think it's still short enough to fairly quickly |
| figure out what it does. |
| |
| 06:17 SHARON: OK. What does `-C` do? Because I think I've been using that this |
| whole time because I copied and pasted it from somewhere, and I've just always |
| had it. |
| |
| 06:28 NICO: It says - it changes the current directory where Ninja runs, like |
| in Make. So it basically says, change the current directory to out/GN, or |
| whatever you build directory is, and then run the build from there. So for |
| Chrome, the build always - the current directory during the build is always the |
| build directory. And then Ninja looks for a file called build.ninja in the |
| current directory, so GN writes build.ninja to out/GN, or whatever you build |
| directory is. And then Ninja finds it there and reads it and does its thing. |
| |
| 06:57 SHARON: All right. So the next part of this would be out/default, or out |
| slash something else. So what are out directories, and how do we make use of |
| them? |
| |
| 07:11 NICO: An out directory - it's just a build directory. That's where all |
| the build artifacts go to, all the generated objects files, executables, random |
| things that are generated during the build. So it can be any directory, really. |
| You can make up any directory name that you like. You can build your Chrome in, |
| I don't know, fluffy/kitten, or whatever. But I think most people use out just |
| because it's in the global `.gitignore` file already. Then you want to use |
| something that's two directories deep so that the path from the directory to |
| the source is always `../..`. And that makes sure that this is deterministic. |
| We try to have a so-called deterministic build, where you get exactly the same |
| binary when you build Chrome at the same revision, independent of the host |
| machine, more or less. And the path from the build directory to the source file |
| is something that goes into debug info. So if you want to have the same build |
| output as everyone else, you want a build directory path that's two directories |
| deep. And the names of those two directories doesn't really matter. So what |
| some people do is they use out/debug for the debug builds and out/release for |
| their release builds. But it's really up to you. |
| |
| 08:26 SHARON: Right. Other common ones are, like, yeah. ASan is a common one, |
| different - |
| |
| 08:33 NICO: Right. |
| |
| 08:33 SHARON: OSes. Right. So you mentioned having a deterministic build. And |
| assuming you're on the same version of Chrome, at the same checkout, |
| tip-of-tree, or whatever as someone else, I would have expected that all of the |
| builds are just deterministic, but maybe that's because of work that people |
| like you and the build team have done. But what are things that could cause |
| that to be nondeterministic? Because you have all the same files. Where is the |
| actual nondeterminism coming from? Or is it just different configurations and |
| setups you have on your local machine? |
| |
| 09:09 NICO: Yeah, that's a great question. I always thought this would be very |
| easy to - but turns out it mostly isn't. We wrote a very long blog post that we |
| can link to it from the show notes about this. But there's many things that can |
| go wrong. Like for example, in C++, there's the preprocessor macro `__DATE__`, |
| which embeds the current date into the build output. So if you do that, then |
| you're time dependent already. By default, I think you end up with absolute |
| paths to everything in debug information. So if you build under |
| `/home/sharon/blah`, then that's already different from all the people who are |
| not called Sharon. Then there's - we run tools as part of the build that |
| produce output. For example, the protobuf compiler or whatnot. And so if that |
| binary iterates over some map, some hash map, and that doesn't have |
| deterministic iteration order, then the output might be different. And there's |
| a long, long, long, long, long list of things. Making the build deterministic |
| was a very big project, and there's still a few open things. |
| |
| 10:08 SHARON: OK, cool. So I guess it's - yeah, it's not true nondeterminism, |
| maybe, but there's enough factors that go into it that to a typical person |
| interacting with it, it does seem - |
| |
| 10:21 NICO: Yeah, but there's also true nondeterminism. Like, every now and |
| then, when we update the compiler, the compiler will write different object |
| files on every run just because the compiler internally iterates about some - |
| over some hash map. And then we have to complain upstream, and then they fix |
| it. |
| |
| 10:34 SHARON: OK. Oh, wow. OK. That's very cool. Well, thank you for dealing |
| with this kind of stuff so people like us don't have to worry about it. OK. And |
| the last part of our typical build thing is content. So what is content in this |
| context? If you want to learn about content more in general, check out |
| episode 3. But in this case, what does that mean? |
| |
| 10:58 NICO: So just a build target. So I think people - at least I usually |
| build some executable. I usually build, I don't know, `base_unittests` or |
| `unit_tests` or Chrome or content shell or what. And it's just - so in the |
| Ninja files, there's basically - there's many, many lines that go, if you want |
| to build this file, you need to have these inputs and then run this command. If |
| you want to build this file, instead, you need these other files. You need to |
| run this other command. So for example, if you want to build `base_unittests`, |
| you need a couple thousand object files, and then you need to run the linkers, |
| what's in there. And so if you tell Ninja - the last thing you give it - |
| basically, it tells Ninja, what do you want to build? So if you say, `ninja -C |
| out/GN content_shell` or what, then Ninja is like, let's look at the line that |
| says `content_shell`. And then it checks - I need these files, so it builds all |
| the prerequisites, which usually means compiling a whole bunch of files. And |
| then it runs the final command and runs the linker. So Ninja basically decides |
| what it needs to do and then invokes other commands to do the actual work. |
| |
| 12:08 SHARON: OK, makes sense. So say I run the build - so say I built the |
| target Chrome, which is the one that actually is an executable, and that's |
| what - if you run that, the browser is built from it. So say I've built the |
| Chrome build target. How do I run that now? |
| |
| 12:31 NICO: Well, it's written - so normally, the thing you give to Ninja is |
| actually a file name. And the `-C` change current directory. So if you say, `-C |
| out/release chrome`, then this creates the file `out/release/chrome`. It just |
| creates that file in the out directory. So to run that, you just run |
| `out/release/chrome`, and hopefully it'll start up and work. |
| |
| 12:54 SHARON: Great. Sounds so easy. So you mentioned earlier something called |
| Goma, which had remote data centers and stuff. Is this something that's |
| available to people who don't work at Google, or is this one of the |
| Google-specific things? Because I think so far, everything mentioned is anyone, |
| anywhere can do all this. Is that the case with Goma, also? |
| |
| 13:14 NICO: Yeah. For the other things - so Ninja is actually something that |
| started in Chrome land, but that's been fairly widely adopted across the world. |
| Like, that's used by many projects. But yeah, Goma - I think it's kind of like |
| distcc. Like, it's a distributed compiler thing. I think the source code for |
| both the client and the server are open source. And we can link to that. But |
| the access to the service, I think, isn't public. So they have to work at |
| Google or at a partner company. I think we hand out access to a few partners. |
| And as far as I know, there's a few independent implementations of the |
| protocol, so other people also use something like Goma. But as far as I know, |
| these other services also aren't public. |
| |
| 13:53 SHARON: OK. Right. Yeah, because I think one of the main things is - I |
| mean, as someone who did an internship on Chrome, after, I was like, I'll |
| finish some of these remaining to do items once I go back to school, right? And |
| then I started to build Chrome on my laptop, just a decent laptop, but still a |
| laptop, and I was like, no, I guess I won't be doing that. |
| |
| 14:17 NICO: No, it's doable. You just need to be patient and strategic. Like, I |
| used to do that every now and then. You have to start the build at night, and |
| then when you get up, it's done. And if you only change one or two CC files, |
| it's reasonably fast. It's just, full builds take a very long time. |
| |
| 14:29 SHARON: Yeah, well, yeah. There was enough stuff going on that I was |
| like, OK. We maybe won't do this. Right. Going back to another thing you |
| mentioned is the compiler and Clang. So can you tell us a bit more about Clang |
| and how compiling fits into the build process? |
| |
| 14:50 NICO: Yeah, sure. I mean, compiling just means - almost all of Chrome |
| currently is written in C++, and compiling just means taking a CC file, like a |
| C++ file, and turning it into - turning that into an object file. And there are |
| a whole bunch of C++ compilers. And back in the day, we used to use many, many |
| different C++ compilers, and they're all slightly different, so that was a |
| little bit painful. And then the C++ language started changing more frequently, |
| like with C++ 11, 14, 17, 20, and so on. And so that was a huge drain on |
| productivity. Updating compilers was always a year-long project, and we had to |
| update, like, seven different compilers, one on Android, iOS, Windows, macOS, |
| Android, Fuchsia, whatnot. So over time, we moved to - we moved from using |
| basically the system compiler to using a hermetically built Clang that we |
| download as a gclient DEPS hook. So when you run gclient sync, that downloads a |
| prebuilt Clang binary. And we use that Clang binary to build Chrome on all |
| operating systems. So if one file builds for you on your OS, then chances are |
| it'll build on all the other OSes because it's built by the same compiler. And |
| that also enables things like cross builds, so you can build Chrome for Windows |
| on Linux if you want to because your compiler is right there. |
| |
| 16:11 SHARON: Oh, cool. All right. I didn't know that. Is there any reason, |
| historically, that Clang beat out these other compilers as the compiler of |
| choice? |
| |
| 16:24 NICO: Yes. So it's basically - I think when we looked at this - so Clang |
| is basically the native compiler on macOS and iOS, and GCC is kind of the |
| system compiler on Linux, I suppose. But Clang has always had very good GCC |
| compatibility. And then on Windows, the default choice is Visual Studio. And we |
| still want to link against the normal Microsoft library, so we need a compiler |
| that's ABI-compatible with the Microsoft ABI. And GCC couldn't do that. And |
| Clang also couldn't do that, but we thought if we teach Clang to do that, then |
| Clang basically can target all the operating systems we care about. And so we |
| made Clang work on Windows, also with others. But there was a team funded by |
| Chrome that worked on that for a few years. And also, Clang has pretty good |
| tooling interface. So for code search, we also use Clang. So we now use the |
| same code to compile Chrome and to index Chrome for code search. |
| |
| 17:28 SHARON: Oh, cool. I didn't know that either, so very interesting. OK. |
| We're just going to keep going back. And as you mention more things, we'll |
| cover that, and then go back to something you previously mentioned. So next on |
| the list is gclient sync. So I think for everyone who's ever worked on Chrome, |
| ever, especially at the start, you're like, I'll build Chrome. You build your |
| target, and you get these weird errors. And you look at it, and you think, oh, |
| this isn't some random weird spot that I definitely didn't change. What's going |
| on? And you ask a senior team member, and they say to you, did you run gclient |
| sync? And you're like, oh, I did not. And then you run it, and suddenly, things |
| are OK. So what else is going - you mentioned a couple of things that happen. |
| So what exactly does gclient sync do? |
| |
| 18:13 NICO: Yeah. So as I - that's this file at the source root called DEPS, |
| D-E-P-S, all capital letters. And when you update - if you git pull the Chrome |
| repository, then that also updates the DEPS file. And then this DEPS file |
| contains a long list of revisions of dependent projects. And then when you run |
| gclient sync, it basically syncs all these other git repositories that are |
| mentioned in the DEPS file. And after that, it runs so-called hooks, which like |
| do things download a new Clang compiler and download a bunch of other binaries |
| from something called the CIPD, for example, GN. But yeah, basically makes sure |
| that all the dependencies that are in Chrome but that aren't in the Chrome |
| repository are also up to date. That's what it does. |
| |
| 19:06 SHARON: OK. Do you have a rough ballpark guess of how many dependencies |
| that includes? |
| |
| 19:13 NICO: Its operating system dependent. I think on Android we have way |
| more, but it's on the order of 200. Like, 150 to 250. |
| |
| 19:25 SHARON: Sounds like a lot. Very cool. OK. In terms of - speaking of other |
| dependencies, one of the top-level directories in Chrome is `//third_party`, |
| and that seems in the same kind of direction. So how does stuff in |
| `//third_party` work in terms of building? Can you just build them as targets? |
| What kind of stuff is in there? What can you and can you not build? Like, for |
| example, Blink is one of the things in `//third_party`, and lots of people - |
| that's a big part of it, right? But a lot of things in there are less active |
| and probably less big of a part of Chrome. So does `//third_party` just build |
| anything else, or what's different about it? |
| |
| 20:09 NICO: And that's a great question. So Blink being in `//third_party` is a |
| bit of a historical artifact. Like, most things - almost all of the things in |
| `//third_party` is basically code that's third-party code. That's code that we |
| didn't write ourselves. And Chrome's secret mission is to depend on every other |
| library out there in the world. No, we depend on things like libpng for reading |
| PNG files, libjpeg for reading all of - libjpeg-turbo these days, I guess, for |
| reading JPEG files, libxml for reading XML, and so on. And, well, that's many |
| dependencies. I won't list them all. And some of these third-party dependencies |
| are just listed in the DEPS file that we talked about. And so they basically - |
| like, when gclient sync runs, it pulls the code from some git repository that |
| contains the third-party code and puts it into your source tree. And for other |
| third-party code, we actually check in the code into the Chrome main repository |
| instead of DEPSing it in. There are trade-offs, which approach to choose. We do |
| both from time to time. But yeah. Almost no third-party dependency has a GN |
| file upstream, so usually what you do is you have to write your own BUILD.gn |
| file for the third-party dependency you want to add. And then after that, it's |
| fairly normal. So for a library, if you want to add a dependency on libfoo, |
| usually what we do is you add - you create third-party libfoo, and you put |
| BUILD.gn in there. And then you add a DEPS entry that syncs the actual code to |
| a third-party libfoo source or something. Yes. |
| |
| 21:37 SHARON: All right. Sounds good. Again, you mentioned BUILD.gn files, and |
| that's, as, expected a big part of how building works. And that's probably the |
| part that most people have interacted more with, outside of just actually |
| running whatever command it is to build Chrome. Because if you create, delete, |
| rename any files, you have to update it in some BUILD.gn file. So can you walk |
| us through the different things contained in a BUILD.gn file? What are all the |
| different parts? |
| |
| 22:12 NICO: Sure. So there's a great presentation by Brett, who wrote GN, that |
| we can link to. But in a summary, it's - BUILD.gn contains build targets, and |
| the build target normally is like - it doesn't have to be, but usually, it's a |
| list of CC files that belong together and that either make up a static library |
| or a shared library on executable. So those are the main target types for CC |
| code. But then you can also have custom build actions that run just arbitrary |
| Python code, which, for example, if you compile a protobuf - proto files into |
| CC and H - into C++ and header files, then we basically have a Python script |
| that runs protoc, the proto compiler, to produce those. And so in that case, |
| the action generates C++ files, and then those get built. But the other, simple |
| answer is libraries or executables. |
| |
| 23:11 SHARON: OK. One part of GN files that has caused me personally some |
| confusion and difficulty - which I think is maybe, depending on the part of |
| Chrome you work on, less of an issue - is DEPS. So you have DEPS in your GN |
| files, and there's also something called external DEPS. And then you have |
| separate DEPS files that are just called capital D-E-P-S. |
| |
| 23:30 NICO: Yes. Yes, there, that's some redundant - that's, again, I guess for |
| historical reasons. So in gclient, DEPS just means to build this target, you |
| first have to build these other targets. Like, this target depends - uses this |
| other code. And in different contexts, it kind of means different things. So |
| for example - I think if an executable depends on some other target, then that |
| external executable is linked - that other target is also linked in. If base |
| unit test depends on the base library, which in a normal build is a static |
| library - like in a normal build? Like in a release build, by default, it's a |
| static library. And so if base unit test is built, it first creates a static |
| library and then links to it. And then base itself might depend on a bunch of |
| third-party things, libraries, which means when base unit tests is linked, it |
| links base, but then it also links against basis dependencies. So that's one |
| meaning of DEPS. Another meaning, like these capital DEPS files, that's |
| completely distinct. Has nothing to do with GN, I'm sad to say. And that's just |
| for enforcing layering. Those predate GN, and they are for enforcing layering |
| at a fairly coarse level. They say, code in this directory may include code |
| from this other directory but not from this third directory. For example, a |
| third - like, Blink must not - may include stuff from base, but must not |
| include anything from, I don't know, the Chrome layer or something. |
| |
| 25:18 SHARON: Right, the classic content Chrome layering, where Chrome - |
| |
| 25:18 NICO: Right. And I think - |
| |
| 25:18 SHARON: content, but - |
| |
| 25:18 NICO: Right. And there's a step called check-deps, and that checks the |
| capital DEPS files. |
| |
| 25:24 SHARON: OK. Yeah, because before, I worked on some Fuchsia stuff, and |
| because we're adding a lot of new things, you're messing around with different |
| DEPS and stuff a lot more than I think if you worked in a typical part. Like, |
| now, I mostly just work within content. Unlikely that you're changing any |
| dependencies. But that was always a bit unclear because, for the most part, the |
| targets have very similar names - not exactly the same, but very similar. And |
| if you miss one, you get all these weird errors. And it was, yeah, generally |
| quite confusing. |
| |
| 25:55 NICO: Yeah, that's pretty confusing. One thing of the capital DEPS things |
| that they can do that the GN DEPS can't is if someone adds a DEPS on your |
| library and they add an entry to their DEPS file, that means that now at code |
| review time, you need to approve that they depend on you. And that's not |
| something we can do at the GN level. And the advantage there is, I don't know, |
| if you have some library and then 50 teams start depending on it without |
| telling you, and now you're on the hook for keeping all these 50 things |
| working, then with this system, you at least have to approve every time someone |
| adds a dependency on you, you have to say, this is fine with me. Or you can |
| say, actually, this is - we don't want this to be used by anyone else. |
| |
| 26:45 SHARON: Is there an ideal state where we don't have these DEPS files and |
| maybe that functionality is built into the BUILD.gn files, or is this something |
| that's probably going to be sticking around for a while? |
| |
| 26:52 NICO: That's a great question. I don't know. It seems weird, right? It's |
| redundant. So I think the current system isn't ideal, but it's also not |
| horrible enough that we have to fix it immediately. So maybe one day we'll get |
| around to it. |
| |
| 27:10 SHARON: Yeah. I think I've mostly just worked on Chrome, so I've gotten |
| pretty used to it. But a common complaint is people who work in Google internal |
| things or other, bigger - the main build system of whatever company they work |
| on, they come to Chrome and they're like, oh, everything's so confusing. But if |
| you - you just got to get used to it, but - |
| |
| 27:27 NICO: Right. I think if you're confused by anything, it's great if you |
| come to us and complain. Because you kind of become blind to these problems, |
| right? I've been doing this for a long time. I'm used to all the foot guns. I |
| know how to dodge them. And yeah. So if you're confused by anything, please |
| tell me personally. And then if enough people complain about something, maybe |
| we'll fix it. |
| |
| 27:55 SHARON: All right. Yeah. That's what you said. The outcome of that - |
| we'll see. We'll see how that goes. We'll see how many complaints you suddenly |
| get. Right. OK. So another thing I was interested in is right now there's a lot |
| of work around Rust, getting more Rust things, introducing that, memory safety, |
| that's good. We like it. What is involved from a build perspective for getting |
| a whole other language into Chrome and into the build? Because we have most of |
| the things C++. There's some Java in all of the Android stuff. And in some |
| areas, you see - you'll see a list of - you'll see a file name, and then you'll |
| see file name underscore and then all the different operating systems, right? |
| And most those are some version of C++. The Mac ones are .mm. And you have Java |
| ones for Android. But if you want to add an entirely different language and |
| still be able to build Chrome, at a high level, what goes into that? |
| |
| 29:00 NICO: Yeah, there's also some Swift on iOS. It's many different things. |
| So at first, you have to teach GN how to generate Ninja files for that |
| language. So when a CC file is built, then basically the compiler writes out a |
| file that says, here are all the header files I depend on. So if one of them |
| gets touched, the compiler - or Ninja knows how to rebuild those. So you need |
| to figure out how the Rust compiler or the Swift compiler track dependencies. |
| You need to get that information out of the compiler into the build system |
| somehow. And C++ is fairly easy to build. It's like a per-file basis. I think |
| most languages are more on a module or package base, where you build a few |
| files as a unit. Then you might want to think about, how can I make this work |
| with Goma so that the compilation can work remotely instead of locally? So |
| that's the build system part. Then also, especially for us, we want to use this |
| for some performance critical things, so it needs to be very fast. And we use a |
| bunch of toolchain optimization techniques to make Chrome very fast with |
| three-letter acronyms, such as PGO and LTO and whatnot. And LTO in particular, |
| that means a Link Time Optimization. That means the C++ or the Rust code is |
| built - is compiled into something called "bitcode." And then all the bitcode |
| files at link time are analyzed together so you can do cross-file in-lining and |
| whatnot. And for that work, the bitcodes - all the bitcode versions need to be |
| compatible, which means Clang and Rust need to be built against the same |
| version of LLVM, which is some - it's some internal compiler machinery that |
| defines the bitcode. So that means you have to - if you want to do |
| cross-language LTO, you have to update your C++ compiler and your Rust compiler |
| at the same time. And you have to build them at the same time. And when you |
| update your LLVM revision, it must break neither the C++ compiler nor the Rust |
| compiler. Yeah. And then you kind of want to build the Rust library from |
| source, so you have bit code for all of that. So it's a fairly involved - but |
| yeah, we've been doing a lot of work on that. Not me, but other people. |
| |
| 31:24 SHARON: Right. Sounds hard. And what does LTO stand for, since you used |
| it? |
| |
| 31:30 NICO: Link Time Optimization. |
| |
| 31:30 SHARON: All right. |
| |
| 31:30 NICO: And there's a blog post on the Chromium blog about this that we can |
| link to in the show notes that has a fairly understandable explanation what |
| this does. |
| |
| 31:43 SHARON: Yeah, all right. That sounds good. So linking, that was my next |
| question. As you build stuff, you sort out all of your just compile errors, you |
| got all your spelling mistakes out. The next type of error you might get is |
| linking error. So how does - can you tell us a bit more about linking in |
| general and how that fits into the build process? |
| |
| 32:01 NICO: I mean, linking - like, for C++, the compiler basically produces |
| one object file for every CC file. And then the linker takes, like, about |
| 50,000 to 100,000 object files and produces a single executable. And every |
| object file has a list of functions that are defined in that object file and a |
| list of functions that are undefined in that object file that it calls that are |
| needed from elsewhere. And then the linker basically makes one long list of all |
| the functions it finds. And at the end, all of them should be defined, and all |
| the non-inline ones should be defined in exactly one object file. And if |
| they're not - if that doesn't happen, then it emits an error, and else, it |
| emits a binary. And the linker is kind of interesting because the only thing |
| you really care about is that it does its job very quickly. But it has to read |
| through gigabytes of data before it writes the executable. And currently, we |
| use a linker called `ld`, which was also written by people on the Chrome team, |
| and which is also fairly popular outside of Chrome nowadays. And so we wrote on |
| ELF linker, which is the file format used on Linux and Android, and on COFF |
| linker, which is the file system used on Windows, and our own Mach-O linker, |
| which is the file system on Apple - macOS and iOS. And our linkers are way, |
| way, way faster than the things that they replace. On Windows, we were, like, |
| 10 times faster than the Windows linker. And on Mac, we're, like, four times |
| faster than the system linker and whatnot. The other linker vendors have caught |
| up a little bit, but we - I feel like Chrome has really advanced the state and |
| performance of linking binaries across the industry, which I think is really |
| cool. |
| |
| 33:44 SHARON: Yeah, that is really cool. And in a kind of similar vein to the |
| different OSes and all that kind of stuff is 32- versus 64-bit. There's some |
| stuff happening. I've seen people talk about it. It seems pretty important. Can |
| you just tell us a bit more about this in general? |
| |
| 34:04 NICO: Well, I guess most processors sold in the last decade or so are |
| 64-bit. So I think on some platforms, we only support 64-bit binaries, like - |
| and the bit just means how wide is a pointer and has some implications on which |
| instructions can the compiler use. But it's fairly transparent too, I think, at |
| the C++ level. You don't have to worry about it all that much. On macOS, we |
| only support 64-bit builds. Same on iOS. On Windows, we still have 32-bit and |
| 64-bit builds. On Linux, we don't publicly support 32-bit, but I think some |
| people try to build it. But it's really on Windows where you have both 32-bit |
| and 64-bit builds. But the default bits is 64-bit, and you can say, if you say |
| target CPU equals x86, I think, in your args.gn, then you get a 32-bit build. |
| But it should be fairly transparent to you as a developer, unless you write |
| assembly. |
| |
| 35:02 SHARON: How big of an effort would it be to get rid of 32-bit on Windows? |
| Because Windows is probably the biggest Chrome-using platform, and also, |
| there's a lot of versions out there, right? So - |
| |
| 35:15 NICO: Oh, yeah. |
| |
| 35:15 SHARON: How doable? |
| |
| 35:15 NICO: I think that the biggest platform is probably Android. But yeah, |
| Android is also 32-bit, at least on some devices at the moment. That's true. I |
| don't know. I think we've looked into it and decided that we don't want to do |
| that at the moment. But I don't know details. |
| |
| 35:33 SHARON: And you mentioned ARM. So is there any - how much does the Chrome |
| build team - are they concerned with the architecture of these processors? Is |
| that something that, at the level that you and the build team have to worry |
| about, or is it far enough - a few layers down that that's - |
| |
| 35:47 NICO: It's something we have to worry about at the toolchain team. So we |
| update the scaling compiler every two weeks or so, which means we pull in all - |
| around 1,000 changes from upstream contributors that work on LVM spread across |
| many companies. And we have to make sure this doesn't break from on 32-bit ARM, |
| 64-bit ARM, 32-bit Intel, 64-bit Intel, across seven different operating |
| systems. And so fairly frequently, when we try to update Clang tests start |
| failing on, I don't know, 32-bit Windows or on 64-bit iOS or some very specific |
| configuration. And then we have to go and debug and dissect and figure out |
| what's going on and work with upstream to get that fixed. So yeah. That's |
| something we have to deal with at the toolchain team, but hopefully, it's - |
| hopefully, like the normal Chrome developer is isolated from that for the most |
| part. |
| |
| 36:45 SHARON: I think so. It's not - if I weren't asking all these other |
| questions, it's something that almost never crosses my mind, right? So that |
| means you're all doing a very good job of that. Thank you very much. Much |
| appreciated. And jumping way back, you mentioned earlier indexing the code |
| base, code search. So I make a change. I submit it. I upload it. It eventually |
| ends up in code search. So how does that process work? And what goes into |
| indexing? Because before, when I was working on Fuchsia all the Fuchsia code |
| wasn't indexed, so you couldn't do the handy thing of clicking a thing and |
| seeing where it was defined. You had to actually look it up. And once you got |
| that, it was like, oh my gosh, so much better. So can you just tell us a bit |
| more about that process? |
| |
| 37:30 NICO: Sure, yeah. The Chrome has a pretty good code search feature, I |
| think, codesearch.chromium.org or cs.chromium.org. Basically, we have a bot |
| that runs, I think, every six hours or so, pulls the latest code, bundles it |
| up, sends it to some indexer service that then also uses Clang to analyze the |
| code. Like, for C++, I think we also index Java. We probably don't index Rust |
| yet, but eventually we will. And then it generates - for every word, it |
| generates metadata that says, this is a class. This is an identifier. And so if |
| you click on it, if you click on a function, you have the option of jumping to |
| the definition of the function, to the declaration, to all the calls, all the |
| overrides, and so on. And that updates ideally several times a day and is |
| fairly up to date. And we built the index, I think, for most operating systems. |
| So you can see this is called here on Linux, here on Windows, and what not. |
| |
| 38:32 SHARON: OK. Sounds good. Very useful stuff. And I don't know if this is |
| part of the build team's jurisdiction, but when you are working on things |
| locally, you have some git commands, and then you have some git-cl commands. |
| |
| 38:43 NICO: Mm-hmm. |
| |
| 38:48 SHARON: So the git commands are your typical ones - git pull, git rebase, |
| git stash, that kind of thing. And then you have git-cl commands, which relate |
| more to your actual CL in Gerrit. So git-cl upload, git-cl status. That'll show |
| you all your local branches and if they have a Gerrit change associated with |
| them. So what's the difference between git and git-cl commands? |
| |
| 39:18 NICO: I'm sorry. So this is basically a git feature. If you call git-foo, |
| then git looks for git-foo on your path. So you can add arbitrary commands to |
| git if you want to. And git-cl is just something that's in `depot_tools`. |
| Again, there's git-cl in `depot_tools`, and you can open that and see what it |
| does. And it'll redirect to `git_cl.py`, I think, which is a fairly long and |
| hairy Python script. But yeah. It's basically Gerrit integration, as you say. |
| So you can use that to send try jobs, `git cl try`. To upload, as you say, you |
| can use `git cl issue` to associate your current branch with a remote Gerrit |
| review, `git cl patch` to get a patch off Gerrit and patch it into your local |
| thing, `git cl web` to open the current thing in a web browser. Yeah, git-cl is |
| basically - git-cl help to see all the git-cl commands, or - yeah. If you have |
| a change that touches, like, 1,000 files, you can run `git cl split`, and it'll |
| upload 500 reviews. But that's usually too granular, and I wouldn't recommend |
| doing that. But it's possible. |
| |
| 40:25 SHARON: Right. Do you have a - [DOORBELL DINGS] |
| |
| 40:25 NICO: Oops, sorry. |
| |
| 40:25 SHARON: commonly - yeah. |
| |
| 40:30 NICO: Oh, sorry. There was - the door just rang. Maybe you didn't hear |
| it. Sorry. |
| |
| 40:30 SHARON: All right. It's all good. Do you have a lesser known git or |
| git-cl command that you use a lot or - |
| |
| 40:41 NICO: Well, I - |
| |
| 40:41 SHARON: is your favorite? [LAUGHS] |
| |
| 40:46 NICO: It's not lesser known to me, so I wouldn't know. I don't know. I |
| use `git cl upload` a lot. |
| |
| 40:53 SHARON: Right. Well, you have to use `git cl upload`, right? |
| |
| 40:53 NICO: I use - |
| |
| 40:53 SHARON: Well, you don't - maybe not but - |
| |
| 40:53 NICO: `git cl try` to send try jobs from my terminal, `git cl web` to see |
| what's going on, `git cl patch` a lot to patch stuff in locally. If I'm doing a |
| code review and I want to play with it, I patch it in, build a local, and see |
| how things are working. |
| |
| 41:12 SHARON: Yeah. When I patch in a thing, I go from the cl page on Gerrit |
| and then click the down patch thing, but - |
| |
| 41:21 NICO: No, even `git cl patch -b` and then some branch name, and then you |
| just patch - paste the Gerrit review URL. |
| |
| 41:28 SHARON: Oh, cool. |
| |
| 41:28 NICO: So it's just, yeah, Control-L to focus the URL bar. Control-C |
| Alt-Tab `git cl patch -b blah`, Paste, Enter, and then you have a local branch |
| with the thing. |
| |
| 41:36 SHARON: All right. Yeah, a lot of these things, once you learn about |
| them - at first you're like, whoa, and then you use them, and then they're not |
| lesser known to you, but you tell other people also a common - so another one |
| would be `git cl archive`, which will - |
| |
| 41:47 NICO: Oh, yeah, yeah. |
| |
| 41:47 SHARON: get rid of any local branches associated with a closed Gerrit |
| branch, so that's very handy, too. |
| |
| 41:53 NICO: Yes. |
| |
| 41:53 SHARON: So it's always fun to learn about things like that. |
| |
| 41:59 NICO: Are you fairly tidy with your branches? How many open branches do |
| you usually have? |
| |
| 41:59 SHARON: [LAUGHS] I used to be more tidy. When I tried to do a cleanup |
| thing, I had more branches. I think right now I've got around 20-something |
| branches. I like having not very many. I think to some people, that's a lot. To |
| some people, that's not very many. I mean, ideally, I have under five, right? |
| [LAUGHS] But - |
| |
| 42:18 NICO: I don't know. I usually have a couple 10, sometimes. Have a bunch |
| of machines. I think on some of them it's over 100, but yeah. Every now and |
| then, I run `git cl archive` and it removes half of them, but - |
| |
| 42:29 SHARON: Yes. All right, cool. Is there anything that we didn't cover so |
| far that you would like to share? So things that maybe you get asked all the |
| time, things that people could do better when it comes to build-related things? |
| Things that you can do that make the build better or don't make it worse, that |
| kind of thing? Or just anything else you would like to get out there? |
| |
| 42:58 NICO: I guess one thing that's maybe implicitly stated, but currently not |
| explicitly documented, as far as I know, but I'm hoping to change that, is - so |
| Chrome tries to have a quiet build. Like, if you build this zero build output, |
| except that one Ninja file, Ninja line that's changing, right? There's, well, |
| another code basis - I think it's fairly common - that there's many screenfulls |
| of warning that scroll by. And we very explicitly try not to do that because if |
| the build emits lots of warnings, then people just learn to ignore warnings. So |
| we think something should either be a serious problem that people need to know |
| about, then it should be an error, or it should be not interesting. Then it |
| should be just quiet. So if you add a build step that adds a random script, the |
| script shouldn't print anything, just about progress. Shouldn't say, doing |
| this, doing this, doing this. Should either print something and say something's |
| wrong and fail those build step or not say anything. So that's one thing. |
| |
| 43:51 SHARON: That's - yeah, that's true. |
| |
| 43:51 NICO: And the other thing - |
| |
| 43:51 SHARON: Like, you only really get a bunch of terminal output if you have |
| a compile or a linker error, whatever. |
| |
| 43:57 NICO: Right. |
| |
| 43:57 SHARON: I hadn't ever considered that. If you build something and it |
| works, you get very few lines of output. And I hadn't ever thought that was |
| intentional before, but you're right in that if it was a ton, you would just |
| not look at any of it. So yeah, that's very cool. |
| |
| 44:09 NICO: Yeah. And on that same note, we don't do deprecation warnings |
| because we don't do any warnings. So if people - like, people like deprecating |
| things, but people don't like tidying up calls to deprecated functions. So if |
| you want to deprecate something in Chrome, the idea is basically, you remove |
| all callers, and then you remove the deprecated thing. And we don't allow you |
| to say - to add a warning that tells everyone, hey, please, everyone, remove |
| your calls. The onus is on the person who wants to deprecate something instead |
| of punting that to everyone else. |
| |
| 44:46 SHARON: Yeah, I mean, the thing that I was working on has a deprecating |
| effect, so removing callers, which is why I have so many branches. But I've |
| also seen presubmit warnings for if you include something deprecated. So - oh, |
| yeah, and there's presubmit, too. OK, we'll get to that also. [LAUGHS] Tell us |
| more about all of this. |
| |
| 45:05 NICO: About presubmits? Yeah, presubmits - presubmits are terrible. |
| That's the short summary. So if you run a `git cl presubmit`, it'll look at a |
| file called presubmit.py, I think, in the current directory, and maybe in all |
| the directories of files - of directories that contain files you touched or |
| something like that. But you can just open the top-level presubmit.py file, and |
| there's a couple thousand lines of Python where basically everyone can add |
| anything they want without much oversight, so it's a fairly long - at least |
| historically, that used to be the case. I don't know if that's still the case |
| nowadays. But yeah, it's basically like a long list of things that random |
| people thought are good if they - like, presubmits are something that are run |
| before you upload, also, implicitly. And so you're supposed to clean them up. |
| And [INAUDIBLE] many useful things. For example, nowadays we require most code |
| to be autoformatted so that people don't argue about where semicolons should go |
| or something silly like that. So one of the things it checks is, did you run |
| `git cl format`, which runs, I guess, Clang format for C++ code and a bunch of |
| custom Python scripts for other files. But it's also - presubmits have grown |
| organically, and there isn't - they're kind of unowned and they're very, very |
| slow. And I think some people have tried to improve them recently, and they're |
| better than they used to be, but I don't love presubmits, I guess is the |
| summary. But yeah, it's another thing to check invariants that we would like to |
| be true about our code base. |
| |
| 46:48 SHARON: Yeah. I mean, I think - yes, spelling is something I think it |
| also checks. |
| |
| 46:54 NICO: It checks spelling? OK. |
| |
| 46:54 SHARON: Or maybe that's a separate bot in Gerrit. |
| |
| 46:59 NICO: Oh, yeah, yeah, yeah, yeah. Like, there's this thing called - |
| what's its name? |
| |
| 47:06 SHARON: Trucium? Tricium? |
| |
| 47:06 NICO: Tricium, yeah. Tricium, right. Tricium is something that adds |
| comments to your - automatically adds comments to your change list when you |
| upload it. And Tricuium can do spelling correction, but it can also - it runs |
| something called Clang Tidy, which is basically a static analysis engine which |
| has quite a few false positives, so sometimes it complains about something |
| that - but it's actually incorrect, and so we don't put that into the compiler |
| itself. So we've added a whole bunch of warnings to the compiler for things |
| that we think are fairly buggy. But Clang Tidy is - but these warnings have to |
| be - they have to have a very low false positive rate. Like, if they complain, |
| they should almost always be right. But sometimes, for static analysis, it's |
| hard to be right. Like, you can say this might be wrong. Please be sure. But |
| this is not something the compiler can say, so we have this other system called |
| Clang Tidy which also adds a comment to your C++ code which says, well, maybe |
| this should be a reference instead of a copy, and things like that. |
| |
| 48:04 SHARON: Yeah. And I think it - I've seen it - it checks for unused |
| variables and other - there's been useful stuff that's come from comments from |
| there, so definitely. All right. Very cool. So if people are interested in all |
| this build "infra-y" kind of stuff and they want to get more into it, what can |
| they do? |
| |
| 48:32 NICO: We have a public build@chromium.org mailing list. It's very low |
| volume, but if you want to reach out, you can send an email there and a few of |
| us will see your email and interact with you. And there's also I think the tech |
| build on crbug. So you can just look for build bugs and fix all our bugs for |
| us. That'd be cool. |
| |
| 48:51 SHARON: [LAUGHS] |
| |
| 48:51 NICO: And if there's anything specific, just talk to local OWNERS. Or if |
| you feel this is just something you're generally interested in and you're |
| looking for a project, you can talk to me, and I probably have a long list of - |
| I do have a long list of somewhat beginner-friendly projects that people could |
| help out with, I guess. |
| |
| 49:15 SHARON: Yeah. I mean, I think being able to - if you're looking for a |
| 20%y kind of project or something else. But knowing how things actually get put |
| together is always a good skill and definitely applicable to other things. It's |
| the kind of thing where the more low level-knowledge you have, the more - it |
| works - it applies to things higher up, but not necessarily the other way |
| around, right? |
| |
| 49:34 NICO: Mm-hmm. |
| |
| 49:34 SHARON: So having that kind of understanding is definitely a good thing. |
| All right. Any last things you'd like to mention or shout out or cool things |
| that you want people to know about? [LAUGHS] |
| |
| 49:48 NICO: I guess - |
| |
| 49:48 SHARON: Or what - yeah, quickly, what is the future of the whole build |
| thing? Like, what's the ideal situation if - |
| |
| 49:55 NICO: Ideally, it'll all be way faster, I guess is the main thing. But |
| yeah, yeah, I think build speed is a big problem. And I'm not sure we have the |
| best handle on that. We're working on many things, but - not many. A bunch of |
| things. But it's - like, people keep adding all that much code, so if y'all |
| could delete some code, too, that would help us a lot. I mean, having - |
| supporting more languages is something we have to - this is something that's |
| happening. Like, Rust is happening. We are also on iOS also using Swift. |
| Currently, we can't LTO Swift with the rest because that's on a different OEM |
| version. There's this - in C++ - we keep upgrading C++ versions. So Peter |
| Kasting is working on moving us to C++20. And then 23, we'll have them, and so |
| on. There's maybe C++ modules at some point, which may or may not help with |
| build speed. And there's a bunch of tech debt that we need to clean up, but |
| that's not super interesting. |
| |
| 51:24 SHARON: I don't know. I think people in Chrome in general are more |
| interested and care about reducing tech debt in general, right? A lot of people |
| I know would be happy to just do tech debt clean-up things only, right? |
| Unfortunately, it doesn't really work out for job reasons. But a lot of people, |
| I think, are interested in, I think, in higher proportions than maybe other |
| places. |
| |
| 51:47 NICO: It depends on the tech debt. Some of it might work out for job |
| reasons. But, yeah. |
| |
| 51:54 SHARON: Yeah. I mean, some of it is easier than others, too, right? Some |
| of it is like, yeah, so, OK, well, go delete some code. Go clean up some |
| deprecated calls. [LAUGHS] All that. |
| |
| 52:08 NICO: Yeah, and again, I think finishing migrations is way harder than |
| starting them, so finish more migrations, start fewer migrations. That'd be |
| also cool. |
| |
| 52:16 SHARON: All right. I am sure everyone listening will go and do that right |
| away. |
| |
| 52:21 NICO: Yep. |
| |
| 52:21 SHARON: And things will immediately be better. |
| |
| 52:27 NICO: They've just been waiting to hear that from me, and now they're |
| like, ah, yeah, right. That makes sense. |
| |
| 52:27 SHARON: Yeah, yeah. All right. Well, you all heard it here first. Go do |
| that. Things will be better, et cetera. So all right. Well, thank you very |
| much, Nico, for being here answering all these questions. I learned a lot. A |
| lot of this is stuff that - everyone who works on Chrome builds Chrome, right? |
| But you can get by with a very minimal understanding of how these things are. |
| Like, you see your - you follow the Intro to Building Chrome doc. You copy the |
| things. You're like, OK, this works. And then you just keep doing that until |
| you have a problem. And depending on where you work, you might not have |
| problems. So it's very easy to know very little about this. But obviously, it's |
| so important because if we didn't have any of this infrastructure, nothing |
| would work. So one, I guess, thank you for doing all the stuff behind the |
| scenes, determinism, OSes, all that, making it a lot easier for everyone else, |
| but also thank you for sharing about it so people understand what's actually |
| going on when they run the commands they do every day. |
| |
| 53:31 NICO: Sure. Anytime. Thanks for having me. And it's good to hear that |
| it's possible to work on Chrome without knowing much about the build because |
| that's the goal, right? It should just work. |
| |
| 53:44 SHARON: Yeah. |
| |
| 53:44 NICO: Sometimes it does. |
| |
| 53:44 SHARON: [LAUGHS] Yeah. Well, thank you for all of it, and see you next |
| time. |
| |
| 53:51 NICO: Yeah. See you on the internet. Bye. |
| |
| 54:03 SHARON: OK. So we will stop recording - |
| |
| 54:03 NICO: Wee. Time for the second take. |
| |
| 54:03 SHARON: [LAUGHS] Let's do that, yeah, all over again. |
| |
| 54:11 NICO: Let's do it. |
| |
| 54:11 SHARON: I will stop recording. |