This document describes code under //content/browser/downloads
restricting the scope only to code handling Save-Page-As functionality (i.e. leaving out other downloads-related code). This document focuses on high-level overview and aspects of the code that span multiple compilation units (hoping that individual compilation units are described by their code comments or by their code structure).
SavePackage class
WebContents
(ref-counted today, but it is unnecessary - see https://crbug.com/596953)SaveFileCreateInfo::SaveFileSource enum
SaveItem
and SaveFile
processing into 2 flavours:SAVE_FILE_FROM_NET
(see SaveFileResourceHandler
)SAVE_FILE_FROM_DOM
(see “Complete HTML” section below)SaveItem class
SavePackage
SaveFileManager class
SavePackage
and communicates results back to SavePackage
on the UI thread.SaveFileManager::UpdateSaveProgress
ResourceDispatchedHostImpl
(ref-counted today, but it is unnecessary - see https://crbug.com/596953)SaveFile class
SaveFileManager
SaveFileResourceHandler class
SaveFileManager
(onto FILE-thread)ResourceDispatcherHostImpl::BeginSaveFile
SaveFileCreateInfo POD struct
MHTMLGenerationManager class
MHTMLGenerationManager::Job
).Save-Page-As flow starts with WebContents::OnSavePage
. The flow is different depending on the save format chosen by the user (each flow is described in a separate section below).
Very high-level flow of saving a page as “Complete HTML”:
Step 1: SavePackage
asks all frames for “savable resources” and creates SaveItem
for each of files that need to be saved
Step 2: SavePackage
first processes SAVE_FILE_FROM_NET
SaveItem
s and asks SaveFileManager
to save them.
Step 3: SavePackage
handles remaining SAVE_FILE_FROM_DOM
SaveItem
s and asks each frame to serialize its DOM/HTML (each frame gets from SavePackage
a map covering local paths that need to be referenced by the frame). Responses from frames get forwarded to SaveFileManager
to be written to disk.
Very high-level flow of saving a page as MHTML:
Step 1: WebContents::GenerateMHTML
is called by either SavePackage
(for Save-Page-As UI) or Extensions (via chrome.pageCapture
extensions API) or by an embedder of WebContents
(since this is public API of //content).
Step 2: MHTMLGenerationManager
creates a new instance of MHTMLGenerationManager::Job
that coordinates generation of the MHTML file by sequentially (one-at-a-time) asking each frame to write its portion of MHTML to a file handle. Other classes (i.e. SavePackage
and/or SaveFileManager
) are not used at this step at all.
Step 3: When done MHTMLGenerationManager
destroys MHTMLGenerationManager::Job
instance and calls a completion callback which in case of Save-Page-As will end up in SavePackage::OnMHTMLGenerated
.
Note: MHTML format is by default disabled in Save-Page-As UI on Windows, MacOS and Linux (it is the default on ChromeOS), but for testing this can be easily changed using --save-page-as-mhtml
command line switch.
Very high-level flow of saving a page as “HTML Only”:
SavePackage
creates only a single SaveItem
(always SAVE_FILE_FROM_NET
) and asks SaveFileManager
to process it (as in the Complete HTML individual SaveItem handling above.).Pointers to related code outside of //content/browser/download
:
End-to-end tests:
//chrome/browser/downloads/save_page_browsertest.cc
//chrome/test/data/save_page/...
Other tests:
//content/browser/downloads/*test*.cc
//content/renderer/dom_serializer_browsertest.cc
- single process... :-/Elsewhere in //content
:
//content/renderer/savable_resources...
Blink:
//third_party/WebKit/public/web/WebFrameSerializer...
//third_party/WebKit/Source/web/WebFrameSerializerImpl...
(used for Complete HTML today; should use FrameSerializer
instead in the long-term - see https://crbug.com/328354).//third_party/WebKit/Source/core/frame/FrameSerializer...
(used for MHTML today)//third_party/WebKit/Source/platform/mhtml/MHTMLArchive...