Reland "Use database as cache for passages and embeddings to bypass embedder"

This is a reland of commit 22c986f0ce1a5ef2efa704a2093a4d47eb18a63e

The feature param to use database before embedder is turned off by default to avoid the https://crbug.com/347685218 crash. I will continue debug the crash.

Original change's description:
> Use database as cache for passages and embeddings to bypass embedder
>
> This CL saves the embedder from having to recompute embeddings for
> passages that are already stored in the database when a visit is
> made to the same URL. Instead of computing the embedding with the
> model, the stored embedding is reused.
>
> Additionally, this CL adds histogram logging for cancellation of
> passage extraction, embedding, and storage. Various reason codes
> are logged so we can identify the step at which cancellation occurred.
> The tab helper should dominate, but later cancellation in the service
> might signal problems or opportunities to optimize.
>
> Also, this CL fixes b/347306992 using live loading tab count
> instead of a member value updated via notifications that could
> be missed when a loading tab is closed. (Random bug noticed during
> development.)
>
> Bug: 345819418
> Change-Id: Icaacf456717b9d88c0e14b3c8f3a1e31835440c6
> Reviewed-on: https://chromium-review.googlesource.com/c/chromium/src/+/5629651
> Commit-Queue: Orin Jaworski <orinj@chromium.org>
> Reviewed-by: Sophie Chang <sophiechang@chromium.org>
> Reviewed-by: Orin Jaworski <orinj@chromium.org>
> Code-Coverage: findit-for-me@appspot.gserviceaccount.com <findit-for-me@appspot.gserviceaccount.com>
> Cr-Commit-Position: refs/heads/main@{#1315485}

Bug: 345819418
Change-Id: I3e15c76133ef60add563224ebb2dcbb4c0ca5cdf
Reviewed-on: https://chromium-review.googlesource.com/c/chromium/src/+/5656278
Reviewed-by: Sophie Chang <sophiechang@chromium.org>
Commit-Queue: Raj T <rajendrant@chromium.org>
Cr-Commit-Position: refs/heads/main@{#1321126}
16 files changed
tree: 2ed30559b2c30b210350c0a3412c774c5798b71f
  1. android_webview/
  2. apps/
  3. ash/
  4. base/
  5. build/
  6. build_overrides/
  7. buildtools/
  8. cc/
  9. chrome/
  10. chromecast/
  11. chromeos/
  12. codelabs/
  13. components/
  14. content/
  15. courgette/
  16. crypto/
  17. dbus/
  18. device/
  19. docs/
  20. extensions/
  21. fuchsia_web/
  22. gin/
  23. google_apis/
  24. google_update/
  25. gpu/
  26. headless/
  27. infra/
  28. ios/
  29. ipc/
  30. media/
  31. mojo/
  32. native_client_sdk/
  33. net/
  34. pdf/
  35. ppapi/
  36. printing/
  37. remoting/
  38. rlz/
  39. sandbox/
  40. services/
  41. skia/
  42. sql/
  43. storage/
  44. styleguide/
  45. testing/
  46. third_party/
  47. tools/
  48. ui/
  49. url/
  50. webkit/
  51. .clang-format
  52. .clang-tidy
  53. .clangd
  54. .eslintrc.js
  55. .git-blame-ignore-revs
  56. .gitallowed
  57. .gitattributes
  58. .gitignore
  59. .gitmodules
  60. .gn
  61. .mailmap
  62. .rustfmt.toml
  63. .vpython3
  64. .yapfignore
  65. ATL_OWNERS
  66. AUTHORS
  67. BUILD.gn
  68. CODE_OF_CONDUCT.md
  69. codereview.settings
  70. CPPLINT.cfg
  71. DEPS
  72. DIR_METADATA
  73. LICENSE
  74. LICENSE.chromium_os
  75. OWNERS
  76. PRESUBMIT.py
  77. PRESUBMIT_test.py
  78. PRESUBMIT_test_mocks.py
  79. README.md
  80. WATCHLISTS
README.md

Logo Chromium

Chromium is an open-source browser project that aims to build a safer, faster, and more stable way for all users to experience the web.

The project's web site is https://www.chromium.org.

To check out the source code locally, don't use git clone! Instead, follow the instructions on how to get the code.

Documentation in the source is rooted in docs/README.md.

Learn how to Get Around the Chromium Source Code Directory Structure.

For historical reasons, there are some small top level directories. Now the guidance is that new top level directories are for product (e.g. Chrome, Android WebView, Ash). Even if these products have multiple executables, the code should be in subdirectories of the product.

If you found a bug, please file it at https://crbug.com/new.