recorder: chunk content before input into T&S model

To prevent out-of-memory errors when processing long input sequences
with the T&S model (observed with inputs exceeding 8k tokens),
this change implements content chunking.

Instead of processing the entire input at once, the content is divided
into smaller chunks, and safety is verified individually for each chunk.
This approach mitigates excessive memory consumption by the T&S model,
particularly for longer inputs.

Testing on Navi with 11k tokens input, safety check execution time after chunking is around 2 seconds.

BUG=b:385078550
TEST=manual; test on brya with Xss model & rauru with Xs model

Change-Id: I3bd1ebe2031fd8603d0deaa2fab30de1e21445ff
Reviewed-on: https://chromium-review.googlesource.com/c/chromium/src/+/6108466
Reviewed-by: Pi-Hsun Shih <pihsun@chromium.org>
Commit-Queue: Yuan-Chieh Cheng <yuanchieh@google.com>
Reviewed-by: Jennifer Ling <hsuanling@google.com>
Cr-Commit-Position: refs/heads/main@{#1404001}
2 files changed
tree: e8fd77f3ab646a620bfdeae0bf5f9fbf1747606d
  1. android_webview/
  2. apps/
  3. ash/
  4. base/
  5. build/
  6. build_overrides/
  7. buildtools/
  8. cc/
  9. chrome/
  10. chromecast/
  11. chromeos/
  12. codelabs/
  13. components/
  14. content/
  15. crypto/
  16. dbus/
  17. device/
  18. docs/
  19. extensions/
  20. fuchsia_web/
  21. gin/
  22. google_apis/
  23. gpu/
  24. headless/
  25. infra/
  26. ios/
  27. ipc/
  28. media/
  29. mojo/
  30. native_client_sdk/
  31. net/
  32. pdf/
  33. ppapi/
  34. printing/
  35. remoting/
  36. rlz/
  37. sandbox/
  38. services/
  39. skia/
  40. sql/
  41. storage/
  42. styleguide/
  43. testing/
  44. third_party/
  45. tools/
  46. ui/
  47. url/
  48. webkit/
  49. .clang-format
  50. .clang-tidy
  51. .clangd
  52. .git-blame-ignore-revs
  53. .gitallowed
  54. .gitattributes
  55. .gitignore
  56. .gitmodules
  57. .gn
  58. .mailmap
  59. .rustfmt.toml
  60. .vpython3
  61. .yapfignore
  62. ATL_OWNERS
  63. AUTHORS
  64. BUILD.gn
  65. CODE_OF_CONDUCT.md
  66. codereview.settings
  67. CPPLINT.cfg
  68. CRYPTO_OWNERS
  69. DEPS
  70. DIR_METADATA
  71. LICENSE
  72. LICENSE.chromium_os
  73. OWNERS
  74. PRESUBMIT.py
  75. PRESUBMIT_test.py
  76. PRESUBMIT_test_mocks.py
  77. README.md
  78. WATCHLISTS
README.md

Logo Chromium

Chromium is an open-source browser project that aims to build a safer, faster, and more stable way for all users to experience the web.

The project's web site is https://www.chromium.org.

To check out the source code locally, don't use git clone! Instead, follow the instructions on how to get the code.

Documentation in the source is rooted in docs/README.md.

Learn how to Get Around the Chromium Source Code Directory Structure.

For historical reasons, there are some small top level directories. Now the guidance is that new top level directories are for product (e.g. Chrome, Android WebView, Ash). Even if these products have multiple executables, the code should be in subdirectories of the product.

If you found a bug, please file it at https://crbug.com/new.