multiword: remove invalid utf-8 bytes of the preceding text

Currently we are using TrimText to get the most recent 100 "characters" (more correctly it is 100 bytes). This will often cause invalid utf-8 characters, for example, if the cutoff is at the middle of an emoji. This causes suggestion problem for Multi-word prediction.

This CL ensures that the invalid utf-8 chars are removed at the cutoff position.

Change-Id: I12eb93619bd82a1250c742573ce3653a93a79012
Reviewed-on: https://chromium-review.googlesource.com/c/chromium/src/+/4546450
Commit-Queue: Chuong Ho <hdchuong@google.com>
Reviewed-by: Mehrab N <mehrab@chromium.org>
Reviewed-by: Darren Shen <shend@chromium.org>
Cr-Commit-Position: refs/heads/main@{#1147744}
4 files changed
tree: f29210e52ed74a01ace57efe05b6522fee0dac07
  1. android_webview/
  2. apps/
  3. ash/
  4. base/
  5. build/
  6. build_overrides/
  7. buildtools/
  8. cc/
  9. chrome/
  10. chromecast/
  11. chromeos/
  12. codelabs/
  13. components/
  14. content/
  15. courgette/
  16. crypto/
  17. dbus/
  18. device/
  19. docs/
  20. extensions/
  21. fuchsia_web/
  22. gin/
  23. google_apis/
  24. google_update/
  25. gpu/
  26. headless/
  27. infra/
  28. ios/
  29. ipc/
  30. media/
  31. mojo/
  32. native_client_sdk/
  33. net/
  34. pdf/
  35. ppapi/
  36. printing/
  37. remoting/
  38. rlz/
  39. sandbox/
  40. services/
  41. skia/
  42. sql/
  43. storage/
  44. styleguide/
  45. testing/
  46. third_party/
  47. tools/
  48. ui/
  49. url/
  50. weblayer/
  51. .clang-format
  52. .clang-tidy
  53. .eslintrc.js
  54. .git-blame-ignore-revs
  55. .gitattributes
  56. .gitignore
  57. .gn
  58. .mailmap
  59. .rustfmt.toml
  60. .vpython3
  61. .yapfignore
  62. ATL_OWNERS
  63. AUTHORS
  64. BUILD.gn
  65. CODE_OF_CONDUCT.md
  66. codereview.settings
  67. DEPS
  68. DIR_METADATA
  69. LICENSE
  70. LICENSE.chromium_os
  71. OWNERS
  72. PRESUBMIT.py
  73. PRESUBMIT_test.py
  74. PRESUBMIT_test_mocks.py
  75. README.md
  76. WATCHLISTS
README.md

Logo Chromium

Chromium is an open-source browser project that aims to build a safer, faster, and more stable way for all users to experience the web.

The project's web site is https://www.chromium.org.

To check out the source code locally, don't use git clone! Instead, follow the instructions on how to get the code.

Documentation in the source is rooted in docs/README.md.

Learn how to Get Around the Chromium Source Code Directory Structure .

For historical reasons, there are some small top level directories. Now the guidance is that new top level directories are for product (e.g. Chrome, Android WebView, Ash). Even if these products have multiple executables, the code should be in subdirectories of the product.

If you found a bug, please file it at https://crbug.com/new.