Lookalikes: Compute hostname variants with and without diacritics

Presently, we generate supplemental hostnames from an input hostname
after removing the hostname's diacritics. This allows us to normalize
a hostname so that attackers can't evade lookalike protections by adding
diacritics.

However, some characters in a hostname can become confusable with other
characters when they are added diacritics. For example,  LATIN SMALL
LETTER L normally not confusable with "t", but LATIN SMALL LETTER L WITH
STROKE (ł) is on some fonts and platforms.

The solution for this would be to add a multiple skeleton mapping for ł
so that the generated supplemental hostnames would have both "l" and "t"
variants in them. This currently doesn't work though, because of the
diacritic removal mentioned above, and a hostname like łest[.]com is
passed as lest[.]com to supplemental hostname generation.

This CL runs the supplemental hostname generation over both the
diacritic and non-diacritic versions of the hostname. This allows us to
generate both lest[.]com and test[.]com as supplemental hostnames.

As a result of this change, we'll be able to prevent additional spoofs
that use characters with diacritics that are confusable with other
characters.

Bug: 1250993
Change-Id: I592602281bffd7f75fc97f32280a46ffcf8fd3ed
Reviewed-on: https://chromium-review.googlesource.com/c/chromium/src/+/3598360
Reviewed-by: Joe DeBlasio <jdeblasio@chromium.org>
Commit-Queue: Mustafa Emre Acer <meacer@chromium.org>
Cr-Commit-Position: refs/heads/main@{#994949}
2 files changed
tree: bfc196b2aad2699d8153f9fc02d3d4feb82a192f
  1. android_webview/
  2. apps/
  3. ash/
  4. base/
  5. build/
  6. build_overrides/
  7. buildtools/
  8. cc/
  9. chrome/
  10. chromecast/
  11. chromeos/
  12. codelabs/
  13. components/
  14. content/
  15. courgette/
  16. crypto/
  17. dbus/
  18. device/
  19. docs/
  20. extensions/
  21. fuchsia/
  22. gin/
  23. google_apis/
  24. google_update/
  25. gpu/
  26. headless/
  27. infra/
  28. ios/
  29. ipc/
  30. media/
  31. mojo/
  32. native_client_sdk/
  33. net/
  34. pdf/
  35. ppapi/
  36. printing/
  37. remoting/
  38. rlz/
  39. sandbox/
  40. services/
  41. skia/
  42. sql/
  43. storage/
  44. styleguide/
  45. testing/
  46. third_party/
  47. tools/
  48. ui/
  49. url/
  50. weblayer/
  51. .clang-format
  52. .clang-tidy
  53. .eslintrc.js
  54. .git-blame-ignore-revs
  55. .gitattributes
  56. .gitignore
  57. .gn
  58. .mailmap
  59. .rustfmt.toml
  60. .vpython
  61. .vpython3
  62. .yapfignore
  63. AUTHORS
  64. BUILD.gn
  65. CODE_OF_CONDUCT.md
  66. codereview.settings
  67. DEPS
  68. DIR_METADATA
  69. ENG_REVIEW_OWNERS
  70. LICENSE
  71. LICENSE.chromium_os
  72. OWNERS
  73. PRESUBMIT.py
  74. PRESUBMIT_test.py
  75. PRESUBMIT_test_mocks.py
  76. README.md
  77. WATCHLISTS
README.md

Logo Chromium

Chromium is an open-source browser project that aims to build a safer, faster, and more stable way for all users to experience the web.

The project's web site is https://www.chromium.org.

To check out the source code locally, don't use git clone! Instead, follow the instructions on how to get the code.

Documentation in the source is rooted in docs/README.md.

Learn how to Get Around the Chromium Source Code Directory Structure .

For historical reasons, there are some small top level directories. Now the guidance is that new top level directories are for product (e.g. Chrome, Android WebView, Ash). Even if these products have multiple executables, the code should be in subdirectories of the product.

If you found a bug, please file it at https://crbug.com/new.