tree: 5681d770764237c4829a2e2965ea33246dfe7282 [path history] [tgz]
  1. public/
  2. BUILD.gn
  3. DEPS
  4. DIR_METADATA
  5. main_content_extraction_browsertest.cc
  6. optical_character_recognizer_browsertest.cc
  7. OWNERS
  8. pref_names.cc
  9. pref_names.h
  10. README.md
  11. screen_ai_dlc_installer.cc
  12. screen_ai_dlc_installer.h
  13. screen_ai_dlc_installer_unittest.cc
  14. screen_ai_downloader_chromeos.cc
  15. screen_ai_downloader_chromeos.h
  16. screen_ai_downloader_non_chromeos.cc
  17. screen_ai_downloader_non_chromeos.h
  18. screen_ai_install_state.cc
  19. screen_ai_install_state.h
  20. screen_ai_install_state_unittest.cc
  21. screen_ai_service_handler_base.cc
  22. screen_ai_service_handler_base.h
  23. screen_ai_service_handler_main_content_extraction.cc
  24. screen_ai_service_handler_main_content_extraction.h
  25. screen_ai_service_handler_ocr.cc
  26. screen_ai_service_handler_ocr.h
  27. screen_ai_service_handler_unittest.cc
  28. screen_ai_service_router.cc
  29. screen_ai_service_router.h
  30. screen_ai_service_router_browsertest.cc
  31. screen_ai_service_router_factory.cc
  32. screen_ai_service_router_factory.h
chrome/browser/screen_ai/README.md

Chrome Screen AI Library

Purpose

ScreenAI service provides accessibility helpers, is downloaded and initialized on demand, and stays on disk for 30 days after the last use.
The service is created per profile and will stay alive as long as the profile lives.
See services/screen_ai/README.md for more.

How to Use for OCR

Depending on your use case restrictions, choose one of the following approaches.

  1. If you are adding a new client for OCR, add a new enum value to screen_ai::mojom::OcrClientType, otherwise choose an appropriate one for it in the next steps.
  2. Join chrome-ocr-clients@ group to get notifications on major updates.
  3. Using OpticalCharacterRecognizer:CreateWithStatusCallback, create an OCR object, and wait until the callback is called. This will trigger download and startup of the service (if needed) and reports the result.
    Once the callback is called with true value, use OpticalCharacterRecognizer:PerformOCR.
    Creation of the object can only be done in the UI thread.
  4. If you cannot use the callback, create the object using OpticalCharacterRecognizer:Create and keep calling OpticalCharacterRecognizer:is_ready until it tells you that the service is ready.
    Then use OpticalCharacterRecognizer:PerformOCR as above.
    Creation of the object can only be done in the UI thread.
  5. If neither of the above work, in the browser process call screen_ai:ScreenAIServiceRouterFactory:GetForBrowserContext:GetServiceStateAsync to trigger library download and service initialization and receive the result in a callback.
    Once you know the service is ready, trigger connection to it in your process by connecting to screen_ai:mojom:ScreenAIAnnotator interface.
    Before calling any of the PerformOCR functions, call SetClientType once to set the client type.
    For an example see components/pdf/renderer/pdf_ocr_helper.cc.

How to use Main Content Extraction

If you are adding a new client for MCE, add a new enum value to screen_ai::mojom::MceClientType. In the browser process call screen_ai:ScreenAIServiceRouterFactory:GetForBrowserContext:GetServiceStateAsync to trigger library download and service initialization and receive the result in a callback.
Once you know the service is ready, trigger connection to it in your process by connecting to screen_ai:mojom:Screen2xMainContentExtractor interface.
Call SetClientType once to set the client type.
For an example see chrome/renderer/accessibility/ax_tree_distiller.cc.

Cautions and Best Practices

  1. OCR downsamples the images if they are larger than a certain threshold, which you can get through GetMaxImageDimension function from version 138. Sending images with higher resolution will not increase the recognition quality and only increases allocated memory and adds extra processing time. If you are resizing the image that you sent to OCR for any other reasons, consider this threshold.
  2. ScreenAI service has a large memory footprint and should be purged from memory when it's not needed. To do so, it monitors last used time and if it is not used for sometime (currently 3 seconds), it shuts down and restarts the next time it is needed.
  3. Have support code for possible disconnecting from the service and reconnecting if needed. This can happen due to a service crash or shutdown on being idle.
  4. If the service crashes, it suspends itself for sometime (increasing on subsequent crashes). Make sure your usecase is consistent with it. You can get the actual delay for the nth crash through SuggestedWaitTimeBeforeReAttempt function.
  5. If you have a batch job, send one request at a time to the service to avoid bloating the queue for tasks. Mechanisms may be added soon to kill the process if it allocates too much memory. Also consider adding pauses after every few requests so that system resources would not be allocated a lot for a long continuous time.

Bugs Component

Chromium > UI > Accessibility > MachineIntelligence (component id: 1457124)