tree: c15e274e4b7ade3b3709c4f8eba181f7017e6906 [path history] [tgz]
  1. BUILD.gn
  2. constrain_point_2d_parameters.idl
  3. DEPS
  4. DIR_METADATA
  5. image_capture.cc
  6. image_capture.h
  7. image_capture.idl
  8. image_capture_frame_grabber.cc
  9. image_capture_frame_grabber.h
  10. image_capture_test.cc
  11. media_settings_range.idl
  12. OWNERS
  13. photo_capabilities.idl
  14. photo_settings.idl
  15. point_2d.idl
  16. README.md
third_party/blink/renderer/modules/imagecapture/README.md

Image Capture API

This folder contains the implementation of the W3C Image Capture API. Image Capture was shipped in Chrome M59; please consult the Implementation Status if you think a feature should be available and isn't.

This API is structured around the ImageCapture class and a number of extensions to the MediaStreamTrack feeding it (let's call them theImageCapturer and theTrack, respectively).

API Mechanics

takePhoto() and grabFrame()

  • takePhoto() returns the result of a single photographic exposure as a Blob which can be downloaded, stored by the browser or displayed in an img element. This method uses the highest available photographic camera resolution.

  • grabFrame() returns a snapshot of the live video in theTrack as an ImageBitmap object which could (for example) be drawn on a canvas and then post-processed to selectively change color values. Note that the ImageBitmap will only have the resolution of the video track — which will generally be lower than the camera's still-image resolution.

(Adapted from the blog post)

Photo settings and capabilities

The photo-specific options and settings are associated to theImageCapturer or theTrack depending on whether a given capability/setting has an immediately recognisable effect on theTrack, in other words if it's “live” or not. For example, changing the zoom level is instantly reflected on the theTrack, while enabling red eye reduction, if available, is not.

ObjectTypeExample
PhotoCapabilitiesnon-live capabilitiestheImageCapturer.getPhotoCapabilities()
MediaTrackCapabilitieslive capabilitiestheTrack.getCapabilities()
PhotoSettingsnon-live settingstheImageCapturer.takePhoto(photoSettings)
MediaTrackSettingslive settingstheTrack.getSettings()

Other topics

Are takePhoto() and grabFrame() the same?

These methods would not produce the same results as explained in this issue comment:

Let me reconstruct the conversion steps each image goes through in CrOs/Linux; [...]

a) Live video capture produces frames via V4L2CaptureDelegate::DoCapture() [1]. The original data (from the WebCam) comes in either YUY2 (a 4:2:2 format) or MJPEG, depending if the capture is smaller than 1280x720p or not, respectively.

b) This V4L2CaptureDelegate sends the capture frame to a conversion stage to I420 [2]. I420 is a 4:2:0 format, so it has lost some information irretrievably. This I420 format is the one used for transporting video frames to the rendered.

c) This I420 is the input to grabFrame(), which produces a JS ImageBitmap, unencoded, after converting the I420 into RGBA [3] of the appropriate endian-ness.

What happens to takePhoto()? It takes the data from the Webcam in a) and either returns a JPEG Blob [4] or converts the YUY2 [5] and encodes it to PNG using the default compression value (6 in a 0-10 scale IIRC) [6].

IOW:

  - for smaller video resolutions:

  OS -> YUY2 ---> I420 --> RGBA --> ImageBitmap     grabFrame()
             |
             +--> RGBA --> PNG ---> Blob            takePhoto()

  - and for larger video resolutions:

  OS -> MJPEG ---> I420 --> RGBA --> ImageBitmap    grabFrame()
              |
              +--> JPG ------------> Blob           takePhoto()

Where every conversion to-I420 loses information and so does the encoding to PNG. Even a conversion RGBA --> I420 --> RGBA would not produce the original image. (Plus, when you show ImageBitmap and/or Blob on an <img> or <canvas> there are more stages of decoding and even colour correction involved!)

With all that, I'm not surprised at all that the images are not pixel accurate! :-)

Why are PhotoCapabilities.fillLightMode and MediaTrackCapabilities.torch separated?

Because they are different things: torch means flash constantly on/off whereas fillLightMode means flash always-on/always-off/auto when taking a photographic exposure.

torch lives in theTrack because the effect can be seen “live” on it, whereas fillLightMode lives in theImageCapture object because the effect of modifying it can only be seen after taking a picture.

Testing

Image Capture web tests are located in web_tests/external/mediacapture-image.