Table of Contents

1 NetBSD UVC driver of USB Video Class devices

1.1 Deliverables

1.2 Similar software

1.3 Driver Overview

Device operation is described in the *USB Device Class Definition for Video Devices* specification (UVC).

The driver needs to attach to a USB device or Interface. It must communicate with the device to set controls, negotiate image format, and transfer images.

Userspace will interact with the driver through some API described below. The driver must respond to open(), read(), mmap(), munmap(), ioctl(), select(), poll(), and close(), and optionally write() for output devices.

1.4 Driver development

1.4.1 Notes and Current Status

<2008-06-02> Driver has been updated to work with usb changed in -current as of May 28 or so. Currently prints out descriptors upon attaching and can query and set some controls.

1.4.2 Read and understand various documentation regarding USB drivers

This has already started and will be ongoing throughout the project. Relevant documentation in addition to that described above includes autoconf(9), driver(9), usbdi(9), the NetBSD coding standards, and discussions with mentors and other NetBSD developers.

1.4.3 DONE Purchase one or two UVC webcams

DEADLINE: 2008-04-30 Wed

Purchased Logitech QuickCam Deluxe for Notebooks. Possibility to obtain Cisco's UVC webcam from friend.

1.4.4 Skeleton driver with initialization operations [6/8]

DEADLINE: 2008-05-30 Fri

  • [X] Add UVC UICLASS* to usb.h
  • [X] Create uvideo.c and uvideoreg.h
  • [X] Ask jmcneill about CFATTACHDECLNEW
  • [X] Add definition to sys/dev/usb/FILES
  • [X] Add definitions to sys/dev/usb/files.usb (how to do correctly?)
  • [X] Add to (test) kernel configuration file
  • [X] Add debug info to usbport.h
  • [X] Should recognize UVC devices by device class and/or interface class during the match() phase. Should also attach() and dettach().
  • [X] Build basic device data structure
  • [X] Query the device and print out something useful upon attaching such as the name.

    The Video Interface Class Code is defined as 0x0E in UVC the spec. Two subclasses are VideoControl (for various controls) and VideoStreaming (for reading or writing video streams). Also Video Interface Collection: e.g. a webcam+microphone may expose one collection that contains VideoControl and VideoStreaming and a second collection containing the Audio stuff (what does a driver do here? will a separate audio device driver handle the audio stuff concurrently with the independent video driver?).

    Driver will focus on video capture devices (cameras). Should leave the door open for video output devices (sending video to be recorded by a camcorder for example), but these devices tend to be more expensive (or maybe there is a cheap USB -> RCA analog video device?). Should even implement the functions, but they will be untested.

1.4.5 Define UVC descriptors [20/24]

DEADLINE: 2008-05-30 Fri

The UVC specification defines a number of USB descriptors for use in device communication. These will be structs using the UPACKED attribute and using the uByte typedefs, UGETW macros for getting and setting values to ensure endian-safety defined in usb.h.

  • [X] UVC Video Control Descriptors [8/8]
    • [X] Interface Header
    • [X] Output Terminal
    • [X] Input Terminal
    • [X] Camera Terminal
    • [X] Selector Unit
    • [X] Processing Unit
    • [X] Extension Unit
    • [X] Interrupt Endpoint
  • [X] Video Streaming Descriptors [11/14]
    • [X] Interface Input Header
    • [X] Interface Output Header
    • [X] Still Image Frame
    • [X] Video Format uncompressed
    • [X] Video Frame uncompressed
    • [X] Video Format MJPEG
    • [X] Video Frame MJPEG
    • [X] Video Format MPEG2TS
    • [X] Video Format DV
    • [X] Color Matching
    • [X] Video Format frame-based
    • [X] Video Frame frame-based
    • [X] Video Format stream-based
    • [X] Video Streaming Endpoint Descriptors. These are all standard USB endpoint descriptors.

1.4.6 Obtain image from camera [0/2]

Images may be transfered either as isochronous or bulk USB transfers depending on the camera or mode. Isochronous transfer reserves bandwidth providing a guaranteed transfer rate. Bulk transfers compete with other devices for bandwidth. Bulk transfers lack any guarantee of timely-ness but have potentially higher transfer rates.

Video conferencing is best with isochronous. Transfer from storage (e.g. videotape) should use bulk and may have retries etc.

The payload is divided into to format classes and various payload formats.

  • frame-based (uncompressed, MJPEG, DV)
  • stream-based (MPEG-2 TS, MPEG-1, etc.)

Each frame or codec segment may require assembly from multiple payloads.

(implied by the UVC FAQ) Device sets MaxPacketSize which is used to determine bandwidth needs and should (?) determine size of driver's internal buffers. For compressed payloads, some bandwidth and space is wasted. Actually, the "Uncompressed Video Frame Descriptor" includes a dwMaxVideoFrameBufferSize which determines buffer size, at least for uncompressed frames.

Dynamic Format Change: device sets error bit in payload with error code control "Format Change". Host (driver) must query new stream state, adjust buffer sizes etc., and commit to the new state or negotiate a new format.

  • [X] Assemble USB packets into an image buffer. The USB Video FAQ diagrams an algorithm something like the following pseudo-code for frame-based formats, given a current buffer and a pool of available buffers, and incoming data that may be flagged as "new frame" or "end of frame" or intermediate data.

    loop { fetchdata() /* either isoc callback or bulk transfer */

    if (newframe && !emptybuffer) { completecurrentbuffer() /* give to userspace */ currentbuffer = newbuffer() }

    if (currentbufferready) { storedataincurrentbuffer() if (endofframe || bufferfull) { completecurrentbuffer() currentbuffer = newbuffer() } } else { discarddata() } }

    (pg. 23 of USB Device Class Definition for Video Devices)

  • [X] Repeat until device is closed or format change is requested.
  • Acquire basic image data
    DEADLINE: 2008-06-06 Fri

    Driver should communicate with device, initiate and receive an image capture in any available format.

  • Negotiate different formats
    DEADLINE: 2008-06-13 Fri

    Driver should be able to enumerate available formats, negoatiate different formats, and acquire images in multiple formats and at multiple framerates. The specifics of this depend on the available formats in the specific hardware device.

  • Handle Bulk and Isochronous transfers
    DEADLINE: 2008-06-20 Fri

    Looking in usbdi.c and usbdiutil.c, isoc transfers appear to be callback based while bulk transfers appear to be driver directed.

    The bulk transfer in usbdiutil.c uses tsleep() and wakeup(), so the transfer will wait until data is received but should not lock up the kernel. (How to deal with e.g. brightness change while waiting for a transfer?)

  • API and syscall support
    DEADLINE: 2008-06-27 Fri

    Should handle open(), close(), read(), select(), and poll().

    V4L2 API allows but does not require multiple opens by different processes. The example given is a audio-mixer-like app to handle the controls and a separate app to capture the video.

    If the driver's buffers are read-only (except for the one buffer currently acquiring data from the device), then even multiple data opens should be possible. Of course, major problems might happen if the video format changes.

    Will need to learn how to handle select() and poll() to indicate to userspace when image data is ready.

  • API and syscall support
    DEADLINE: 2008-07-25 Fri

    Should handle mmap(), munmap(), and some ioctl() calls. In V4L2, these allow the user to map the driver's image buffers into the process memory after appropriate setup with the ioctl VIDIOCREQBUFS().

1.4.7 Read device terminals and units

A UVC device is composed of a tree of Units. A Unit has multiple input pins and one output pin. A Terminal is like a unit but has only a single pin, either input or output. For example, a CCD sensor is a Terminal with a single output (the raw image data).

We should be able to read the graph of connected Units and Terminals and print out some representation of it.

1.4.8 Video Controls

Some of these (e.g. focus) will be associated with a Camera Terminal. Others (e.g brightness) will be associated with a Processing Unit. There may be other units with controls as well.

  • Read all available video controls and set them [7/8]
    DEADLINE: 2008-07-04 Fri
    • [X] Detect which controls are available and print this list.
    • [X] Read and print control attributes
      • [X] Current setting. Note: this is returning "stalled" when reading the value of "zoom" on my Panasonic PV-GS9; max and min etc. are read correctly. Need to see if this changes after image format negotation.
      • [X] Minimum
      • [X] Maximum
      • [X] Resolution
      • [X] Step size
    • [X] Write value of controls and read back to verify

      Some controls are updated asynchronously if the update action may take a long time ("long time" defined in UVC spec). I don't believe my Panasonic or Logitech have such controls. The update process consists of sending an update request and later responding to an interrupt indication completion.

  • Provide necessary ioctl() calls to query and update from userspace
    DEADLINE: 2008-07-25 Fri

    Details of this depend on the actual API that is decided upon. This may require more time than the USB communication driver portions.

1.4.9 Handle interrupt endpoint

The interrupt endpoint is optional. Interrupt is mandatory for hardware trigger for still image capture (i.e. button on webcam). It's not clear how the V4L2 API should react in this situation.

It could either notify userspace that a button press occurred, or it could go ahead and perform a still image capture and notify userspace of the availability of the result. But in either case there does not appear to be a way for V4L2 to provide notification to userspace or how this would be implemented if it did. If using select() for example, each time the file descriptor is ready for reading, userspace would need to check if image data is ready or if the button was pressed.

1.4.10 Implement V4L2 API (?) and Test Suite

  • Video4Linux 2

    This may be the easiest way to get application support since a number of applications already support v4l2. Should have discussion with NetBSD devs who are working on other video drivers (such as video capture or tuner cards).

    V4L2 lacks generic interface to controls. It does provide space for custom controls, but this could lead to incompatibilities. V4L2 defines focus controlls as extended controls.

    The API header file will be derived from the V4L2 specification. Copying from Linux would raise GPL issues I believe? Implementation of this API will be largely in parallel with the other steps in the driver.

    Implementing the entire API may be beyond the scope of this project, but enough should be implemented to allow frame grab and basic controls via an existing V4L2 application.

  • bktr

    This is the API currently implemented by the TV card driver. Xawtv and some other apps support it.

  • Other general video API

    Could use UVC spec as a starting point. If other devs have major problems with v4l2, this may be the way to go, otherwise it's probably needless work and would cause incompatibilities.

    UVC is conceptually Terminals and Units wired together. A Unit has input pins and a single output pin. A Terminal has either a single output pin (e.g. a CCD sensor) or a single input pin (e.g. a video out). An example Unit is a Processing Unit that does brightness or whitebalance correction. VideoControls are associated with Units. A Camera Terminal may have a focus control, and a Processing Unit may have a brightness control.

  • Test Suite

    A UVC driver test suite will be developed concurrently with the driver. This may be a graphical program (it should at least display camera image) but should have a text mode that runs through various tests to ensure that, e.g. controls can be read and set, images (or "data" at least) can be read. Should do long term image capture, attempt format or bandwidth changes, etc.

    Should look into the Automated Testing Framework (ATF) and use that if it makes sense. Many of these tests can be automated, but some will require a human to look at an video.

  • Fully Document Driver API
    DEADLINE: 2008-08-01 Fri

    Whatever is used, the API should be well documented in man pages much like bktr(4).

1.5 Application porting and development

DEADLINE: 2008-08-11 Mon

Would like at least two applications to demonstrate the driver. Possible Applications to port or get working:

Will work with Stephen to integrate these into pkgsrc for easy installation on NetBSD systems.

1.6 My Past Projects

Frameworks - stop-motion frame capture written in C on Linux. As described above, this is a simple application to acquire still frames from webcams using the Video4Linux API. Began as a patch to Gqcam to port from Gtk+ to Gtk+2, then evolved to handle basic stop-motion. This is a one-man project currently using Darcs revision control software.

AAlib-Ruby - a wrapper around the AA-lib C library written in Ruby using Ruby/DL to interface with the C library. A fun little diversion.

1.7 Misc

Author: Patrick Mahoney <>

Date: 2008/06/02 08:28:54 AM