This project will be an independent implementation.
Device operation is described in the *USB Device Class Definition for Video Devices* specification (UVC).
The driver needs to attach to a USB device or Interface. It must communicate with the device to set controls, negotiate image format, and transfer images.
Userspace will interact with the driver through some API described below. The driver must respond to open(), read(), mmap(), munmap(), ioctl(), select(), poll(), and close(), and optionally write() for output devices.
<2008-06-02> Driver has been updated to work with usb changed in -current as of May 28 or so. Currently prints out descriptors upon attaching and can query and set some controls.
This has already started and will be ongoing throughout the project. Relevant documentation in addition to that described above includes autoconf(9), driver(9), usbdi(9), the NetBSD coding standards, and discussions with mentors and other NetBSD developers.
Purchased Logitech QuickCam Deluxe for Notebooks. Possibility to obtain Cisco's UVC webcam from friend.
The Video Interface Class Code is defined as 0x0E in UVC the spec. Two subclasses are VideoControl (for various controls) and VideoStreaming (for reading or writing video streams). Also Video Interface Collection: e.g. a webcam+microphone may expose one collection that contains VideoControl and VideoStreaming and a second collection containing the Audio stuff (what does a driver do here? will a separate audio device driver handle the audio stuff concurrently with the independent video driver?).
Driver will focus on video capture devices (cameras). Should leave the door open for video output devices (sending video to be recorded by a camcorder for example), but these devices tend to be more expensive (or maybe there is a cheap USB -> RCA analog video device?). Should even implement the functions, but they will be untested.
The UVC specification defines a number of USB descriptors for use in device communication. These will be structs using the UPACKED attribute and using the uByte typedefs, UGETW macros for getting and setting values to ensure endian-safety defined in usb.h.
Images may be transfered either as isochronous or bulk USB transfers depending on the camera or mode. Isochronous transfer reserves bandwidth providing a guaranteed transfer rate. Bulk transfers compete with other devices for bandwidth. Bulk transfers lack any guarantee of timely-ness but have potentially higher transfer rates.
Video conferencing is best with isochronous. Transfer from storage (e.g. videotape) should use bulk and may have retries etc.
The payload is divided into to format classes and various payload formats.
Each frame or codec segment may require assembly from multiple payloads.
(implied by the UVC FAQ) Device sets MaxPacketSize which is used to determine bandwidth needs and should (?) determine size of driver's internal buffers. For compressed payloads, some bandwidth and space is wasted. Actually, the "Uncompressed Video Frame Descriptor" includes a dwMaxVideoFrameBufferSize which determines buffer size, at least for uncompressed frames.
Dynamic Format Change: device sets error bit in payload with error code control "Format Change". Host (driver) must query new stream state, adjust buffer sizes etc., and commit to the new state or negotiate a new format.
loop { fetchdata() /* either isoc callback or bulk transfer */
if (newframe && !emptybuffer) { completecurrentbuffer() /* give to userspace */ currentbuffer = newbuffer() }
if (currentbufferready) { storedataincurrentbuffer() if (endofframe || bufferfull) { completecurrentbuffer() currentbuffer = newbuffer() } } else { discarddata() } }
(pg. 23 of USB Device Class Definition for Video Devices)
Driver should communicate with device, initiate and receive an image capture in any available format.
Driver should be able to enumerate available formats, negoatiate different formats, and acquire images in multiple formats and at multiple framerates. The specifics of this depend on the available formats in the specific hardware device.
Looking in usbdi.c and usbdiutil.c, isoc transfers appear to be callback based while bulk transfers appear to be driver directed.
The bulk transfer in usbdiutil.c uses tsleep() and wakeup(), so the transfer will wait until data is received but should not lock up the kernel. (How to deal with e.g. brightness change while waiting for a transfer?)
Should handle open(), close(), read(), select(), and poll().
V4L2 API allows but does not require multiple opens by different processes. The example given is a audio-mixer-like app to handle the controls and a separate app to capture the video.
If the driver's buffers are read-only (except for the one buffer currently acquiring data from the device), then even multiple data opens should be possible. Of course, major problems might happen if the video format changes.
Will need to learn how to handle select() and poll() to indicate to userspace when image data is ready.
Should handle mmap(), munmap(), and some ioctl() calls. In V4L2, these allow the user to map the driver's image buffers into the process memory after appropriate setup with the ioctl VIDIOCREQBUFS().
A UVC device is composed of a tree of Units. A Unit has multiple input pins and one output pin. A Terminal is like a unit but has only a single pin, either input or output. For example, a CCD sensor is a Terminal with a single output (the raw image data).
We should be able to read the graph of connected Units and Terminals and print out some representation of it.
Some of these (e.g. focus) will be associated with a Camera Terminal. Others (e.g brightness) will be associated with a Processing Unit. There may be other units with controls as well.
Some controls are updated asynchronously if the update action may take a long time ("long time" defined in UVC spec). I don't believe my Panasonic or Logitech have such controls. The update process consists of sending an update request and later responding to an interrupt indication completion.
Details of this depend on the actual API that is decided upon. This may require more time than the USB communication driver portions.
The interrupt endpoint is optional. Interrupt is mandatory for hardware trigger for still image capture (i.e. button on webcam). It's not clear how the V4L2 API should react in this situation.
It could either notify userspace that a button press occurred, or it could go ahead and perform a still image capture and notify userspace of the availability of the result. But in either case there does not appear to be a way for V4L2 to provide notification to userspace or how this would be implemented if it did. If using select() for example, each time the file descriptor is ready for reading, userspace would need to check if image data is ready or if the button was pressed.
This may be the easiest way to get application support since a number of applications already support v4l2. Should have discussion with NetBSD devs who are working on other video drivers (such as video capture or tuner cards).
V4L2 lacks generic interface to controls. It does provide space for custom controls, but this could lead to incompatibilities. V4L2 defines focus controlls as extended controls.
The API header file will be derived from the V4L2 specification. Copying from Linux would raise GPL issues I believe? Implementation of this API will be largely in parallel with the other steps in the driver.
Implementing the entire API may be beyond the scope of this project, but enough should be implemented to allow frame grab and basic controls via an existing V4L2 application.
This is the API currently implemented by the TV card driver. Xawtv and some other apps support it.
Could use UVC spec as a starting point. If other devs have major problems with v4l2, this may be the way to go, otherwise it's probably needless work and would cause incompatibilities.
UVC is conceptually Terminals and Units wired together. A Unit has input pins and a single output pin. A Terminal has either a single output pin (e.g. a CCD sensor) or a single input pin (e.g. a video out). An example Unit is a Processing Unit that does brightness or whitebalance correction. VideoControls are associated with Units. A Camera Terminal may have a focus control, and a Processing Unit may have a brightness control.
A UVC driver test suite will be developed concurrently with the driver. This may be a graphical program (it should at least display camera image) but should have a text mode that runs through various tests to ensure that, e.g. controls can be read and set, images (or "data" at least) can be read. Should do long term image capture, attempt format or bandwidth changes, etc.
Should look into the Automated Testing Framework (ATF) and use that if it makes sense. Many of these tests can be automated, but some will require a human to look at an video.
Whatever is used, the API should be well documented in man pages much like bktr(4).
Would like at least two applications to demonstrate the driver. Possible Applications to port or get working:
Will work with Stephen to integrate these into pkgsrc for easy installation on NetBSD systems.
Frameworks - stop-motion frame capture written in C on Linux. As described above, this is a simple application to acquire still frames from webcams using the Video4Linux API. Began as a patch to Gqcam to port from Gtk+ to Gtk+2, then evolved to handle basic stop-motion. This is a one-man project currently using Darcs revision control software. http://frameworks.polycrystal.org/
AAlib-Ruby - a wrapper around the AA-lib C library written in Ruby using Ruby/DL to interface with the C library. A fun little diversion. http://aalib-ruby.rubyforge.org/
Date: 2008/06/02 08:28:54 AM