USB Video Class is a specification for USB video devices such as webcams. This driver will provide support of these devices to the NetBSD operating system.

Source is available via CVS at the NetBSD SoC CVS.

cvs login (Password: just press ENTER)
cvs -z3 co -P uvc


UVC Devices: Cameras

Compliance with the UVC spec is a requirement for the "Certified for Windows Vista" logo, so it is expected that many future webcams will be supported by this driver. For those interested in buying a webcam, be sure to check the box for "Certified for Vista". Many cameras say "Works with Vista", but this is not enough.

From a NetBSD system, plug the camera in and run usbctl (found in the usbutil package). A UVC device is identified by at least one INTERFACE of class 14 (0x0E):

INTERFACE descriptor 1:
bLength=9 bDescriptorType=interface(4) bInterfaceNumber=1 bAlternateSetting=0
bNumEndpoints=0 bInterfaceClass=14 bInterfaceSubClass=2
bInterfaceProtocol=0 iInterface=0()

Note that devices complying with the UVC spec may still use proprietary video formats and thus be unusable. Information on cameras known to work will be posted when it is known.

Xbox LIVE Camera MJPEG
Logitech QuickCam Deluxe for Notebooks MJPEG
Panasonic PV-GS9 MJPEG

I currently have a Panasonic PV-GS9 camcorder and a Logitech QuickCam Deluxe for Notebooks. Based on some web searching, I believe the less expensive Envision V-CAM is UVC, but I'm not certain. The camera in the Asus Eee PC is a UVC device. The Linux UVC driver page has a list of many webcamcs as well.


See the status page for more details.

Friday, 15 August - Nearly Complete Status Update

UVC driver (uvideo)

Video driver (video)

TODO before end of GSoC

Maybe TODO depending on time

Thursday, 14 August - Working with V4L2 app

Successfully viewed Xbox camera with MPlayer. To compile MPlayer from pkgsrc, I first added --enable-tv-v4l2 and then patched the file stream/tv_v4l2.c to include sys/videoio.h.

$ mplayer tv:// -tv driver=v4l2:device=/dev/video0 -fps 30
MPlayer on NetBSD with Xbox

diff -ur ./mplayer-orig/Makefile ./mplayer/Makefile
--- ./mplayer-orig/Makefile     2008-08-06 01:03:01.000000000 -0600
+++ ./mplayer/Makefile  2008-08-06 02:21:14.000000000 -0600
@@ -9,7 +9,7 @@
 .include "../../multimedia/mplayer-share/Makefile.common"
-CONFIGURE_ARGS+=       --disable-mencoder
+CONFIGURE_ARGS+=       --disable-mencoder --enable-tv-v4l2
 CONFIGURE_ARGS+=       --confdir=${PREFIX}/share/mplayer
 # Solaris/x86 has Xv, but the header files live in /usr/X11R6, not
diff -ur ./mplayer-orig/work/MPlayer-1.0rc2/stream/tvi_v4l2.c ./mplayer/work/MPlayer-1.0rc2/stream/tvi_v4l2.c
--- ./mplayer-orig/work/MPlayer-1.0rc2/stream/tvi_v4l2.c        2008-08-06 01:02:57.000000000 -0600
+++ ./mplayer/work/MPlayer-1.0rc2/stream/tvi_v4l2.c     2008-08-06 02:58:13.000000000 -0600
@@ -38,8 +38,9 @@
 #include <sys/sysinfo.h>
-#include <linux/types.h>
-#include <linux/videodev2.h>
+/* #include <linux/types.h>
+#include <linux/videodev2.h> */
+#include <sys/videoio.h> /* NetBSD V4L2 */
 #include "mp_msg.h"
 #include "libmpcodecs/img_format.h"
 #include "libaf/af_format.h"

Friday, 18 July

The current plan for an external API is to implement a video driver (similar to the audio driver) that implements the Video4Linux2 API atop a number of hardware drivers (uvideo in this case).

Video4Linux2 has two types of controls, "normal" controls represented by a single value (e.g. brightness) and "extended" controls that operate on an array of controls. V4L2 defines an array of Camera Controls that includes various controllable hardware parts such as physical zoom, pan, and tilt. The UVC spec defines each control individual, and some controls such as "pan relative" require two values: a direction and a speed. This speed value is not defined by the current V4L2 (revision 0.24)...

Using the Video4Linux2 solution prevents full use of UVC hardware, at least in the current V4L2 API. Changing V4L2 to more closely match UVC is one possibility, but UVC is not the only video device specification. Creating a generic control system for use between the video driver and hardware drivers (and possibly eventual use by a non-V4L2 video API) can be done in a number of ways, none of which seem very appealing:

  1. Explicitly define all controls, including those with multiple values that must be set together. The video driver (or userspace) passes pointers to pre-defined structs to the hardware driver which fills them with current values, min and max values, etc. This is essentially the V4L2 solution except that it doesn't map perfectly to UVC.
  2. Include a "group" number with each control. For example, "pan direction" and "pan speed" would have the same group number and it would be up to the user or video layer to make note of this and set them together. Video4Linux2 has "extended controls" that work more or less in this fashion.
  3. Require a two-step process to acquire control information. The first step requests the length (i.e. number of values under a given control), the second step malloc's and provides the variable size buffer to be filled by the hardware driver.

To be V4L2 compatible, it must be possible to address an arbitray control by id, e.g. "brightness" (id's include defined constants and driver-specific id's), so the video.c layer may or may not need to do some bookeeping/juggling depending on what method is chosen.

I'm currently going with solution 2, having a group number to indicate which controls must be set together. However, V4L2 only defines three very broad groups (e.g. "camera control" includes exposure, focus, and pan/tilt settings). There appears to be no way to indicate that, e.g. pan/tilt must be set together (as in UVC) but focus is separate, other than to create custom finer-grained groups. I don't think this violates the spec, but is probably against the intent.

Monday, 30 June

I've been floudering a bit lately, trying to develop a method of choosing an appropriate alternate interface (determines the video stream endpoint's wMaxPacketSize) and properly setting up the isochronous transfer. I have code that works, but I don't fully understand everything.

I have debugging information on the isoc transfers using my Logitech camera on USB full-speed (not high-speed). My understanding is that with USB full-speed, a given endpoint can output one data packet per 1 ms USB frame. The debugging info below shows the packets sent by the Logitech camera. At the start of each USB frame, the data begins with a header which contains the frame number (1 or 0), the presentation time stamp (PTS), and whether this packet marks the end of frame (eof).

The max frame buffer size for the format used was 53333 bytes, and the frame interval 66.6666 ms for a maximum bandwidth requirement of 800 bytes/ms. The alt interface in use can only provide 384 bytes/ms if my understanding of USB is correct, but in practice the JPEG frames never hit the max.

Note that some packets contain only the 12 byte header field. For some reason, the PTS does not change at the F0/F1 transition like I expect. I'm not sure if I'm reading the PTS incorrectly or otherwise doing things incorrectly, or if the device is at fault (probably the least likely).

uvideo: datalen=384 hdrlen=12 F0-pts(200194)
uvideo: datalen=352 hdrlen=12 F0-pts(200194)
uvideo: datalen=384 hdrlen=12 F0-pts(200194)
uvideo: datalen=356 hdrlen=12 F0-pts(200194)
uvideo: datalen=384 hdrlen=12 F0-pts(200194)
uvideo: datalen=384 hdrlen=12 F0-pts(200194)
uvideo: datalen=24 hdrlen=12 F0-pts(200194)
uvideo: datalen=384 hdrlen=12 F0-pts(200194)
uvideo: datalen=196 hdrlen=12 F0-pts(200194)
uvideo: datalen=384 hdrlen=12 F0-pts(200194)
uvideo: datalen=192 hdrlen=12 F0-pts(200194)
uvideo: datalen=272 hdrlen=12 F0-pts(200194)
uvideo: datalen=380 hdrlen=12 F0-eof-pts(200194)
note F0/F1 change
uvideo: datalen=12 hdrlen=12 F1-pts(200194)
uvideo: datalen=12 hdrlen=12 F1-pts(200194)
uvideo: datalen=12 hdrlen=12 F1-pts(200194)
uvideo: datalen=12 hdrlen=12 F1-pts(200194)
uvideo: datalen=12 hdrlen=12 F1-pts(200194)
uvideo: datalen=12 hdrlen=12 F1-pts(200194)
uvideo: datalen=12 hdrlen=12 F1-pts(200194)
uvideo: datalen=384 hdrlen=12 F1-pts(200194)
uvideo: datalen=336 hdrlen=12 F1-pts(200194)

Monday, 16 June

Currently working on a Gtk+ based test program extracted from my Frameworks stop-motion software. Implementing a minimal poll() functionality among other small details.

With the test program, it should be easier to test things like controls, changing video formats, etc.

Tuesday, 10 June

Finally, after much struggle, I have acquired a JPEG frame from the Logitech camera. I think my mental model for isochronous transfers is wrong; using usbd_setup_isoc_xfer(), I receive data when I specify sufficient nframes (USB frames) to cover one entire video sample (video frame), but if I specify fewer nframes, I get empty data packets...

blurry first image from UVC camera on
			     NetBSD telephone on desk from UVC camera on

The Logitech's MJPEG data is simply a series of JPEG images. However, these JPEG images lack Huffman tables, and the correct table is given as an example in the JPEG standard which I extracted from ISO/IEC 10918-1:1993(E) from I wrote a small Ruby script to insert these Huffman tables into a JPEG file. It's probably best to keep this in userspace rather than in the kernel because during format negotatiaion, maximum buffer sizes are reported. The lack of Huffman tables cannot be dected until image data is required, and inserting the table might overflow the maximum buffer size.

Wednesday, 4 June

Working on negotitating the streaming format and acquiring an image. The negotatiation process consists of sending USB requests to set the current desired format, read the current format proposed by the device (may be different or more explicit that what requested), and either repeat or commit to the proposed format.

The set/commit steps appear to be working, but upon reading the current format with usbd_do_request(), I get error 13, USBD_IOERROR, even though the format data appears to have been successfully read.