Here's what we know about the USB devices.
Kinect has the following USB devices connected through its USB cable:
- Generic USB Hub
- Xbox NUI Audio
- Xbox NUI Motor
- Xbox NUI Camera
The devices do not conform to any standard USB class such as HID, camera, or audio devices. This indicates that using it as a plug-and-play webcam or microphone is probably not possible.
Plugging in Kinect
When you plug the Kinect to a PC running Windows, it finds the Generic USB Hub and the Xbox NUI Motor. The Motor obviously does not have drivers, so the PnP configuration fails and the device is not started. At this point, the Kinect still looks to be powered off.
Josh Blake tested creating boilerplate drivers using a Jungo WinDriver USB trial. When he generated and installed the driver for the Xbox NUI Motor, the following happened:
- The Kinect device started flashing green on the indicator light.
- The Xbox NUI Audio and Xbox NUI Camera devices became available on the USB hub (also with no drivers initially)
The flashing green light is consistent with the standard Kinect startup process. At this point it is waiting for the Xbox to boot up and probably for a handshake protocol or other command to activate the devices.
It appears that starting the Motor device also controls power to the rest of Kinect.
After the Audio and Camera devices were powered on, Josh also generated boilerplate drivers for them.
Talking to Kinect
Josh was able to send standard USB requests and get appropriate responses from all three devices.
He attempted to send vendor-type setup packets with various values for other parameters but it was mostly non-productive. There were a few interesting bits, documented below.
Until analysis is done on the USB protocol exchange between the Xbox and Kinect, controlling or getting data from Kinect is likely not possible.
Xbox NUI Motor
The Motor device controls the actual motor for tilting/panning the Kinect. It also seems to control power to the rest of the Kinect devices. After the driver for the Motor is installed/activated, Kinect displays a flashing green light and the other devices connect to the internal USB hub.
The Motor has only a single control endpoint on 0x00.
Sending a vendor-specific setup packet read request (0xC2) with zeros for the request, value, and index and 0x02 for the length returns 0x80. This might indicate the current position of the motor as 0x80 is the middle of possible values (0x00 to 0xFF) and would be a good initial position for the motor.
Sending a vendor-specific setup packet write request (0x02) with a request of 0x00 and zeros for the value, index, and length resulted in (need to double check but it was a no-op, as opposed to an error)
Sending a vendor-specific setup packet write request (0x02) with a request of 0x01 and zeros for the value, index, and length resulted in the Motor device immediately disconnecting from the USB port, effectively resetting it.
Xbox NUI Audio
The Audio device provides combined audio from Kinect's four microphones. It also provides hardware noise cancellation by subtracting the TV's game audio and accounting for the 3D layout of the room.
The Audio device has several endpoints, the purpose of which is not yet clear. They provide enough bandwidth for full audio both in and out of the device. This supports the hardware noise processor and feeding game audio to Kinect for processing purposes only.
In addition to the default 0x00 control endpoint, the Audio device has:
- Endpoint 0x81 - IN, Bulk, 512 bytes
- Endpoint 0x01 - OUT, Bulk, 512 bytes
- Endpoint 0x82 - IN, Isochronous, 524 bytes
- Endpoint 0x02 - OUT, Isochronous, 72 bytes
Note that IN and OUT are relative to the USB host. OUT means sending data from the Xbox (or your computer) to Kinect.
Xbox NUI Camera
The most interesting and useful device, the Camera provides both an RGB and a depth map image.
The Camera device has two endpoints. Presumably, one of these provide the RGB image data and the other provides depth map data.
In addition to the default 0x00 control endpoint, the Camera device has:
- Endpoint 0x81 - IN, Isochronous, 1920 bytes (color camera)
- Endpoint 0x82 - IN, Isochronous, 1760 bytes (depth camera)
Note that the packet size is not large enough to fit an entire image at either 320x240 or 640x480, so a single frame will have to be transmitted and reconstituted from multiple frames. This also implies there may be some packet headers to help combine the image.
Interestingly, 1920 divided by 320 is 6, so without a header a single packet could contain 3 16-bit rows, and 80 packets would be required for a single 320x240 16-bit frame.
Sending any vendor-specific setup packets, both read (0xC2) or write (0x02) results in a timeout, requiring resetting the device.
Josh suspects that a handshake protocol has to occur, possibly with the Motor device, before the other devices will respond. Alternately, the camera might respond to a class setup packet or a non-standard setup packet type.