A while ago I was trying to get descriptors asynchronously from devices since in some situations, like if you catch a Mass Storage device while it's busy servicing a CDB, the device could drop control pipe requests on the floor, causing DeviceIoControl to block until Windows loses patience with the device and resets it (about five seconds).

The naïve approach is to use an OVERLAPPED structure with DeviceIoControl, since that's how you do async IO on Windows, but this doesn't work. It's up to the device driver to determine whether the call will be completed synchronously or not, and the Windows USB drivers complete all calls synchronously unless the pipe gets stalled by the device (normally between a URB submission and completion the device just NAKs until the result is ready). This is extremely uncommon, and impossible for the control pipe (which is where descriptor requests go) because the control pipe is used to clear stalls. Stall the control pipe and you wedge the device, leaving a device reset as the only option. In five seconds.

The solution I wound up using is not one I'm particularly proud of, but it does have the advantage of working: I didn't mind a synchronous call (in fact, it made things easier), but I didn't want to deal with getting wedged when a device went out to lunch, so I spawned a thread1 and used WaitForSingleObject to set my timeout. Plan on waiting 80ms or so per quarter-kilobyte expected. Measured very unscientifically, a config descriptor without interfaces (nine bytes) takes less than 10ms, average 4.7ms, standard deviation 0.00831.

Note that you want to open the HANDLE you're using in the DeviceIoControl call in the thread too—if there's another request in flight, opening the file handle will block until it's completed, which is exactly what we're trying so hard to avoid. You shouldn't really be calling for another descriptor from the same device right after it's failed though, because it's likely about to be reset by Windows.

Bonus pitfall: For configuration and string descriptors, you don't know how big the descriptor is, so you can grab a portion of it, get the bLength (wTotalLength for configuration descriptors) and use that to get the rest of the descriptor, or you can just request UINT16_MAX and cross your fingers.

It turns out that neither of these approaches work for all devices for any value of n. To save some space, but keep the number of round trips to the device low, I always used a 256 byte buffer in my requests, and failed over to a larger buffer if the descriptor was too big. Unfortunately, in the classic "just bang on it until it starts working" approach of USB vendors, some devices will just not respond if you make a request with a buffer size that is not the exact size of the descriptor or sizeof UsbConfigurationDescriptor (nine bytes). The reason for this appears to be one of convenience—Windows itself only requests either nine bytes or the entirety of descriptors, so some device firmwares were written with only these two cases in mind.


1 more accurately: reused a vthread, but the internal implementation of our threading library is not germane to this article