EDIT: just to be completely clear and avoid spreading any speculation as fact if someone reads this post a year from now, this is all 100% speculation based on casual listening and could easily be completely wrong!
My suspicion is that everything is always converted at 24 bit and gets converted down to 16 in software without any kind of dither (just truncating the least significant 8 bits) if you’re working at 16. The way working at 16 affects the sound quality is VERY similar (to my ears, anyway) to the difference between correctly dithered and undithered bitrate conversion - a kind of subtle “plastic” quality (I know it’s a really vague term but it absolutely captures the sound, I don’t think it’s a coincidence that a lot of people use it), and kind of a one dimensional quality. When I first got a proper interface (MOTU 828, not really that great sounding but a lot better than using a bunch of Sound Blaster AWE64’s like I’d been doing before that) I used to describe the difference between 16 and 24 bit audio in general as being like the difference between looking at a very high quality, life size photo of a window vs. looking out of an actual window, but it turns out I wasn’t dithering correctly (back then DAWs didn’t always dither by default) and the difference got smaller after that. The OT at 16 bit has a very similar sound.
EDIT: Also, if I prepare a sample on the computer and load it into the OT over USB I typically save it as 16 bit and don’t have the same loss in sound quality when it’s in the OT, which obviously could be the converters themselves not working properly in 16 but but it could also be because if I convert to 16 in the computer I’m always dithering correctly. On top of everything else, if people are generally recording a bit on the low side it the OT (because it’s easier to err on the side of caution that to push it too hot and get digital clipping) that would also bring the truncation artifacts up more relative to the audio.