|
|
 |
 |
 |
 |
TCL(Tool Command Language) Scripting
|
 |
 |
 |
 |
 |
 |
 |
 |
Data loss with USB-serial transfer
I'm reading data from a USB serial device, and the data comes in blocks of 65536 bytes. The baud rate is 230400, so the stuff arrives quite fast. The channel is set to non-blocking, with a fileevent handler to pick up the data as it arrives. Every so often, a transfer will fail because fewer than 65536 bytes are received. The amount actually received varies. When this does happen, there will have been one or more triggers of the fileevent handler in which exactly 3968 bytes were received. In almost all other cases, the fileevent handler will have seen smaller amounts of data in each invocation, but in one transfer that I have seen so far, there was a case where 3978 bytes were found in one fileevent trigger, and the correct number of bytes was still transferred. The buffer size set by fconfigure was the default 4096 bytes. I tried increasing it to 8192 but it made no difference. The transfer mode is binary. Googling hasn't helped, so I'm wondering whether anyone here can shed any light on what may be happening, and what to do about it. I'd prefer to find a solution that didn't involve reducing the baud rate, because of the size of the data transfer. I'm running Tcl 8.4.13 on a Linux i686 box with os version 2.6.19-1.2911.fc6, and I'm ready to apologize if the answer is that I should upgrade Tcl. TIA John Foster -- jsjf dot demon dot co dot uk, username john
On May 13, 12:31 am, John Foster <nos@example.com> wrote: > I'm reading data from a USB serial device, and the data comes in blocks > of 65536 bytes. The baud rate is 230400, so the stuff arrives quite > fast. The channel is set to non-blocking, with a fileevent handler to > pick up the data as it arrives. > Every so often, a transfer will fail because fewer than 65536 bytes are > received.... > I'm running Tcl 8.4.13 on a Linux i686 box with os version
You're very lucky to be doing this under Unix ! Use the strace, Luke... Indeed, there is a suspicion of data overrun, since the serial link (true or emulated) has no flow control. The [fconfigure -buffersize] may be large, it will do no good if the kernel buffer has already overflowed. Admittedly, if [vwait] has a chance to get back to listening quickly enough, this shouldn' t happen. However, I don't know the size of this kernel buffer, and maybe a small latency is enough. Use strace with the fine-grained timestamp option, and writing to a file (otherwise scrolling the terminal will slow everything down). And try to see how luch time is spent outside the poll() (i.e. the [vwait]). You can also do the same experiment with 'cat' reading the same serial device and writing to a file. See if the saved file is complete. If it's not, you're in trouble because it means the kernel buffer is tiny. If it is, then cat beats Tcl (or at least your script) for now. Try to reduce the latency from your various handlers. And *please* report back ;-) -Alex
Alexandre Ferrieux wrote:
Alex, thank you very much for such a quick response!
> On May 13, 12:31 am, John Foster <nos @example.com> wrote: >> I'm reading data from a USB serial device, and the data comes in >> blocks of 65536 bytes. The baud rate is 230400, so the stuff arrives >> quite fast. The channel is set to non-blocking, with a fileevent >> handler to pick up the data as it arrives. >> Every so often, a transfer will fail because fewer than 65536 bytes >> are received.... >> I'm running Tcl 8.4.13 on a Linux i686 box with os version > You're very lucky to be doing this under Unix ! > Use the strace, Luke... > Indeed, there is a suspicion of data overrun, since the serial link > (true or emulated) has no flow control.
Yes, that's what bothered me. It does look like a flow control problem, yet I can't see any way of fixing it. > The [fconfigure -buffersize] may be large, it will do no good if the > kernel buffer has already overflowed. > Admittedly, if [vwait] has a chance to get back to listening quickly > enough, this shouldn' t happen. > However, I don't know the size of this kernel buffer, and maybe a > small latency is enough. > Use strace with the fine-grained timestamp option, and writing to a > file (otherwise scrolling the terminal will slow everything down). > And try to see how luch time is spent outside the poll() (i.e. the > [vwait]).
Right now, I don't quite understand what you're saying here. But I've never used strace, so I'll start that research now... > You can also do the same experiment with 'cat' reading the same serial > device and writing to a file. > See if the saved file is complete. If it's not, you're in trouble > because it means the kernel buffer is tiny. > If it is, then cat beats Tcl (or at least your script) for now. Try to > reduce the latency from your various handlers.
Tricky to use 'cat', because I have to send commands to the device in order to trigger the output. And because this thing doesn't happen every time, to capture it I really do need a script. Reducing the latency from the handlers would be a possibility if there were any running, other than the fileevent handler. But there aren't any. It seems to be the operating system latency that's causing the problem. But if that's the case, how come? The USB standard doesn't seem to say that flow control is ever necessary. I'm baffled. > And *please* report back ;-) > -Alex
-- jsjf dot demon dot co dot uk, username john
John Foster wrote: > I'm reading data from a USB serial device, and the data comes in blocks > of 65536 bytes. The baud rate is 230400, so the stuff arrives quite > fast. The channel is set to non-blocking, with a fileevent handler to > pick up the data as it arrives.
is your device a _real_ serial device that is connected via one of various RS232 <> USB adapters or a device that behaves like a class serial device? I have used these 067b:2303 Prolific Technology, Inc. PL2303 Serial Port to connect to embedded systems and _downloaded_ i.e. outbound large amounts of data @115kB. never any data loss. can your device loose blocks of data due to not enough local memory? G! uwe
Uwe Klein wrote: > John Foster wrote: >> I'm reading data from a USB serial device, and the data comes in >> blocks of 65536 bytes. The baud rate is 230400, so the stuff arrives >> quite fast. The channel is set to non-blocking, with a fileevent >> handler to pick up the data as it arrives. > is your device a _real_ serial device that is connected via > one of various RS232 <> USB adapters > or a device that behaves like a class serial device?
It's one that behaves like a serial device. > I have used these > 067b:2303 Prolific Technology, Inc. PL2303 Serial Port > to connect to embedded systems and _downloaded_ i.e. > outbound large amounts of data @115kB. never any data loss. > can your device loose blocks of data due to not enough > local memory?
I don't believe so - it collects the data in dedicated memory and then transmits it, using the USB output from an Atmel micro. -- jsjf dot demon dot co dot uk, username john
John Foster wrote: > I don't believe so - it collects the data in dedicated memory and then > transmits it, using the USB output from an Atmel micro.
I have only worked with Cypress E Z-USB FX2 using the limited buffer space provided by the USB engine ( ~1MB/s continuous in Highspeed). I had issues until I used tripple buffering. Transfers _can_ be in spurts and bouts. Do you have any cpu hogs and drainers running like KDE/Gnome/Beagle ...? My first debug step was to have an overrun flag in the upstream dataformat ;-) uwe
Uwe Klein wrote: > John Foster wrote: >> I don't believe so - it collects the data in dedicated memory and >> then transmits it, using the USB output from an Atmel micro. > I have only worked with Cypress E Z-USB FX2 using the limited > buffer space provided by the USB engine ( ~1MB/s continuous in > Highspeed). > I had issues until I used tripple buffering. > Transfers _can_ be in spurts and bouts. > Do you have any cpu hogs and drainers running like KDE/Gnome/Beagle > ...?
I'm certainly running Gnome, though 'top' says things are pretty quiet. On the other hand, kernel or scheduling latency could certainly be a factor. > My first debug step was to have an overrun flag in the upstream > dataformat ;-)
I don't have that much control over the device, unfortunately. But is the message here that USB *can* silently lose data? My understanding is very limited, but I found these words through Google: "Isochronous data transfer offers prenegotiated bandwidth with possible data loss; often used when on-time data delivery is more important than data accuracy, such as streaming audio and video." "Bulk data transfer delivers large data transfers with no loss of data; often used for applications where lots of data must be transferred with no loss of data such as external hard drives." I would have assumed that serial transfer would be of the bulk variety, in which case I simply don't understand how data loss can be happening. Hmm. This seems not to be a Tcl issue - maybe I should take it elsewhere. -- jsjf dot demon dot co dot uk, username john
John Foster wrote: > I don't have that much control over the device, unfortunately. But is > the message here that USB *can* silently lose data? My understanding is
You don't loose data in the USB transfer doing bulk transfers. Your device may loose data before it ever gets on the bus! Can you offload the upstream/in data transfer to a separate process while still doing downstream/out commanding via tcl? Apropos: do you get any usb-errors in /var/log/messages? ( afair there is a debug flag you can set for the usb subsystem either at runtime or while building a new kernel.) There are still issues with certain Hardwarecombinations. forex some cameras I use don't work behind certain hub types. In another case data transfers get stuck. ( haven't been able to resolve these things yet. uwe
Uwe Klein wrote: > John Foster wrote: >> I don't have that much control over the device, unfortunately. But is >> the message here that USB *can* silently lose data? My understanding >> is > You don't loose data in the USB transfer doing bulk transfers. > Your device may loose data before it ever gets on the bus!
I have wondered about that, but I'm going to have a job proving it! And I'm still at the point where I'm more ready to believe that it's something in my setup. > Can you offload the upstream/in data transfer to a separate process > while still doing downstream/out commanding via tcl?
It'd be very difficult - in effect, I'd have to write a new application just for that purpose. I'm not sure it would help - after sending the request for the data, the application does nothing else except to receive and store it via the fileevent handler, until the transfer ends. So I'm as sure as I can be that latency in the application itself is not a factor. > Apropos: do you get any usb-errors in /var/log/messages? ( afair there > is a debug flag you can set for the usb subsystem either at runtime or > while building a new kernel.)
No, no error messages in /var/log/messages. (Thanks for the reminder - I'd forgotten to look there.) Presumably a kernel buffer overrun would be expected to show up there? As for the debug flag, I'll have a look around, but I don't have much of clue as to where to start looking. I'm busy experimenting with strace at the moment, as suggested by Alexandre, but so far it's just confirming what I've already reported. Cheers, and thanks for the help John -- jsjf dot demon dot co dot uk, username john
Been there? subscribed? ( was usefull for me): https://lists.sourceforge.net/lists/listinfo/linux-usb-users Linux USB FAQ http://www.linux-usb.org/FAQ.html Q: How do I see the "USB Verbose Debug Messages" that I enabled in the kernel config? http://www.linux-usb.org/FAQ.html#ts7 uwe
Ah! Now that looks like a useful set of references, and I hadn't found them. I'll go and check them out.. Cheers and thanks again John -- jsjf dot demon dot co dot uk, username john
And what do I find there but these two recent posts (warning: long urls which I've wrapped): http://sourceforge.net/mailarchive/forum.php ?thread_name=5486cca80704140747u74f92c8bj564618b1cdc09e43 %40mail.gmail.com&forum_name=linux-usb-users and http://sourceforge.net/mailarchive/forum.php ?thread_name=5486cca80705040138r6ac16e9bp77e4f6217720ea8 %40mail.gmail.com&forum_name=linux-usb-users They're discussing an FTDI driver problem involving data loss - and the device I'm having trouble with does, I now realize, use an FTDI chip. Looks as though my best plan is to lower the baud rate for now, and check for driver updates. It certainly doesn't look as though there's much point in further investigation on my part. Cheers, and many thanks for the pointers. John -- jsjf dot demon dot co dot uk, username john
you may want to ask on linux_usb_users or contact Oliver Neukum directly he has been looking for testers with various usb devices recently. uwe
Uwe Klein wrote: > you may want to ask on linux_usb_users or contact Oliver Neukum > directly he has been looking for testers with various usb devices > recently.
Good idea - I'll try it. Cheers John -- jsjf dot demon dot co dot uk, username john
|
 |
 |
 |
 |
|