File corruption problems copying via finder.


Subject: File corruption problems copying via finder.
From: Rouillard, John (RouillardJ@brevard.cc.fl.us)
Date: Mon May 22 2000 - 17:25:53 EDT


I am having a problem with a file transfer being corrupted. Using the finder
I copy a file from a local drive to a network drive. Here is some background
info:

Source machines: MAC G4 running MacOS8, native 10/100bT Ethernet card
                PowerMac (8200?) running 7.5.3 w/ OT 1.1.1, Asante AUI-10bT
adapter

Server: RedHat Linux 6.1 box with kernel 2.2.12 using netatalk is
1.4.2b_asun2.1.3.
        The linux box is connected to a 10bT network hub.
        CR/LF translation is not turned on for the volume in the
Applevolumes.default file.

The file is a 32 MB movie files and some frames in the movie get trashed.
The file
sizes on the server and client are the same, just the data is corrupted. In
each of the corrupted sections four bytes are repeated a number of times,
then totally different data is substituted. (see below) Multiple regions of
corruption are present in most files.
The areas of corruption are a few hundred bytes in size: 111, 223, 223, 495.
They can vary within a single file.

  If the G4 is connected to the 100bT switch running in 100bT mode, it
     is corrupted approx. 50% of the time.

  If the G4 is connected to the 10bT hub, it is corrupted only 10% of the
time.

  If the power mac is connected to the 100bT switch, it runs in 10bT mode
and
        gets corrupted approx. 10-20% of the time.

The network looks like:

                           G4 alternate position
                           |
   G4 ---- switch ----- hub ---- fiber ---- hub --- linux box
              |
             power mac

Since I can markedly reduce the errors by changing the ethernet connection
of the source G4 machine, I think this leaves the linux box free and clear
as a source of the corruption, but it seems that network devices corrupting
the data should get caught by a checksum.

The corruption of the file is weird too. When the data gets corrupted, it
looks like:
(columns are byte offset, correct data, bad data, the data is octal dump of
the byte)

pos G B
17224222 22 255
17224223 23 255
17224224 16 336
17224225 21 377
17224226 22 255
17224227 15 255
17224228 20 336
17224229 21 377
17224230 14 255
17224231 17 255
17224232 20 336
17224233 14 377
17224234 17 255
17224235 17 255
17224236 13 336
17224237 16 377
17224238 16 5
17224239 12 5
17224240 14 4
...

The first 4 bytes of the corruption sometimes differ among the corrupted
files, but it always seems to repeat 3 or 4 times before it stops repeating.

Usually (75% of the time 3 out of 4 failures) if the file is corrupted,
I get an increase in received frame errors on the ethernet interface
(reported
by ifconfig) of the linux server. Exactly what a frame error is, I am not
sure.
Frame errors are in the 1900 range, and its been up for 22 days.

As I understand netatalk's implementation, it used ddp directly rather than
using udp encapsulation (like cap). Being unfamiliar with ddp, I have the
following questions:

 does ddp checksum its data packets?
 if checksumming is optional, how do you turn it on for macOS8, 7.5.3 and
netatalk?
  (from looking at the kernel ddp code, it looks like checksumming is turned
on by
   default)

Also asun's patches should have the capability of running over
tcp/ip rather than ddp. I assume I am using ddp since I see the
line:

  appletalk 17376 13 (autoclean)

in my lsmod output. Is there any way of telling what
transport mechanism is being used for a given appleshare connection?
Also, can somebody provide a pointer for turning on the TCP/IP transport
mechanism. Will it work for MacOS < 8.0

Any other quips, comments, evasion, questions, answers, tests to run, things
to not look at etc.? Thanks.

-- rouilj



This archive was generated by hypermail 2b28 : Wed Jan 17 2001 - 14:30:47 EST