Subject: replicated DID error
From: Mike Johnson (ffp_randjohnson@yahoo.com)
Date: Tue Dec 19 2000 - 02:47:44 EST
Hello again everybody!
I'm planning on writing a pretty detailed email so here's the hook: I
managed to replicated the DID conflict error. I felt it was important to
first replicate the error before it can be reliably fixed.
(I'm rather unfamiliar with bug reporting and the like, so bare with me.
I'll just be as detailed as possible and hopefully somebody can discern
something useful from this. I've got to learn C!)
Anyway, let me first explain my test.
I've been using netatalk for sometime now and I've never experienced
anything like what many of the people on this list have reported. No DID
conflicts, dancing icons,and thankfully no disappearing files.
The rpms from the sourceforge site install like a dream and works very
well the uams.
But the difference between my netatalk use and other's is that I rarely
use it for production use. I'm pretty much the only one who ever logs
into that machine, and it's just a temporary file dumping zone and a
backup server. (I would like to use a similar setup for production but
the aforementioned problems have kinda scared me away.:)
It seemed pretty obvious to me that whatever problems have arisen, they
were exclusively multi-user problems, and thus the reason why I haven't
heard of anybody successfully replicating the problems.
I devised three things that might possibly cause errors. I was looking to
cause any errors at all.
1st - High network traffic and buffer overruns.
2nd - Removing .AppleDouble files using a Windows machine while a mac was
copying folders.
3rd - Use a Windows machine to rename folders while a mac is actively
trying to copy files into them.
After hearing Roland's explanation of DIDs errors and resolving, I
thought this might be a good first test.
Let me explain the hardware too:
1. Linux Redhat 7.0 Server, Pentium 90 Mhz Vectra. Three IDE hard drives.
32 Megs of ram. (Hey, it's a test machine, afterall.) I named it
challenger.... Redhat 2.2.16-22 kernel.
2. My laptop. A Dell Inspiron 3000. lots o ram. SuSE 6.4. Not that this
matters.
3. Another similar Dell laptop running Windows 2000.
4. A mac. It's a 7200 running mac OS 8.6.
I installed a 100 M/s hub for just these computers. All computers except
the 7200 were running 100 M/s NICs, the mac 10 m/s.
I took detailed notes through this whole process, but let me just run
through the juicy parts.
>From the linux dell, I ran "ping -f" to flood the server. I wanted to
simulate lots of network traffic. It's a ridiculous amount really, but
very effective.
Before and after this I checked the system log. There were no signs of
errors or buffer overruns during the whole test. Maybe next time I'll
ping from more computers.
I ran the ping for a while, then stopped it to check the collisions. 967
packets collided, 251,648 sent. 3 dropped.
Next, I restarted the ping and mounted challenger with my mac. There was
no noticeable slowdown.
I started a copy of 400+ megs of assorted data files to challenger -- all
of them were just random things. Copped home directories, pro/e files,
IGES, stuff like that.
Things were going well (or should I say badly) for my little test. The
collisions increased everything the IDE drives kicked in, but this was
expected.
Just for kicks, I decided to do a find from the windows computer. Using
samba, I found all of the .AppleDouble files as they were being copied to
challenger from the mac. I deleted them as they came, but it didn't
disturb the copy or cause any log entries. I don't really know what
pertinent information is here that would hinder a copy, but apparently
nothing.
Here's the last one I ran:
About halfway through the copy, I went again to the windows computer.
Using samba, I tried to rename the top level folder that the mac was
copying. It was called "pocketscience" or something like that. I couldn't
rename it (file being used error on the windows side), but I could rename
the two folders directly underneath "pocketscience".
Those two folders were called "viking" and "from_andrews_home". I renamed
them to "viking2" and "from_andrews_home2". The copy kept moving,
unfettered by this name change.
I finally stopped the ping to check the stats. Some 20,000 packets
disappeared, either dropped or collided, with about 2 million sent.
I let the copy go on for a little while; still nothing in the syslog.
Tired of watching the progress bar, I canceled the copy. I immediately
realized however, that though I could double-click on "pocketscience" and
read the contents, the two subdirectories didn't have the new names I had
given them.
When I double-clicked on each "from_andrews_home" and "viking", the mac
yelped an error and refused to open the directories.
Then, finally, a syslog entry:
-------------------------
Dec 18 15:16:00 challenger afpd[1765]: WARNING: DID conflict for 'pa' and
'pa2'. Are these the same file?
Dec 18 15:16:00 challenger afpd[1765]: WARNING: DID conflict for
'from_andrews_home' and 'from_andrews_home2'. Are these the same file?
<snipping DHCP requests>
Dec 18 15:17:59 challenger afpd[1765]: WARNING: DID conflict for 'pa' and
'pa2'. Are these the same file?
Dec 18 15:17:59 challenger afpd[1765]: WARNING: DID conflict for
'from_andrews_home' and 'from_andrews_home2'. Are these the same file?
Dec 18 15:18:47 challenger PAM_pwdb[1765]: (netatalk) session closed for
user miketec
Dec 18 15:18:47 challenger afpd[1765]: 273819.15KB read, 5.18KB written
Dec 18 15:18:48 challenger afpd[1765]: done
--------------------------
Here I had a choice, either try to fix the problem and salvage the files,
or try to make them disappear. I choose to try to fix the problem; I'll
try making them disappear again some other time.
I noticed that simply closing the windows yet still keeping the mac's
challenger mount didn't refresh the windows. The only thing that
successfully resolved the DID conflict was disconnecting from the server
and connecting again.
Restarting the mac didn't make any difference to this test, but umounting
and remounting was what appeared to matter most.
Thanks for reading this far. The reason I'm sending this to the admin
list and not to the developers is that I was hoping someone might help me
out. If somebody has some time and the equipment to reproduce these
errors following the steps above, it might speed things along. Then I'll
know for sure whether it was something to do with my equipment, how I
worked, or if it truly was the error I was looking for.
Thanks,
- Mike Johnson
This archive was generated by hypermail 2b28 : Wed Jan 17 2001 - 14:32:47 EST