[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Possible Bug: OpenBSD 3.0/SPARC
Bug Overview:
Kernel panic: alignment fault
Occurs when using basic bridging with basic pf rules
Bridging MUST be enabled and both ethernet links must be plugged in
OpenBSD 3.0, 32bit SPARC port, default install
3.0 default kernel, as well as fully-patched (through patch 013)
compiled kernels.
Hardware:
Sun Microsystems SparcStation 5 with 170mhz CPU (I understand this
chip is of a slightly different microarchitecture. Linux/SPARC
does not work at all with these chips, but prior OpenBSD versions
worked very well. I haven't seen any noticeable problems when
using this machine for non-bridging purposes (workstation))
Sun QuadFastEthernet Sbus card (4x HappyMeal controllers)
128mb RAM (I've tried numerous different sets of known-good Sun
RAM)
Seagate 2gb SCA SCSI drive (tried two of these drives, as well as
one Quantum 1gb, to ensure swap space corruption was not
involved)
Sun CGthree sbus video (tried two of these)
Software and configuration:
Default OpenBSD 3.0/SPARC install
set "pf=YES" in /etc/rc.conf
uncomment "pass in all" and "pass out all" in /etc/pf.conf
set hostname.hme0 to "up"
set hostname.hme1 to "up"
set bridgename.bridge0 to "add hme0","add hme1","up"
rebooted to let changes take effect on automatic boot.
Network Environment:
Testing in a private office LAN (10.1.1/24)
IPv4 traffic only (no IPv6)
about 20 linux machines, about 20 windows workstations
two completely-idle test machines (one Redhat, one Debian) are on
the hme1 side of the bridge. the only traffic they deal with are
network broadcasts.
Error:
After a random number of seconds of ethernet connectivity to both network
segments (and successful pings going through), the following error comes
up and drops me to ddb>:
trap type 0x7: pc=0xf80ca2a4 npc=0xf80ca2a8 psr=50007c6<s,PS>
panic: alignment fault
Stopped at _Debugger+0x4: jmpl [%o7 + 0x8], %g0
RUN AT LEAST 'trace' AND 'ps' AND INCLUDE OUTPUT WHEN REPORTING THIS PANIC!
DO NOT EVEN BOTH ER REPORTING THIS WITHOUT INCLUDING THAT INFORMATION!
Here's the trace (hopefully no typos!):
ddb> trace
_trap(0x7, 0x50007c6, 0xf80ca2a4, 0xf821fcc0, 0x0, 0xf85e8170) at _trap+0xdc
_kernel_text(0xf0000000, 0x40000000, 0xf85e9600, 0xf85e9600, 0x0, 0x0) at
_kernel_text+0x4694
_bridge_filter(0xf85e9632, 0x1, 0xf850b08c, 0xf21fe68, 0xf85e9600, 0x0) at
_bridge_filter+0xa0
_bridge_broadcast(0xf85b5000, 0xf850b28c, 0xf821fe68, 0xf85e9500, 0x14,
0xf8228800) at _bridge_broadcast+0x14c
_bridgeintr_frame(0xf85b5000, 0xf85e9500, 0x100, 0xffc00000, 0x0,
0xf8227800) at _bridgeintr_frame+0x2f0
_bridgeintr(0x20000000, 0xf82703c4, 0x100, 0x2, 0xffff, 0x2800) at
_bridgeintr+0x4c
_soft01intr(0x20000000, 0xf81eebe8, 0x3c63f8ad, 0x9eb63, 0x53, 0xfc0cdaf8)
at _soft01intr+0x100
_kernel_text(0x0, 0x0, 0xf800a988, 0x0, 0xffffffff, 0xf00) at
_kernel_text+0x4900
_mi_switch(0xf94, 0x3, 0xfc0fa220, 0xf8266388, 0x0, 0x34) at
_mi_switch+0x19c
_ltsleep(0x0, 0x11a, 0xf80a81c8, 0x32, 0x0, 0x0) at _ltsleep+0x1e4
_bpfread(0x16, 0xfc0fbe28, 0x0, 0xf80a81cc, 0xf84fa1a0, 0x0) at
_bpfread+0xec
_spec_read(0xfc0fbd30, 0xf8099220, 0x30, 0xf85b6500, 0xf8263800,
0xf8228800) at _spec_read+0xc4
_ufsspec_read(0xfc0fbd30, 0xf815efbc, 0x30, 0xf85b6900, 0x0, 0x1) at
_ufsspec_read+0x34
_YOP_READ(0xfc0fe548, 0xfc0fbe28, 0x0, 0xf84ff680, 0x2800, 0x2800) at
_YOP_READ+0x30
_vn_read(0x0, 0xfc0e1440, 0xfc0fbe28, 0xf84ff680, 0xf8090220, 0xfc0cdaf8)
at _vn_read+0x68
_dofileread(0xfc0cdaf8, 0x3, 0x8000, 0x4e000, 0x8000, 0xfc0e1440) at
_dofileread+0x78
_sys_read(0xfc0cdaf8, 0xfc0fbf28, 0xfc0fbf20, 0xf806b580, 0x0, 0x34) at
_sys_read+0x58
_syscall(0x3, 0xfc0fbfb0, 0x0, 0x11, 0x0, 0x0) at _syscall+0x1f4
_kernel_text(0x3, 0x4e00, 0x8000, 0x11, 0x0, 0x0) at _kernel_text+0x47b0
Here's the ps:
ddb> ps
PID PPID PGRP UID S FLAGS WAIT COMMAND
7262 1 7262 0 3 0x4086 ttyin csh
10533 1 10533 0 3 0x84 nanosleep cron
17934 1 17934 0 3 0x84 select screenblank
12274 1 12274 0 3 0x84 select sshd
26250 1 26250 0 3 0x84 bpf pflogd
31878 1 31878 0 3 0x84 syslogd
6 0 0 0 3 0x100204 crypto_wa crypto
5 0 0 0 3 0x100204 syncer update
4 0 0 0 3 0x100204 cleaner cleaner
3 0 0 0 3 0x100204 reaper reaper
2 0 0 0 3 0x100204 daemon_sl pagedaemon
1 0 1 0 3 0x4084 wait init
0 -1 0 0 3 0x80204 scheduler swapper
ddb>
Notes:
I've gone through about half a dozen rebuilds of this machine to
ensure that there is no hardware fault. However, I have not tested this
configuration (basic bridging) on any other 32-bit SPARC using OpenBSD
3.0. I have used bridging on OpenBSD 2.6 through 2.9 with other SPARCs
(not the SS5/170)
I was hoping that the available kernel patches (as of feb 7 2002)
would fix whatever is causing this problem. Unfortunately, these patches
didn't make any difference.
It seems that I can instigate the crash by trying to ping one of
the test machines on the hme1-side of the bridge. It'll get a couple pings
through, and then something goes crazy.
Laird Bedore
VectorStar Networks
marasmus@vectorstar.net