[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Possible Bug: OpenBSD 3.0/SPARC



Bug Overview:
	Kernel panic: alignment fault
	Occurs when using basic bridging with basic pf rules
	Bridging MUST be enabled and both ethernet links must be plugged in
	OpenBSD 3.0, 32bit SPARC port, default install
	3.0 default kernel, as well as fully-patched (through patch 013)
	compiled kernels.

Hardware:
	Sun Microsystems SparcStation 5 with 170mhz CPU (I understand this
	 chip is of a slightly different microarchitecture. Linux/SPARC
	 does not work at all with these chips, but prior OpenBSD versions
	 worked very well. I haven't seen any noticeable problems when
	 using this machine for non-bridging purposes (workstation))
	Sun QuadFastEthernet Sbus card (4x HappyMeal controllers)
	128mb RAM (I've tried numerous different sets of known-good Sun
	 RAM)
	Seagate 2gb SCA SCSI drive (tried two of these drives, as well as
	 one Quantum 1gb, to ensure swap space corruption was not
	 involved)
	Sun CGthree sbus video (tried two of these)

Software and configuration:
	Default OpenBSD 3.0/SPARC install
	set "pf=YES" in /etc/rc.conf
	uncomment "pass in all" and "pass out all" in /etc/pf.conf
	set hostname.hme0 to "up"
	set hostname.hme1 to "up"
	set bridgename.bridge0 to "add hme0","add hme1","up"
	rebooted to let changes take effect on automatic boot.

Network Environment:
	Testing in a private office LAN (10.1.1/24)
	IPv4 traffic only (no IPv6)
	about 20 linux machines, about 20 windows workstations
	two completely-idle test machines (one Redhat, one Debian) are on
	 the hme1 side of the bridge. the only traffic they deal with are
	 network broadcasts.


Error:
After a random number of seconds of ethernet connectivity to both network
segments (and successful pings going through), the following error comes
up and drops me to ddb>:

trap type 0x7: pc=0xf80ca2a4 npc=0xf80ca2a8 psr=50007c6<s,PS>
panic: alignment fault
Stopped at      _Debugger+0x4:  jmpl            [%o7 + 0x8], %g0
RUN AT LEAST 'trace' AND 'ps' AND INCLUDE OUTPUT WHEN REPORTING THIS PANIC!
DO NOT EVEN BOTH ER REPORTING THIS WITHOUT INCLUDING THAT INFORMATION!

Here's the trace (hopefully no typos!):

ddb> trace
_trap(0x7, 0x50007c6, 0xf80ca2a4, 0xf821fcc0, 0x0, 0xf85e8170) at _trap+0xdc
_kernel_text(0xf0000000, 0x40000000, 0xf85e9600, 0xf85e9600, 0x0, 0x0) at
_kernel_text+0x4694
_bridge_filter(0xf85e9632, 0x1, 0xf850b08c, 0xf21fe68, 0xf85e9600, 0x0) at
_bridge_filter+0xa0
_bridge_broadcast(0xf85b5000, 0xf850b28c, 0xf821fe68, 0xf85e9500, 0x14,
0xf8228800) at _bridge_broadcast+0x14c
_bridgeintr_frame(0xf85b5000, 0xf85e9500, 0x100, 0xffc00000, 0x0,
0xf8227800) at _bridgeintr_frame+0x2f0
_bridgeintr(0x20000000, 0xf82703c4, 0x100, 0x2, 0xffff, 0x2800) at
_bridgeintr+0x4c
_soft01intr(0x20000000, 0xf81eebe8, 0x3c63f8ad, 0x9eb63, 0x53, 0xfc0cdaf8)
at _soft01intr+0x100
_kernel_text(0x0, 0x0, 0xf800a988, 0x0, 0xffffffff, 0xf00) at
_kernel_text+0x4900
_mi_switch(0xf94, 0x3, 0xfc0fa220, 0xf8266388, 0x0, 0x34) at
_mi_switch+0x19c
_ltsleep(0x0, 0x11a, 0xf80a81c8, 0x32, 0x0, 0x0) at _ltsleep+0x1e4
_bpfread(0x16, 0xfc0fbe28, 0x0, 0xf80a81cc, 0xf84fa1a0, 0x0) at
_bpfread+0xec
_spec_read(0xfc0fbd30, 0xf8099220, 0x30, 0xf85b6500, 0xf8263800,
0xf8228800) at _spec_read+0xc4
_ufsspec_read(0xfc0fbd30, 0xf815efbc, 0x30, 0xf85b6900, 0x0, 0x1) at
_ufsspec_read+0x34
_YOP_READ(0xfc0fe548, 0xfc0fbe28, 0x0, 0xf84ff680, 0x2800, 0x2800) at
_YOP_READ+0x30
_vn_read(0x0, 0xfc0e1440, 0xfc0fbe28, 0xf84ff680, 0xf8090220, 0xfc0cdaf8)
at _vn_read+0x68
_dofileread(0xfc0cdaf8, 0x3, 0x8000, 0x4e000, 0x8000, 0xfc0e1440) at
_dofileread+0x78
_sys_read(0xfc0cdaf8, 0xfc0fbf28, 0xfc0fbf20, 0xf806b580, 0x0, 0x34) at
_sys_read+0x58
_syscall(0x3, 0xfc0fbfb0, 0x0, 0x11, 0x0, 0x0) at _syscall+0x1f4
_kernel_text(0x3, 0x4e00, 0x8000, 0x11, 0x0, 0x0) at _kernel_text+0x47b0


Here's the ps:

ddb> ps
 PID  PPID   PGRP   UID  S      FLAGS  WAIT       COMMAND
 7262     1   7262     0  3     0x4086  ttyin      csh
10533     1  10533     0  3       0x84  nanosleep  cron
17934     1  17934     0  3       0x84  select     screenblank
12274     1  12274     0  3       0x84  select     sshd
26250     1  26250     0  3       0x84  bpf        pflogd
31878     1  31878     0  3       0x84             syslogd
6         0      0     0  3   0x100204  crypto_wa  crypto
5         0      0     0  3   0x100204  syncer     update
4         0      0     0  3   0x100204  cleaner    cleaner
3         0      0     0  3   0x100204  reaper     reaper
2         0      0     0  3   0x100204  daemon_sl  pagedaemon
1         0      1     0  3     0x4084  wait       init
0        -1      0     0  3    0x80204  scheduler  swapper
ddb>


Notes:
	I've gone through about half a dozen rebuilds of this machine to
ensure that there is no hardware fault. However, I have not tested this
configuration (basic bridging) on any other 32-bit SPARC using OpenBSD
3.0. I have used bridging on OpenBSD 2.6 through 2.9 with other SPARCs
(not the SS5/170)
	I was hoping that the available kernel patches (as of feb 7 2002)
would fix whatever is causing this problem. Unfortunately, these patches
didn't make any difference.
	It seems that I can instigate the crash by trying to ping one of
the test machines on the hme1-side of the bridge. It'll get a couple pings
through, and then something goes crazy.


Laird Bedore
VectorStar Networks
marasmus@vectorstar.net