[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: occasional kernel freezes possibly related to aac(4) 2410SA
Ingo Schwarze said:
> Hi Per, hi misc,
>
>> I have (maybe had...) server freezing problems (came during test phase
>> with a couple of days interval) with my new home server which is a Dell
>> 400SC with an adaptec 2410SA (4.1-0[5934] set up as a mirror. This
>> server
>> is new and I have only ran OpenBSD 3.6 and only with my 2410SA card.
>> When
>> the server locked up I could not even see anything on the attached
>> monitor. I have not enabled any debugging in the kernel yet to track if
>> it's the adaptec card or not that causes my hangings. I was going to try
>> what "Antonios Anastasiadis" did (see misc thread) and disable both the
>> uhci and the ehci driver to see if I get rid of the freezings (he did
>> and
>> pointed against a very buggy *hci driver).
>
> As i don't need USB anyway, i just compiled the kernel without any
> usb support. Let's see whether that will change anything...
>
>> However... before disbling the *hci in the kernel I tried to disable the
>> driver cache on my 2410 card ( a long shot) and the hangings have not
>> showed up since.
>
> When i first configured the controller, i enabled all cache options,
> too. But right after the first hang, i disabled the write cache on
> all SATA devices - after all, when the machine hangs with cache
> enabled, you lose data.
>
> I just checked the BIOS settings of the card.
> They are currently as follows:
>
> Controller configuration:
> Drive write cache - disabled
> Runtime BIOS - enabled
> Automatic Failover - enabled
> Array background consistency check - disabled
> Array based BBS support - disabled
> Physical device display during POST - enabled
> Alarm control - enabled
>
> SATA device configuration:
> Write cache - all four channels - no
>
>> Can you try to disable all cache and reply with status?
>
> The last hang of my machine occurred with firmware 4.2-0 (=Build 7348)
> while the write cache on all SATA devices was explicitely switched
> off.
>
>> I really want to solve my issue AND have an enabled cache.
>
> You mean, having paid cash for cache, you would like to actually
> use it? Hmmmm...
>
> Well, first of all, i should like to get the controller firmware
> to work reliably *at all*. Locking up every second week is not
> that nice.
>
> Yours
> Ingo
>
>
Hi Ingo
It seems like I can only wait for a crash then... After some thoughts,
maybe my problem looks like it is more like Antonios (search for "3.6
random lockups" on misc) where the *hci was the problem. He described that
he could not see any input on the screen when the system hang. I actually
experience the same thing. But I have not seen a hang since I disabled the
driver cache... But it's maybe just a matter of time...
I upgraded to firmware 7348 from 5934 even though I have not yet
experienced a hang after disabling the driver cache. My problem is that I
haven't tried the new Dell 400SC server or the new 2410SA card on 3.6
before. So I have a new server, a new raid card and a new OpenBSD release.
Maybe it's the raid card... Maybe it's the *hci... I don't know.
Don't you think a very buggy 2410SA firmware would give problems on
windows servers as well? Could it really be a totally unusable adaptec
card? Are we the only persons running 2410SA with OpenBSD? Why does it
work for all other people... as we are the only posters with 2410SA cards
that will freeze? The knowledge base at adaptec have no postings with this
issue.
B.t.w... Are you as I running the 64bit 2410SA card in a 32bit slot?
While waiting for adaptec updates, can we check the USB (the Antonios track)?
What does a "dmesg|grep hci" show on your system?
My system shows...
xanadu# dmesg|grep hci
uhci0 at pci0 dev 29 function 0 "Intel 82801EB/ER USB" rev 0x02: irq 11
usb0 at uhci0: USB revision 1.0
uhci1 at pci0 dev 29 function 1 "Intel 82801EB/ER USB" rev 0x02: irq 10
usb1 at uhci1: USB revision 1.0
uhci2 at pci0 dev 29 function 2 "Intel 82801EB/ER USB" rev 0x02: irq 9
usb2 at uhci2: USB revision 1.0
uhci3 at pci0 dev 29 function 3 "Intel 82801EB/ER USB" rev 0x02: irq 11
usb3 at uhci3: USB revision 1.0
ehci0 at pci0 dev 29 function 7 "Intel 82801EB/ER USB" rev 0x02: irq 5
ehci0: EHCI version 1.0
ehci0: companion controllers, 2 ports each: uhci0 uhci1 uhci2 uhci3
usb4 at ehci0: USB revision 2.0
/Per-Olov