[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

thinkpad t41, acpi + ICH4 woes



> Date: Fri, 2 Jan 2004 12:34:23 -0800
> From: t12 <donnie_(_at_)_opus_(_dot_)_bloomcounty_(_dot_)_org>
> Sender: owner-freebsd-mobile_(_at_)_freebsd_(_dot_)_org
> 
> I recently got a new thinkpad t41 (23739FU), and have been working on
> FreeBSD 5.2-RC2. Everything has been excelent, except of course for
> suspend/resume.
> 
> In both ACPI and APM suspends, I get a hang on resume. With APM, occasionally
> it would resume safely. This appears to be a common problem, across multiple
> fbsd releases. I've been tinkering with that ata driver, which seems to be at
> least involved with the problems. I'm at an impasse at this point, and am
> wondering if anyone would like to help, or has input.
> 
> Here are my findings:
> 
> with both ACPI and APM, if i suspend in single user mode, resume works fine.
> in multiuser mode, booted normally it crashes on resume. Notibly the hard   
> drive light stays on during this crash. I cannot escape to the kernel debugger
> from this state.
> 
> According to intel's docs on the ICH4-M:
> (http://www.intel.com/design/mobile/datashts/25233701.pdf section 5.15.2.4)
> 
> "
> Native Mode IDE/ACPI S3 Resume Hang Avoidance:
> 
> System BIOS must clear the interrupt bit (Bit 2) in Bus Master IDE Status Register for BOTH
> primary and secondary channels prior to passing control to the OS during resume from S3 state
> (STR). This ensures that the pending IDE interrupt(s) are cleared when the control is passed to the
> OS. The registers are locked in I/O space via BM_BASE register (Bus 0: Device 31: Register 20-
> 23h) at offset 02h and offset 0Ah, respectively. Failure to do this may result in system hang when
> the OS starts executing resume sequence from S3 (STR) under certain conditions. These conditions
> include a combination of the following:
> 
> - Only a single channel of IDE is enabled (either Primary or Secondary)
> - Native IDE Mode capability is reported by the BIOS
> - OS is capable of dynamically switching from Legacy IDE Mode to Native IDE Mode.
> 
> A system hang may occur if there exists a pending IDE Interrupt status bit during the legacy IDE
> Mode to Native IDE mode sequence, the OS software may not clear the IDE interrupt(s), resulting
> in an apparent hang condition (interrupt storm).
> "
> 
> This seems to be about whats happening. Of course I dont really have
> the ability to change things like that in the BIOS, but I attempted to
> clear those bits in the ata driver on in ata_resume(). It did't make a
> difference. I noted that sometimes the interrupt bit is set before I change
> it, but most of the time it isnt.
> 
> Another thing I ran across, which could possibly be related:
> (http://support.microsoft.com/default.aspx?scid=kb;EN-US;323153)
> 
> 
> "
> SYMPTOMS
> A computer with two IDE drives on one cable may stop responding (hang) when the computer resumes from standby.
> 
> CAUSE
> The IDE disk driver does not check the busy bit in the drive status before resetting the drive.
> This creates a problem with computers that use the S3 sleep state for standby. This state cuts
> power to the drive instead of issuing a sleep command to the drive as in the S2 standby state.
> This causes the drives to be reset before the drive negotiation is finished when the drives restart.
> The specification for ATAPI drives calls for checking the busy bit before resetting the drive.
> The specification also states that the drive ignore resets until the drives are synchronized and
> ready to operate after negotiation. Some drives that are currently being sold do not do this.
> This can cause computers with two drives to start with both drives programmed to be the master
> on the same cable, causing the computer to hang.
> "
> 
> Its really not the same situation, but it seemed related enough to include.
> >From what I can tell, ata_reset() does not check the busy bit before issuing
> a reset, but does wait for it afterwards.
> 
> As far as i can tell from kernel debugging attempts, the system hangs
> immediately after queueing the command to change the master device on ata0
> from PIO_MAX to DMA_MAX.  A paraphrased trace looks something like:
> 
> ata_controlcmd()
> ata_intel_new_setmode()
> ata_identify_devices()
> ata_reinit()
> ata_resume()
> ...
> 
> Other related ACPI problems:
> 
> usb is unhappy on resume. This is also documented in the ICH4-M spec,
> something about "SE0 during Resume Causes Disconnect". It implies
> that depneding on OS handling, the device will latently resume or
> not resume at all.

Well known problem. It's been under investigation for some time, but
it's still a problem.

> when i acpiconf -s 3, the system actually doesnt really go to sleep.
> the harddrive spins down, the LCD backlight powers off, the fan spins
> down, but at the right angle you can tell that the LCD is still displaying
> text. The sleep light doesnt come on, The dvd drives power light stays on.
> with APM, the system really goes to sleep.

I believe that this has either been fixed or partly fixed in CURRENT. At
least my sleep light goes on now. My LCD has always turned off, although
the back-light does not turn off.

> I cant find it this moment, but i ran across something that mentioned
> that if the AC97 Audio stuff is enabled on shutdown, that the machine
> will not really enter acpi S3. If i remember right, the document
> reffered to the ICH4 rather then the ICH4-M, but they seem pretty
> similar.

Interesting. I'll have to play with this on my T30.

> I've run into and read about a few problems with X11/AGP and APM/ACPI
> resume, and have run into them occasionally, but i'm trying to tackle
> the possible ata problems first.

As long as you switch to a vty on S3 and back out on resume, I have not
seen any of these problems. The suspend and resume scripts now work
correctly with ACPI in CURRENT.

> It'd be great if anyone has advice on insight into any of this stuff,
> I have plenty of time to work on this, but i'm running low on 
> knowledge. Feel free to email me for other/further info. 

While testing 5.2-RC is probably a very good idea (I'm guilty of not
doing so), ACPI is being updated at a rather rapid pace and you will
likely find fewer issues with CURRENT.

Unfortunately, the ATA issue is a killer. The primary person supporting
ACPI is nate_(_at_)_root_(_dot_)_org, but he closely monitors the ACPI list and you
would probably be best off to send this there (and subscribe to it)
acpi-jp_(_at_)_jp_(_dot_)_FreeBSD_(_dot_)_org_(_dot_)_ The primary ATA support is from Søren Schmidt
(sos_(_at_)_freebsd_(_dot_)_org). Make sure he gets copies of mail relating to the ATA
problem. I don't think he monitors the mobile list, but he is on
current. Or just cc: him.

A great many people would be very happy to see all of this ironed
out. Laptop support is a second priority for the FreeBSD developers as
servers are the primary targets with embedded systems coming in
second. But LOTS of people including many committers run regularly on
laptops and are eager to get the bugs fixed as time allows. (E.g. Nate
has a ThinkPad.)

Thanks for the research you have already put into this.
-- 
R. Kevin Oberman, Network Engineer
Energy Sciences Network (ESnet)
Ernest O. Lawrence Berkeley National Laboratory (Berkeley Lab)
E-mail: oberman_(_at_)_es_(_dot_)_net			Phone: +1 510 486-8634