firewire: ohci: handle register access failure in SClk domain

One of the changes from OHCI-1394 v1.0 to v1.1 is that the PHY's SClk
signal may sometimes not be present during normal operation, and
accesses to certain registers fail then.  See OHCI-1394 v1.1 sections
1.4.1, 4., and 6.1.

The specification does not tell us though how to recover from this
condition.

This patch adds a check for this condition at each and every register
access within the SClk domain.  If the access failure is encountered,
the access is retried 20 times in a busy loop in atomic contexts, or
with 50 ms period (i.e. 1 second total retry time) in process contexts.
If the failure persists, the error is passed up to higher layers.  For
example, if loss of SClk persists when the controller is initialized or
is woken up in PM resume, the controller is not enabled and the
pci_probe or resume fails.

Since some of the accesses are in performance sensitive paths, notably
cycleTimer access in the interrupt handler, this patch should perhaps
be followed up by an optimization for OHCI 1.0 controllers and maybe
some known good OHCI 1.1 controllers which don't need the regAccessFail
check, to skip the expensive MMIOs on them.

regAccessFail has been seen with the following devices:

Texas Instruments PCIxx21 FireWire + CardBus + flash memory card
controller in a Toshiba Satellite:
https://bugzilla.redhat.com/show_bug.cgi?id=608544

O2 Micro FireWire + flash memory card controller in various Dell
laptops:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/801719
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/881688
http://marc.info/?l=linux1394-devel&m=132309283531423
http://marc.info/?l=linux1394-devel&m=132368567907469
and several more reports.

Pinnacle MovieBoard:
http://git.kernel.org/?p=linux/kernel/git/torvalds/linux.git;a=commit;h=7f7e37115a8b
http://marc.info/?l=linux1394-devel&m=130714243325962

I don't have access to TI PCIxx21 and O2 Micro, hence tested this only
on several good controllers which never raise regAccessFail and on the
MovieBoard.  In case of the latter, the driver now detects the condition
as intended but still end up in a lock-up due to an interrupt storm or
even in a panic when SClk loss happens; so we need to keep the MovieBoard
disabled for now.

Tests on an affected Toshiba or Dell laptop would be much appreciated.
My hope is that their PM resume problem is fixed by the access retries
with pauses until SClk is on in ohci_enable.

Cc: Ming Lei <ming.lei@canonical.com>
Reported-by: Bjørn Forbord <bforbord@broadpark.no>
Reported-by: Joel Bourrigaud <joel@bourrigaud.info>
Reported-by: Klaus Pedersen <projectu@gmail.com>
Reported-by: Marc Legris
Reported-by: Michael Heutzwer
Reported-by: Nikita Kitaev <nikitakit@gmail.com>
Reported-by: Nils Cant <nils@krash.be>
Reported-by: Robrecht Dewaele <robrecht.dewaele@gmail.com>
Reported-by: Steve Kroon <kroon@sun.ac.za>
Reported-by: Vianney <vidac2000@yahoo.fr>
Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
9 files changed