Re: [PATCH] scsi_error fix

Mike Anderson (andmike@us.ibm.com)
Fri, 7 Mar 2003 13:17:32 -0800


Andries.Brouwer@cwi.nl [Andries.Brouwer@cwi.nl] wrote:
> From: Patrick Mansfield <patmans@us.ibm.com>
>
> > [Further discussion and things I did not yet investigate:
> > What was changed to make this fail first in 2.5.63?
> > Experience shows that we get into a loop when something else
> > than SUCCESS is returned here. Probably that is a bug elsewhere.
> > Probably the commands that cause problems should never have been
> > sent in the first place.]
>
> The scsi error handler is also used to retrieve sense data for
> adapters/drivers that do not auto retrieve it. In such cases, it should
> not issue any aborts, resets etc.
>
> Indeed.
>
> Your change effectively disables that support - we never hit the code in
> scsi_eh_get_sense() to request sense. It would be very nice if we could
> fix (or audit) all the scsi drivers, apply your change and remove
> scsi_eh_get_sense, but AFAIK that has not and is not happening.
>
> No. What happened before was that we got into an infinite loop.
> The right action is to read the code, understand why it gets
> into a loop, and fix it. Once that has happened we may decide
> to undo my change. Or we may decide to ask for sense at that very spot.
>
> Today both James and Mike say that they can reproduce the loop,
> so probably they'll fix that part. If not, I'll have a look again.

Sorry about that Patrick. I had sent some mail last might and thought it
went to the list.

Both James and I can reproduce this problem. I was able to reproduce
using a hack to scsi_debug.

The loop problem is related to scsi error handling using the common code
of scsi_queue_insert / scsi_requesT_fn . When a command gets started the
scsi_init_cmd_errh function is called which sets retries to 0.

There maybe another issue with the scsi_eh_get_sense function, but I am
still looking at it.

I believe James is still pursuing a solution also.

-andmike

--
Michael Anderson
andmike@us.ibm.com

- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/