> [Further discussion and things I did not yet investigate:
> What was changed to make this fail first in 2.5.63?
> Experience shows that we get into a loop when something else
> than SUCCESS is returned here. Probably that is a bug elsewhere.
> Probably the commands that cause problems should never have been
> sent in the first place.]
The scsi error handler is also used to retrieve sense data for
adapters/drivers that do not auto retrieve it. In such cases, it should
not issue any aborts, resets etc.
Indeed.
Your change effectively disables that support - we never hit the code in
scsi_eh_get_sense() to request sense. It would be very nice if we could
fix (or audit) all the scsi drivers, apply your change and remove
scsi_eh_get_sense, but AFAIK that has not and is not happening.
No. What happened before was that we got into an infinite loop.
The right action is to read the code, understand why it gets
into a loop, and fix it. Once that has happened we may decide
to undo my change. Or we may decide to ask for sense at that very spot.
Today both James and Mike say that they can reproduce the loop,
so probably they'll fix that part. If not, I'll have a look again.
Andries
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/