Re: PATCH 2.5.x disable BAR when sizing

Linus Torvalds (torvalds@transmeta.com)
Fri, 20 Dec 2002 09:05:53 -0800 (PST)


On Fri, 20 Dec 2002, David Mosberger wrote:
>
> Could you please stop this ia64 paranoia and instead explain to me why
> it's OK to relocate a PCI device to (0x100000000-PCI_dev_size)
> temporarily? That just seems horribly unsafe to me.

No. It's not horribly unsafe at all. It's very safe, for one simple
reason: it's how PCI probing has _always_ been done. Exactly because the
alternatives simply do not work.

I can also tell you why it does work, and why it's supposed to work: by
writing 0xffffffff to the BAR register, you basically move the BAR to high
PCI memory - even if it was enabled before. Which is fine, as long as
nobody else is in that high memory. So the secondary rule to "don't turn
off MEM or IO accesses" is "never allocate real PCI BAR resources at the
top of memory".

Think about it: if you move the BAR to high memory, you basically disable
only _that_ bar, and nothing else. You don't clobber any other associated
functions, or anything like that. It's clearly a _less_ disruptive thing
than disabling access to the whole device.

Anyway, why do you really want to add the MEM/IO disable? Do you actually
have a device that wants it, or are you just blinded by documentation
written by somebody who had no idea about what real life is all about?

> The PCI spec
> seems to say the same as it says pretty clearly that memory decoding
> should be disabled during BAR-sizing. If certain bridges cause
> problems, perhaps those need to be special-cased?

No. Do it the other way around: if there is some specific chipset that
actually _needs_ the disable, you do THAT special cased.

Because the current code works on everything that we know about, and we
_know_ that the disable doesn't work. As Ivan already pointed out, it's
not just bridges. When you disable IO/MEM accesses, you often disable
_everything_, which can break pretty much anything.

(I say "often", because it does actually depend on the chipset. Some chips
only seem to disable the BAR entries. Others disable all the "extended"
ports too, and leave only config space accessible after you've disable
IO/MEM)

The cases I've seen are northbridges that stop forwarding DMA, and USB
controllers that are still in legacy mode (because the BAR probing happens
before the USB driver has had a chance to _change_ it), where disabling IO
and/or MEM will cause the SMM code that does the legacy handling to just
lock up, since suddenly the hardware they expect to be there doesn't
respond any more.

Ivan pointed out that it also disables things like VGA legacy registers.

It will disable IDE legacy registers too, btw. I'd also expect it to
disable IDE DMA access, so if you happen to be trying to probe the BAR's
after somebody started IO on the IDE, you just made that IO fail
spectacularly, and I'd not be surprised if the IDE controller just locked
up as a result.

Let me re-iterate the "turn power off at the master switch in a house when
switching a light bulb" analogy. Yes, it's a good idea if you are nervous,
but you do that only when you _know_ who is in the house and you know what
they are doing and it's ok by them.

For example, it would be fine for a low-level driver (who has already
taken control of the device) to turn the device off. But it is NOT fine to
do it in general.

One solution in the long term may be to not even probe the BAR's at all in
generic code, and only do it in the pci_enable_dev() stuff. That way it
would literally only be done by the driver, who can hopefully make sure
that the device is ok with it.

Linus

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/