• Thomas Gleixner's avatar
    genirq/chip: Use the first chip in irq_chip_compose_msi_msg() · 13b90cad
    Thomas Gleixner authored
    The documentation of irq_chip_compose_msi_msg() claims that with
    hierarchical irq domains the first chip in the hierarchy which has an
    irq_compose_msi_msg() callback is chosen. But the code just keeps
    iterating after it finds a chip with a compose callback.
    
    The x86 HPET MSI implementation relies on that behaviour, but that does not
    make it more correct.
    
    The message should always be composed at the domain which manages the
    underlying resource (e.g. APIC or remap table) because that domain knows
    about the required layout of the message.
    
    On X86 the following hierarchies exist:
    
    1)   vector -------- PCI/MSI
    2)   vector -- IR -- PCI/MSI
    
    The vector domain has a different message format than the IR (remapping)
    domain. So obviously the PCI/MSI domain can't compose the message without
    having knowledge about the parent domain, which is exactly the opposite of
    what hierarchical domains want to achieve.
    
    X86 actually has two different PCI/MSI chips where #1 has a compose
    callback and #2 does not. #2 delegates the composition to the remap domain
    where it belongs, but #1 does it at the PCI/MSI level.
    
    For the upcoming device MSI support it's necessary to change this and just
    let the first domain which can compose the message take care of it. That
    way the top level chip does not have to worry about it and the device MSI
    code does not need special knowledge about topologies. It just sets the
    compose callback to NULL and lets the hierarchy pick the first chip which
    has one.
    
    Due to that the attempt to move the compose callback from the direct
    delivery PCI/MSI domain to the vector domain made the system fail to boot
    with interrupt remapping enabled because in the remapping case
    irq_chip_compose_msi_msg() keeps iterating and choses the compose callback
    of the vector domain which obviously creates the wrong format for the remap
    table.
    
    Break out of the loop when the first irq chip with a compose callback is
    found and fixup the HPET code temporarily. That workaround will be removed
    once the direct delivery compose callback is moved to the place where it
    belongs in the vector domain.
    Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
    Reviewed-by: Marc Zyngier <maz@kernel.org>                                                                                                                                                                                                                                     Link: https://lore.kernel.org/r/20200826112331.047917603@linutronix.de
     
    13b90cad
internals.h 14.2 KB