• Linus Torvalds's avatar
    x86: Clean up late e820 resource allocation · 1f987577
    Linus Torvalds authored
    This makes the late e820 resources use 'insert_resource_expand_to_fit()'
    instead of doing a 'reserve_region_with_split()', and also avoids
    marking them as IORESOURCE_BUSY.
    
    This results in us being perfectly happy to use pre-existing PCI
    resources even if they were marked as being in a reserved region, while
    still avoiding any _new_ allocations in the reserved regions.  It also
    makes for a simpler and more accurate resource tree.
    
    Example resource allocation from Jonathan Corbet, who has firmware that
    has an e820 reserved entry that covered a big range (e0000000-fed003ff),
    and that had various PCI resources in it set up by firmware.
    
    With old kernels, the reserved range would force us to re-allocate all
    pre-existing PCI resources, and his reserved range would end up looking
    like this:
    
    	e0000000-fed003ff : reserved
    	  fec00000-fec00fff : IOAPIC 0
    	  fed00000-fed003ff : HPET 0
    
    where only the pre-allocated special regions (IOAPIC and HPET) were kept
    around.
    
    With 2.6.28-rc2, which uses 'reserve_region_with_split()', Jonathan's
    resource tree looked like this:
    
    	e0000000-fe7fffff : reserved
    	fe800000-fe8fffff : PCI Bus 0000:01
    	 fe800000-fe8fffff : reserved
    	fe900000-fe9d9aff : reserved
    	fe9d9b00-fe9d9bff : 0000:00:1f.3
    	 fe9d9b00-fe9d9bff : reserved
    	fe9d9c00-fe9d9fff : 0000:00:1a.7
    	 fe9d9c00-fe9d9fff : reserved
    	fe9da000-fe9dafff : 0000:00:03.3
    	 fe9da000-fe9dafff : reserved
    	fe9db000-fe9dbfff : 0000:00:19.0
    	 fe9db000-fe9dbfff : reserved
    	fe9dc000-fe9dffff : 0000:00:1b.0
    	 fe9dc000-fe9dffff : reserved
    	fe9e0000-fe9fffff : 0000:00:19.0
    	 fe9e0000-fe9fffff : reserved
    	fea00000-fea7ffff : 0000:00:02.0
    	 fea00000-fea7ffff : reserved
    	fea80000-feafffff : 0000:00:02.1
    	 fea80000-feafffff : reserved
    	feb00000-febfffff : 0000:00:02.0
    	 feb00000-febfffff : reserved
    	fec00000-fed003ff : reserved
    	 fec00000-fec00fff : IOAPIC 0
    	 fed00000-fed003ff : HPET 0
    
    and because the reserved entry had been split and moved into the
    individual resources, and because it used the IORESOURCE_BUSY flag, the
    drivers that actually wanted to _use_ those resources couldn't actually
    attach to them:
    
    	e1000e 0000:00:19.0: BAR 0: can't reserve mem region [0xfe9e0000-0xfe9fffff]
    	HDA Intel 0000:00:1b.0: BAR 0: can't reserve mem region [0xfe9dc000-0xfe9dffff]
    
    with this patch, the resource tree instead becomes
    
    	e0000000-fed003ff : reserved
    	  fe800000-fe8fffff : PCI Bus 0000:01
    	  fe9d9b00-fe9d9bff : 0000:00:1f.3
    	  fe9d9c00-fe9d9fff : 0000:00:1a.7
    	    fe9d9c00-fe9d9fff : ehci_hcd
    	  fe9da000-fe9dafff : 0000:00:03.3
    	  fe9db000-fe9dbfff : 0000:00:19.0
    	    fe9db000-fe9dbfff : e1000e
    	  fe9dc000-fe9dffff : 0000:00:1b.0
    	    fe9dc000-fe9dffff : ICH HD audio
    	  fe9e0000-fe9fffff : 0000:00:19.0
    	    fe9e0000-fe9fffff : e1000e
    	  fea00000-fea7ffff : 0000:00:02.0
    	  fea80000-feafffff : 0000:00:02.1
    	  feb00000-febfffff : 0000:00:02.0
    	  fec00000-fec00fff : IOAPIC 0
    	  fed00000-fed003ff : HPET 0
    
    ie the one reserved region now ends up surrounding all the PCI resources
    that were allocated inside of it by firmware, and because it is not
    marked BUSY, drivers have no problem attaching to the pre-allocated
    resources.
    Reported-and-tested-by: default avatarJonathan Corbet <corbet@lwn.net>
    Cc: Yinghai Lu <yinghai@kernel.org>
    Cc: Ingo Molnar <mingo@elte.hu>
    Cc: Robert Hancock <hancockr@shaw.ca>
    Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
    1f987577
e820.c 34.9 KB