Commit bcd11afa authored by Linus Torvalds's avatar Linus Torvalds Committed by Thomas Gleixner

x86/speculation/l1tf: Change order of offset/type in swap entry

If pages are swapped out, the swap entry is stored in the corresponding
PTE, which has the Present bit cleared. CPUs vulnerable to L1TF speculate
on PTE entries which have the present bit set and would treat the swap
entry as phsyical address (PFN). To mitigate that the upper bits of the PTE
must be set so the PTE points to non existent memory.

The swap entry stores the type and the offset of a swapped out page in the
PTE. type is stored in bit 9-13 and offset in bit 14-63. The hardware
ignores the bits beyond the phsyical address space limit, so to make the
mitigation effective its required to start 'offset' at the lowest possible
bit so that even large swap offsets do not reach into the physical address
space limit bits.

Move offset to bit 9-58 and type to bit 59-63 which are the bits that
hardware generally doesn't care about.

That, in turn, means that if you on desktop chip with only 40 bits of
physical addressing, now that the offset starts at bit 9, there needs to be
30 bits of offset actually *in use* until bit 39 ends up being set, which
means when inverted it will again point into existing memory.

So that's 4 terabyte of swap space (because the offset is counted in pages,
so 30 bits of offset is 42 bits of actual coverage). With bigger physical
addressing, that obviously grows further, until the limit of the offset is
hit (at 50 bits of offset - 62 bits of actual swap file coverage).

This is a preparatory change for the actual swap entry inversion to protect
against L1TF.

[ AK: Updated description and minor tweaks. Split into two parts ]
[ tglx: Massaged changelog ]
Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: default avatarAndi Kleen <ak@linux.intel.com>
Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
Tested-by: default avatarAndi Kleen <ak@linux.intel.com>
Reviewed-by: default avatarJosh Poimboeuf <jpoimboe@redhat.com>
Acked-by: default avatarMichal Hocko <mhocko@suse.com>
Acked-by: default avatarVlastimil Babka <vbabka@suse.cz>
Acked-by: default avatarDave Hansen <dave.hansen@intel.com>
parent 50896e18
...@@ -273,7 +273,7 @@ static inline int pgd_large(pgd_t pgd) { return 0; } ...@@ -273,7 +273,7 @@ static inline int pgd_large(pgd_t pgd) { return 0; }
* *
* | ... | 11| 10| 9|8|7|6|5| 4| 3|2| 1|0| <- bit number * | ... | 11| 10| 9|8|7|6|5| 4| 3|2| 1|0| <- bit number
* | ... |SW3|SW2|SW1|G|L|D|A|CD|WT|U| W|P| <- bit names * | ... |SW3|SW2|SW1|G|L|D|A|CD|WT|U| W|P| <- bit names
* | OFFSET (14->63) | TYPE (9-13) |0|0|X|X| X| X|X|SD|0| <- swp entry * | TYPE (59-63) | OFFSET (9-58) |0|0|X|X| X| X|X|SD|0| <- swp entry
* *
* G (8) is aliased and used as a PROT_NONE indicator for * G (8) is aliased and used as a PROT_NONE indicator for
* !present ptes. We need to start storing swap entries above * !present ptes. We need to start storing swap entries above
...@@ -287,19 +287,28 @@ static inline int pgd_large(pgd_t pgd) { return 0; } ...@@ -287,19 +287,28 @@ static inline int pgd_large(pgd_t pgd) { return 0; }
* Bit 7 in swp entry should be 0 because pmd_present checks not only P, * Bit 7 in swp entry should be 0 because pmd_present checks not only P,
* but also L and G. * but also L and G.
*/ */
#define SWP_TYPE_FIRST_BIT (_PAGE_BIT_PROTNONE + 1) #define SWP_TYPE_BITS 5
#define SWP_TYPE_BITS 5
/* Place the offset above the type: */ #define SWP_OFFSET_FIRST_BIT (_PAGE_BIT_PROTNONE + 1)
#define SWP_OFFSET_FIRST_BIT (SWP_TYPE_FIRST_BIT + SWP_TYPE_BITS)
/* We always extract/encode the offset by shifting it all the way up, and then down again */
#define SWP_OFFSET_SHIFT (SWP_OFFSET_FIRST_BIT+SWP_TYPE_BITS)
#define MAX_SWAPFILES_CHECK() BUILD_BUG_ON(MAX_SWAPFILES_SHIFT > SWP_TYPE_BITS) #define MAX_SWAPFILES_CHECK() BUILD_BUG_ON(MAX_SWAPFILES_SHIFT > SWP_TYPE_BITS)
#define __swp_type(x) (((x).val >> (SWP_TYPE_FIRST_BIT)) \ /* Extract the high bits for type */
& ((1U << SWP_TYPE_BITS) - 1)) #define __swp_type(x) ((x).val >> (64 - SWP_TYPE_BITS))
#define __swp_offset(x) ((x).val >> SWP_OFFSET_FIRST_BIT)
#define __swp_entry(type, offset) ((swp_entry_t) { \ /* Shift up (to get rid of type), then down to get value */
((type) << (SWP_TYPE_FIRST_BIT)) \ #define __swp_offset(x) ((x).val << SWP_TYPE_BITS >> SWP_OFFSET_SHIFT)
| ((offset) << SWP_OFFSET_FIRST_BIT) })
/*
* Shift the offset up "too far" by TYPE bits, then down again
*/
#define __swp_entry(type, offset) ((swp_entry_t) { \
((unsigned long)(offset) << SWP_OFFSET_SHIFT >> SWP_TYPE_BITS) \
| ((unsigned long)(type) << (64-SWP_TYPE_BITS)) })
#define __pte_to_swp_entry(pte) ((swp_entry_t) { pte_val((pte)) }) #define __pte_to_swp_entry(pte) ((swp_entry_t) { pte_val((pte)) })
#define __pmd_to_swp_entry(pmd) ((swp_entry_t) { pmd_val((pmd)) }) #define __pmd_to_swp_entry(pmd) ((swp_entry_t) { pmd_val((pmd)) })
#define __swp_entry_to_pte(x) ((pte_t) { .pte = (x).val }) #define __swp_entry_to_pte(x) ((pte_t) { .pte = (x).val })
......
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment