1. 26 Oct, 2021 6 commits
    • Chang S. Bae's avatar
      x86/arch_prctl: Add controls for dynamic XSTATE components · db8268df
      Chang S. Bae authored
      Dynamically enabled XSTATE features are by default disabled for all
      processes. A process has to request permission to use such a feature.
      
      To support this implement a architecture specific prctl() with the options:
      
         - ARCH_GET_XCOMP_SUPP
      
           Copies the supported feature bitmap into the user space provided
           u64 storage. The pointer is handed in via arg2
      
         - ARCH_GET_XCOMP_PERM
      
           Copies the process wide permitted feature bitmap into the user space
           provided u64 storage. The pointer is handed in via arg2
      
         - ARCH_REQ_XCOMP_PERM
      
           Request permission for a feature set. A feature set can be mapped to a
           facility, e.g. AMX, and can require one or more XSTATE components to
           be enabled.
      
           The feature argument is the number of the highest XSTATE component
           which is required for a facility to work.
      
           The request argument is not a user supplied bitmap because that makes
           filtering harder (think seccomp) and even impossible because to
           support 32bit tasks the argument would have to be a pointer.
      
      The permission mechanism works this way:
      
         Task asks for permission for a facility and kernel checks whether that's
         supported. If supported it does:
      
           1) Check whether permission has already been granted
      
           2) Compute the size of the required kernel and user space buffer
              (sigframe) size.
      
           3) Validate that no task has a sigaltstack installed
              which is smaller than the resulting sigframe size
      
           4) Add the requested feature bit(s) to the permission bitmap of
              current->group_leader->fpu and store the sizes in the group
              leaders fpu struct as well.
      
      If that is successful then the feature is still not enabled for any of the
      tasks. The first usage of a related instruction will result in a #NM
      trap. The trap handler validates the permission bit of the tasks group
      leader and if permitted it installs a larger kernel buffer and transfers
      the permission and size info to the new fpstate container which makes all
      the FPU functions which require per task information aware of the extended
      feature set.
      
        [ tglx: Adopted to new base code, added missing serialization,
                massaged namings, comments and changelog ]
      Signed-off-by: default avatarChang S. Bae <chang.seok.bae@intel.com>
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Signed-off-by: default avatarChang S. Bae <chang.seok.bae@intel.com>
      Signed-off-by: default avatarBorislav Petkov <bp@suse.de>
      Link: https://lkml.kernel.org/r/20211021225527.10184-7-chang.seok.bae@intel.com
      db8268df
    • Thomas Gleixner's avatar
      x86/fpu: Add fpu_state_config::legacy_features · c33f0a81
      Thomas Gleixner authored
      The upcoming prctl() which is required to request the permission for a
      dynamically enabled feature will also provide an option to retrieve the
      supported features. If the CPU does not support XSAVE, the supported
      features would be 0 even when the CPU supports FP and SSE.
      
      Provide separate storage for the legacy feature set to avoid that and fill
      in the bits in the legacy init function.
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Signed-off-by: default avatarChang S. Bae <chang.seok.bae@intel.com>
      Signed-off-by: default avatarBorislav Petkov <bp@suse.de>
      Link: https://lkml.kernel.org/r/20211021225527.10184-6-chang.seok.bae@intel.com
      c33f0a81
    • Thomas Gleixner's avatar
      x86/fpu: Add members to struct fpu to cache permission information · 6f6a7c09
      Thomas Gleixner authored
      Dynamically enabled features can be requested by any thread of a running
      process at any time. The request does neither enable the feature nor
      allocate larger buffers. It just stores the permission to use the feature
      by adding the features to the permission bitmap and by calculating the
      required sizes for kernel and user space.
      
      The reallocation of the kernel buffer happens when the feature is used
      for the first time which is caught by an exception. The permission
      bitmap is then checked and if the feature is permitted, then it becomes
      fully enabled. If not, the task dies similarly to a task which uses an
      undefined instruction.
      
      The size information is precomputed to allow proper sigaltstack size checks
      once the feature is permitted, but not yet in use because otherwise this
      would open race windows where too small stacks could be installed causing
      a later fail on signal delivery.
      
      Initialize them to the default feature set and sizes.
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Signed-off-by: default avatarChang S. Bae <chang.seok.bae@intel.com>
      Signed-off-by: default avatarBorislav Petkov <bp@suse.de>
      Link: https://lkml.kernel.org/r/20211021225527.10184-5-chang.seok.bae@intel.com
      6f6a7c09
    • Chang S. Bae's avatar
      x86/fpu/xstate: Provide xstate_calculate_size() · 84e4dccc
      Chang S. Bae authored
      Split out the size calculation from the paranoia check so it can be used
      for recalculating buffer sizes when dynamically enabled features are
      supported.
      Signed-off-by: default avatarChang S. Bae <chang.seok.bae@intel.com>
      [ tglx: Adopted to changed base code ]
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Signed-off-by: default avatarChang S. Bae <chang.seok.bae@intel.com>
      Signed-off-by: default avatarBorislav Petkov <bp@suse.de>
      Link: https://lkml.kernel.org/r/20211021225527.10184-4-chang.seok.bae@intel.com
      84e4dccc
    • Thomas Gleixner's avatar
      x86/signal: Implement sigaltstack size validation · 3aac3ebe
      Thomas Gleixner authored
      For historical reasons MINSIGSTKSZ is a constant which became already too
      small with AVX512 support.
      
      Add a mechanism to enforce strict checking of the sigaltstack size against
      the real size of the FPU frame.
      
      The strict check can be enabled via a config option and can also be
      controlled via the kernel command line option 'strict_sas_size' independent
      of the config switch.
      
      Enabling it might break existing applications which allocate a too small
      sigaltstack but 'work' because they never get a signal delivered. Though it
      can be handy to filter out binaries which are not yet aware of
      AT_MINSIGSTKSZ.
      
      Also the upcoming support for dynamically enabled FPU features requires a
      strict sanity check to ensure that:
      
         - Enabling of a dynamic feature, which changes the sigframe size fits
           into an enabled sigaltstack
      
         - Installing a too small sigaltstack after a dynamic feature has been
           added is not possible.
      
      Implement the base check which is controlled by config and command line
      options.
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Signed-off-by: default avatarChang S. Bae <chang.seok.bae@intel.com>
      Signed-off-by: default avatarBorislav Petkov <bp@suse.de>
      Link: https://lkml.kernel.org/r/20211021225527.10184-3-chang.seok.bae@intel.com
      3aac3ebe
    • Thomas Gleixner's avatar
      signal: Add an optional check for altstack size · 1bdda24c
      Thomas Gleixner authored
      New x86 FPU features will be very large, requiring ~10k of stack in
      signal handlers.  These new features require a new approach called
      "dynamic features".
      
      The kernel currently tries to ensure that altstacks are reasonably
      sized. Right now, on x86, sys_sigaltstack() requires a size of >=2k.
      However, that 2k is a constant. Simply raising that 2k requirement
      to >10k for the new features would break existing apps which have a
      compiled-in size of 2k.
      
      Instead of universally enforcing a larger stack, prohibit a process from
      using dynamic features without properly-sized altstacks. This must be
      enforced in two places:
      
       * A dynamic feature can not be enabled without an large-enough altstack
         for each process thread.
       * Once a dynamic feature is enabled, any request to install a too-small
         altstack will be rejected
      
      The dynamic feature enabling code must examine each thread in a
      process to ensure that the altstacks are large enough. Add a new lock
      (sigaltstack_lock()) to ensure that threads can not race and change
      their altstack after being examined.
      
      Add the infrastructure in form of a config option and provide empty
      stubs for architectures which do not need dynamic altstack size checks.
      
      This implementation will be fleshed out for x86 in a future patch called
      
        x86/arch_prctl: Add controls for dynamic XSTATE components
      
        [dhansen: commit message. ]
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Signed-off-by: default avatarChang S. Bae <chang.seok.bae@intel.com>
      Signed-off-by: default avatarBorislav Petkov <bp@suse.de>
      Link: https://lkml.kernel.org/r/20211021225527.10184-2-chang.seok.bae@intel.com
      1bdda24c
  2. 23 Oct, 2021 4 commits
    • Thomas Gleixner's avatar
    • Thomas Gleixner's avatar
      x86/kvm: Convert FPU handling to a single swap buffer · d69c1382
      Thomas Gleixner authored
      For the upcoming AMX support it's necessary to do a proper integration with
      KVM. Currently KVM allocates two FPU structs which are used for saving the user
      state of the vCPU thread and restoring the guest state when entering
      vcpu_run() and doing the reverse operation before leaving vcpu_run().
      
      With the new fpstate mechanism this can be reduced to one extra buffer by
      swapping the fpstate pointer in current::thread::fpu. This makes the
      upcoming support for AMX and XFD simpler because then fpstate information
      (features, sizes, xfd) are always consistent and it does not require any
      nasty workarounds.
      
      Convert the KVM FPU code over to this new scheme.
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Signed-off-by: default avatarBorislav Petkov <bp@suse.de>
      Link: https://lkml.kernel.org/r/20211022185313.019454292@linutronix.de
      d69c1382
    • Thomas Gleixner's avatar
      x86/fpu: Provide infrastructure for KVM FPU cleanup · 69f6ed1d
      Thomas Gleixner authored
      For the upcoming AMX support it's necessary to do a proper integration with
      KVM. Currently KVM allocates two FPU structs which are used for saving the user
      state of the vCPU thread and restoring the guest state when entering
      vcpu_run() and doing the reverse operation before leaving vcpu_run().
      
      With the new fpstate mechanism this can be reduced to one extra buffer by
      swapping the fpstate pointer in current::thread::fpu. This makes the
      upcoming support for AMX and XFD simpler because then fpstate information
      (features, sizes, xfd) are always consistent and it does not require any
      nasty workarounds.
      
      Provide:
      
        - An allocator which initializes the state properly
      
        - A replacement for the existing FPU swap mechanim
      
      Aside of the reduced memory footprint, this also makes state switching
      more efficient when TIF_FPU_NEED_LOAD is set. It does not require a
      memcpy as the state is already correct in the to be swapped out fpstate.
      
      The existing interfaces will be removed once KVM is converted over.
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Signed-off-by: default avatarBorislav Petkov <bp@suse.de>
      Link: https://lkml.kernel.org/r/20211022185312.954684740@linutronix.de
      69f6ed1d
    • Thomas Gleixner's avatar
      x86/fpu: Prepare for sanitizing KVM FPU code · 75c52dad
      Thomas Gleixner authored
      For the upcoming AMX support it's necessary to do a proper integration with
      KVM. To avoid more nasty hackery in KVM which violate encapsulation extend
      struct fpu and fpstate so the fpstate switching can be consolidated and
      simplified.
      
      Currently KVM allocates two FPU structs which are used for saving the user
      state of the vCPU thread and restoring the guest state when entering
      vcpu_run() and doing the reverse operation before leaving vcpu_run().
      
      With the new fpstate mechanism this can be reduced to one extra buffer by
      swapping the fpstate pointer in current::thread::fpu. This makes the
      upcoming support for AMX and XFD simpler because then fpstate information
      (features, sizes, xfd) are always consistent and it does not require any
      nasty workarounds.
      
      Add fpu::__task_fpstate to save the regular fpstate pointer while the task
      is inside vcpu_run(). Add some state fields to fpstate to indicate the
      nature of the state.
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Signed-off-by: default avatarBorislav Petkov <bp@suse.de>
      Link: https://lkml.kernel.org/r/20211022185312.896403942@linutronix.de
      75c52dad
  3. 22 Oct, 2021 3 commits
  4. 21 Oct, 2021 15 commits
  5. 20 Oct, 2021 12 commits