1. 09 Nov, 2023 3 commits
  2. 08 Nov, 2023 3 commits
    • Palmer Dabbelt's avatar
      Merge patch series "riscv: Fix set_memory_XX() and set_direct_map_XX()" · 05942f78
      Palmer Dabbelt authored
      Alexandre Ghiti <alexghiti@rivosinc.com> says:
      
      Those 2 patches fix the set_memory_XX() and set_direct_map_XX() APIs, which
      in turn fix STRICT_KERNEL_RWX and memfd_secret(). Those were broken since the
      permission changes were not applied to the linear mapping because the linear
      mapping is mapped using hugepages and walk_page_range_novma() does not split
      such mappings.
      
      To fix that, patch 1 disables PGD mappings in the linear mapping as it is
      hard to propagate changes at this level in *all* the page tables, this has the
      downside of disabling PMD mapping for sv32 and PUD (1GB) mapping for sv39 in
      the linear mapping (for specific kernels, we could add a Kconfig to enable
      ARCH_HAS_SET_DIRECT_MAP and STRICT_KERNEL_RWX if needed, I'm pretty sure we'll
      discuss that).
      
      patch 2 implements the split of the huge linear mappings so that
      walk_page_range_novma() can properly apply the permissions. The whole split is
      protected with mmap_sem in write mode, but I'm wondering if that's enough,
      any opinion on that is appreciated.
      
      * b4-shazam-merge:
        riscv: Fix set_memory_XX() and set_direct_map_XX() by splitting huge linear mappings
        riscv: Don't use PGD entries for the linear mapping
      
      Link: https://lore.kernel.org/r/20231108075930.7157-1-alexghiti@rivosinc.comSigned-off-by: default avatarPalmer Dabbelt <palmer@rivosinc.com>
      05942f78
    • Alexandre Ghiti's avatar
      riscv: Fix set_memory_XX() and set_direct_map_XX() by splitting huge linear mappings · 311cd2f6
      Alexandre Ghiti authored
      When STRICT_KERNEL_RWX is set, any change of permissions on any kernel
      mapping (vmalloc/modules/kernel text...etc) should be applied on its
      linear mapping alias. The problem is that the riscv kernel uses huge
      mappings for the linear mapping and walk_page_range_novma() does not
      split those huge mappings.
      
      So this patchset implements such split in order to apply fine-grained
      permissions on the linear mapping.
      
      Below is the difference before and after (the first PUD mapping is split
      into PTE/PMD mappings):
      
      Before:
      
      ---[ Linear mapping ]---
      0xffffaf8000080000-0xffffaf8000200000    0x0000000080080000      1536K PTE     D A G . . W R V
      0xffffaf8000200000-0xffffaf8077c00000    0x0000000080200000      1914M PMD     D A G . . W R V
      0xffffaf8077c00000-0xffffaf8078800000    0x00000000f7c00000        12M PMD     D A G . . . R V
      0xffffaf8078800000-0xffffaf8078c00000    0x00000000f8800000         4M PMD     D A G . . W R V
      0xffffaf8078c00000-0xffffaf8079200000    0x00000000f8c00000         6M PMD     D A G . . . R V
      0xffffaf8079200000-0xffffaf807e600000    0x00000000f9200000        84M PMD     D A G . . W R V
      0xffffaf807e600000-0xffffaf807e716000    0x00000000fe600000      1112K PTE     D A G . . W R V
      0xffffaf807e717000-0xffffaf807e71a000    0x00000000fe717000        12K PTE     D A G . . W R V
      0xffffaf807e71d000-0xffffaf807e71e000    0x00000000fe71d000         4K PTE     D A G . . W R V
      0xffffaf807e722000-0xffffaf807e800000    0x00000000fe722000       888K PTE     D A G . . W R V
      0xffffaf807e800000-0xffffaf807fe00000    0x00000000fe800000        22M PMD     D A G . . W R V
      0xffffaf807fe00000-0xffffaf807ff54000    0x00000000ffe00000      1360K PTE     D A G . . W R V
      0xffffaf807ff55000-0xffffaf8080000000    0x00000000fff55000       684K PTE     D A G . . W R V
      0xffffaf8080000000-0xffffaf8400000000    0x0000000100000000        14G PUD     D A G . . W R V
      
      After:
      
      ---[ Linear mapping ]---
      0xffffaf8000080000-0xffffaf8000200000    0x0000000080080000      1536K PTE     D A G . . W R V
      0xffffaf8000200000-0xffffaf8077c00000    0x0000000080200000      1914M PMD     D A G . . W R V
      0xffffaf8077c00000-0xffffaf8078800000    0x00000000f7c00000        12M PMD     D A G . . . R V
      0xffffaf8078800000-0xffffaf8078a00000    0x00000000f8800000         2M PMD     D A G . . W R V
      0xffffaf8078a00000-0xffffaf8078c00000    0x00000000f8a00000         2M PTE     D A G . . W R V
      0xffffaf8078c00000-0xffffaf8079200000    0x00000000f8c00000         6M PMD     D A G . . . R V
      0xffffaf8079200000-0xffffaf807e600000    0x00000000f9200000        84M PMD     D A G . . W R V
      0xffffaf807e600000-0xffffaf807e716000    0x00000000fe600000      1112K PTE     D A G . . W R V
      0xffffaf807e717000-0xffffaf807e71a000    0x00000000fe717000        12K PTE     D A G . . W R V
      0xffffaf807e71d000-0xffffaf807e71e000    0x00000000fe71d000         4K PTE     D A G . . W R V
      0xffffaf807e722000-0xffffaf807e800000    0x00000000fe722000       888K PTE     D A G . . W R V
      0xffffaf807e800000-0xffffaf807fe00000    0x00000000fe800000        22M PMD     D A G . . W R V
      0xffffaf807fe00000-0xffffaf807ff54000    0x00000000ffe00000      1360K PTE     D A G . . W R V
      0xffffaf807ff55000-0xffffaf8080000000    0x00000000fff55000       684K PTE     D A G . . W R V
      0xffffaf8080000000-0xffffaf8080800000    0x0000000100000000         8M PMD     D A G . . W R V
      0xffffaf8080800000-0xffffaf8080af6000    0x0000000100800000      3032K PTE     D A G . . W R V
      0xffffaf8080af6000-0xffffaf8080af8000    0x0000000100af6000         8K PTE     D A G . X . R V
      0xffffaf8080af8000-0xffffaf8080c00000    0x0000000100af8000      1056K PTE     D A G . . W R V
      0xffffaf8080c00000-0xffffaf8081a00000    0x0000000100c00000        14M PMD     D A G . . W R V
      0xffffaf8081a00000-0xffffaf8081a40000    0x0000000101a00000       256K PTE     D A G . . W R V
      0xffffaf8081a40000-0xffffaf8081a44000    0x0000000101a40000        16K PTE     D A G . X . R V
      0xffffaf8081a44000-0xffffaf8081a52000    0x0000000101a44000        56K PTE     D A G . . W R V
      0xffffaf8081a52000-0xffffaf8081a54000    0x0000000101a52000         8K PTE     D A G . X . R V
      ...
      0xffffaf809e800000-0xffffaf80c0000000    0x000000011e800000       536M PMD     D A G . . W R V
      0xffffaf80c0000000-0xffffaf8400000000    0x0000000140000000        13G PUD     D A G . . W R V
      
      Note that this also fixes memfd_secret() syscall which uses
      set_direct_map_invalid_noflush() and set_direct_map_default_noflush() to
      remove the pages from the linear mapping. Below is the kernel page table
      while a memfd_secret() syscall is running, you can see all the !valid
      page table entries in the linear mapping:
      
      ...
      0xffffaf8082240000-0xffffaf8082241000    0x0000000102240000         4K PTE     D A G . . W R .
      0xffffaf8082241000-0xffffaf8082250000    0x0000000102241000        60K PTE     D A G . . W R V
      0xffffaf8082250000-0xffffaf8082252000    0x0000000102250000         8K PTE     D A G . . W R .
      0xffffaf8082252000-0xffffaf8082256000    0x0000000102252000        16K PTE     D A G . . W R V
      0xffffaf8082256000-0xffffaf8082257000    0x0000000102256000         4K PTE     D A G . . W R .
      0xffffaf8082257000-0xffffaf8082258000    0x0000000102257000         4K PTE     D A G . . W R V
      0xffffaf8082258000-0xffffaf8082259000    0x0000000102258000         4K PTE     D A G . . W R .
      0xffffaf8082259000-0xffffaf808225a000    0x0000000102259000         4K PTE     D A G . . W R V
      0xffffaf808225a000-0xffffaf808225c000    0x000000010225a000         8K PTE     D A G . . W R .
      0xffffaf808225c000-0xffffaf8082266000    0x000000010225c000        40K PTE     D A G . . W R V
      0xffffaf8082266000-0xffffaf8082268000    0x0000000102266000         8K PTE     D A G . . W R .
      0xffffaf8082268000-0xffffaf8082284000    0x0000000102268000       112K PTE     D A G . . W R V
      0xffffaf8082284000-0xffffaf8082288000    0x0000000102284000        16K PTE     D A G . . W R .
      0xffffaf8082288000-0xffffaf808229c000    0x0000000102288000        80K PTE     D A G . . W R V
      0xffffaf808229c000-0xffffaf80822a0000    0x000000010229c000        16K PTE     D A G . . W R .
      0xffffaf80822a0000-0xffffaf80822a5000    0x00000001022a0000        20K PTE     D A G . . W R V
      0xffffaf80822a5000-0xffffaf80822a6000    0x00000001022a5000         4K PTE     D A G . . . R V
      0xffffaf80822a6000-0xffffaf80822ab000    0x00000001022a6000        20K PTE     D A G . . W R V
      ...
      
      And when the memfd_secret() fd is released, the linear mapping is
      correctly reset:
      
      ...
      0xffffaf8082240000-0xffffaf80822a5000    0x0000000102240000       404K PTE     D A G . . W R V
      0xffffaf80822a5000-0xffffaf80822a6000    0x00000001022a5000         4K PTE     D A G . . . R V
      0xffffaf80822a6000-0xffffaf80822af000    0x00000001022a6000        36K PTE     D A G . . W R V
      ...
      Signed-off-by: default avatarAlexandre Ghiti <alexghiti@rivosinc.com>
      Link: https://lore.kernel.org/r/20231108075930.7157-3-alexghiti@rivosinc.comSigned-off-by: default avatarPalmer Dabbelt <palmer@rivosinc.com>
      311cd2f6
    • Alexandre Ghiti's avatar
      riscv: Don't use PGD entries for the linear mapping · 629db01c
      Alexandre Ghiti authored
      Propagating changes at this level is cumbersome as we need to go through
      all the page tables when that happens (either when changing the
      permissions or when splitting the mapping).
      
      Note that this prevents the use of 4MB mapping for sv32 and 1GB mapping for
      sv39 in the linear mapping.
      Signed-off-by: default avatarAlexandre Ghiti <alexghiti@rivosinc.com>
      Link: https://lore.kernel.org/r/20231108075930.7157-2-alexghiti@rivosinc.comSigned-off-by: default avatarPalmer Dabbelt <palmer@rivosinc.com>
      629db01c
  3. 07 Nov, 2023 15 commits
  4. 06 Nov, 2023 9 commits
  5. 05 Nov, 2023 10 commits