1. 17 May, 2023 2 commits
    • Arnd Bergmann's avatar
      fs: d_path: include internal.h · df67cb4c
      Arnd Bergmann authored
      make W=1 warns about a missing prototype that is defined but
      not visible at point where simple_dname() is defined:
      
      fs/d_path.c:317:7: error: no previous prototype for 'simple_dname' [-Werror=missing-prototypes]
      Signed-off-by: default avatarArnd Bergmann <arnd@arndb.de>
      Message-Id: <20230516195444.551461-1-arnd@kernel.org>
      Signed-off-by: default avatarChristian Brauner <brauner@kernel.org>
      df67cb4c
    • Vladimir Sementsov-Ogievskiy's avatar
      coredump: require O_WRONLY instead of O_RDWR · 88e46070
      Vladimir Sementsov-Ogievskiy authored
      The motivation for this patch has been to enable using a stricter
      apparmor profile to prevent programs from reading any coredump in the
      system.
      
      However, this became something else. The following details are based on
      Christian's and Linus' archeology into the history of the number "2" in
      the coredump handling code.
      
      To make sure we're not accidently introducing some subtle behavioral
      change into the coredump code we set out on a voyage into the depths of
      history.git to figure out why this was O_RDWR in the first place.
      
      Coredump handling was introduced over 30 years ago in commit
      ddc733f4 ("[PATCH] Linux-0.97 (August 1, 1992)").
      The original code used O_WRONLY:
      
          open_namei("core",O_CREAT | O_WRONLY | O_TRUNC,0600,&inode,NULL)
      
      However, this changed in 1993 and starting with commit
      9cb9f18b ("[PATCH] Linux-0.99.10 (June 7, 1993)") the coredump code
      suddenly used the constant "2":
      
          open_namei("core",O_CREAT | 2 | O_TRUNC,0600,&inode,NULL)
      
      This was curious as in the same commit the kernel switched from
      constants to proper defines in other places such as KERNEL_DS and
      USER_DS and O_RDWR did already exist.
      
      So why was "2" used? It turns out that open_namei() - an early version
      of what later turned into filp_open() - didn't accept O_RDWR.
      
      A semantic quirk of the open() uapi is the definition of the O_RDONLY
      flag. It would seem natural to define:
      
          #define O_RDWR (O_RDONLY | O_WRONLY)
      
      but that isn't possible because:
      
          #define O_RDONLY 0
      
      This makes O_RDONLY effectively meaningless when passed to the kernel.
      In other words, there has never been a way - until O_PATH at least - to
      open a file without any permission; O_RDONLY was always implied on the
      uapi side while the kernel does in fact allow opening files without
      permissions.
      
      The trouble comes when trying to map the uapi flags onto the
      corresponding file mode flags FMODE_{READ,WRITE}. This mapping still
      happens today and is causing issues to this day (We ran into this
      during additions for openat2() for example.).
      
      So the special value "3" was used to indicate that the file was opened
      for special access:
      
          f->f_flags = flag = flags;
          f->f_mode = (flag+1) & O_ACCMODE;
          if (f->f_mode)
                  flag++;
      
      This allowed the file mode to be set to FMODE_READ | FMODE_WRITE mapping
      the O_{RDONLY,WRONLY,RDWR} flags into the FMODE_{READ,WRITE} flags. The
      special access then required read-write permissions and 0 was used to
      access symlinks.
      
      But back when ddc733f4 ("[PATCH] Linux-0.97 (August 1, 1992)") added
      coredump handling open_namei() took the FMODE_{READ,WRITE} flags as an
      argument. So the coredump handling introduced in
      ddc733f4 ("[PATCH] Linux-0.97 (August 1, 1992)") was buggy because
      O_WRONLY shouldn't have been passed. Since O_WRONLY is 1 but
      open_namei() took FMODE_{READ,WRITE} it was passed FMODE_READ on
      accident.
      
      So 9cb9f18b ("[PATCH] Linux-0.99.10 (June 7, 1993)") was a bugfix
      for this and the 2 didn't really mean O_RDWR, it meant FMODE_WRITE which
      was correct.
      
      The clue is that FMODE_{READ,WRITE} didn't exist yet and thus a raw "2"
      value was passed.
      
      Fast forward 5 years when around 2.2.4pre4 (February 16, 1999) this code
      was changed to:
      
          -       dentry = open_namei(corefile,O_CREAT | 2 | O_TRUNC | O_NOFOLLOW, 0600);
          ...
          +       file = filp_open(corefile,O_CREAT | 2 | O_TRUNC | O_NOFOLLOW, 0600);
      
      At this point the raw "2" should have become O_WRONLY again as
      filp_open() didn't take FMODE_{READ,WRITE} but O_{RDONLY,WRONLY,RDWR}.
      
      Another 17 years later, the code was changed again cementing the mistake
      and making it almost impossible to detect when commit
      378c6520 ("fs/coredump: prevent fsuid=0 dumps into user-controlled directories")
      replaced the raw "2" with O_RDWR.
      
      And now, here we are with this patch that sent us on a quest to answer
      the big questions in life such as "Why are coredump files opened with
      O_RDWR?" and "Is it safe to just use O_WRONLY?".
      
      So with this commit we're reintroducing O_WRONLY again and bringing this
      code back to its original state when it was first introduced in commit
      ddc733f4 ("[PATCH] Linux-0.97 (August 1, 1992)") over 30 years ago.
      Signed-off-by: default avatarVladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
      Message-Id: <20230420120409.602576-1-vsementsov@yandex-team.ru>
      [brauner@kernel.org: completely rewritten commit message]
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarChristian Brauner <brauner@kernel.org>
      88e46070
  2. 15 May, 2023 5 commits
  3. 14 May, 2023 13 commits
  4. 13 May, 2023 17 commits
  5. 12 May, 2023 3 commits
    • Borislav Petkov (AMD)'s avatar
      x86/retbleed: Fix return thunk alignment · 9a48d604
      Borislav Petkov (AMD) authored
      SYM_FUNC_START_LOCAL_NOALIGN() adds an endbr leading to this layout
      (leaving only the last 2 bytes of the address):
      
        3bff <zen_untrain_ret>:
        3bff:       f3 0f 1e fa             endbr64
        3c03:       f6                      test   $0xcc,%bl
      
        3c04 <__x86_return_thunk>:
        3c04:       c3                      ret
        3c05:       cc                      int3
        3c06:       0f ae e8                lfence
      
      However, "the RET at __x86_return_thunk must be on a 64 byte boundary,
      for alignment within the BTB."
      
      Use SYM_START instead.
      Signed-off-by: default avatarBorislav Petkov (AMD) <bp@alien8.de>
      Reviewed-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Cc: <stable@kernel.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      9a48d604
    • Linus Torvalds's avatar
      Merge tag 'for-6.4-rc1-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux · 76c7f887
      Linus Torvalds authored
      Pull more btrfs fixes from David Sterba:
      
       - fix incorrect number of bitmap entries for space cache if loading is
         interrupted by some error
      
       - fix backref walking, this breaks a mode of LOGICAL_INO_V2 ioctl that
         is used in deduplication tools
      
       - zoned mode fixes:
            - properly finish zone reserved for relocation
            - correctly calculate super block zone end on ZNS
            - properly initialize new extent buffer for redirty
      
       - make mount option clear_cache work with block-group-tree, to rebuild
         free-space-tree instead of temporarily disabling it that would lead
         to a forced read-only mount
      
       - fix alignment check for offset when printing extent item
      
      * tag 'for-6.4-rc1-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux:
        btrfs: make clear_cache mount option to rebuild FST without disabling it
        btrfs: zero the buffer before marking it dirty in btrfs_redirty_list_add
        btrfs: zoned: fix full zone super block reading on ZNS
        btrfs: zoned: zone finish data relocation BG with last IO
        btrfs: fix backref walking not returning all inode refs
        btrfs: fix space cache inconsistency after error loading it from disk
        btrfs: print-tree: parent bytenr must be aligned to sector size
      76c7f887
    • Linus Torvalds's avatar
      Merge tag '6.4-rc1-smb3-client-fixes' of git://git.samba.org/sfrench/cifs-2.6 · fd88f147
      Linus Torvalds authored
      Pull cifs client fixes from Steve French:
      
       - fix for copy_file_range bug for very large files that are multiples
         of rsize
      
       - do not ignore "isolated transport" flag if set on share
      
       - set rasize default better
      
       - three fixes related to shutdown and freezing (fixes 4 xfstests, and
         closes deferred handles faster in some places that were missed)
      
      * tag '6.4-rc1-smb3-client-fixes' of git://git.samba.org/sfrench/cifs-2.6:
        cifs: release leases for deferred close handles when freezing
        smb3: fix problem remounting a share after shutdown
        SMB3: force unmount was failing to close deferred close files
        smb3: improve parallel reads of large files
        do not reuse connection if share marked as isolated
        cifs: fix pcchunk length type in smb2_copychunk_range
      fd88f147