1. 13 Apr, 2019 2 commits
    • Kirill Smelkov's avatar
      vfs: use &file->f_pos directly on files that have position · 0192236d
      Kirill Smelkov authored
      Long ago vfs read/write operations were passing ppos=&file->f_pos directly to
      .read / .write file_operations methods. That changed in 2004 in 55f09ec0
      ("read/write: pass down a copy of f_pos, not f_pos itself.") which started to
      pass ppos=&local_var trying to avoid simultaneous read/write/lseek stepping
      onto each other toes and overwriting file->f_pos racily. That measure was not
      complete and in 2014 commit 9c225f26 ("vfs: atomic f_pos accesses as per
      POSIX") added file->f_pos_lock to completely disable simultaneous
      read/write/lseek runs. After f_pos_lock was introduced the reason to avoid
      passing ppos=&file->f_pos directly due to concurrency vanished.
      
      Linus explains[1]:
      
      	In fact, we *used* to (long ago) pass in the address of "file->f_pos"
      	itself to the low-level read/write routines. We then changed it to do
      	that indirection through a local copy of pos (and
      	file_pos_read/file_pos_write) because we didn't do the proper locking,
      	so different read/write versions could mess with each other (and with
      	lseek).
      
      	But one of the things that commit 9c225f26 ("vfs: atomic f_pos
      	accesses as per POSIX") did was to add the proper locking at least for
      	the cases that we care about deeply, so we *could* say that we have
      	three cases:
      
      	 - FMODE_ATOMIC_POS: properly locked,
      	 - FMODE_STREAM: no pos at all
      	 - otherwise a "mostly don't care - don't mix!"
      
      	and so we could go back to not copying the pos at all, and instead do
      	something like
      
      	        loff_t *ppos = f.file->f_mode & FMODE_STREAM ? NULL : &file->f_pos;
      	        ret = vfs_write(f.file, buf, count, ppos);
      
      	and perhaps have a long-term plan to try to get rid of the "don't mix"
      	case entirely (ie "if you use f_pos, then we'll do the proper
      	locking")
      
      	(The above is obviously surrounded by the fdget_pos()/fdput_pos() that
      	implements the locking decision).
      
      Currently for regular files we always set FMODE_ATOMIC_POS and change that to
      FMODE_STREAM if stream_open is used explicitly on open. That leaves other
      files, like e.g. sockets and pipes, for "mostly don't care - don't mix!" case.
      
      Sockets, for example, always check that on read/write the initial pos they
      receive is 0 and don't update it. And if it is !0 they return -ESPIPE.
      That suggests that we can do the switch into passing &file->f_pos directly now
      and incrementally convert to FMODE_STREAM files that were doing the stream-like
      checking manually in their low-level .read/.write handlers.
      
      Note: it is theoretically possible that a driver updates *ppos inside
      even if read/write returns error. For such cases the conversion will
      change IO semantic a bit. The semantic that is changing here was
      introduced in 2013 in commit 5faf153e "don't call file_pos_write()
      if vfs_{read,write}{,v}() fails".
      
      [1] https://lore.kernel.org/linux-fsdevel/CAHk-=whJtZt52SnhBGrNMnuxFn3GE9X_e02x8BPxtkqrfyZukw@mail.gmail.com/Suggested-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: Kirill Smelkov's avatarKirill Smelkov <kirr@nexedi.com>
      0192236d
    • Kirill Smelkov's avatar
      vfs: pass ppos=NULL to .read()/.write() of FMODE_STREAM files · 3df9bc6c
      Kirill Smelkov authored
      This amends commit 10dce8af ("fs: stream_open - opener for
      stream-like files so that read and write can run simultaneously without
      deadlock") in how position is passed into .read()/.write() handler for
      stream-like files:
      
      Rasmus noticed that we currently pass 0 as position and ignore any position
      change if that is done by a file implementation. This papers over bugs if ppos
      is used in files that declare themselves as being stream-like as such bugs will
      go unnoticed. Even if a file implementation is correctly converted into using
      stream_open, its read/write later could be changed to use ppos and even though
      that won't be working correctly, that bug might go unnoticed without someone
      doing wrong behaviour analysis. It is thus better to pass ppos=NULL into
      read/write for stream-like files as that don't give any chance for ppos usage
      bugs because it will oops if ppos is ever used inside .read() or .write().
      
      Note 1: rw_verify_area, new_sync_{read,write} needs to be updated
      because they are called by vfs_read/vfs_write & friends before
      file_operations .read/.write .
      
      Note 2: if file backend uses new-style .read_iter/.write_iter, position
      is still passed into there as non-pointer kiocb.ki_pos . Currently
      stream_open.cocci (semantic patch added by 10dce8af) ignores files
      whose file_operations has *_iter methods.
      Suggested-by: default avatarRasmus Villemoes <linux@rasmusvillemoes.dk>
      Signed-off-by: Kirill Smelkov's avatarKirill Smelkov <kirr@nexedi.com>
      3df9bc6c
  2. 12 Apr, 2019 1 commit
    • Kirill Smelkov's avatar
      vfs: document that read/write & friends are called with file.f_pos_lock held · 8ea60779
      Kirill Smelkov authored
      Commit 9c225f26 ("vfs: atomic f_pos accesses as per POSIX") added
      locking on read/write/lseek for regular files so that the file position
      is updated atomically.
      
      Commit 10dce8af ("fs: stream_open - opener for stream-like files so
      that read and write can run simultaneously without deadlock") removed
      that locking if a file is stream-like, so that e.g. read don't deadlock
      simultaneous write.
      
      Document that file.f_pos_lock is held on read/write & co for regular
      files.
      
      In the documentation we can ignore the fact that f_pos_lock is not taken
      if there is only one single thread using the file, because it is just an
      optimization and does not affect semantic, by which multiple read /
      write cannot run simultaneously.
      Signed-off-by: Kirill Smelkov's avatarKirill Smelkov <kirr@nexedi.com>
      8ea60779
  3. 11 Apr, 2019 1 commit
  4. 10 Apr, 2019 7 commits
  5. 09 Apr, 2019 3 commits
    • Linus Torvalds's avatar
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net · 869e3305
      Linus Torvalds authored
      Pull networking fixes from David Miller:
      
       1) Off by one and bounds checking fixes in NFC, from Dan Carpenter.
      
       2) There have been many weird regressions in r8169 since we turned ASPM
          support on, some are still not understood nor completely resolved.
          Let's turn this back off for now. From Heiner Kallweit.
      
       3) Signess fixes for ethtool speed value handling, from Michael
          Zhivich.
      
       4) Handle timestamps properly in macb driver, from Paul Thomas.
      
       5) Two erspan fixes, it's the usual "skb ->data potentially reallocated
          and we're holding a stale protocol header pointer". From Lorenzo
          Bianconi.
      
      * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net:
        bnxt_en: Reset device on RX buffer errors.
        bnxt_en: Improve RX consumer index validity check.
        net: macb driver, check for SKBTX_HW_TSTAMP
        qlogic: qlcnic: fix use of SPEED_UNKNOWN ethtool constant
        broadcom: tg3: fix use of SPEED_UNKNOWN ethtool constant
        ethtool: avoid signed-unsigned comparison in ethtool_validate_speed()
        net: ip6_gre: fix possible use-after-free in ip6erspan_rcv
        net: ip_gre: fix possible use-after-free in erspan_rcv
        r8169: disable ASPM again
        MAINTAINERS: ieee802154: update documentation file pattern
        net: vrf: Fix ping failed when vrf mtu is set to 0
        selftests: add a tc matchall test case
        nfc: nci: Potential off by one in ->pipes[] array
        NFC: nci: Add some bounds checking in nci_hci_cmd_received()
      869e3305
    • Linus Torvalds's avatar
      Merge branch 'fixes-v5.1' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security · a556810d
      Linus Torvalds authored
      Pull TPM fixes from James Morris:
       "From Jarkko: These are critical fixes for v5.1. Contains also couple
        of new selftests for v5.1 features (partial reads in /dev/tpm0)"
      
      * 'fixes-v5.1' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security:
        selftests/tpm2: Open tpm dev in unbuffered mode
        selftests/tpm2: Extend tests to cover partial reads
        KEYS: trusted: fix -Wvarags warning
        tpm: Fix the type of the return value in calc_tpm2_event_size()
        KEYS: trusted: allow trusted.ko to initialize w/o a TPM
        tpm: fix an invalid condition in tpm_common_poll
        tpm: turn on TPM on suspend for TPM 1.x
      a556810d
    • Linus Torvalds's avatar
      Merge tag 'xtensa-20190408' of git://github.com/jcmvbkbc/linux-xtensa · 10d43397
      Linus Torvalds authored
      Pull xtensa fixes from Max Filippov:
      
       - fix syscall number passed to trace_sys_exit
      
       - fix syscall number initialization in start_thread
      
       - fix level interpretation in the return_address
      
       - fix format string warning in init_pmd
      
      * tag 'xtensa-20190408' of git://github.com/jcmvbkbc/linux-xtensa:
        xtensa: fix format string warning in init_pmd
        xtensa: fix return_address
        xtensa: fix initialization of pt_regs::syscall in start_thread
        xtensa: use actual syscall number in do_syscall_trace_leave
      10d43397
  6. 08 Apr, 2019 26 commits