1. 11 Jul, 2013 1 commit
    • David Jeffery's avatar
      lockd: protect nlm_blocked access in nlmsvc_retry_blocked · 1c327d96
      David Jeffery authored
      In nlmsvc_retry_blocked, the check that the list is non-empty and acquiring
      the pointer of the first entry is unprotected by any lock.  This allows a rare
      race condition when there is only one entry on the list.  A function such as
      nlmsvc_grant_callback() can be called, which will temporarily remove the entry
      from the list.  Between the list_empty() and list_entry(),the list may become
      empty, causing an invalid pointer to be used as an nlm_block, leading to a
      possible crash.
      
      This patch adds the nlm_block_lock around these calls to prevent concurrent
      use of the nlm_blocked list.
      
      This was a regression introduced by
      f904be9c  "lockd: Mostly remove BKL from
      the server".
      
      Cc: Bryan Schumaker <bjschuma@netapp.com>
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarDavid Jeffery <djeffery@redhat.com>
      Signed-off-by: default avatarJ. Bruce Fields <bfields@redhat.com>
      1c327d96
  2. 08 Jul, 2013 3 commits
    • J. Bruce Fields's avatar
      nfsd4: support minorversion 1 by default · d1091481
      J. Bruce Fields authored
      We now have minimal minorversion 1 support; turn it on by default.
      
      This can still be turned off with "echo -4.1 >/proc/fs/nfsd/versions".
      Signed-off-by: default avatarJ. Bruce Fields <bfields@redhat.com>
      d1091481
    • J. Bruce Fields's avatar
      nfsd4: allow destroy_session over destroyed session · f0f51f5c
      J. Bruce Fields authored
      RFC 5661 allows a client to destroy a session using a compound
      associated with the destroyed session, as long as the DESTROY_SESSION op
      is the last op of the compound.
      
      We attempt to allow this, but testing against a Solaris client (which
      does destroy sessions in this way) showed that we were failing the
      DESTROY_SESSION with NFS4ERR_DELAY, because we assumed the reference
      count on the session (held by us) represented another rpc in progress
      over this session.
      
      Fix this by noting that in this case the expected reference count is 1,
      not 0.
      
      Also, note as long as the session holds a reference to the compound
      we're destroying, we can't free it here--instead, delay the free till
      the final put in nfs4svc_encode_compoundres.
      Signed-off-by: default avatarJ. Bruce Fields <bfields@redhat.com>
      f0f51f5c
    • J. Bruce Fields's avatar
      svcrpc: fix failures to handle -1 uid's · 0979292b
      J. Bruce Fields authored
      As of f025adf1 "sunrpc: Properly decode
      kuids and kgids in RPC_AUTH_UNIX credentials" any rpc containing a -1
      (0xffff) uid or gid would fail with a badcred error.
      
      Commit afe3c3fd "svcrpc: fix failures to
      handle -1 uid's and gid's" fixed part of the problem, but overlooked the
      gid upcall--the kernel can request supplementary gid's for the -1 uid,
      but mountd's attempt write a response will get -EINVAL.
      
      Symptoms were nfsd failing to reply to the first attempt to use a newly
      negotiated krb5 context.
      Reported-by: default avatarSven Geggus <lists@fuchsschwanzdomain.de>
      Tested-by: default avatarSven Geggus <lists@fuchsschwanzdomain.de>
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarJ. Bruce Fields <bfields@redhat.com>
      0979292b
  3. 01 Jul, 2013 20 commits
    • NeilBrown's avatar
      sunrpc: Don't schedule an upcall on a replaced cache entry. · 0bebc633
      NeilBrown authored
      When a cache entry is replaced, the "expiry_time" get set to
      zero by a call to "cache_fresh_locked(..., 0)" at the end of
      "sunrpc_cache_update".
      
      This low expiry time makes cache_check() think that the 'refresh_age'
      is negative, so the 'age' is comparatively large and a refresh is
      triggered.
      However refreshing a replaced entry it pointless, it cannot achieve
      anything useful.
      
      So teach cache_check to ignore a low refresh_age when expiry_time
      is zero.
      Reported-by: default avatarBodo Stroesser <bstroesser@ts.fujitsu.com>
      Signed-off-by: default avatarNeilBrown <neilb@suse.de>
      Signed-off-by: default avatarJ. Bruce Fields <bfields@redhat.com>
      0bebc633
    • NeilBrown's avatar
      net/sunrpc: xpt_auth_cache should be ignored when expired. · 7715cde8
      NeilBrown authored
      commit d202cce8
          sunrpc: never return expired entries in sunrpc_cache_lookup
      
      moved the 'entry is expired' test from cache_check to
      sunrpc_cache_lookup, so that it happened early and some races could
      safely be ignored.
      
      However the ip_map (in svcauth_unix.c) has a separate single-item
      cache which allows quick lookup without locking.  An entry in this
      case would not be subject to the expiry test and so could be used
      well after it has expired.
      
      This is not normally a big problem because the first time it is used
      after it is expired an up-call will be scheduled to refresh the entry
      (if it hasn't been scheduled already) and the old entry will then
      be invalidated.  So on the second attempt to use it after it has
      expired, ip_map_cached_get will discard it.
      
      However that is subtle and not ideal, so replace the "!cache_valid"
      test with "cache_is_expired".
      In doing this we drop the test on the "CACHE_VALID" bit.  This is
      unnecessary as the bit is never cleared, and an entry will only
      be cached if the bit is set.
      Reported-by: default avatarBodo Stroesser <bstroesser@ts.fujitsu.com>
      Signed-off-by: default avatarNeilBrown <neilb@suse.de>
      Signed-off-by: default avatarJ. Bruce Fields <bfields@redhat.com>
      7715cde8
    • NeilBrown's avatar
      sunrpc/cache: ensure items removed from cache do not have pending upcalls. · 013920eb
      NeilBrown authored
      It is possible for a race to set CACHE_PENDING after cache_clean()
      has removed a cache entry from the cache.
      If CACHE_PENDING is still set when the entry is finally 'put',
      the cache_dequeue() will never happen and we can leak memory.
      
      So set a new flag 'CACHE_CLEANED' when we remove something from
      the cache, and don't queue any upcall if it is set.
      
      If CACHE_PENDING is set before CACHE_CLEANED, the call that
      cache_clean() makes to cache_fresh_unlocked() will free memory
      as needed.  If CACHE_PENDING is set after CACHE_CLEANED, the
      test in sunrpc_cache_pipe_upcall will ensure that the memory
      is not allocated.
      
      Reported-by: <bstroesser@ts.fujitsu.com>
      Signed-off-by: default avatarNeilBrown <neilb@suse.de>
      Signed-off-by: default avatarJ. Bruce Fields <bfields@redhat.com>
      013920eb
    • NeilBrown's avatar
      sunrpc/cache: use cache_fresh_unlocked consistently and correctly. · 2a1c7f53
      NeilBrown authored
      cache_fresh_unlocked() is called when a cache entry
      has been updated and ensures that if there were any
      pending upcalls, they are cleared.
      
      So every time we update a cache entry, we should call this,
      and this should be the only way that we try to clear
      pending calls (that sort of uniformity makes code sooo much
      easier to read).
      
      try_to_negate_entry() will (possibly) mark an entry as
      negative.  If it doesn't, it is because the entry already
      is VALID.
      So the entry will be valid on exit, so it is appropriate to
      call cache_fresh_unlocked().
      So tidy up try_to_negate_entry() to do that, and remove
      partial open-coded cache_fresh_unlocked() from the one
      call-site of try_to_negate_entry().
      
      In the other branch of the 'switch(cache_make_upcall())',
      we again have a partial open-coded version of cache_fresh_unlocked().
      Replace that with a real call.
      
      And again in cache_clean(), use a real call to cache_fresh_unlocked().
      
      These call sites might previously have called
      cache_revisit_request() if CACHE_PENDING wasn't set.
      This is never necessary because cache_revisit_request() can
      only do anything if the item is in the cache_defer_hash,
      However any time that an item is added to the cache_defer_hash
      (setup_deferral), the code immediately tests CACHE_PENDING,
      and removes the entry again if it is clear.  So all other
      places we only need to 'cache_revisit_request' if we've
      just cleared CACHE_PENDING.
      Reported-by: default avatarBodo Stroesser <bstroesser@ts.fujitsu.com>
      Signed-off-by: default avatarNeilBrown  <neilb@suse.de>
      Signed-off-by: default avatarJ. Bruce Fields <bfields@redhat.com>
      2a1c7f53
    • NeilBrown's avatar
      sunrpc/cache: remove races with queuing an upcall. · f9e1aedc
      NeilBrown authored
      We currently queue an upcall after setting CACHE_PENDING,
      and dequeue after clearing CACHE_PENDING.
      So a request should only be present when CACHE_PENDING is set.
      
      However we don't combine the test and the enqueue/dequeue in
      a protected region, so it is possible (if unlikely) for a race
      to result in a request being queued without CACHE_PENDING set,
      or a request to be absent despite CACHE_PENDING.
      
      So: include a test for CACHE_PENDING inside the regions of
      enqueue and dequeue where queue_lock is held, and abort
      the operation if the value is not as expected.
      
      Also remove the early 'return' from cache_dequeue() to ensure that it
      always removes all entries: As there is no locking between setting
      CACHE_PENDING and calling sunrpc_cache_pipe_upcall it is not
      inconceivable for some other thread to clear CACHE_PENDING and then
      someone else to set it and call sunrpc_cache_pipe_upcall, both before
      the original threads completed the call.
      
      With this, it perfectly safe and correct to:
       - call cache_dequeue() if and only if we have just
         cleared CACHE_PENDING
       - call sunrpc_cache_pipe_upcall() (via cache_make_upcall)
         if and only if we have just set CACHE_PENDING.
      Reported-by: default avatarBodo Stroesser <bstroesser@ts.fujitsu.com>
      Signed-off-by: default avatarNeilBrown <neilb@suse.de>
      Signed-off-by: default avatarBodo Stroesser <bstroesser@ts.fujitsu.com>
      Signed-off-by: default avatarJ. Bruce Fields <bfields@redhat.com>
      f9e1aedc
    • J. Bruce Fields's avatar
      nfsd4: return delegation immediately if lease fails · d08d32e6
      J. Bruce Fields authored
      This case shouldn't happen--the administrator shouldn't really allow
      other applications access to the export until clients have had the
      chance to reclaim their state--but if it does then we should set the
      "return this lease immediately" bit on the reply.  That still leaves
      some small races, but it's the best the protocol allows us to do in the
      case a lease is ripped out from under us....
      Signed-off-by: default avatarJ. Bruce Fields <bfields@redhat.com>
      d08d32e6
    • J. Bruce Fields's avatar
      nfsd4: do not throw away 4.1 lock state on last unlock · 0a262ffb
      J. Bruce Fields authored
      This reverts commit eb2099f3 "nfsd4:
      release lockowners on last unlock in 4.1 case".  Trond identified
      language in rfc 5661 section 8.2.4 which forbids this behavior:
      
      	Stateids associated with byte-range locks are an exception.
      	They remain valid even if a LOCKU frees all remaining locks, so
      	long as the open file with which they are associated remains
      	open, unless the client frees the stateids via the FREE_STATEID
      	operation.
      
      And bakeathon 2013 testing found a 4.1 freebsd client was getting an
      incorrect BAD_STATEID return from a FREE_STATEID in the above situation
      and then failing.
      
      The spec language honestly was probably a mistake but at this point with
      implementations already following it we're probably stuck with that.
      Signed-off-by: default avatarJ. Bruce Fields <bfields@redhat.com>
      0a262ffb
    • J. Bruce Fields's avatar
      nfsd4: delegation-based open reclaims should bypass permissions · 89f6c336
      J. Bruce Fields authored
      We saw a v4.0 client's create fail as follows:
      
      	- open create succeeds and gets a read delegation
      	- client attempts to set mode on new file, gets DELAY while
      	  server recalls delegation.
      	- client attempts a CLAIM_DELEGATE_CUR open using the
      	  delegation, gets error because of new file mode.
      
      This probably can't happen on a recent kernel since we're no longer
      giving out delegations on create opens.  Nevertheless, it's a
      bug--reclaim opens should bypass permission checks.
      Reported-by: default avatarSteve Dickson <steved@redhat.com>
      Reported-by: default avatarTrond Myklebust <Trond.Myklebust@netapp.com>
      Signed-off-by: default avatarJ. Bruce Fields <bfields@redhat.com>
      89f6c336
    • J. Bruce Fields's avatar
      svcrpc: don't error out on small tcp fragment · 1f691b07
      J. Bruce Fields authored
      Though clients we care about mostly don't do this, it is possible for
      rpc requests to be sent in multiple fragments.  Here we have a sanity
      check to ensure that the final received rpc isn't too small--except that
      the number we're actually checking is the length of just the final
      fragment, not of the whole rpc.  So a perfectly legal rpc that's
      unluckily fragmented could cause the server to close the connection
      here.
      
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarJ. Bruce Fields <bfields@redhat.com>
      1f691b07
    • J. Bruce Fields's avatar
      svcrpc: fix handling of too-short rpc's · cf3aa02c
      J. Bruce Fields authored
      If we detect that an rpc is too short, we abort and close the
      connection.  Except, there's a bug here: we're leaving sk_datalen
      nonzero without leaving any pages in the sk_pages array.  The most
      likely result of the inconsistency is a subsequent crash in
      svc_tcp_clear_pages.
      
      Also demote the BUG_ON in svc_tcp_clear_pages to a WARN.
      
      Cc: stable@kernel.org
      Signed-off-by: default avatarJ. Bruce Fields <bfields@redhat.com>
      cf3aa02c
    • J. Bruce Fields's avatar
      nfsd4: minor read_buf cleanup · 590b7431
      J. Bruce Fields authored
      The code to step to the next page seems reasonably self-contained.
      Signed-off-by: default avatarJ. Bruce Fields <bfields@redhat.com>
      590b7431
    • J. Bruce Fields's avatar
      nfsd4: fix decoding of compounds across page boundaries · 24750082
      J. Bruce Fields authored
      A freebsd NFSv4.0 client was getting rare IO errors expanding a tarball.
      A network trace showed the server returning BAD_XDR on the final getattr
      of a getattr+write+getattr compound.  The final getattr started on a
      page boundary.
      
      I believe the Linux client ignores errors on the post-write getattr, and
      that that's why we haven't seen this before.
      
      Cc: stable@vger.kernel.org
      Reported-by: default avatarRick Macklem <rmacklem@uoguelph.ca>
      Signed-off-by: default avatarJ. Bruce Fields <bfields@redhat.com>
      24750082
    • J. Bruce Fields's avatar
      nfsd4: clean up nfs4_open_delegation · 99c41515
      J. Bruce Fields authored
      The nfs4_open_delegation logic is unecessarily baroque.
      
      Also stop pretending we support write delegations in several places.
      
      Some day we will support write delegations, but when that happens adding
      back in these flag parameters will be the easy part.  For now they're
      just confusing.
      Signed-off-by: default avatarJ. Bruce Fields <bfields@redhat.com>
      99c41515
    • Steve Dickson's avatar
      NFSD: Don't give out read delegations on creates · 9a0590ae
      Steve Dickson authored
      When an exclusive create is done with the mode bits
      set (aka open(testfile, O_CREAT | O_EXCL, 0777)) this
      causes a OPEN op followed by a SETATTR op. When a
      read delegation is given in the OPEN, it causes
      the SETATTR to delay with EAGAIN until the
      delegation is recalled.
      
      This patch caused exclusive creates to give out
      a write delegation (which turn into no delegation)
      which allows the SETATTR seamlessly succeed.
      Signed-off-by: default avatarSteve Dickson <steved@redhat.com>
      [bfields: do this for any CREATE, not just exclusive; comment]
      Signed-off-by: default avatarJ. Bruce Fields <bfields@redhat.com>
      9a0590ae
    • J. Bruce Fields's avatar
      nfsd4: allow client to send no cb_sec flavors · 57569a70
      J. Bruce Fields authored
      In testing I notice that some of the pynfs tests forget to send any
      cb_sec flavors, and that we haven't necessarily errored out in that case
      before.
      
      I'll fix pynfs, but am also inclined to default to trying AUTH_NONE in
      that case in case this is something clients actually do.
      Signed-off-by: default avatarJ. Bruce Fields <bfields@redhat.com>
      57569a70
    • J. Bruce Fields's avatar
      nfsd4: fail attempts to request gss on the backchannel · b78724b7
      J. Bruce Fields authored
      We don't support gss on the backchannel.  We should state that fact up
      front rather than just letting things continue and later making the
      client try to figure out why the backchannel isn't working.
      
      Trond suggested instead returning NFS4ERR_NOENT.  I think it would be
      tricky for the client to distinguish between the case "I don't support
      gss on the backchannel" and "I can't find that in my cache, please
      create another context and try that instead", and I'd prefer something
      that currently doesn't have any other meaning for this operation, hence
      the (somewhat arbitrary) NFS4ERR_ENCR_ALG_UNSUPP.
      Signed-off-by: default avatarJ. Bruce Fields <bfields@redhat.com>
      b78724b7
    • J. Bruce Fields's avatar
      nfsd4: implement minimal SP4_MACH_CRED · 57266a6e
      J. Bruce Fields authored
      Do a minimal SP4_MACH_CRED implementation suggested by Trond, ignoring
      the client-provided spo_must_* arrays and just enforcing credential
      checks for the minimum required operations.
      Signed-off-by: default avatarJ. Bruce Fields <bfields@redhat.com>
      57266a6e
    • J. Bruce Fields's avatar
      svcrpc: store gss mech in svc_cred · 0dc1531a
      J. Bruce Fields authored
      Store a pointer to the gss mechanism used in the rq_cred and cl_cred.
      This will make it easier to enforce SP4_MACH_CRED, which needs to
      compare the mechanism used on the exchange_id with that used on
      protected operations.
      Signed-off-by: default avatarJ. Bruce Fields <bfields@redhat.com>
      0dc1531a
    • J. Bruce Fields's avatar
      svcrpc: introduce init_svc_cred · 44234063
      J. Bruce Fields authored
      Common helper to zero out fields of the svc_cred.
      Signed-off-by: default avatarJ. Bruce Fields <bfields@redhat.com>
      44234063
    • J. Bruce Fields's avatar
      Merge branch 'for-3.10' into 'for-3.11' · 0de93493
      J. Bruce Fields authored
      Merge bugfixes into my for-3.11 branch.
      0de93493
  4. 29 May, 2013 1 commit
  5. 28 May, 2013 1 commit
  6. 21 May, 2013 2 commits
  7. 15 May, 2013 3 commits
  8. 13 May, 2013 5 commits
  9. 12 May, 2013 3 commits
    • Dan Carpenter's avatar
      svcauth_gss: fix error code in use_gss_proxy() · 625cdd78
      Dan Carpenter authored
      This should return zero on success and -EBUSY on error so the type
      needs to be int instead of bool.
      Signed-off-by: default avatarDan Carpenter <dan.carpenter@oracle.com>
      Signed-off-by: default avatarJ. Bruce Fields <bfields@redhat.com>
      625cdd78
    • Linus Torvalds's avatar
      Linux 3.10-rc1 · f722406f
      Linus Torvalds authored
      f722406f
    • Linus Torvalds's avatar
      Merge tag 'trace-fixes-v3.10' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace · 26b840ae
      Linus Torvalds authored
      Pull tracing/kprobes update from Steven Rostedt:
       "The majority of these changes are from Masami Hiramatsu bringing
        kprobes up to par with the latest changes to ftrace (multi buffering
        and the new function probes).
      
        He also discovered and fixed some bugs in doing so.  When pulling in
        his patches, I also found a few minor bugs as well and fixed them.
      
        This also includes a compile fix for some archs that select the ring
        buffer but not tracing.
      
        I based this off of the last patch you took from me that fixed the
        merge conflict error, as that was the commit that had all the changes
        I needed for this set of changes."
      
      * tag 'trace-fixes-v3.10' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace:
        tracing/kprobes: Support soft-mode disabling
        tracing/kprobes: Support ftrace_event_file base multibuffer
        tracing/kprobes: Pass trace_probe directly from dispatcher
        tracing/kprobes: Increment probe hit-count even if it is used by perf
        tracing/kprobes: Use bool for retprobe checker
        ftrace: Fix function probe when more than one probe is added
        ftrace: Fix the output of enabled_functions debug file
        ftrace: Fix locking in register_ftrace_function_probe()
        tracing: Add helper function trace_create_new_event() to remove duplicate code
        tracing: Modify soft-mode only if there's no other referrer
        tracing: Indicate enabled soft-mode in enable file
        tracing/kprobes: Fix to increment return event probe hit-count
        ftrace: Cleanup regex_lock and ftrace_lock around hash updating
        ftrace, kprobes: Fix a deadlock on ftrace_regex_lock
        ftrace: Have ftrace_regex_write() return either read or error
        tracing: Return error if register_ftrace_function_probe() fails for event_enable_func()
        tracing: Don't succeed if event_enable_func did not register anything
        ring-buffer: Select IRQ_WORK
      26b840ae
  10. 11 May, 2013 1 commit
    • Linus Torvalds's avatar
      Merge tag 'stable/for-linus-3.10-rc0-tag-two' of... · 607eeb0b
      Linus Torvalds authored
      Merge tag 'stable/for-linus-3.10-rc0-tag-two' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen
      
      Pull Xen bug-fixes from Konrad Rzeszutek Wilk:
       - More fixes in the vCPU PVHVM hotplug path.
       - Add more documentation.
       - Fix various ARM related issues in the Xen generic drivers.
       - Updates in the xen-pciback driver per Bjorn's updates.
       - Mask the x2APIC feature for PV guests.
      
      * tag 'stable/for-linus-3.10-rc0-tag-two' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen:
        xen/pci: Used cached MSI-X capability offset
        xen/pci: Use PCI_MSIX_TABLE_BIR, not PCI_MSIX_FLAGS_BIRMASK
        xen: clear IRQ_NOAUTOEN and IRQ_NOREQUEST
        xen: mask x2APIC feature in PV
        xen: SWIOTLB is only used on x86
        xen/spinlock: Fix check from greater than to be also be greater or equal to.
        xen/smp/pvhvm: Don't point per_cpu(xen_vpcu, 33 and larger) to shared_info
        xen/vcpu: Document the xen_vcpu_info and xen_vcpu
        xen/vcpu/pvhvm: Fix vcpu hotplugging hanging.
      607eeb0b