1. 12 Feb, 2019 40 commits
    • David S. Miller's avatar
      Merge branch 'classifier-no-rtnl' · ef718bc3
      David S. Miller authored
      Vlad Buslov says:
      
      ====================
      Refactor classifier API to work with chain/classifiers without rtnl lock
      
      Currently, all netlink protocol handlers for updating rules, actions and
      qdiscs are protected with single global rtnl lock which removes any
      possibility for parallelism. This patch set is a third step to remove
      rtnl lock dependency from TC rules update path.
      
      Recently, new rtnl registration flag RTNL_FLAG_DOIT_UNLOCKED was added.
      Handlers registered with this flag are called without RTNL taken. End
      goal is to have rule update handlers(RTM_NEWTFILTER, RTM_DELTFILTER,
      etc.) to be registered with UNLOCKED flag to allow parallel execution.
      However, there is no intention to completely remove or split rtnl lock
      itself. This patch set addresses specific problems in implementation of
      classifiers API that prevent its control path from being executed
      concurrently, and completes refactoring of cls API rules update handlers
      by removing rtnl lock dependency from code that handles chains and
      classifiers. Rules update handlers are registered with
      RTNL_FLAG_DOIT_UNLOCKED flag.
      
      This patch set substitutes global rtnl lock dependency on rules update
      path in cls API by extending its data structures with following locks:
      - tcf_block with 'lock' mutex. It is used to protect block state and
        life-time management fields of chains on the block (chain->refcnt,
        chain->action_refcnt, chain->explicitly crated, etc.).
      - tcf_chain with 'filter_chain_lock' mutex, that is used to protect list
        of classifier instances attached to chain. chain0->filter_chain_lock
        serializes calls to head change callbacks and allows them to rely on
        filter_chain_lock for serialization instead of rtnl lock.
      - tcf_proto with 'lock' spinlock that is intended to be used to
        synchronize access to classifiers that support unlocked execution.
      
      Classifiers are extended with reference counting to accommodate parallel
      access by unlocked cls API. Classifier ops structure is extended with
      additional 'put' function to allow reference counting of filters and
      intended to be used by classifiers that implement rtnl-unlocked API.
      Users of classifiers and individual filter instances are modified to
      always hold reference while working with them.
      
      Classifiers that support unlocked execution still need to know the
      status of rtnl lock, so their API is extended with additional
      'rtnl_held' argument that is used to indicate that caller holds rtnl
      lock. Cls API propagates rtnl lock status across its helper functions
      and passes it to classifier.
      
      Changes from V3 to V4:
      - Patch 1:
        - Extract code that manages chain 'explicitly_created' flag into
          standalone patch.
      - Patch 2 - new.
      
      Changes from V2 to V3:
      - Change block->lock and chain->filter_chain_lock type to mutex. This
        removes the need for async miniqp refactoring and allows calling
        sleeping functions while holding the block->lock and
        chain->filter_chain_lock locks.
      - Previous patch 1 - async miniqp is no longer needed, remove the patch.
      - Patch 1:
        - Change block->lock type to mutex.
        - Implement tcf_block_destroy() helper function that destroys
          block->lock mutex before deallocating the block.
        - Revert GFP_KERNEL->GFP_ATOMIC memory allocation flags of tcf_chain
          which is no longer needed after block->lock type change.
      - Patch 6:
        - Change chain->filter_chain_lock type to mutex.
        - Assume chain0->filter_chain_lock synchronizations instead of rtnl
          lock in mini_qdisc_pair_swap() function that is called from head
          change callback of ingress Qdisc. With filter_chain_lock type
          changed to mutex it is now possible to call sleeping function while
          holding it, so it is now used instead of async implementation from
          previous versions of this patch set.
      - Patch 7:
        - Add local tp_next var to tcf_chain_flush() and use it to store
          tp->next pointer dereferenced with rcu_dereference_protected() to
          satisfy kbuild test robot.
        - Reset tp pointer to NULL at the beginning of tc_new_tfilter() to
          prevent its uninitialized usage in error handling code. This code
          was already implemented in patch 10, but must be in patch 8 to
          preserve code bisectability.
        - Put parent chain in tcf_proto_destroy(). In previous version this
          code was implemented in patch 1 which was removed in V3.
      
      Changes from V1 to V2:
      - Patch 1:
        - Use smp_store_release() instead of xchg() for setting
          miniqp->tp_head.
        - Move Qdisc deallocation to tc_proto_wq ordered workqueue that is
          used to destroy tcf proto instances. This is necessary to ensure
          that Qdisc is destroyed after all instances of chain/proto that it
          contains in order to prevent use-after-free error in
          tc_chain_notify_delete().
        - Cache parent net device ifindex in block to prevent use-after-free
          of dev queue in tc_chain_notify_delete().
      - Patch 2:
        - Use lockdep_assert_held() instead of spin_is_locked() for assertion.
        - Use 'free_block' argument in tcf_chain_destroy() instead of checking
          block's reference count and chain_list for second time.
      - Patch 7:
        - Refactor tcf_chain0_head_change_cb_add() to not take block->lock and
          chain0->filter_chain_lock in correct order.
      - Patch 10:
        - Always set 'tp_created' flag when creating tp to prevent releasing
          the chain twice when tp with same priority was inserted
          concurrently.
      - Patch 11:
        - Add additional check to prevent creation of new proto instance when
          parent chain is being flushed to reduce CPU usage.
        - Don't call tcf_chain_delete_empty() if tp insertion failed.
      - Patch 16 - new.
      - Patch 17:
        - Refactor to only lock take rtnl lock once (at the beginning of rule
          update handlers).
        - Always release rtnl mutex in the same function that locked it.
          Remove unlock code from tcf_block_release().
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      ef718bc3
    • Vlad Buslov's avatar
      net: sched: unlock rules update API · 470502de
      Vlad Buslov authored
      Register netlink protocol handlers for message types RTM_NEWTFILTER,
      RTM_DELTFILTER, RTM_GETTFILTER as unlocked. Set rtnl_held variable that
      tracks rtnl mutex state to be false by default.
      
      Introduce tcf_proto_is_unlocked() helper that is used to check
      tcf_proto_ops->flag to determine if ops can be called without taking rtnl
      lock. Manually lookup Qdisc, class and block in rule update handlers.
      Verify that both Qdisc ops and proto ops are unlocked before using any of
      their callbacks, and obtain rtnl lock otherwise.
      Signed-off-by: default avatarVlad Buslov <vladbu@mellanox.com>
      Acked-by: default avatarJiri Pirko <jiri@mellanox.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      470502de
    • Vlad Buslov's avatar
      net: sched: refactor tcf_block_find() into standalone functions · 18d3eefb
      Vlad Buslov authored
      Refactor tcf_block_find() code into three standalone functions:
      - __tcf_qdisc_find() to lookup Qdisc and increment its reference counter.
      - __tcf_qdisc_cl_find() to lookup class.
      - __tcf_block_find() to lookup block and increment its reference counter.
      
      This change is necessary to allow netlink tc rule update handlers to call
      these functions directly in order to conditionally take rtnl lock
      according to Qdisc class ops flags before calling any of class ops
      functions.
      Signed-off-by: default avatarVlad Buslov <vladbu@mellanox.com>
      Acked-by: default avatarJiri Pirko <jiri@mellanox.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      18d3eefb
    • Vlad Buslov's avatar
      net: sched: add flags to Qdisc class ops struct · dfcd2a2b
      Vlad Buslov authored
      Extend Qdisc_class_ops with flags. Create enum to hold possible class ops
      flag values. Add first class ops flags value QDISC_CLASS_OPS_DOIT_UNLOCKED
      to indicate that class ops functions can be called without taking rtnl
      lock.
      Signed-off-by: default avatarVlad Buslov <vladbu@mellanox.com>
      Acked-by: default avatarJiri Pirko <jiri@mellanox.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      dfcd2a2b
    • Vlad Buslov's avatar
      net: sched: extend proto ops to support unlocked classifiers · 12db03b6
      Vlad Buslov authored
      Add 'rtnl_held' flag to tcf proto change, delete, destroy, dump, walk
      functions to track rtnl lock status. Extend users of these function in cls
      API to propagate rtnl lock status to them. This allows classifiers to
      obtain rtnl lock when necessary and to pass rtnl lock status to extensions
      and driver offload callbacks.
      
      Add flags field to tcf proto ops. Add flag value to indicate that
      classifier doesn't require rtnl lock.
      Signed-off-by: default avatarVlad Buslov <vladbu@mellanox.com>
      Acked-by: default avatarJiri Pirko <jiri@mellanox.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      12db03b6
    • Vlad Buslov's avatar
      net: sched: extend proto ops with 'put' callback · 7d5509fa
      Vlad Buslov authored
      Add optional tp->ops->put() API to be implemented for filter reference
      counting. This new function is called by cls API to release filter
      reference for filters returned by tp->ops->change() or tp->ops->get()
      functions. Implement tfilter_put() helper to call tp->ops->put() only for
      classifiers that implement it.
      Signed-off-by: default avatarVlad Buslov <vladbu@mellanox.com>
      Acked-by: default avatarJiri Pirko <jiri@mellanox.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      7d5509fa
    • Vlad Buslov's avatar
      net: sched: track rtnl lock status when validating extensions · ec6743a1
      Vlad Buslov authored
      Actions API is already updated to not rely on rtnl lock for
      synchronization. However, it need to be provided with rtnl status when
      called from classifiers API in order to be able to correctly release the
      lock when loading kernel module.
      
      Extend extension validation function with 'rtnl_held' flag which is passed
      to actions API. Add new 'rtnl_held' parameter to tcf_exts_validate() in cls
      API. No classifier is currently updated to support unlocked execution, so
      pass hardcoded 'true' flag parameter value.
      Signed-off-by: default avatarVlad Buslov <vladbu@mellanox.com>
      Acked-by: default avatarJiri Pirko <jiri@mellanox.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      ec6743a1
    • Vlad Buslov's avatar
      net: sched: prevent insertion of new classifiers during chain flush · 726d0612
      Vlad Buslov authored
      Extend tcf_chain with 'flushing' flag. Use the flag to prevent insertion of
      new classifier instances when chain flushing is in progress in order to
      prevent resource leak when tcf_proto is created by unlocked users
      concurrently.
      
      Return EAGAIN error from tcf_chain_tp_insert_unique() to restart
      tc_new_tfilter() and lookup the chain/proto again.
      Signed-off-by: default avatarVlad Buslov <vladbu@mellanox.com>
      Acked-by: default avatarJiri Pirko <jiri@mellanox.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      726d0612
    • Vlad Buslov's avatar
      net: sched: refactor tp insert/delete for concurrent execution · 8b64678e
      Vlad Buslov authored
      Implement unique insertion function to atomically attach tcf_proto to chain
      after verifying that no other tcf proto with specified priority exists.
      Implement delete function that verifies that tp is actually empty before
      deleting it. Use these functions to refactor cls API to account for
      concurrent tp and rule update instead of relying on rtnl lock. Add new
      'deleting' flag to tcf proto. Use it to restart search when iterating over
      tp's on chain to prevent accessing potentially inval tp->next pointer.
      
      Extend tcf proto with spinlock that is intended to be used to protect its
      data from concurrent modification instead of relying on rtnl mutex. Use it
      to protect 'deleting' flag. Add lockdep macros to validate that lock is
      held when accessing protected fields.
      Signed-off-by: default avatarVlad Buslov <vladbu@mellanox.com>
      Acked-by: default avatarJiri Pirko <jiri@mellanox.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      8b64678e
    • Vlad Buslov's avatar
      net: sched: traverse classifiers in chain with tcf_get_next_proto() · fe2923af
      Vlad Buslov authored
      All users of chain->filters_chain rely on rtnl lock and assume that no new
      classifier instances are added when traversing the list. Use
      tcf_get_next_proto() to traverse filters list without relying on rtnl
      mutex. This function iterates over classifiers by taking reference to
      current iterator classifier only and doesn't assume external
      synchronization of filters list.
      Signed-off-by: default avatarVlad Buslov <vladbu@mellanox.com>
      Acked-by: default avatarJiri Pirko <jiri@mellanox.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      fe2923af
    • Vlad Buslov's avatar
      net: sched: introduce reference counting for tcf_proto · 4dbfa766
      Vlad Buslov authored
      In order to remove dependency on rtnl lock and allow concurrent tcf_proto
      modification, extend tcf_proto with reference counter. Implement helper
      get/put functions for tcf proto and use them to modify cls API to always
      take reference to tcf_proto while using it. Only release reference to
      parent chain after releasing last reference to tp.
      Signed-off-by: default avatarVlad Buslov <vladbu@mellanox.com>
      Acked-by: default avatarJiri Pirko <jiri@mellanox.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      4dbfa766
    • Vlad Buslov's avatar
      net: sched: protect filter_chain list with filter_chain_lock mutex · ed76f5ed
      Vlad Buslov authored
      Extend tcf_chain with new filter_chain_lock mutex. Always lock the chain
      when accessing filter_chain list, instead of relying on rtnl lock.
      Dereference filter_chain with tcf_chain_dereference() lockdep macro to
      verify that all users of chain_list have the lock taken.
      
      Rearrange tp insert/remove code in tc_new_tfilter/tc_del_tfilter to execute
      all necessary code while holding chain lock in order to prevent
      invalidation of chain_info structure by potential concurrent change. This
      also serializes calls to tcf_chain0_head_change(), which allows head change
      callbacks to rely on filter_chain_lock for synchronization instead of rtnl
      mutex.
      Signed-off-by: default avatarVlad Buslov <vladbu@mellanox.com>
      Acked-by: default avatarJiri Pirko <jiri@mellanox.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      ed76f5ed
    • Vlad Buslov's avatar
      net: sched: protect chain template accesses with block lock · a5654820
      Vlad Buslov authored
      When cls API is called without protection of rtnl lock, parallel
      modification of chain is possible, which means that chain template can be
      changed concurrently in certain circumstances. For example, when chain is
      'deleted' by new user-space chain API, the chain might continue to be used
      if it is referenced by actions, and can be 're-created' again by user. In
      such case same chain structure is reused and its template is changed. To
      protect from described scenario, cache chain template while holding block
      lock. Introduce standalone tc_chain_notify_delete() function that works
      with cached template values, instead of chains themselves.
      Signed-off-by: default avatarVlad Buslov <vladbu@mellanox.com>
      Acked-by: default avatarJiri Pirko <jiri@mellanox.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      a5654820
    • Vlad Buslov's avatar
      net: sched: traverse chains in block with tcf_get_next_chain() · bbf73830
      Vlad Buslov authored
      All users of block->chain_list rely on rtnl lock and assume that no new
      chains are added when traversing the list. Use tcf_get_next_chain() to
      traverse chain list without relying on rtnl mutex. This function iterates
      over chains by taking reference to current iterator chain only and doesn't
      assume external synchronization of chain list.
      
      Don't take reference to all chains in block when flushing and use
      tcf_get_next_chain() to safely iterate over chain list instead. Remove
      tcf_block_put_all_chains() that is no longer used.
      Signed-off-by: default avatarVlad Buslov <vladbu@mellanox.com>
      Acked-by: default avatarJiri Pirko <jiri@mellanox.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      bbf73830
    • Vlad Buslov's avatar
      net: sched: protect block->chain0 with block->lock · 165f0135
      Vlad Buslov authored
      In order to remove dependency on rtnl lock, use block->lock to protect
      chain0 struct from concurrent modification. Rearrange code in chain0
      callback add and del functions to only access chain0 when block->lock is
      held.
      Signed-off-by: default avatarVlad Buslov <vladbu@mellanox.com>
      Acked-by: default avatarJiri Pirko <jiri@mellanox.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      165f0135
    • Vlad Buslov's avatar
      net: sched: refactor tc_ctl_chain() to use block->lock · 2cbfab07
      Vlad Buslov authored
      In order to remove dependency on rtnl lock, modify chain API to use
      block->lock to protect chain from concurrent modification. Rearrange
      tc_ctl_chain() code to call tcf_chain_hold() while holding block->lock to
      prevent concurrent chain removal.
      Signed-off-by: default avatarVlad Buslov <vladbu@mellanox.com>
      Acked-by: default avatarJiri Pirko <jiri@mellanox.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      2cbfab07
    • Vlad Buslov's avatar
      net: sched: protect chain->explicitly_created with block->lock · 91052fa1
      Vlad Buslov authored
      In order to remove dependency on rtnl lock, protect
      tcf_chain->explicitly_created flag with block->lock. Consolidate code that
      checks and resets 'explicitly_created' flag into __tcf_chain_put() to
      execute it atomically with rest of code that puts chain reference.
      Signed-off-by: default avatarVlad Buslov <vladbu@mellanox.com>
      Acked-by: default avatarJiri Pirko <jiri@mellanox.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      91052fa1
    • Vlad Buslov's avatar
      net: sched: protect block state with mutex · c266f64d
      Vlad Buslov authored
      Currently, tcf_block doesn't use any synchronization mechanisms to protect
      critical sections that manage lifetime of its chains. block->chain_list and
      multiple variables in tcf_chain that control its lifetime assume external
      synchronization provided by global rtnl lock. Converting chain reference
      counting to atomic reference counters is not possible because cls API uses
      multiple counters and flags to control chain lifetime, so all of them must
      be synchronized in chain get/put code.
      
      Use single per-block lock to protect block data and manage lifetime of all
      chains on the block. Always take block->lock when accessing chain_list.
      Chain get and put modify chain lifetime-management data and parent block's
      chain_list, so take the lock in these functions. Verify block->lock state
      with assertions in functions that expect to be called with the lock taken
      and are called from multiple places. Take block->lock when accessing
      filter_chain_list.
      
      In order to allow parallel update of rules on single block, move all calls
      to classifiers outside of critical sections protected by new block->lock.
      Rearrange chain get and put functions code to only access protected chain
      data while holding block lock:
      - Rearrange code to only access chain reference counter and chain action
        reference counter while holding block lock.
      - Extract code that requires block->lock from tcf_chain_destroy() into
        standalone tcf_chain_destroy() function that is called by
        __tcf_chain_put() in same critical section that changes chain reference
        counters.
      Signed-off-by: default avatarVlad Buslov <vladbu@mellanox.com>
      Acked-by: default avatarJiri Pirko <jiri@mellanox.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      c266f64d
    • Gustavo A. R. Silva's avatar
      isdn_v110: mark expected switch fall-through · b67de691
      Gustavo A. R. Silva authored
      In preparation to enabling -Wimplicit-fallthrough, mark switch
      cases where we are expecting to fall through.
      
      This patch fixes the following warnings:
      
      drivers/isdn/i4l/isdn_v110.c: In function ‘EncodeMatrix’:
      drivers/isdn/i4l/isdn_v110.c:353:7: warning: this statement may fall through [-Wimplicit-fallthrough=]
          if (line >= mlen) {
             ^
      drivers/isdn/i4l/isdn_v110.c:358:3: note: here
         case 128:
         ^~~~
      
      Warning level 3 was used: -Wimplicit-fallthrough=3
      
      Notice that, in this particular case, the code comment is modified
      in accordance with what GCC is expecting to find.
      
      This patch is part of the ongoing efforts to enable
      -Wimplicit-fallthrough.
      Signed-off-by: default avatarGustavo A. R. Silva <gustavo@embeddedor.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      b67de691
    • Gustavo A. R. Silva's avatar
      isdn: i4l: isdn_tty: Mark expected switch fall-through · 56e9b6b9
      Gustavo A. R. Silva authored
      In preparation to enabling -Wimplicit-fallthrough, mark switch
      cases where we are expecting to fall through.
      
      This patch fixes the following warnings:
      
      drivers/isdn/i4l/isdn_tty.c: In function ‘isdn_tty_edit_at’:
      drivers/isdn/i4l/isdn_tty.c:3644:18: warning: this statement may fall through [-Wimplicit-fallthrough=]
             m->mdmcmdl = 0;
             ~~~~~~~~~~~^~~
      drivers/isdn/i4l/isdn_tty.c:3646:5: note: here
           case 0:
           ^~~~
      
      Warning level 3 was used: -Wimplicit-fallthrough=3
      
      Notice that, in this particular case, the code comment is modified
      in accordance with what GCC is expecting to find.
      
      This patch is part of the ongoing efforts to enable
      -Wimplicit-fallthrough.
      Signed-off-by: default avatarGustavo A. R. Silva <gustavo@embeddedor.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      56e9b6b9
    • Gustavo A. R. Silva's avatar
      ser_gigaset: mark expected switch fall-through · b6cd7dd9
      Gustavo A. R. Silva authored
      In preparation to enabling -Wimplicit-fallthrough, mark switch
      cases where we are expecting to fall through.
      
      This patch fixes the following warning:
      
      drivers/isdn/gigaset/ser-gigaset.c: In function ‘gigaset_tty_ioctl’:
      drivers/isdn/gigaset/ser-gigaset.c:627:3: warning: this statement may fall through [-Wimplicit-fallthrough=]
         switch (arg) {
         ^~~~~~
      drivers/isdn/gigaset/ser-gigaset.c:638:2: note: here
        default:
        ^~~~~~~
      
      Warning level 3 was used: -Wimplicit-fallthrough=3
      
      Notice that, in this particular case, the code comment is modified
      in accordance with what GCC is expecting to find.
      
      This patch is part of the ongoing efforts to enable
      -Wimplicit-fallthrough.
      Signed-off-by: default avatarGustavo A. R. Silva <gustavo@embeddedor.com>
      Acked-by: default avatarPaul Bolle <pebolle@tiscali.nl>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      b6cd7dd9
    • David S. Miller's avatar
      Merge branch 's390-qeth-next' · 8a1343c5
      David S. Miller authored
      Julian Wiedmann says:
      
      ====================
      s390/qeth: updates 2019-02-12
      
      please apply one more round of qeth patches to net-next.
      This series targets the driver's control paths. It primarily brings improvements
      to the error handling for sent cmds and received responses, along with the
      usual cleanup and consolidation efforts.
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      8a1343c5
    • Julian Wiedmann's avatar
      s390/qeth: convert remaining legacy cmd callbacks · 742d4d40
      Julian Wiedmann authored
      This calls the existing errno translation helpers from the callbacks,
      adding trivial wrappers where necessary. For cmds that have no
      sophisticated errno translation, default to -EIO.
      
      For IPA cmds with no callback, fall back to a minimal default. This is
      currently being used by qeth_l3_send_setrouting().
      
      Thus having all converted all callbacks, remove the legacy path in
      qeth_send_control_data_cb().
      Signed-off-by: default avatarJulian Wiedmann <jwi@linux.ibm.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      742d4d40
    • Julian Wiedmann's avatar
      s390/qeth: convert bridgeport callbacks · 1709ff8d
      Julian Wiedmann authored
      By letting the callbacks deal with error translation, we no longer need
      to pass the raw error codes back to the originator. This allows us to
      slim down the callback's private data, and nicely simplifies the code.
      Signed-off-by: default avatarJulian Wiedmann <jwi@linux.ibm.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      1709ff8d
    • Julian Wiedmann's avatar
      s390/qeth: allow cmd callbacks to return errnos · 4b7ae122
      Julian Wiedmann authored
      Error propagation from cmd callbacks currently works in a way where
      qeth_send_control_data_cb() picks the raw HW code from the response,
      and the cmd's originator later translates this into an errno.
      The callback itself only returns 0 ("done") or 1 ("expect more data").
      
      This is
      1. limiting, as the only means for the callback to report an internal
      error is to invent pseudo HW codes (such as IPA_RC_ENOMEM), that
      the originator then needs to understand. For non-IPA callbacks, we
      even provide a separate field in the IO buffer metadata (iob->rc) so
      the callback can pass back a return value.
      2. fragile, as the originator must take care to not translate any errno
      that is returned by qeth's own IO code paths (eg -ENOMEM). Also, any
      originator that forgets to translate the HW codes potentially passes
      garbage back to its caller. For instance, see
      commit 2aa48671 ("s390/qeth: translate SETVLAN/DELVLAN errors").
      
      Introduce a new model where all HW error translation is done within the
      callback, and the callback returns
      >  0, if it expects more data (as before)
      == 0, on success
      <  0, with an errno
      
      Start off with converting all callbacks to the new model that either
      a) pass back pseudo HW codes, or b) have a dependency on a specific
      HW error code. Also convert c) the one callback that uses iob->rc, and
      d) qeth_setadpparms_change_macaddr_cb() so that it can pass back an
      error back to qeth_l2_request_initial_mac() even when the cmd itself
      was successful.
      
      The old model remains supported: if the callback returns 0, we still
      propagate the response's HW error code back to the originator.
      Signed-off-by: default avatarJulian Wiedmann <jwi@linux.ibm.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      4b7ae122
    • Julian Wiedmann's avatar
      s390/qeth: cancel cmd on early error · 54daaca7
      Julian Wiedmann authored
      When sending cmds via qeth_send_control_data(), qeth puts the request
      on the IO channel and then blocks on the reply object until the response
      has been received.
      
      If the IO completes with error, there will never be a response and we
      block until the reply-wait hits its timeout. For this case, connect the
      request buffer to its reply object, so that we can immediately cancel
      the wait.
      Signed-off-by: default avatarJulian Wiedmann <jwi@linux.ibm.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      54daaca7
    • Julian Wiedmann's avatar
      s390/qeth: simplify reply object handling · 0951c6ba
      Julian Wiedmann authored
      Current code enqueues & dequeues a reply object from the waiter list
      in various places. In particular, the dequeue & enqueue in
      qeth_send_control_data_cb() looks fragile - this can cause
      qeth_clear_ipacmd_list() to skip the active object.
      Add some helpers, and boil the logic down by giving
      qeth_send_control_data() the sole responsibility to add and remove
      objects.
      
      qeth_send_control_data_cb() and qeth_clear_ipacmd_list() will now only
      notify the reply object to interrupt its wait cycle. This can cause
      a slight delay in the removal, but that's no concern.
      Signed-off-by: default avatarJulian Wiedmann <jwi@linux.ibm.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      0951c6ba
    • Julian Wiedmann's avatar
      s390/qeth: limit trace to valid data of command request · 51581fd0
      Julian Wiedmann authored
      'len' specifies how much data we send to the HW, don't dump beyond this
      boundary.
      As of today this is no big concern - commands are built in full, zeroed
      pages.
      Signed-off-by: default avatarJulian Wiedmann <jwi@linux.ibm.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      51581fd0
    • Julian Wiedmann's avatar
      s390/qeth: align csum offload with TSO control logic · 4386e34f
      Julian Wiedmann authored
      csum offload and TSO have similar programming requirements. The TSO code
      was reworked with commit "s390/qeth: enhance TSO control sequence",
      adjust the csum control flow accordingly. Primarily this means replacing
      custom helpers with more generic infrastructure.
      
      Also, change the LP2LP check so that it warns on TX offload (not RX).
      This is where reduced csum capability actually matters.
      Signed-off-by: default avatarJulian Wiedmann <jwi@linux.ibm.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      4386e34f
    • Julian Wiedmann's avatar
      s390/qeth: enable only required csum offload features · 7e83747d
      Julian Wiedmann authored
      Current code attempts to enable all advertised HW csum offload features.
      Future-proof this by enabling only those features that we actually use.
      
      Also, the IPv4 header csum feature is only needed for TX on L3 devices.
      Signed-off-by: default avatarJulian Wiedmann <jwi@linux.ibm.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      7e83747d
    • Julian Wiedmann's avatar
      s390/qeth: consolidate filling of low-level cmd length fields · c2153277
      Julian Wiedmann authored
      The code to fill the IPA length fields is duplicated three times across
      the driver:
      1. qeth_send_ipa_cmd() sets IPA_CMD_LENGTH, which matches the defaults
         in the IPA_PDU_HEADER template.
      2. for OSN, qeth_osn_send_ipa_cmd() bypasses this logic and inserts the
         length passed by the caller.
      3. SNMP commands (that can outgrow IPA_CMD_LENGTH) have their own way
         of setting the length fields, via qeth_send_ipa_snmp_cmd().
      
      Consolidate this into qeth_prepare_ipa_cmd(), which all originators of
      IPA cmds already call during setup of their cmd. Let qeth_send_ipa_cmd()
      pull the length from the cmd instead of hard-coding IPA_CMD_LENGTH.
      
      For now, the SNMP code still needs to fix-up its length fields manually.
      Signed-off-by: default avatarJulian Wiedmann <jwi@linux.ibm.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      c2153277
    • Julian Wiedmann's avatar
      s390/qeth: reduce data length for ARP cache query · 84dbea46
      Julian Wiedmann authored
      qeth_l3_query_arp_cache_info() indicates a data length that's much
      larger than the actual length of its request (ie. the value passed to
      qeth_get_setassparms_cmd()). The confusion presumably comes from the
      fact that the cmd _response_ can be quite large - but that's no concern
      for the initial request IO.
      
      Fixing this up allows us to use the generic qeth_send_ipa_cmd()
      infrastructure.
      Signed-off-by: default avatarJulian Wiedmann <jwi@linux.ibm.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      84dbea46
    • David S. Miller's avatar
      Merge branch 'Add-ENETC-PTP-clock-driver' · a263f99c
      David S. Miller authored
      Yangbo Lu says:
      
      ====================
      Add ENETC PTP clock driver
      
      There is same QorIQ 1588 timer IP block on the new ENETC Ethernet
      controller with eTSEC/DPAA Ethernet controllers. However it's
      different endianness (little-endian) and using PCI driver.
      
      To support ENETC PTP driver, ptp_qoriq driver needed to be
      reworked to make functions global for reusing, to add little-
      endian support, to add ENETC memory map support, and to add
      ENETC dependency for ptp_qoriq driver.
      
      In addition, although ENETC PTP driver is a PCI driver, the dts
      node still could be used. Currently the ls1028a dtsi which is
      the only platform by now using ENETC is not complete, so there
      is still dependency for ENETC PTP node upstreaming. This will
      be done in the near future. The hardware timestamping support
      for ENETC is done but needs to be reworked with new method in
      internal git tree, and will be sent out soon.
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      a263f99c
    • Saeed Mahameed's avatar
      net/mlx4_en: Force CHECKSUM_NONE for short ethernet frames · 74abc07d
      Saeed Mahameed authored
      When an ethernet frame is padded to meet the minimum ethernet frame
      size, the padding octets are not covered by the hardware checksum.
      Fortunately the padding octets are usually zero's, which don't affect
      checksum. However, it is not guaranteed. For example, switches might
      choose to make other use of these octets.
      This repeatedly causes kernel hardware checksum fault.
      
      Prior to the cited commit below, skb checksum was forced to be
      CHECKSUM_NONE when padding is detected. After it, we need to keep
      skb->csum updated. However, fixing up CHECKSUM_COMPLETE requires to
      verify and parse IP headers, it does not worth the effort as the packets
      are so small that CHECKSUM_COMPLETE has no significant advantage.
      
      Future work: when reporting checksum complete is not an option for
      IP non-TCP/UDP packets, we can actually fallback to report checksum
      unnecessary, by looking at cqe IPOK bit.
      
      Fixes: 88078d98 ("net: pskb_trim_rcsum() and CHECKSUM_COMPLETE are friends")
      Cc: Eric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarSaeed Mahameed <saeedm@mellanox.com>
      Signed-off-by: default avatarTariq Toukan <tariqt@mellanox.com>
      Reviewed-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      74abc07d
    • Yangbo Lu's avatar
      MAINTAINERS: add enetc_ptp driver into QorIQ PTP list · bb024c3b
      Yangbo Lu authored
      This patch to add enetc_ptp driver into QorIQ PTP list
      for maintaining.
      Signed-off-by: default avatarYangbo Lu <yangbo.lu@nxp.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      bb024c3b
    • Yangbo Lu's avatar
      enetc: add PTP clock driver · 19971f5e
      Yangbo Lu authored
      This patch is to add PTP clock driver for ENETC.
      The driver reused QorIQ PTP clock driver.
      Signed-off-by: default avatarYangbo Lu <yangbo.lu@nxp.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      19971f5e
    • Yangbo Lu's avatar
      ptp: add QorIQ PTP support for ENETC · ad6e1be6
      Yangbo Lu authored
      This patch is to add QorIQ PTP support for ENETC.
      ENETC PTP driver which is a PCI driver for same
      1588 timer IP block will reuse QorIQ PTP driver.
      Signed-off-by: default avatarYangbo Lu <yangbo.lu@nxp.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      ad6e1be6
    • Yangbo Lu's avatar
      ptp_qoriq: fix register memory map · d4e17687
      Yangbo Lu authored
      The 1588 timer on eTSEC Ethernet controller uses different
      register memory map with DPAA Ethernet controller.
      Now the new ENETC Ethernet controller uses same reigster
      memory map with DPAA. To support ENETC, let's use register
      memory map of DPAA/ENETC in default.
      Signed-off-by: default avatarYangbo Lu <yangbo.lu@nxp.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      d4e17687
    • Yangbo Lu's avatar
      dt-binding: ptp_qoriq: add little-endian support · 2843bf51
      Yangbo Lu authored
      Specify "little-endian" property if the 1588 timer IP block
      is little-endian mode. The default endian mode is big-endian.
      Signed-off-by: default avatarYangbo Lu <yangbo.lu@nxp.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      2843bf51
    • Yangbo Lu's avatar
      ptp_qoriq: add little enadian support · f038ddf2
      Yangbo Lu authored
      There is QorIQ 1588 timer IP block on the new ENETC Ethernet
      controller. However it uses little endian mode which is different
      with before. This patch is to add little endian support for the
      driver by using "little-endian" dts node property.
      Signed-off-by: default avatarYangbo Lu <yangbo.lu@nxp.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      f038ddf2