1. 01 Nov, 2012 8 commits
    • Pavel Emelyanov's avatar
      sk-filter: Add ability to get socket filter program (v2) · a8fc9277
      Pavel Emelyanov authored
      The SO_ATTACH_FILTER option is set only. I propose to add the get
      ability by using SO_ATTACH_FILTER in getsockopt. To be less
      irritating to eyes the SO_GET_FILTER alias to it is declared. This
      ability is required by checkpoint-restore project to be able to
      save full state of a socket.
      
      There are two issues with getting filter back.
      
      First, kernel modifies the sock_filter->code on filter load, thus in
      order to return the filter element back to user we have to decode it
      into user-visible constants. Fortunately the modification in question
      is interconvertible.
      
      Second, the BPF_S_ALU_DIV_K code modifies the command argument k to
      speed up the run-time division by doing kernel_k = reciprocal(user_k).
      Bad news is that different user_k may result in same kernel_k, so we
      can't get the original user_k back. Good news is that we don't have
      to do it. What we need to is calculate a user2_k so, that
      
        reciprocal(user2_k) == reciprocal(user_k) == kernel_k
      
      i.e. if it's re-loaded back the compiled again value will be exactly
      the same as it was. That said, the user2_k can be calculated like this
      
        user2_k = reciprocal(kernel_k)
      
      with an exception, that if kernel_k == 0, then user2_k == 1.
      
      The optlen argument is treated like this -- when zero, kernel returns
      the amount of sock_fprog elements in filter, otherwise it should be
      large enough for the sock_fprog array.
      
      changes since v1:
      * Declared SO_GET_FILTER in all arch headers
      * Added decode of vlan-tag codes
      Signed-off-by: default avatarPavel Emelyanov <xemul@parallels.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      a8fc9277
    • Jason Wang's avatar
      tuntap: choose the txq based on rxq · 96442e42
      Jason Wang authored
      This patch implements a simple multiqueue flow steering policy - tx follows rx
      for tun/tap. The idea is simple, it just choose the txq based on which rxq it
      comes. The flow were identified through the rxhash of a skb, and the hash to
      queue mapping were recorded in a hlist with an ageing timer to retire the
      mapping. The mapping were created when tun receives packet from userspace, and
      was quired in .ndo_select_queue().
      
      I run co-current TCP_CRR test and didn't see any mapping manipulation helpers in
      perf top, so the overhead could be negelected.
      Signed-off-by: default avatarJason Wang <jasowang@redhat.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      96442e42
    • Jason Wang's avatar
      tuntap: add ioctl to attach or detach a file form tuntap device · cde8b15f
      Jason Wang authored
      Sometimes usespace may need to active/deactive a queue, this could be done by
      detaching and attaching a file from tuntap device.
      
      This patch introduces a new ioctls - TUNSETQUEUE which could be used to do
      this. Flag IFF_ATTACH_QUEUE were introduced to do attaching while
      IFF_DETACH_QUEUE were introduced to do the detaching.
      Signed-off-by: default avatarJason Wang <jasowang@redhat.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      cde8b15f
    • Jason Wang's avatar
      tuntap: multiqueue support · c8d68e6b
      Jason Wang authored
      This patch converts tun/tap to a multiqueue devices and expose the multiqueue
      queues as multiple file descriptors to userspace. Internally, each tun_file were
      abstracted as a queue, and an array of pointers to tun_file structurs were
      stored in tun_structure device, so multiple tun_files were allowed to be
      attached to the device as multiple queues.
      
      When choosing txq, we first try to identify a flow through its rxhash, if it
      does not have such one, we could try recorded rxq and then use them to choose
      the transmit queue. This policy may be changed in the future.
      Signed-off-by: default avatarJason Wang <jasowang@redhat.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      c8d68e6b
    • Jason Wang's avatar
      tuntap: introduce multiqueue flags · bbb00994
      Jason Wang authored
      Add flags to be used by creating multiqueue tuntap device.
      Signed-off-by: default avatarJason Wang <jasowang@redhat.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      bbb00994
    • Jason Wang's avatar
      tuntap: RCUify dereferencing between tun_struct and tun_file · 6e914fc7
      Jason Wang authored
      RCU were introduced in this patch to synchronize the dereferences between
      tun_struct and tun_file. All tun_{get|put} were replaced with RCU, the
      dereference from one to other must be done under rtnl lock or rcu read critical
      region.
      
      This is needed for the following patches since the one of the goal of multiqueue
      tuntap is to allow adding or removing queues during workload. Without RCU,
      control path would hold tx locks when adding or removing queues (which may cause
      sme delay) and it's hard to change the number of queues without stopping the net
      device. With the help of rcu, there's also no need for tun_file hold an refcnt
      to tun_struct.
      Signed-off-by: default avatarJason Wang <jasowang@redhat.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      6e914fc7
    • Jason Wang's avatar
      tuntap: move socket to tun_file · 54f968d6
      Jason Wang authored
      Current tuntap makes use of the socket receive queue as its tx queue. To
      implement multiple tx queues for tuntap and enable the ability of adding and
      removing queues during workload, the first step is to move the socket related
      structures to tun_file. Then we could let multiple fds/sockets to be attached to
      the tuntap.
      
      This patch removes tun_sock and moves socket related structures from tun_sock or
      tun_struct to tun_file. Two exceptions are tap_filter and sock_fprog, they are
      still kept in tun_structure since they are used to filter packets for the net
      device instead of per transmit queue (at least I see no requirements for
      them). After those changes, socket were created and destroyed during file open
      and close (instead of device creation and destroy), the socket structures could
      be dereferenced from tun_file instead of the file of tun_struct structure
      itself.
      
      For persisent device, since we purge during datching and wouldn't queue any
      packets when no interface were attached, there's no behaviod changes before and
      after this patch, so the changes were transparent to the userspace. To keep the
      attributes such as sndbuf, socket filter and vnet header, those would be
      re-initialize after a new interface were attached to an persist device.
      Signed-off-by: default avatarJason Wang <jasowang@redhat.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      54f968d6
    • Jason Wang's avatar
      1e588338
  2. 31 Oct, 2012 18 commits
  3. 30 Oct, 2012 13 commits
  4. 29 Oct, 2012 1 commit