1. 18 Dec, 2015 9 commits
    • David S. Miller's avatar
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf-next · 59ce9670
      David S. Miller authored
      Pablo Neira Ayuso says:
      
      ====================
      Netfilter updates for net-next
      
      The following patchset contains the first batch of Netfilter updates for
      the upcoming 4.5 kernel. This batch contains userspace netfilter header
      compilation fixes, support for packet mangling in nf_tables, the new
      tracing infrastructure for nf_tables and cgroup2 support for iptables.
      More specifically, they are:
      
      1) Two patches to include dependencies in our netfilter userspace
         headers to resolve compilation problems, from Mikko Rapeli.
      
      2) Four comestic cleanup patches for the ebtables codebase, from Ian Morris.
      
      3) Remove duplicate include in the netfilter reject infrastructure,
         from Stephen Hemminger.
      
      4) Two patches to simplify the netfilter defragmentation code for IPv6,
         patch from Florian Westphal.
      
      5) Fix root ownership of /proc/net netfilter for unpriviledged net
         namespaces, from Philip Whineray.
      
      6) Get rid of unused fields in struct nft_pktinfo, from Florian Westphal.
      
      7) Add mangling support to our nf_tables payload expression, from
         Patrick McHardy.
      
      8) Introduce a new netlink-based tracing infrastructure for nf_tables,
         from Florian Westphal.
      
      9) Change setter functions in nfnetlink_log to be void, from
          Rami Rosen.
      
      10) Add netns support to the cttimeout infrastructure.
      
      11) Add cgroup2 support to iptables, from Tejun Heo.
      
      12) Introduce nfnl_dereference_protected() in nfnetlink, from Florian.
      
      13) Add support for mangling pkttype in the nf_tables meta expression,
          also from Florian.
      
      BTW, I need that you pull net into net-next, I have another batch that
      requires changes that I don't yet see in net.
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      59ce9670
    • Jakub Kicinski's avatar
      nfp: call netif_carrier_off() during init · 4b402d71
      Jakub Kicinski authored
      Netdevs default to carrier on, we should call netif_carrier_off()
      during initialization since we handle carrier state changes in the
      driver.
      Signed-off-by: default avatarJakub Kicinski <jakub.kicinski@netronome.com>
      Reviewed-by: default avatarRolf Neugebauer <rolf.neugebauer@netronome.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      4b402d71
    • David S. Miller's avatar
      Merge branch 'l3mdev-accept' · 6462de8c
      David S. Miller authored
      David Ahern says:
      
      ====================
      net: Allow accepted sockets to be bound to l3mdev domain
      
      Allow accepted sockets to derive their sk_bound_dev_if setting from the
      l3mdev domain in which the packets originated. This version adds a sysctl
      to control whether the setting is inherited, making the functionality
      similar to sk_mark and its sysctl_tcp_fwmark_accept setting.
      
      This effectively allow a process to have a "VRF-global" listen socket,
      with child sockets bound to the VRF device in which the packet originated.
      A similar behavior can be achieved using sk_mark, but a solution using marks
      is incomplete as it does not handle duplicate addresses in different L3
      domains/VRFs. Allowing sockets to inherit the sk_bound_dev_if from l3mdev
      domain provides a complete solution.
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      6462de8c
    • David Ahern's avatar
      net: Allow accepted sockets to be bound to l3mdev domain · 6dd9a14e
      David Ahern authored
      Allow accepted sockets to derive their sk_bound_dev_if setting from the
      l3mdev domain in which the packets originated. A sysctl setting is added
      to control the behavior which is similar to sk_mark and
      sysctl_tcp_fwmark_accept.
      
      This effectively allow a process to have a "VRF-global" listen socket,
      with child sockets bound to the VRF device in which the packet originated.
      A similar behavior can be achieved using sk_mark, but a solution using marks
      is incomplete as it does not handle duplicate addresses in different L3
      domains/VRFs. Allowing sockets to inherit the sk_bound_dev_if from l3mdev
      domain provides a complete solution.
      Signed-off-by: default avatarDavid Ahern <dsa@cumulusnetworks.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      6dd9a14e
    • David Ahern's avatar
      net: l3mdev: Add master device lookup by index · 1a852479
      David Ahern authored
      Add helper to lookup l3mdev master index given a device index.
      Signed-off-by: default avatarDavid Ahern <dsa@cumulusnetworks.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      1a852479
    • Bjørn Mork's avatar
      ipv6: addrconf: use stable address generator for ARPHRD_NONE · cc9da6cc
      Bjørn Mork authored
      Add a new address generator mode, using the stable address generator
      with an automatically generated secret. This is intended as a default
      address generator mode for device types with no EUI64 implementation.
      The new generator is used for ARPHRD_NONE interfaces initially, adding
      default IPv6 autoconf support to e.g. tun interfaces.
      
      If the addrgenmode is set to 'random', either by default or manually,
      and no stable secret is available, then a random secret is used as
      input for the stable-privacy address generator.  The secret can be
      read and modified like manually configured secrets, using the proc
      interface.  Modifying the secret will change the addrgen mode to
      'stable-privacy' to indicate that it operates on a known secret.
      
      Existing behaviour of the 'stable-privacy' mode is kept unchanged. If
      a known secret is available when the device is created, then the mode
      will default to 'stable-privacy' as before.  The mode can be manually
      set to 'random' but it will behave exactly like 'stable-privacy' in
      this case. The secret will not change.
      
      Cc: Hannes Frederic Sowa <hannes@stressinduktion.org>
      Cc: 吉藤英明 <hideaki.yoshifuji@miraclelinux.com>
      Signed-off-by: default avatarBjørn Mork <bjorn@mork.no>
      Acked-by: default avatarHannes Frederic Sowa <hannes@stressinduktion.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      cc9da6cc
    • Arnd Bergmann's avatar
      ila: add NETFILTER dependency · 8cb964da
      Arnd Bergmann authored
      The recently added generic ILA translation facility fails to
      build when CONFIG_NETFILTER is disabled:
      
      net/ipv6/ila/ila_xlat.c:229:20: warning: 'struct nf_hook_state' declared inside parameter list
      net/ipv6/ila/ila_xlat.c:235:27: error: array type has incomplete element type 'struct nf_hook_ops'
       static struct nf_hook_ops ila_nf_hook_ops[] __read_mostly = {
      
      This adds an explicit Kconfig dependency to avoid that case.
      Signed-off-by: default avatarArnd Bergmann <arnd@arndb.de>
      Fixes: 7f00feaf ("ila: Add generic ILA translation facility")
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      8cb964da
    • Florian Westphal's avatar
      netfilter: meta: add support for setting skb->pkttype · b4aae759
      Florian Westphal authored
      This allows to redirect bridged packets to local machine:
      
      ether type ip ether daddr set aa:53:08:12:34:56 meta pkttype set unicast
      Without 'set unicast', ip stack discards PACKET_OTHERHOST skbs.
      
      It is also useful to add support for a '-m cluster like' nft rule
      (where switch floods packets to several nodes, and each cluster node
       node processes a subset of packets for load distribution).
      
      Mangling is restricted to HOST/OTHER/BROAD/MULTICAST, i.e. you cannot set
      skb->pkt_type to PACKET_KERNEL or change PACKET_LOOPBACK to PACKET_HOST.
      Signed-off-by: default avatarFlorian Westphal <fw@strlen.de>
      Signed-off-by: default avatarPablo Neira Ayuso <pablo@netfilter.org>
      b4aae759
    • David S. Miller's avatar
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net · b3e0d3d7
      David S. Miller authored
      Conflicts:
      	drivers/net/geneve.c
      
      Here we had an overlapping change, where in 'net' the extraneous stats
      bump was being removed whilst in 'net-next' the final argument to
      udp_tunnel6_xmit_skb() was being changed.
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      b3e0d3d7
  2. 17 Dec, 2015 21 commits
  3. 16 Dec, 2015 10 commits
    • Hubert Sokolowski's avatar
      net: Pass ndm_state to route netlink FDB notifications. · b3379041
      Hubert Sokolowski authored
      Before this change applications monitoring FDB notifications
      were not able to determine whether a new FDB entry is permament
      or not:
      bridge fdb add f1:f2:f3:f4:f5:f8 dev sw0p1 temp self
      bridge fdb add f1:f2:f3:f4:f5:f9 dev sw0p1 self
      
      bridge monitor fdb
      
      f1:f2:f3:f4:f5:f8 dev sw0p1 self permanent
      f1:f2:f3:f4:f5:f9 dev sw0p1 self permanent
      
      With this change ndm_state from the original netlink message
      is passed to the new netlink message sent as notification.
      
      bridge fdb add f1:f2:f3:f4:f5:f6 dev sw0p1 self
      bridge fdb add f1:f2:f3:f4:f5:f7 dev sw0p1 temp self
      
      bridge monitor fdb
      f1:f2:f3:f4:f5:f6 dev sw0p1 self permanent
      f1:f2:f3:f4:f5:f7 dev sw0p1 self static
      Signed-off-by: default avatarHubert Sokolowski <hubert.sokolowski@intel.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      b3379041
    • David S. Miller's avatar
      Merge tag 'mac80211-for-davem-2015-12-15' of... · 4d4f3791
      David S. Miller authored
      Merge tag 'mac80211-for-davem-2015-12-15' of git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211
      
      Johannes Berg says:
      
      ====================
      Another set of fixes:
       * memory leak fixes (from Ola)
       * operating mode notification spec compliance fix (from Eyal)
       * copy rfkill names in case pointer becomes invalid (myself)
       * two hardware restart fixes (myself)
       * get rid of "limiting TX power" log spam (myself)
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      4d4f3791
    • Martin Roth's avatar
      82xx: FCC: Fixing a bug causing to FCC port lock-up · 79aa05a2
      Martin Roth authored
      The patch fixes FCC port lock-up, which occurs as a result of a bug
      during underrun/collision handling. Within the tx_startup() function
      in mac-fcc.c, the address of last BD is not calculated correctly.
      As a result of wrong calculation of the last BD address, the next
      transmitted BD may be set to an area out of the transmit BD ring.
      This actually causes to port lock-up and it is not recoverable.
      Signed-off-by: default avatarMartin Roth <martin.roth@motorolasolutions.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      79aa05a2
    • Hamish Martin's avatar
      gianfar: Don't enable RX Filer if not supported · 7bff47da
      Hamish Martin authored
      After commit 15bf176d ("gianfar: Don't enable the Filer w/o the
      Parser"), 'TSEC' model controllers (for example as seen on MPC8541E)
      always have 8 bytes stripped from the front of received frames.
      Only 'eTSEC' gianfar controllers have the RX Filer capability (amongst
      other enhancements). Previously this was treated as always enabled
      for both 'TSEC' and 'eTSEC' controllers.
      In commit 15bf176d ("gianfar: Don't enable the Filer w/o the Parser")
      a subtle change was made to the setting of 'uses_rxfcb' to effectively
      always set it (since 'rx_filer_enable' was always true). This had the
      side-effect of always stripping 8 bytes from the front of received frames
      on 'TSEC' type controllers.
      
      We now only enable the RX Filer capability on controller types that
      support it, thereby avoiding the issue for 'TSEC' type controllers.
      Reviewed-by: default avatarChris Packham <chris.packham@alliedtelesis.co.nz>
      Reviewed-by: default avatarMark Tomlinson <mark.tomlinson@alliedtelesis.co.nz>
      Signed-off-by: default avatarHamish Martin <hamish.martin@alliedtelesis.co.nz>
      Reviewed-by: default avatarClaudiu Manoil <claudiu.manoil@freescale.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      7bff47da
    • Daniel Mentz's avatar
      dma-debug: Fix dma_debug_entry offset calculation · 0354aec1
      Daniel Mentz authored
      dma-debug uses struct dma_debug_entry to keep track of dma coherent
      memory allocation requests. The virtual address is converted into a pfn
      and an offset. Previously, the offset was calculated using an incorrect
      bit mask.  As a result, we saw incorrect error messages from dma-debug
      like the following:
      
      "DMA-API: exceeded 7 overlapping mappings of cacheline 0x03e00000"
      
      Cacheline 0x03e00000 does not exist on our platform.
      
      Cc: <stable@vger.kernel.org>
      Fixes: 0abdd7a8 ("dma-debug: introduce debug_dma_assert_idle()")
      Signed-off-by: default avatarDaniel Mentz <danielmentz@google.com>
      Signed-off-by: default avatarDan Williams <dan.j.williams@intel.com>
      0354aec1
    • Linus Torvalds's avatar
      Merge branch 'fixes' of git://ftp.arm.linux.org.uk/~rmk/linux-arm · a5e90b1b
      Linus Torvalds authored
      Pull ARM fixes from Russell King:
       "Further ARM fixes:
         - Anson Huang noticed that we were corrupting a register we shouldn't
           be during suspend on some CPUs.
         - Shengjiu Wang spotted a bug in the 'swp' instruction emulation.
         - Will Deacon fixed a bug in the ASID allocator.
         - Laura Abbott fixed the kernel permission protection to apply to all
           threads running in the system.
         - I've fixed two bugs with the domain access control register
           handling, one to do with printing an appropriate value at oops
           time, and the other to further fix the uaccess_with_memcpy code"
      
      * 'fixes' of git://ftp.arm.linux.org.uk/~rmk/linux-arm:
        ARM: 8475/1: SWP emulation: Restore original *data when failed
        ARM: 8471/1: need to save/restore arm register(r11) when it is corrupted
        ARM: fix uaccess_with_memcpy() with SW_DOMAIN_PAN
        ARM: report proper DACR value in oops dumps
        ARM: 8464/1: Update all mm structures with section adjustments
        ARM: 8465/1: mm: keep reserved ASIDs in sync with mm after multiple rollovers
      a5e90b1b
    • Hannes Frederic Sowa's avatar
      net: fix warnings in 'make htmldocs' by moving macro definition out of field declaration · 7bbadd2d
      Hannes Frederic Sowa authored
      Docbook does not like the definition of macros inside a field declaration
      and adds a warning. Move the definition out.
      
      Fixes: 79462ad0 ("net: add validation for the socket syscall protocol argument")
      Reported-by: default avatarkbuild test robot <lkp@intel.com>
      Signed-off-by: default avatarHannes Frederic Sowa <hannes@stressinduktion.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      7bbadd2d
    • Herbert Xu's avatar
      rhashtable: Fix walker list corruption · c6ff5268
      Herbert Xu authored
      The commit ba7c95ea ("rhashtable:
      Fix sleeping inside RCU critical section in walk_stop") introduced
      a new spinlock for the walker list.  However, it did not convert
      all existing users of the list over to the new spin lock.  Some
      continued to use the old mutext for this purpose.  This obviously
      led to corruption of the list.
      
      The fix is to use the spin lock everywhere where we touch the list.
      
      This also allows us to do rcu_rad_lock before we take the lock in
      rhashtable_walk_start.  With the old mutex this would've deadlocked
      but it's safe with the new spin lock.
      
      Fixes: ba7c95ea ("rhashtable: Fix sleeping inside RCU...")
      Reported-by: default avatarColin Ian King <colin.king@canonical.com>
      Signed-off-by: default avatarHerbert Xu <herbert@gondor.apana.org.au>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      c6ff5268
    • David S. Miller's avatar
      Merge tag 'batman-adv-for-davem' of git://git.open-mesh.org/linux-merge · 04ad3783
      David S. Miller authored
      Antonio Quartulli says:
      
      ====================
      Included changes:
      - change my email in MAINTAINERS and Doc files
      - create and export list of single hop neighs per interface
      - protect CRC in the BLA code by means of its own lock
      - minor fixes and code cleanups
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      04ad3783
    • David S. Miller's avatar
      Merge branch 'geneve-udp-port-offload' · 897ca373
      David S. Miller authored
      Anjali Singhai Jain says:
      
      ====================
      Add support for Geneve udp port offload
      
      This patch series adds new ndo ops for Geneve add/del port, so as
      to help offload Geneve tunnel functionalities such as RX checksum,
      RSS, filters etc.
      
      i40e driver has been tested with the changes to make sure the offloads
      happen.
      
      We do understand that this is not the ideal solution and most likely
      will be redone with a more generic offload framework.
      But this certainly will enable us to start seeing benefits of the
      accelerations for Geneve tunnels.
      
      As a side note, we did find an existing issue in i40e driver where a
      service task can modify tunnel data structures with no locks held to
      help linearize access. A separate patch will be taking care of that issue.
      
      A question out to the community is regarding the driver Kconfig parameters
      for VxLAN and Geneve, it would be ideal to drop those if there is a way
      to help resolve vxlan/geneve_get_rx_port symbols while the tunnel modules
      are not loaded.
      
      Performance numbers:
      With the offloads enable on X722 devices with remote checksum enabled
      and no other tuning in terms of cpu governer etc on my test machine:
      
      With offload
      Throughput: 5527Mbits/sec with a single thread
      %cpu: ~43% per core with 4 threads
      
      Without offload
      Throughput: 2364Mbits/sec with a single thread
      %cpu: ~99% per core with 4 threads
      
      These numbers will get better for X722 as it is being worked. But
      this does bring out the delta in terms of when the stack is notified
      with csum_level 1 and CHECKSUM_UNNECESSARY vs not without the RX offload.
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      897ca373