1. 22 Dec, 2015 19 commits
  2. 20 Dec, 2015 1 commit
  3. 19 Dec, 2015 1 commit
    • Arnd Bergmann's avatar
      netcp: fix regression in receive processing · 958d104e
      Arnd Bergmann authored
      A cleanup patch I did was unfortunately wrong and introduced
      multiple serious bugs in the netcp rx processing, as indicated
      by these correct gcc warnings:
      
      drivers/net/ethernet/ti/netcp_core.c:776:14: warning: 'buf_ptr' may be used uninitialized in this function [-Wuninitialized]
      drivers/net/ethernet/ti/netcp_core.c:687:14: warning: 'ptr' may be used uninitialized in this function [-Wuninitialized]
      
      I have checked the patch once more and found that a call to
      get_pkt_info() accidentally got removed in netcp_free_rx_desc_chain,
      and netcp_process_one_rx_packet no longer retrieved the correct
      buffer length. This patch should fix all the known problems,
      but I did not test on real hardware.
      Signed-off-by: default avatarArnd Bergmann <arnd@arndb.de>
      Fixes: 89907779 ("netcp: try to reduce type confusion in descriptors")
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      958d104e
  4. 18 Dec, 2015 17 commits
    • stephen hemminger's avatar
      asix: silence log message from oversize packet · b70183db
      stephen hemminger authored
      Since it is possible for an external system to send oversize packets
      at anytime, it is best for driver not to print a message and spam
      the log (potential external DoS).
      
      Fixes: https://bugzilla.kernel.org/show_bug.cgi?id=109471Signed-off-by: default avatarStephen Hemminger <stephen@networkplumber.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      b70183db
    • Eric Dumazet's avatar
      tcp: diag: add support for request sockets to tcp_abort() · 07f6f4a3
      Eric Dumazet authored
      Adding support for SYN_RECV request sockets to tcp_abort()
      is quite easy after our tcp listener rewrite.
      
      Note that we also need to better handle listeners, or we might
      leak not yet accepted children, because of a missing
      inet_csk_listen_stop() call.
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Cc: Lorenzo Colitti <lorenzo@google.com>
      Tested-by: default avatarLorenzo Colitti <lorenzo@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      07f6f4a3
    • David S. Miller's avatar
      Merge branch 'bpf-misc-updates' · d73e5f41
      David S. Miller authored
      Daniel Borkmann says:
      
      ====================
      Misc BPF updates
      
      This series contains a couple of misc updates to the BPF code, besides
      others a new helper bpf_skb_load_bytes(), moving clearing of A/X to the
      classic converter, etc. Please see individual patches for details.
      
      Thanks!
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      d73e5f41
    • Daniel Borkmann's avatar
      bpf, test: add couple of test cases · 9dd2af83
      Daniel Borkmann authored
      Add couple of test cases for interpreter but also JITs, f.e. to test that
      when imm32 moves are being done, upper 32bits of the regs are being zero
      extended.
      
      Without JIT:
      
        [...]
        [ 1114.129301] test_bpf: #43 MOV REG64 jited:0 128 PASS
        [ 1114.130626] test_bpf: #44 MOV REG32 jited:0 139 PASS
        [ 1114.132055] test_bpf: #45 LD IMM64 jited:0 124 PASS
        [...]
      
      With JIT (generated code can as usual be nicely verified with the help of
      bpf_jit_disasm tool):
      
        [...]
        [ 1062.726782] test_bpf: #43 MOV REG64 jited:1 6 PASS
        [ 1062.726890] test_bpf: #44 MOV REG32 jited:1 6 PASS
        [ 1062.726993] test_bpf: #45 LD IMM64 jited:1 6 PASS
        [...]
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Acked-by: default avatarAlexei Starovoitov <ast@kernel.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      9dd2af83
    • Daniel Borkmann's avatar
      bpf, x86: detect/optimize loading 0 immediates · 606c88a8
      Daniel Borkmann authored
      When sometimes structs or variables need to be initialized/'memset' to 0 in
      an eBPF C program, the x86 BPF JIT converts this to use immediates. We can
      however save a couple of bytes (f.e. even up to 7 bytes on a single emmission
      of BPF_LD | BPF_IMM | BPF_DW) in the image by detecting such case and use xor
      on the dst register instead.
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Acked-by: default avatarAlexei Starovoitov <ast@kernel.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      606c88a8
    • Daniel Borkmann's avatar
      bpf: fix misleading comment in bpf_convert_filter · 23bf8807
      Daniel Borkmann authored
      Comment says "User BPF's register A is mapped to our BPF register 6",
      which is actually wrong as the mapping is on register 0. This can
      already be inferred from the code itself. So just remove it before
      someone makes assumptions based on that. Only code tells truth. ;)
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Acked-by: default avatarAlexei Starovoitov <ast@kernel.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      23bf8807
    • Daniel Borkmann's avatar
      bpf: move clearing of A/X into classic to eBPF migration prologue · 8b614aeb
      Daniel Borkmann authored
      Back in the days where eBPF (or back then "internal BPF" ;->) was not
      exposed to user space, and only the classic BPF programs internally
      translated into eBPF programs, we missed the fact that for classic BPF
      A and X needed to be cleared. It was fixed back then via 83d5b7ef
      ("net: filter: initialize A and X registers"), and thus classic BPF
      specifics were added to the eBPF interpreter core to work around it.
      
      This added some confusion for JIT developers later on that take the
      eBPF interpreter code as an example for deriving their JIT. F.e. in
      f75298f5 ("s390/bpf: clear correct BPF accumulator register"), at
      least X could leak stack memory. Furthermore, since this is only needed
      for classic BPF translations and not for eBPF (verifier takes care
      that read access to regs cannot be done uninitialized), more complexity
      is added to JITs as they need to determine whether they deal with
      migrations or native eBPF where they can just omit clearing A/X in
      their prologue and thus reduce image size a bit, see f.e. cde66c2d
      ("s390/bpf: Only clear A and X for converted BPF programs"). In other
      cases (x86, arm64), A and X is being cleared in the prologue also for
      eBPF case, which is unnecessary.
      
      Lets move this into the BPF migration in bpf_convert_filter() where it
      actually belongs as long as the number of eBPF JITs are still few. It
      can thus be done generically; allowing us to remove the quirk from
      __bpf_prog_run() and to slightly reduce JIT image size in case of eBPF,
      while reducing code duplication on this matter in current(/future) eBPF
      JITs.
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Acked-by: default avatarAlexei Starovoitov <ast@kernel.org>
      Reviewed-by: default avatarMichael Holzheu <holzheu@linux.vnet.ibm.com>
      Tested-by: default avatarMichael Holzheu <holzheu@linux.vnet.ibm.com>
      Cc: Zi Shen Lim <zlim.lnx@gmail.com>
      Cc: Yang Shi <yang.shi@linaro.org>
      Acked-by: default avatarYang Shi <yang.shi@linaro.org>
      Acked-by: default avatarZi Shen Lim <zlim.lnx@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      8b614aeb
    • Daniel Borkmann's avatar
      bpf: add bpf_skb_load_bytes helper · 05c74e5e
      Daniel Borkmann authored
      When hacking tc programs with eBPF, one of the issues that come up
      from time to time is to load addresses from headers. In eBPF as in
      classic BPF, we have BPF_LD | BPF_ABS | BPF_{B,H,W} instructions that
      extract a byte, half-word or word out of the skb data though helpers
      such as bpf_load_pointer() (interpreter case).
      
      F.e. extracting a whole IPv6 address could possibly look like ...
      
        union v6addr {
          struct {
            __u32 p1;
            __u32 p2;
            __u32 p3;
            __u32 p4;
          };
          __u8 addr[16];
        };
      
        [...]
      
        a.p1 = htonl(load_word(skb, off));
        a.p2 = htonl(load_word(skb, off +  4));
        a.p3 = htonl(load_word(skb, off +  8));
        a.p4 = htonl(load_word(skb, off + 12));
      
        [...]
      
        /* access to a.addr[...] */
      
      This work adds a complementary helper bpf_skb_load_bytes() (we also
      have bpf_skb_store_bytes()) as an alternative where the same call
      would look like from an eBPF program:
      
        ret = bpf_skb_load_bytes(skb, off, addr, sizeof(addr));
      
      Same verifier restrictions apply as in ffeedafb ("bpf: introduce
      current->pid, tgid, uid, gid, comm accessors") case, where stack memory
      access needs to be statically verified and thus guaranteed to be
      initialized in first use (otherwise verifier cannot tell whether a
      subsequent access to it is valid or not as it's runtime dependent).
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Acked-by: default avatarAlexei Starovoitov <ast@kernel.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      05c74e5e
    • David S. Miller's avatar
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf-next · 59ce9670
      David S. Miller authored
      Pablo Neira Ayuso says:
      
      ====================
      Netfilter updates for net-next
      
      The following patchset contains the first batch of Netfilter updates for
      the upcoming 4.5 kernel. This batch contains userspace netfilter header
      compilation fixes, support for packet mangling in nf_tables, the new
      tracing infrastructure for nf_tables and cgroup2 support for iptables.
      More specifically, they are:
      
      1) Two patches to include dependencies in our netfilter userspace
         headers to resolve compilation problems, from Mikko Rapeli.
      
      2) Four comestic cleanup patches for the ebtables codebase, from Ian Morris.
      
      3) Remove duplicate include in the netfilter reject infrastructure,
         from Stephen Hemminger.
      
      4) Two patches to simplify the netfilter defragmentation code for IPv6,
         patch from Florian Westphal.
      
      5) Fix root ownership of /proc/net netfilter for unpriviledged net
         namespaces, from Philip Whineray.
      
      6) Get rid of unused fields in struct nft_pktinfo, from Florian Westphal.
      
      7) Add mangling support to our nf_tables payload expression, from
         Patrick McHardy.
      
      8) Introduce a new netlink-based tracing infrastructure for nf_tables,
         from Florian Westphal.
      
      9) Change setter functions in nfnetlink_log to be void, from
          Rami Rosen.
      
      10) Add netns support to the cttimeout infrastructure.
      
      11) Add cgroup2 support to iptables, from Tejun Heo.
      
      12) Introduce nfnl_dereference_protected() in nfnetlink, from Florian.
      
      13) Add support for mangling pkttype in the nf_tables meta expression,
          also from Florian.
      
      BTW, I need that you pull net into net-next, I have another batch that
      requires changes that I don't yet see in net.
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      59ce9670
    • Jakub Kicinski's avatar
      nfp: call netif_carrier_off() during init · 4b402d71
      Jakub Kicinski authored
      Netdevs default to carrier on, we should call netif_carrier_off()
      during initialization since we handle carrier state changes in the
      driver.
      Signed-off-by: default avatarJakub Kicinski <jakub.kicinski@netronome.com>
      Reviewed-by: default avatarRolf Neugebauer <rolf.neugebauer@netronome.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      4b402d71
    • David S. Miller's avatar
      Merge branch 'l3mdev-accept' · 6462de8c
      David S. Miller authored
      David Ahern says:
      
      ====================
      net: Allow accepted sockets to be bound to l3mdev domain
      
      Allow accepted sockets to derive their sk_bound_dev_if setting from the
      l3mdev domain in which the packets originated. This version adds a sysctl
      to control whether the setting is inherited, making the functionality
      similar to sk_mark and its sysctl_tcp_fwmark_accept setting.
      
      This effectively allow a process to have a "VRF-global" listen socket,
      with child sockets bound to the VRF device in which the packet originated.
      A similar behavior can be achieved using sk_mark, but a solution using marks
      is incomplete as it does not handle duplicate addresses in different L3
      domains/VRFs. Allowing sockets to inherit the sk_bound_dev_if from l3mdev
      domain provides a complete solution.
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      6462de8c
    • David Ahern's avatar
      net: Allow accepted sockets to be bound to l3mdev domain · 6dd9a14e
      David Ahern authored
      Allow accepted sockets to derive their sk_bound_dev_if setting from the
      l3mdev domain in which the packets originated. A sysctl setting is added
      to control the behavior which is similar to sk_mark and
      sysctl_tcp_fwmark_accept.
      
      This effectively allow a process to have a "VRF-global" listen socket,
      with child sockets bound to the VRF device in which the packet originated.
      A similar behavior can be achieved using sk_mark, but a solution using marks
      is incomplete as it does not handle duplicate addresses in different L3
      domains/VRFs. Allowing sockets to inherit the sk_bound_dev_if from l3mdev
      domain provides a complete solution.
      Signed-off-by: default avatarDavid Ahern <dsa@cumulusnetworks.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      6dd9a14e
    • David Ahern's avatar
      net: l3mdev: Add master device lookup by index · 1a852479
      David Ahern authored
      Add helper to lookup l3mdev master index given a device index.
      Signed-off-by: default avatarDavid Ahern <dsa@cumulusnetworks.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      1a852479
    • Bjørn Mork's avatar
      ipv6: addrconf: use stable address generator for ARPHRD_NONE · cc9da6cc
      Bjørn Mork authored
      Add a new address generator mode, using the stable address generator
      with an automatically generated secret. This is intended as a default
      address generator mode for device types with no EUI64 implementation.
      The new generator is used for ARPHRD_NONE interfaces initially, adding
      default IPv6 autoconf support to e.g. tun interfaces.
      
      If the addrgenmode is set to 'random', either by default or manually,
      and no stable secret is available, then a random secret is used as
      input for the stable-privacy address generator.  The secret can be
      read and modified like manually configured secrets, using the proc
      interface.  Modifying the secret will change the addrgen mode to
      'stable-privacy' to indicate that it operates on a known secret.
      
      Existing behaviour of the 'stable-privacy' mode is kept unchanged. If
      a known secret is available when the device is created, then the mode
      will default to 'stable-privacy' as before.  The mode can be manually
      set to 'random' but it will behave exactly like 'stable-privacy' in
      this case. The secret will not change.
      
      Cc: Hannes Frederic Sowa <hannes@stressinduktion.org>
      Cc: 吉藤英明 <hideaki.yoshifuji@miraclelinux.com>
      Signed-off-by: default avatarBjørn Mork <bjorn@mork.no>
      Acked-by: default avatarHannes Frederic Sowa <hannes@stressinduktion.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      cc9da6cc
    • Arnd Bergmann's avatar
      ila: add NETFILTER dependency · 8cb964da
      Arnd Bergmann authored
      The recently added generic ILA translation facility fails to
      build when CONFIG_NETFILTER is disabled:
      
      net/ipv6/ila/ila_xlat.c:229:20: warning: 'struct nf_hook_state' declared inside parameter list
      net/ipv6/ila/ila_xlat.c:235:27: error: array type has incomplete element type 'struct nf_hook_ops'
       static struct nf_hook_ops ila_nf_hook_ops[] __read_mostly = {
      
      This adds an explicit Kconfig dependency to avoid that case.
      Signed-off-by: default avatarArnd Bergmann <arnd@arndb.de>
      Fixes: 7f00feaf ("ila: Add generic ILA translation facility")
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      8cb964da
    • Florian Westphal's avatar
      netfilter: meta: add support for setting skb->pkttype · b4aae759
      Florian Westphal authored
      This allows to redirect bridged packets to local machine:
      
      ether type ip ether daddr set aa:53:08:12:34:56 meta pkttype set unicast
      Without 'set unicast', ip stack discards PACKET_OTHERHOST skbs.
      
      It is also useful to add support for a '-m cluster like' nft rule
      (where switch floods packets to several nodes, and each cluster node
       node processes a subset of packets for load distribution).
      
      Mangling is restricted to HOST/OTHER/BROAD/MULTICAST, i.e. you cannot set
      skb->pkt_type to PACKET_KERNEL or change PACKET_LOOPBACK to PACKET_HOST.
      Signed-off-by: default avatarFlorian Westphal <fw@strlen.de>
      Signed-off-by: default avatarPablo Neira Ayuso <pablo@netfilter.org>
      b4aae759
    • David S. Miller's avatar
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net · b3e0d3d7
      David S. Miller authored
      Conflicts:
      	drivers/net/geneve.c
      
      Here we had an overlapping change, where in 'net' the extraneous stats
      bump was being removed whilst in 'net-next' the final argument to
      udp_tunnel6_xmit_skb() was being changed.
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      b3e0d3d7
  5. 17 Dec, 2015 2 commits
    • Linus Torvalds's avatar
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net · 73796d8b
      Linus Torvalds authored
      Pull networking fixes from David Miller:
      
       1) Fix uninitialized variable warnings in nfnetlink_queue, a lot of
          people reported this...  From Arnd Bergmann.
      
       2) Don't init mutex twice in i40e driver, from Jesse Brandeburg.
      
       3) Fix spurious EBUSY in rhashtable, from Herbert Xu.
      
       4) Missing DMA unmaps in mvpp2 driver, from Marcin Wojtas.
      
       5) Fix race with work structure access in pppoe driver causing
          corruptions, from Guillaume Nault.
      
       6) Fix OOPS due to sh_eth_rx() not checking whether netdev_alloc_skb()
          actually succeeded or not, from Sergei Shtylyov.
      
       7) Don't lose flags when settifn IFA_F_OPTIMISTIC in ipv6 code, from
          Bjørn Mork.
      
       8) VXLAN_HD_RCO defined incorrectly, fix from Jiri Benc.
      
       9) Fix clock source used for cookies in SCTP, from Marcelo Ricardo
          Leitner.
      
      10) aurora driver needs HAS_DMA dependency, from Geert Uytterhoeven.
      
      11) ndo_fill_metadata_dst op of vxlan has to handle ipv6 tunneling
          properly as well, from Jiri Benc.
      
      12) Handle request sockets properly in xfrm layer, from Eric Dumazet.
      
      13) Double stats update in ipv6 geneve transmit path, fix from Pravin B
          Shelar.
      
      14) sk->sk_policy[] needs RCU protection, and as a result
          xfrm_policy_destroy() needs to free policies using an RCU grace
          period, from Eric Dumazet.
      
      15) SCTP needs to clone ipv6 tx options in order to avoid use after
          free, from Eric Dumazet.
      
      16) Missing kbuild export if ila.h, from Stephen Hemminger.
      
      17) Missing mdiobus_alloc() return value checking in mdio-mux.c, from
          Tobias Klauser.
      
      18) Validate protocol value range in ->create() methods, from Hannes
          Frederic Sowa.
      
      19) Fix early socket demux races that result in illegal dst reuse, from
          Eric Dumazet.
      
      20) Validate socket address length in pptp code, from WANG Cong.
      
      21) skb_reorder_vlan_header() uses incorrect offset and can corrupt
          packets, from Vlad Yasevich.
      
      22) Fix memory leaks in nl80211 registry code, from Ola Olsson.
      
      23) Timeout loop count handing fixes in mISDN, xgbe, qlge, sfc, and
          qlcnic.  From Dan Carpenter.
      
      24) msg.msg_iocb needs to be cleared in recvfrom() otherwise, for
          example, AF_ALG will interpret it as an async call.  From Tadeusz
          Struk.
      
      25) inetpeer_set_addr_v4 forgets to initialize the 'vif' field, from
          Eric Dumazet.
      
      26) rhashtable enforces the minimum table size not early enough,
          breaking how we calculate the per-cpu lock allocations.  From
          Herbert Xu.
      
      27) Fix FCC port lockup in 82xx driver, from Martin Roth.
      
      28) FOU sockets need to be freed using RCU, from Hannes Frederic Sowa.
      
      29) Fix out-of-bounds access in __skb_complete_tx_timestamp() and
          sock_setsockopt() wrt.  timestamp handling.  From WANG Cong.
      
      * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (117 commits)
        net: check both type and procotol for tcp sockets
        drivers: net: xgene: fix Tx flow control
        tcp: restore fastopen with no data in SYN packet
        af_unix: Revert 'lock_interruptible' in stream receive code
        fou: clean up socket with kfree_rcu
        82xx: FCC: Fixing a bug causing to FCC port lock-up
        gianfar: Don't enable RX Filer if not supported
        net: fix warnings in 'make htmldocs' by moving macro definition out of field declaration
        rhashtable: Fix walker list corruption
        rhashtable: Enforce minimum size on initial hash table
        inet: tcp: fix inetpeer_set_addr_v4()
        ipv6: automatically enable stable privacy mode if stable_secret set
        net: fix uninitialized variable issue
        bluetooth: Validate socket address length in sco_sock_bind().
        net_sched: make qdisc_tree_decrease_qlen() work for non mq
        ser_gigaset: remove unnecessary kfree() calls from release method
        ser_gigaset: fix deallocation of platform device structure
        ser_gigaset: turn nonsense checks into WARN_ON
        ser_gigaset: fix up NULL checks
        qlcnic: fix a timeout loop
        ...
      73796d8b
    • Eran Ben Elisha's avatar
      team: Advertise tunneling offload features · 3268e5cb
      Eran Ben Elisha authored
      When the underlying device supports offloads encapulated traffic,
      we need to reflect that through the hw_enc_features field of the
      team net-device.
      
      This will cause the xmit path in the core networking stack to provide
      team with encapsulated GSO frames to offload into the HW etc.
      
      Using this over Mellanox ConnectX3-pro (mlx4 driver) card that supports
      VXLAN offloads we got 36.0 Gbits/sec using eight iperf streams.
      Signed-off-by: default avatarEran Ben Elisha <eranbe@mellanox.com>
      Signed-off-by: default avatarJack Morgenstein <jackm@dev.mellanox.co.il>
      Reviewed-by: default avatarOr Gerlitz <ogerlitz@mellanox.com>
      Acked-by: default avatarJiri Pirko <jiri@mellanox.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      3268e5cb