1. 25 Apr, 2017 9 commits
  2. 24 Apr, 2017 12 commits
    • David S. Miller's avatar
      Merge branch 'dsa-b53-58xx-fixes' · 38a98bce
      David S. Miller authored
      Florian Fainelli says:
      
      ====================
      net: dsa: b53: BCM58xx devices fixes
      
      This patch series contains fixes for the 58xx devices (Broadcom Northstar
      Plus), which were identified thanks to the help of Eric Anholt.
      ====================
      Tested-by: default avatarEric Anholt <eric@anholt.net>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      38a98bce
    • Florian Fainelli's avatar
      net: dsa: b53: Fix CPU port for 58xx devices · bfcda65c
      Florian Fainelli authored
      The 58xx devices (Northstar Plus) do actually have their CPU port wired
      at port 8, it was unfortunately set to port 5 (B53_CPU_PORT_25) which is
      incorrect, since that is the second possible management port.
      
      Fixes: 991a36bb ("net: dsa: b53: Add support for BCM585xx/586xx/88312 integrated switch")
      Reported-by: default avatarEric Anholt <eric@anholt.net>
      Signed-off-by: default avatarFlorian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      bfcda65c
    • Florian Fainelli's avatar
      net: dsa: b53: Implement software reset for 58xx devices · 3fb22b05
      Florian Fainelli authored
      Implement the correct software reset sequence for 58xx devices by
      setting all 3 reset bits and polling for the SW_RST bit to clear itself
      without a given timeout. We cannot use is58xx() here because that would
      also include the 7445/7278 Starfighter 2 which have their own driver
      doing the reset earlier on due to the HW specific integration.
      
      Fixes: 991a36bb ("net: dsa: b53: Add support for BCM585xx/586xx/88312 integrated switch")
      Signed-off-by: default avatarFlorian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      3fb22b05
    • Florian Fainelli's avatar
      net: dsa: b53: Include IMP/CPU port in dumb forwarding mode · a424f0de
      Florian Fainelli authored
      Since Broadcom tags are not enabled in b53 (DSA_PROTO_TAG_NONE), we need
      to make sure that the IMP/CPU port is included in the forwarding
      decision.
      
      Without this change, switching between non-management ports would work,
      but not between management ports and non-management ports thus breaking
      the default state in which DSA switch are brought up.
      
      Fixes: 967dd82f ("net: dsa: b53: Add support for Broadcom RoboSwitch")
      Reported-by: default avatarEric Anholt <eric@anholt.net>
      Signed-off-by: default avatarFlorian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      a424f0de
    • David S. Miller's avatar
      Merge tag 'mlx5-fixes-2017-04-22' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux · 38baf3a6
      David S. Miller authored
      Saeed Mahameed says:
      
      ====================
      Mellanox, mlx5 fixes 2017-04-22
      
      This series contains some mlx5 fixes for net.
      
      For your convenience, the series doesn't introduce any conflict with
      the ongoing net-next pull request.
      
      Please pull and let me know if there's any problem.
      
      For -stable:
      ("net/mlx5: E-Switch, Correctly deal with inline mode on ConnectX-5") kernels >= 4.10
      ("net/mlx5e: Fix ETHTOOL_GRXCLSRLALL handling") kernels >= 4.8
      ("net/mlx5e: Fix small packet threshold")       kernels >= 4.7
      ("net/mlx5: Fix driver load bad flow when having fw initializing timeout") kernels >= 4.4
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      38baf3a6
    • David Ahern's avatar
      net: ipv6: send unsolicited NA if enabled for all interfaces · fc1f8f4f
      David Ahern authored
      When arp_notify is set to 1 for either a specific interface or for 'all'
      interfaces, gratuitous arp requests are sent. Since ndisc_notify is the
      ipv6 equivalent to arp_notify, it should follow the same semantics.
      Commit 4a6e3c5d ("net: ipv6: send unsolicited NA on admin up") sends
      the NA on admin up. The final piece is checking devconf_all->ndisc_notify
      in addition to the per device setting. Add it.
      
      Fixes: 5cb04436 ("ipv6: add knob to send unsolicited ND on link-layer address change")
      Signed-off-by: default avatarDavid Ahern <dsa@cumulusnetworks.com>
      Reviewed-by: default avatarSimon Horman <simon.horman@netronome.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      fc1f8f4f
    • Dan Carpenter's avatar
      ravb: Double free on error in ravb_start_xmit() · 9199cb76
      Dan Carpenter authored
      If skb_put_padto() fails then it frees the skb.  I shifted that code
      up a bit to make my error handling a little simpler.
      
      Fixes: a0d2f206 ("Renesas Ethernet AVB PTP clock driver")
      Signed-off-by: default avatarDan Carpenter <dan.carpenter@oracle.com>
      Acked-by: default avatarSergei Shtylyov <sergei.shtylyov@cogentembedded.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      9199cb76
    • Ansis Atteka's avatar
      udp: disable inner UDP checksum offloads in IPsec case · b40c5f4f
      Ansis Atteka authored
      Otherwise, UDP checksum offloads could corrupt ESP packets by attempting
      to calculate UDP checksum when this inner UDP packet is already protected
      by IPsec.
      
      One way to reproduce this bug is to have a VM with virtio_net driver (UFO
      set to ON in the guest VM); and then encapsulate all guest's Ethernet
      frames in Geneve; and then further encrypt Geneve with IPsec.  In this
      case following symptoms are observed:
      1. If using ixgbe NIC, then it will complain with following error message:
         ixgbe 0000:01:00.1: partial checksum but l4 proto=32!
      2. Receiving IPsec stack will drop all the corrupted ESP packets and
         increase XfrmInStateProtoError counter in /proc/net/xfrm_stat.
      3. iperf UDP test from the VM with packet sizes above MTU will not work at
         all.
      4. iperf TCP test from the VM will get ridiculously low performance because.
      Signed-off-by: default avatarAnsis Atteka <aatteka@ovn.org>
      Co-authored-by: default avatarSteffen Klassert <steffen.klassert@secunet.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      b40c5f4f
    • Jason A. Donenfeld's avatar
      macsec: avoid heap overflow in skb_to_sgvec · 4d6fa57b
      Jason A. Donenfeld authored
      While this may appear as a humdrum one line change, it's actually quite
      important. An sk_buff stores data in three places:
      
      1. A linear chunk of allocated memory in skb->data. This is the easiest
         one to work with, but it precludes using scatterdata since the memory
         must be linear.
      2. The array skb_shinfo(skb)->frags, which is of maximum length
         MAX_SKB_FRAGS. This is nice for scattergather, since these fragments
         can point to different pages.
      3. skb_shinfo(skb)->frag_list, which is a pointer to another sk_buff,
         which in turn can have data in either (1) or (2).
      
      The first two are rather easy to deal with, since they're of a fixed
      maximum length, while the third one is not, since there can be
      potentially limitless chains of fragments. Fortunately dealing with
      frag_list is opt-in for drivers, so drivers don't actually have to deal
      with this mess. For whatever reason, macsec decided it wanted pain, and
      so it explicitly specified NETIF_F_FRAGLIST.
      
      Because dealing with (1), (2), and (3) is insane, most users of sk_buff
      doing any sort of crypto or paging operation calls a convenient function
      called skb_to_sgvec (which happens to be recursive if (3) is in use!).
      This takes a sk_buff as input, and writes into its output pointer an
      array of scattergather list items. Sometimes people like to declare a
      fixed size scattergather list on the stack; othertimes people like to
      allocate a fixed size scattergather list on the heap. However, if you're
      doing it in a fixed-size fashion, you really shouldn't be using
      NETIF_F_FRAGLIST too (unless you're also ensuring the sk_buff and its
      frag_list children arent't shared and then you check the number of
      fragments in total required.)
      
      Macsec specifically does this:
      
              size += sizeof(struct scatterlist) * (MAX_SKB_FRAGS + 1);
              tmp = kmalloc(size, GFP_ATOMIC);
              *sg = (struct scatterlist *)(tmp + sg_offset);
      	...
              sg_init_table(sg, MAX_SKB_FRAGS + 1);
              skb_to_sgvec(skb, sg, 0, skb->len);
      
      Specifying MAX_SKB_FRAGS + 1 is the right answer usually, but not if you're
      using NETIF_F_FRAGLIST, in which case the call to skb_to_sgvec will
      overflow the heap, and disaster ensues.
      Signed-off-by: default avatarJason A. Donenfeld <Jason@zx2c4.com>
      Cc: stable@vger.kernel.org
      Cc: security@kernel.org
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      4d6fa57b
    • Robert Shearman's avatar
      ipv4: Avoid caching l3mdev dst on mismatched local route · b7c8487c
      Robert Shearman authored
      David reported that doing the following:
      
          ip li add red type vrf table 10
          ip link set dev eth1 vrf red
          ip addr add 127.0.0.1/8 dev red
          ip link set dev eth1 up
          ip li set red up
          ping -c1 -w1 -I red 127.0.0.1
          ip li del red
      
      when either policy routing IP rules are present or the local table
      lookup ip rule is before the l3mdev lookup results in a hang with
      these messages:
      
          unregister_netdevice: waiting for red to become free. Usage count = 1
      
      The problem is caused by caching the dst used for sending the packet
      out of the specified interface on a local route with a different
      nexthop interface. Thus the dst could stay around until the route in
      the table the lookup was done is deleted which may be never.
      
      Address the problem by not forcing output device to be the l3mdev in
      the flow's output interface if the lookup didn't use the l3mdev. This
      then results in the dst using the right device according to the route.
      
      Changes in v2:
       - make the dev_out passed in by __ip_route_output_key_hash correct
         instead of checking the nh dev if FLOWI_FLAG_SKIP_NH_OIF is set as
         suggested by David.
      
      Fixes: 5f02ce24 ("net: l3mdev: Allow the l3mdev to be a loopback")
      Reported-by: default avatarDavid Ahern <dsa@cumulusnetworks.com>
      Suggested-by: default avatarDavid Ahern <dsa@cumulusnetworks.com>
      Signed-off-by: default avatarRobert Shearman <rshearma@brocade.com>
      Acked-by: default avatarDavid Ahern <dsa@cumulusnetworks.com>
      Tested-by: default avatarDavid Ahern <dsa@cumulusnetworks.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      b7c8487c
    • Dan Carpenter's avatar
      net: tc35815: move free after the dereference · 11faa7b0
      Dan Carpenter authored
      We dereference "skb" to get "skb->len" so we should probably do that
      step before freeing the skb.
      
      Fixes: eea221ce ("tc35815 driver update (take 2)")
      Signed-off-by: default avatarDan Carpenter <dan.carpenter@oracle.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      11faa7b0
    • Martin KaFai Lau's avatar
      net/mlx5e: Fix race in mlx5e_sw_stats and mlx5e_vport_stats · 1510d728
      Martin KaFai Lau authored
      We have observed a sudden spike in rx/tx_packets and rx/tx_bytes
      reported under /proc/net/dev.  There is a race in mlx5e_update_stats()
      and some of the get-stats functions (the one that we hit is the
      mlx5e_get_stats() which is called by ndo_get_stats64()).
      
      In particular, the very first thing mlx5e_update_sw_counters()
      does is 'memset(s, 0, sizeof(*s))'.  For example, if mlx5e_get_stats()
      is unlucky at one point, rx_bytes and rx_packets could be 0.  One second
      later, a normal (and much bigger than 0) value will be reported.
      
      This patch is to use a 'struct mlx5e_sw_stats temp' to avoid
      a direct memset zero on priv->stats.sw.
      
      mlx5e_update_vport_counters() has a similar race.  Hence, addressed
      together.  However, memset zero is removed instead because
      it is not needed.
      
      I am lucky enough to catch this 0-reset in rx multicast:
      eth0: 41457665   76804   70    0    0    70          0     47085 15586634   87502    3    0    0     0       3          0
      eth0: 41459860   76815   70    0    0    70          0     47094 15588376   87516    3    0    0     0       3          0
      eth0: 41460577   76822   70    0    0    70          0         0 15589083   87521    3    0    0     0       3          0
      eth0: 41463293   76838   70    0    0    70          0     47108 15595872   87538    3    0    0     0       3          0
      eth0: 41463379   76839   70    0    0    70          0     47116 15596138   87539    3    0    0     0       3          0
      
      v2: Remove memset zero from mlx5e_update_vport_counters()
      v1: Use temp and memcpy
      
      Fixes: 9218b44d ("net/mlx5e: Statistics handling refactoring")
      Suggested-by: default avatarEric Dumazet <eric.dumazet@gmail.com>
      Suggested-by: default avatarSaeed Mahameed <saeedm@mellanox.com>
      Signed-off-by: default avatarMartin KaFai Lau <kafai@fb.com>
      Acked-by: default avatarSaeed Mahameed <saeedm@mellanox.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      1510d728
  3. 22 Apr, 2017 7 commits
  4. 21 Apr, 2017 12 commits
    • Linus Torvalds's avatar
      Merge tag 'nfsd-4.11-2' of git://linux-nfs.org/~bfields/linux · 94836ecf
      Linus Torvalds authored
      Pull nfsd bugfix from Bruce Fields:
       "Fix a 4.11 regression that triggers a BUG() on an attempt to use an
        unsupported NFSv4 compound op"
      
      * tag 'nfsd-4.11-2' of git://linux-nfs.org/~bfields/linux:
        nfsd: fix oops on unsupported operation
      94836ecf
    • Linus Torvalds's avatar
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net · 057a650b
      Linus Torvalds authored
      Pull networking fixes from David Miller:
      
       1) Don't race in IPSEC dumps, from Yuejie Shi.
      
       2) Verify lengths properly in IPSEC reqeusts, from Herbert Xu.
      
       3) Fix out of bounds access in ipv6 segment routing code, from David
          Lebrun.
      
       4) Don't write into the header of cloned SKBs in smsc95xx driver, from
          James Hughes.
      
       5) Several other drivers have this bug too, fix them. From Eric
          Dumazet.
      
       6) Fix access to uninitialized data in TC action cookie code, from
          Wolfgang Bumiller.
      
       7) Fix double free in IPV6 segment routing, again from David Lebrun.
      
       8) Don't let userspace set the RTF_PCPU flag, oops. From David Ahern.
      
       9) Fix use after free in qrtr code, from Dan Carpenter.
      
      10) Don't double-destroy devices in ip6mr code, from Nikolay
          Aleksandrov.
      
      11) Don't pass out-of-range TX queue indices into drivers, from Tushar
          Dave.
      
      * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (30 commits)
        netpoll: Check for skb->queue_mapping
        ip6mr: fix notification device destruction
        bpf, doc: update bpf maintainers entry
        net: qrtr: potential use after free in qrtr_sendmsg()
        bpf: Fix values type used in test_maps
        net: ipv6: RTF_PCPU should not be settable from userspace
        gso: Validate assumption of frag_list segementation
        kaweth: use skb_cow_head() to deal with cloned skbs
        ch9200: use skb_cow_head() to deal with cloned skbs
        lan78xx: use skb_cow_head() to deal with cloned skbs
        sr9700: use skb_cow_head() to deal with cloned skbs
        cx82310_eth: use skb_cow_head() to deal with cloned skbs
        smsc75xx: use skb_cow_head() to deal with cloned skbs
        ipv6: sr: fix double free of skb after handling invalid SRH
        MAINTAINERS: Add "B:" field for networking.
        net sched actions: allocate act cookie early
        qed: Fix issue in populating the PFC config paramters.
        qed: Fix possible system hang in the dcbnl-getdcbx() path.
        qed: Fix sending an invalid PFC error mask to MFW.
        qed: Fix possible error in populating max_tc field.
        ...
      057a650b
    • Tushar Dave's avatar
      netpoll: Check for skb->queue_mapping · c70b17b7
      Tushar Dave authored
      Reducing real_num_tx_queues needs to be in sync with skb queue_mapping
      otherwise skbs with queue_mapping greater than real_num_tx_queues
      can be sent to the underlying driver and can result in kernel panic.
      
      One such event is running netconsole and enabling VF on the same
      device. Or running netconsole and changing number of tx queues via
      ethtool on same device.
      
      e.g.
      Unable to handle kernel NULL pointer dereference
      tsk->{mm,active_mm}->context = 0000000000001525
      tsk->{mm,active_mm}->pgd = fff800130ff9a000
                    \|/ ____ \|/
                    "@'/ .. \`@"
                    /_| \__/ |_\
                       \__U_/
      kworker/48:1(475): Oops [#1]
      CPU: 48 PID: 475 Comm: kworker/48:1 Tainted: G           OE
      4.11.0-rc3-davem-net+ #7
      Workqueue: events queue_process
      task: fff80013113299c0 task.stack: fff800131132c000
      TSTATE: 0000004480e01600 TPC: 00000000103f9e3c TNPC: 00000000103f9e40 Y:
      00000000    Tainted: G           OE
      TPC: <ixgbe_xmit_frame_ring+0x7c/0x6c0 [ixgbe]>
      g0: 0000000000000000 g1: 0000000000003fff g2: 0000000000000000 g3:
      0000000000000001
      g4: fff80013113299c0 g5: fff8001fa6808000 g6: fff800131132c000 g7:
      00000000000000c0
      o0: fff8001fa760c460 o1: fff8001311329a50 o2: fff8001fa7607504 o3:
      0000000000000003
      o4: fff8001f96e63a40 o5: fff8001311d77ec0 sp: fff800131132f0e1 ret_pc:
      000000000049ed94
      RPC: <set_next_entity+0x34/0xb80>
      l0: 0000000000000000 l1: 0000000000000800 l2: 0000000000000000 l3:
      0000000000000000
      l4: 000b2aa30e34b10d l5: 0000000000000000 l6: 0000000000000000 l7:
      fff8001fa7605028
      i0: fff80013111a8a00 i1: fff80013155a0780 i2: 0000000000000000 i3:
      0000000000000000
      i4: 0000000000000000 i5: 0000000000100000 i6: fff800131132f1a1 i7:
      00000000103fa4b0
      I7: <ixgbe_xmit_frame+0x30/0xa0 [ixgbe]>
      Call Trace:
       [00000000103fa4b0] ixgbe_xmit_frame+0x30/0xa0 [ixgbe]
       [0000000000998c74] netpoll_start_xmit+0xf4/0x200
       [0000000000998e10] queue_process+0x90/0x160
       [0000000000485fa8] process_one_work+0x188/0x480
       [0000000000486410] worker_thread+0x170/0x4c0
       [000000000048c6b8] kthread+0xd8/0x120
       [0000000000406064] ret_from_fork+0x1c/0x2c
       [0000000000000000]           (null)
      Disabling lock debugging due to kernel taint
      Caller[00000000103fa4b0]: ixgbe_xmit_frame+0x30/0xa0 [ixgbe]
      Caller[0000000000998c74]: netpoll_start_xmit+0xf4/0x200
      Caller[0000000000998e10]: queue_process+0x90/0x160
      Caller[0000000000485fa8]: process_one_work+0x188/0x480
      Caller[0000000000486410]: worker_thread+0x170/0x4c0
      Caller[000000000048c6b8]: kthread+0xd8/0x120
      Caller[0000000000406064]: ret_from_fork+0x1c/0x2c
      Caller[0000000000000000]:           (null)
      Signed-off-by: default avatarTushar Dave <tushar.n.dave@oracle.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      c70b17b7
    • Nikolay Aleksandrov's avatar
      ip6mr: fix notification device destruction · 723b929c
      Nikolay Aleksandrov authored
      Andrey Konovalov reported a BUG caused by the ip6mr code which is caused
      because we call unregister_netdevice_many for a device that is already
      being destroyed. In IPv4's ipmr that has been resolved by two commits
      long time ago by introducing the "notify" parameter to the delete
      function and avoiding the unregister when called from a notifier, so
      let's do the same for ip6mr.
      
      The trace from Andrey:
      ------------[ cut here ]------------
      kernel BUG at net/core/dev.c:6813!
      invalid opcode: 0000 [#1] SMP KASAN
      Modules linked in:
      CPU: 1 PID: 1165 Comm: kworker/u4:3 Not tainted 4.11.0-rc7+ #251
      Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs
      01/01/2011
      Workqueue: netns cleanup_net
      task: ffff880069208000 task.stack: ffff8800692d8000
      RIP: 0010:rollback_registered_many+0x348/0xeb0 net/core/dev.c:6813
      RSP: 0018:ffff8800692de7f0 EFLAGS: 00010297
      RAX: ffff880069208000 RBX: 0000000000000002 RCX: 0000000000000001
      RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff88006af90569
      RBP: ffff8800692de9f0 R08: ffff8800692dec60 R09: 0000000000000000
      R10: 0000000000000006 R11: 0000000000000000 R12: ffff88006af90070
      R13: ffff8800692debf0 R14: dffffc0000000000 R15: ffff88006af90000
      FS:  0000000000000000(0000) GS:ffff88006cb00000(0000)
      knlGS:0000000000000000
      CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      CR2: 00007fe7e897d870 CR3: 00000000657e7000 CR4: 00000000000006e0
      Call Trace:
       unregister_netdevice_many.part.105+0x87/0x440 net/core/dev.c:7881
       unregister_netdevice_many+0xc8/0x120 net/core/dev.c:7880
       ip6mr_device_event+0x362/0x3f0 net/ipv6/ip6mr.c:1346
       notifier_call_chain+0x145/0x2f0 kernel/notifier.c:93
       __raw_notifier_call_chain kernel/notifier.c:394
       raw_notifier_call_chain+0x2d/0x40 kernel/notifier.c:401
       call_netdevice_notifiers_info+0x51/0x90 net/core/dev.c:1647
       call_netdevice_notifiers net/core/dev.c:1663
       rollback_registered_many+0x919/0xeb0 net/core/dev.c:6841
       unregister_netdevice_many.part.105+0x87/0x440 net/core/dev.c:7881
       unregister_netdevice_many net/core/dev.c:7880
       default_device_exit_batch+0x4fa/0x640 net/core/dev.c:8333
       ops_exit_list.isra.4+0x100/0x150 net/core/net_namespace.c:144
       cleanup_net+0x5a8/0xb40 net/core/net_namespace.c:463
       process_one_work+0xc04/0x1c10 kernel/workqueue.c:2097
       worker_thread+0x223/0x19c0 kernel/workqueue.c:2231
       kthread+0x35e/0x430 kernel/kthread.c:231
       ret_from_fork+0x31/0x40 arch/x86/entry/entry_64.S:430
      Code: 3c 32 00 0f 85 70 0b 00 00 48 b8 00 02 00 00 00 00 ad de 49 89
      47 78 e9 93 fe ff ff 49 8d 57 70 49 8d 5f 78 eb 9e e8 88 7a 14 fe <0f>
      0b 48 8b 9d 28 fe ff ff e8 7a 7a 14 fe 48 b8 00 00 00 00 00
      RIP: rollback_registered_many+0x348/0xeb0 RSP: ffff8800692de7f0
      ---[ end trace e0b29c57e9b3292c ]---
      Reported-by: default avatarAndrey Konovalov <andreyknvl@google.com>
      Signed-off-by: default avatarNikolay Aleksandrov <nikolay@cumulusnetworks.com>
      Tested-by: default avatarAndrey Konovalov <andreyknvl@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      723b929c
    • Daniel Borkmann's avatar
      bpf, doc: update bpf maintainers entry · cdb90499
      Daniel Borkmann authored
      Add various related files that have been missing under
      BPF entry covering essential parts of its infrastructure
      and also add myself as co-maintainer.
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Acked-by: default avatarAlexei Starovoitov <ast@kernel.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      cdb90499
    • Dan Carpenter's avatar
      net: qrtr: potential use after free in qrtr_sendmsg() · 6f60f438
      Dan Carpenter authored
      If skb_pad() fails then it frees the skb so we should check for errors.
      
      Fixes: bdabad3e ("net: Add Qualcomm IPC router")
      Signed-off-by: default avatarDan Carpenter <dan.carpenter@oracle.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      6f60f438
    • David Miller's avatar
      bpf: Fix values type used in test_maps · 89087c45
      David Miller authored
      Maps of per-cpu type have their value element size adjusted to 8 if it
      is specified smaller during various map operations.
      
      This makes test_maps as a 32-bit binary fail, in fact the kernel
      writes past the end of the value's array on the user's stack.
      
      To be quite honest, I think the kernel should reject creation of a
      per-cpu map that doesn't have a value size of at least 8 if that's
      what the kernel is going to silently adjust to later.
      
      If the user passed something smaller, it is a sizeof() calcualtion
      based upon the type they will actually use (just like in this testcase
      code) in later calls to the map operations.
      
      Fixes: df570f57 ("samples/bpf: unit test for BPF_MAP_TYPE_PERCPU_ARRAY")
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Acked-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Acked-by: default avatarAlexei Starovoitov <ast@kernel.org>
      89087c45
    • David Ahern's avatar
      net: ipv6: RTF_PCPU should not be settable from userspace · 557c44be
      David Ahern authored
      Andrey reported a fault in the IPv6 route code:
      
      kasan: GPF could be caused by NULL-ptr deref or user memory access
      general protection fault: 0000 [#1] SMP KASAN
      Modules linked in:
      CPU: 1 PID: 4035 Comm: a.out Not tainted 4.11.0-rc7+ #250
      Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
      task: ffff880069809600 task.stack: ffff880062dc8000
      RIP: 0010:ip6_rt_cache_alloc+0xa6/0x560 net/ipv6/route.c:975
      RSP: 0018:ffff880062dced30 EFLAGS: 00010206
      RAX: dffffc0000000000 RBX: ffff8800670561c0 RCX: 0000000000000006
      RDX: 0000000000000003 RSI: ffff880062dcfb28 RDI: 0000000000000018
      RBP: ffff880062dced68 R08: 0000000000000001 R09: 0000000000000000
      R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000
      R13: ffff880062dcfb28 R14: dffffc0000000000 R15: 0000000000000000
      FS:  00007feebe37e7c0(0000) GS:ffff88006cb00000(0000) knlGS:0000000000000000
      CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      CR2: 00000000205a0fe4 CR3: 000000006b5c9000 CR4: 00000000000006e0
      Call Trace:
       ip6_pol_route+0x1512/0x1f20 net/ipv6/route.c:1128
       ip6_pol_route_output+0x4c/0x60 net/ipv6/route.c:1212
      ...
      
      Andrey's syzkaller program passes rtmsg.rtmsg_flags with the RTF_PCPU bit
      set. Flags passed to the kernel are blindly copied to the allocated
      rt6_info by ip6_route_info_create making a newly inserted route appear
      as though it is a per-cpu route. ip6_rt_cache_alloc sees the flag set
      and expects rt->dst.from to be set - which it is not since it is not
      really a per-cpu copy. The subsequent call to __ip6_dst_alloc then
      generates the fault.
      
      Fix by checking for the flag and failing with EINVAL.
      
      Fixes: d52d3997 ("ipv6: Create percpu rt6_info")
      Reported-by: default avatarAndrey Konovalov <andreyknvl@google.com>
      Signed-off-by: default avatarDavid Ahern <dsa@cumulusnetworks.com>
      Acked-by: default avatarMartin KaFai Lau <kafai@fb.com>
      Tested-by: default avatarAndrey Konovalov <andreyknvl@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      557c44be
    • Ilan Tayari's avatar
      gso: Validate assumption of frag_list segementation · 43170c4e
      Ilan Tayari authored
      Commit 07b26c94 ("gso: Support partial splitting at the frag_list
      pointer") assumes that all SKBs in a frag_list (except maybe the last
      one) contain the same amount of GSO payload.
      
      This assumption is not always correct, resulting in the following
      warning message in the log:
          skb_segment: too many frags
      
      For example, mlx5 driver in Striding RQ mode creates some RX SKBs with
      one frag, and some with 2 frags.
      After GRO, the frag_list SKBs end up having different amounts of payload.
      If this frag_list SKB is then forwarded, the aforementioned assumption
      is violated.
      
      Validate the assumption, and fall back to software GSO if it not true.
      
      Change-Id: Ia03983f4a47b6534dd987d7a2aad96d54d46d212
      Fixes: 07b26c94 ("gso: Support partial splitting at the frag_list pointer")
      Signed-off-by: default avatarIlan Tayari <ilant@mellanox.com>
      Signed-off-by: default avatarIlya Lesokhin <ilyal@mellanox.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      43170c4e
    • David S. Miller's avatar
      Merge branch 'skb_cow_head' · 918b7024
      David S. Miller authored
      Eric Dumazet says:
      
      ====================
      net: use skb_cow_head() to deal with cloned skbs
      
      James Hughes found an issue with smsc95xx driver. Same problematic code
      is found in other drivers.
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      918b7024
    • Eric Dumazet's avatar
      kaweth: use skb_cow_head() to deal with cloned skbs · 39fba783
      Eric Dumazet authored
      We can use skb_cow_head() to properly deal with clones,
      especially the ones coming from TCP stack that allow their head being
      modified. This avoids a copy.
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Cc: James Hughes <james.hughes@raspberrypi.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      39fba783
    • Eric Dumazet's avatar
      ch9200: use skb_cow_head() to deal with cloned skbs · 6bc6895b
      Eric Dumazet authored
      We need to ensure there is enough headroom to push extra header,
      but we also need to check if we are allowed to change headers.
      
      skb_cow_head() is the proper helper to deal with this.
      
      Fixes: 4a476bd6 ("usbnet: New driver for QinHeng CH9200 devices")
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Cc: James Hughes <james.hughes@raspberrypi.org>
      Cc: Matthew Garrett <mjg59@srcf.ucam.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      6bc6895b