1. 03 Oct, 2015 10 commits
    • Daniel Borkmann's avatar
      netlink, mmap: transform mmap skb into full skb on taps · 62f43b58
      Daniel Borkmann authored
      [ Upstream commit 1853c949 ]
      
      Ken-ichirou reported that running netlink in mmap mode for receive in
      combination with nlmon will throw a NULL pointer dereference in
      __kfree_skb() on nlmon_xmit(), in my case I can also trigger an "unable
      to handle kernel paging request". The problem is the skb_clone() in
      __netlink_deliver_tap_skb() for skbs that are mmaped.
      
      I.e. the cloned skb doesn't have a destructor, whereas the mmap netlink
      skb has it pointed to netlink_skb_destructor(), set in the handler
      netlink_ring_setup_skb(). There, skb->head is being set to NULL, so
      that in such cases, __kfree_skb() doesn't perform a skb_release_data()
      via skb_release_all(), where skb->head is possibly being freed through
      kfree(head) into slab allocator, although netlink mmap skb->head points
      to the mmap buffer. Similarly, the same has to be done also for large
      netlink skbs where the data area is vmalloced. Therefore, as discussed,
      make a copy for these rather rare cases for now. This fixes the issue
      on my and Ken-ichirou's test-cases.
      
      Reference: http://thread.gmane.org/gmane.linux.network/371129
      Fixes: bcbde0d4 ("net: netlink: virtual tap device management")
      Reported-by: default avatarKen-ichirou MATSUZAWA <chamaken@gmail.com>
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Tested-by: default avatarKen-ichirou MATSUZAWA <chamaken@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      62f43b58
    • Florian Fainelli's avatar
      net: dsa: bcm_sf2: Fix 64-bits register writes · 12e082bc
      Florian Fainelli authored
      [ Upstream commit 03679a14 ]
      
      The macro to write 64-bits quantities to the 32-bits register swapped
      the value and offsets arguments, we want to preserve the ordering of the
      arguments with respect to how writel() is implemented for instance:
      value first, offset/base second.
      
      Fixes: 246d7f77 ("net: dsa: add Broadcom SF2 switch driver")
      Signed-off-by: default avatarFlorian Fainelli <f.fainelli@gmail.com>
      Reviewed-by: default avatarVivien Didelot <vivien.didelot@savoirfairelinux.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      12e082bc
    • Roopa Prabhu's avatar
      ipv6: fix multipath route replace error recovery · e60f4a39
      Roopa Prabhu authored
      [ Upstream commit 6b9ea5a6 ]
      
      Problem:
      The ecmp route replace support for ipv6 in the kernel, deletes the
      existing ecmp route too early, ie when it installs the first nexthop.
      If there is an error in installing the subsequent nexthops, its too late
      to recover the already deleted existing route leaving the fib
      in an inconsistent state.
      
      This patch reduces the possibility of this by doing the following:
      a) Changes the existing multipath route add code to a two stage process:
        build rt6_infos + insert them
      	ip6_route_add rt6_info creation code is moved into
      	ip6_route_info_create.
      b) This ensures that most errors are caught during building rt6_infos
        and we fail early
      c) Separates multipath add and del code. Because add needs the special
        two stage mode in a) and delete essentially does not care.
      d) In any event if the code fails during inserting a route again, a
        warning is printed (This should be unlikely)
      
      Before the patch:
      $ip -6 route show
      3000:1000:1000:1000::2 via fe80::202:ff:fe00:b dev swp49s0 metric 1024
      3000:1000:1000:1000::2 via fe80::202:ff:fe00:d dev swp49s1 metric 1024
      3000:1000:1000:1000::2 via fe80::202:ff:fe00:f dev swp49s2 metric 1024
      
      /* Try replacing the route with a duplicate nexthop */
      $ip -6 route change 3000:1000:1000:1000::2/128 nexthop via
      fe80::202:ff:fe00:b dev swp49s0 nexthop via fe80::202:ff:fe00:d dev
      swp49s1 nexthop via fe80::202:ff:fe00:d dev swp49s1
      RTNETLINK answers: File exists
      
      $ip -6 route show
      /* previously added ecmp route 3000:1000:1000:1000::2 dissappears from
       * kernel */
      
      After the patch:
      $ip -6 route show
      3000:1000:1000:1000::2 via fe80::202:ff:fe00:b dev swp49s0 metric 1024
      3000:1000:1000:1000::2 via fe80::202:ff:fe00:d dev swp49s1 metric 1024
      3000:1000:1000:1000::2 via fe80::202:ff:fe00:f dev swp49s2 metric 1024
      
      /* Try replacing the route with a duplicate nexthop */
      $ip -6 route change 3000:1000:1000:1000::2/128 nexthop via
      fe80::202:ff:fe00:b dev swp49s0 nexthop via fe80::202:ff:fe00:d dev
      swp49s1 nexthop via fe80::202:ff:fe00:d dev swp49s1
      RTNETLINK answers: File exists
      
      $ip -6 route show
      3000:1000:1000:1000::2 via fe80::202:ff:fe00:b dev swp49s0 metric 1024
      3000:1000:1000:1000::2 via fe80::202:ff:fe00:d dev swp49s1 metric 1024
      3000:1000:1000:1000::2 via fe80::202:ff:fe00:f dev swp49s2 metric 1024
      
      Fixes: 27596472 ("ipv6: fix ECMP route replacement")
      Signed-off-by: default avatarRoopa Prabhu <roopa@cumulusnetworks.com>
      Reviewed-by: default avatarNikolay Aleksandrov <nikolay@cumulusnetworks.com>
      Acked-by: default avatarNicolas Dichtel <nicolas.dichtel@6wind.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      e60f4a39
    • Florian Fainelli's avatar
      net: dsa: bcm_sf2: Fix ageing conditions and operation · 5548af0c
      Florian Fainelli authored
      [ Upstream commit 39797a27 ]
      
      The comparison check between cur_hw_state and hw_state is currently
      invalid because cur_hw_state is right shifted by G_MISTP_SHIFT, while
      hw_state is not, so we end-up comparing bits 2:0 with bits 7:5, which is
      going to cause an additional aging to occur. Fix this by not shifting
      cur_hw_state while reading it, but instead, mask the value with the
      appropriately shitfted bitmask.
      
      The other problem with the fast-ageing process is that we did not set
      the EN_AGE_DYNAMIC bit to request the ageing to occur for dynamically
      learned MAC addresses. Finally, write back 0 to the FAST_AGE_CTRL
      register to avoid leaving spurious bits sets from one operation to the
      other.
      
      Fixes: 12f460f2 ("net: dsa: bcm_sf2: add HW bridging support")
      Signed-off-by: default avatarFlorian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      5548af0c
    • Richard Laing's avatar
      net/ipv6: Correct PIM6 mrt_lock handling · f5f10834
      Richard Laing authored
      [ Upstream commit 25b4a44c ]
      
      In the IPv6 multicast routing code the mrt_lock was not being released
      correctly in the MFC iterator, as a result adding or deleting a MIF would
      cause a hang because the mrt_lock could not be acquired.
      
      This fix is a copy of the code for the IPv4 case and ensures that the lock
      is released correctly.
      Signed-off-by: default avatarRichard Laing <richard.laing@alliedtelesis.co.nz>
      Acked-by: default avatarCong Wang <cwang@twopensource.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      f5f10834
    • Atsushi Nemoto's avatar
      net: eth: altera: fix napi poll_list corruption · c8bf2008
      Atsushi Nemoto authored
      [ Upstream commit 4548a697 ]
      
      tse_poll() calls __napi_complete() with irq enabled.  This leads napi
      poll_list corruption and may stop all napi drivers working.
      Use napi_complete() instead of __napi_complete().
      Signed-off-by: default avatarAtsushi Nemoto <nemoto@toshiba-tops.co.jp>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      c8bf2008
    • Russell King's avatar
      net: fec: clear receive interrupts before processing a packet · 496e7b36
      Russell King authored
      [ Upstream commit ed63f1dc ]
      
      The patch just to re-submit the patch "db3421c1" because the
      patch "4d494cdc" remove the change.
      
      Clear any pending receive interrupt before we process a pending packet.
      This helps to avoid any spurious interrupts being raised after we have
      fully cleaned the receive ring, while still allowing an interrupt to be
      raised if we receive another packet.
      
      The position of this is critical: we must do this prior to reading the
      next packet status to avoid potentially dropping an interrupt when a
      packet is still pending.
      Acked-by: default avatarFugang Duan <B38611@freescale.com>
      Signed-off-by: default avatarRussell King <rmk+kernel@arm.linux.org.uk>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      496e7b36
    • Daniel Borkmann's avatar
      ipv6: fix exthdrs offload registration in out_rt path · dd35e5b8
      Daniel Borkmann authored
      [ Upstream commit e41b0bed ]
      
      We previously register IPPROTO_ROUTING offload under inet6_add_offload(),
      but in error path, we try to unregister it with inet_del_offload(). This
      doesn't seem correct, it should actually be inet6_del_offload(), also
      ipv6_exthdrs_offload_exit() from that commit seems rather incorrect (it
      also uses rthdr_offload twice), but it got removed entirely later on.
      
      Fixes: 3336288a ("ipv6: Switch to using new offload infrastructure.")
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      dd35e5b8
    • Daniel Borkmann's avatar
      sock, diag: fix panic in sock_diag_put_filterinfo · 4230591d
      Daniel Borkmann authored
      [ Upstream commit b382c086 ]
      
      diag socket's sock_diag_put_filterinfo() dumps classic BPF programs
      upon request to user space (ss -0 -b). However, native eBPF programs
      attached to sockets (SO_ATTACH_BPF) cannot be dumped with this method:
      
      Their orig_prog is always NULL. However, sock_diag_put_filterinfo()
      unconditionally tries to access its filter length resp. wants to copy
      the filter insns from there. Internal cBPF to eBPF transformations
      attached to sockets don't have this issue, as orig_prog state is kept.
      
      It's currently only used by packet sockets. If we would want to add
      native eBPF support in the future, this needs to be done through
      a different attribute than PACKET_DIAG_FILTER to not confuse possible
      user space disassemblers that work on diag data.
      
      Fixes: 89aa0758 ("net: sock: allow eBPF programs to be attached to sockets")
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Acked-by: default avatarNicolas Dichtel <nicolas.dichtel@6wind.com>
      Acked-by: default avatarAlexei Starovoitov <ast@plumgrid.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      4230591d
    • Mark Salter's avatar
      phylib: fix device deletion order in mdiobus_unregister() · 001fc2f5
      Mark Salter authored
      [ Upstream commit b6c6aedc ]
      
      commit 8b63ec18 ("phylib: Make PHYs children of their MDIO bus, not
      the bus' parent.") uncovered a problem in mdiobus_unregister() which
      leads to this warning when I reboot an APM Mustang (arm64) platform:
      
        WARNING: CPU: 7 PID: 4239 at fs/sysfs/group.c:224 sysfs_remove_group+0xa0/0xa4()
        sysfs group fffffe0000e07a10 not found for kobject 'xgene-mii-eth0:03'
        ...
        CPU: 7 PID: 4239 Comm: reboot Tainted: G            E   4.2.0-0.18.el7.test15.aarch64 #1
        Hardware name: AppliedMicro Mustang/Mustang, BIOS 1.1.0 Aug 26 2015
        Call Trace:
        [<fffffe000009739c>] dump_backtrace+0x0/0x170
        [<fffffe000009752c>] show_stack+0x20/0x2c
        [<fffffe00007436f0>] dump_stack+0x78/0x9c
        [<fffffe00000c2cb4>] warn_slowpath_common+0xa0/0xd8
        [<fffffe00000c2d60>] warn_slowpath_fmt+0x74/0x88
        [<fffffe0000293d3c>] sysfs_remove_group+0x9c/0xa4
        [<fffffe00004a8bac>] dpm_sysfs_remove+0x5c/0x70
        [<fffffe000049b388>] device_del+0x44/0x208
        [<fffffe000049b578>] device_unregister+0x2c/0x7c
        [<fffffe000050dc68>] mdiobus_unregister+0x48/0x94
        [<fffffe000052afd0>] xgene_enet_mdio_remove+0x28/0x44
        [<fffffe000052d3f0>] xgene_enet_remove+0xd0/0xd8
        [<fffffe000052d424>] xgene_enet_shutdown+0x2c/0x3c
        [<fffffe00004a204c>] platform_drv_shutdown+0x24/0x40
        [<fffffe000049d4f4>] device_shutdown+0xf0/0x1b4
        [<fffffe00000e31ec>] kernel_restart_prepare+0x40/0x4c
        [<fffffe00000e32f8>] kernel_restart+0x1c/0x80
        [<fffffe00000e3670>] SyS_reboot+0x17c/0x250
      
      The problem is that mdiobus_unregister() deletes the bus device before
      unregistering the phy devices on the bus. This wasn't a problem before
      because the phys were not children of the bus:
      
        /sys/devices/platform/APMC0D05:00/net/eth0/xgene-mii-eth0:03
        /sys/devices/platform/APMC0D05:00/net/eth0/xgene-mii-eth0
      
      But now that they are:
      
        /sys/devices/platform/APMC0D05:00/net/eth0/xgene-mii-eth0/xgene-mii-eth0:03
      
      when mdiobus_unregister deletes the bus device, the phy subdirs are
      removed from sysfs also. So when the phys are unregistered afterward,
      we get the warning. This patch changes the order so that phys are
      unregistered before the bus device is deleted.
      
      Fixes: 8b63ec18 ("phylib: Make PHYs children of their MDIO bus, not the bus' parent.")
      Signed-off-by: default avatarMark Salter <msalter@redhat.com>
      Reviewed-by: default avatarFlorian Fainelli <f.fainelli@gmail.com>
      Tested-by: default avatarMark Langsdorf <mlangsdo@redhat.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      001fc2f5
  2. 29 Sep, 2015 30 commits