1. 30 Jun, 2015 3 commits
  2. 22 Jun, 2015 1 commit
  3. 18 Jun, 2015 1 commit
  4. 17 Jun, 2015 23 commits
  5. 16 Jun, 2015 12 commits
    • Florent Fourcot's avatar
      tcp/ipv6: fix flow label setting in TIME_WAIT state · b61fe8d6
      Florent Fourcot authored
      commit 21858cd0 upstream.
      
      commit 1d13a96c ("ipv6: tcp: fix flowlabel value in ACK messages
      send from TIME_WAIT") added the flow label in the last TCP packets.
      Unfortunately, it was not casted properly.
      
      This patch replace the buggy shift with be32_to_cpu/cpu_to_be32.
      
      Fixes: 1d13a96c ("ipv6: tcp: fix flowlabel value in ACK messages")
      Reported-by: default avatarEric Dumazet <eric.dumazet@gmail.com>
      Signed-off-by: default avatarFlorent Fourcot <florent.fourcot@enst-bretagne.fr>
      Acked-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarKamal Mostafa <kamal@canonical.com>
      b61fe8d6
    • Michal Kubeček's avatar
      ipv6: fix ECMP route replacement · 4d1de9e9
      Michal Kubeček authored
      commit 27596472 upstream.
      
      When replacing an IPv6 multipath route with "ip route replace", i.e.
      NLM_F_CREATE | NLM_F_REPLACE, fib6_add_rt2node() replaces only first
      matching route without fixing its siblings, resulting in corrupted
      siblings linked list; removing one of the siblings can then end in an
      infinite loop.
      
      IPv6 ECMP implementation is a bit different from IPv4 so that route
      replacement cannot work in exactly the same way. This should be a
      reasonable approximation:
      
      1. If the new route is ECMP-able and there is a matching ECMP-able one
      already, replace it and all its siblings (if any).
      
      2. If the new route is ECMP-able and no matching ECMP-able route exists,
      replace first matching non-ECMP-able (if any) or just add the new one.
      
      3. If the new route is not ECMP-able, replace first matching
      non-ECMP-able route (if any) or add the new route.
      
      We also need to remove the NLM_F_REPLACE flag after replacing old
      route(s) by first nexthop of an ECMP route so that each subsequent
      nexthop does not replace previous one.
      
      Fixes: 51ebd318 ("ipv6: add support of equal cost multipath (ECMP)")
      Signed-off-by: default avatarMichal Kubecek <mkubecek@suse.cz>
      Acked-by: default avatarNicolas Dichtel <nicolas.dichtel@6wind.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarKamal Mostafa <kamal@canonical.com>
      4d1de9e9
    • Henning Rogge's avatar
      net/ipv6/udp: Fix ipv6 multicast socket filter regression · eecd238f
      Henning Rogge authored
      commit 33b4b015 upstream.
      
      Commit <5cf3d461> ("udp: Simplify__udp*_lib_mcast_deliver")
      simplified the filter for incoming IPv6 multicast but removed
      the check of the local socket address and the UDP destination
      address.
      
      This patch restores the filter to prevent sockets bound to a IPv6
      multicast IP to receive other UDP traffic link unicast.
      Signed-off-by: default avatarHenning Rogge <hrogge@gmail.com>
      Fixes: 5cf3d461 ("udp: Simplify__udp*_lib_mcast_deliver")
      Cc: "David S. Miller" <davem@davemloft.net>
      Acked-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarKamal Mostafa <kamal@canonical.com>
      eecd238f
    • Michal Kubeček's avatar
      ipv6: do not delete previously existing ECMP routes if add fails · 60f93af1
      Michal Kubeček authored
      commit 35f1b4e9 upstream.
      
      If adding a nexthop of an IPv6 multipath route fails, comment in
      ip6_route_multipath() says we are going to delete all nexthops already
      added. However, current implementation deletes even the routes it
      hasn't even tried to add yet. For example, running
      
        ip route add 1234:5678::/64 \
            nexthop via fe80::aa dev dummy1 \
            nexthop via fe80::bb dev dummy1 \
            nexthop via fe80::cc dev dummy1
      
      twice results in removing all routes first command added.
      
      Limit the second (delete) run to nexthops that succeeded in the first
      (add) run.
      
      Fixes: 51ebd318 ("ipv6: add support of equal cost multipath (ECMP)")
      Signed-off-by: default avatarMichal Kubecek <mkubecek@suse.cz>
      Acked-by: default avatarNicolas Dichtel <nicolas.dichtel@6wind.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarKamal Mostafa <kamal@canonical.com>
      60f93af1
    • Eric W. Biederman's avatar
      ipv4: Avoid crashing in ip_error · a1745635
      Eric W. Biederman authored
      commit 381c759d upstream.
      
      ip_error does not check if in_dev is NULL before dereferencing it.
      
      IThe following sequence of calls is possible:
      CPU A                          CPU B
      ip_rcv_finish
          ip_route_input_noref()
              ip_route_input_slow()
                                     inetdev_destroy()
          dst_input()
      
      With the result that a network device can be destroyed while processing
      an input packet.
      
      A crash was triggered with only unicast packets in flight, and
      forwarding enabled on the only network device.   The error condition
      was created by the removal of the network device.
      
      As such it is likely the that error code was -EHOSTUNREACH, and the
      action taken by ip_error (if in_dev had been accessible) would have
      been to not increment any counters and to have tried and likely failed
      to send an icmp error as the network device is going away.
      
      Therefore handle this weird case by just dropping the packet if
      !in_dev.  It will result in dropping the packet sooner, and will not
      result in an actual change of behavior.
      
      Fixes: 251da413 ("ipv4: Cache ip_error() routes even when not forwarding.")
      Reported-by: default avatarVittorio Gambaletta <linuxbugs@vittgam.net>
      Tested-by: default avatarVittorio Gambaletta <linuxbugs@vittgam.net>
      Signed-off-by: default avatarVittorio Gambaletta <linuxbugs@vittgam.net>
      Signed-off-by: default avatar"Eric W. Biederman" <ebiederm@xmission.com>
      Acked-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarKamal Mostafa <kamal@canonical.com>
      a1745635
    • Bjørn Mork's avatar
      cdc_ncm: Fix tx_bytes statistics · 98f0430b
      Bjørn Mork authored
      commit 44f6731d upstream.
      
      The tx_curr_frame_payload field is u32. When we try to calculate a
      small negative delta based on it, we end up with a positive integer
      close to 2^32 instead.  So the tx_bytes pointer increases by about
      2^32 for every transmitted frame.
      
      Fix by calculating the delta as a signed long.
      
      Cc: Ben Hutchings <ben.hutchings@codethink.co.uk>
      Reported-by: default avatarFlorian Bruhin <me@the-compiler.org>
      Fixes: 7a1e890e ("usbnet: Fix tx_bytes statistic running backward in cdc_ncm")
      Signed-off-by: default avatarBjørn Mork <bjorn@mork.no>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarKamal Mostafa <kamal@canonical.com>
      98f0430b
    • Thadeu Lima de Souza Cascardo's avatar
      bridge: fix parsing of MLDv2 reports · 4b822a14
      Thadeu Lima de Souza Cascardo authored
      commit 47cc84ce upstream.
      
      When more than a multicast address is present in a MLDv2 report, all but
      the first address is ignored, because the code breaks out of the loop if
      there has not been an error adding that address.
      
      This has caused failures when two guests connected through the bridge
      tried to communicate using IPv6. Neighbor discoveries would not be
      transmitted to the other guest when both used a link-local address and a
      static address.
      
      This only happens when there is a MLDv2 querier in the network.
      
      The fix will only break out of the loop when there is a failure adding a
      multicast address.
      
      The mdb before the patch:
      
      dev ovirtmgmt port vnet0 grp ff02::1:ff7d:6603 temp
      dev ovirtmgmt port vnet1 grp ff02::1:ff7d:6604 temp
      dev ovirtmgmt port bond0.86 grp ff02::2 temp
      
      After the patch:
      
      dev ovirtmgmt port vnet0 grp ff02::1:ff7d:6603 temp
      dev ovirtmgmt port vnet1 grp ff02::1:ff7d:6604 temp
      dev ovirtmgmt port bond0.86 grp ff02::fb temp
      dev ovirtmgmt port bond0.86 grp ff02::2 temp
      dev ovirtmgmt port bond0.86 grp ff02::d temp
      dev ovirtmgmt port vnet0 grp ff02::1:ff00:76 temp
      dev ovirtmgmt port bond0.86 grp ff02::16 temp
      dev ovirtmgmt port vnet1 grp ff02::1:ff00:77 temp
      dev ovirtmgmt port bond0.86 grp ff02::1:ff00:def temp
      dev ovirtmgmt port bond0.86 grp ff02::1:ffa1:40bf temp
      
      Fixes: 08b202b6 ("bridge br_multicast: IPv6 MLD support.")
      Reported-by: default avatarRik Theys <Rik.Theys@esat.kuleuven.be>
      Signed-off-by: default avatarThadeu Lima de Souza Cascardo <cascardo@redhat.com>
      Tested-by: default avatarRik Theys <Rik.Theys@esat.kuleuven.be>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarKamal Mostafa <kamal@canonical.com>
      4b822a14
    • Florian Fainelli's avatar
      net: phy: Allow EEE for all RGMII variants · 40ea3526
      Florian Fainelli authored
      commit 7e140696 upstream.
      
      RGMII interfaces come in multiple flavors: RGMII with transmit or
      receive internal delay, no delays at all, or delays in both direction.
      
      This change extends the initial check for PHY_INTERFACE_MODE_RGMII to
      cover all of these variants since EEE should be allowed for any of these
      modes, since it is a property of the RGMII, hence Gigabit PHY capability
      more than the RGMII electrical interface and its delays.
      
      Fixes: a59a4d19 ("phy: add the EEE support and the way to access to the MMD registers")
      Signed-off-by: default avatarFlorian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarKamal Mostafa <kamal@canonical.com>
      40ea3526
    • Jesper Dangaard Brouer's avatar
      conntrack: RFC5961 challenge ACK confuse conntrack LAST-ACK transition · 1f997b1a
      Jesper Dangaard Brouer authored
      commit b3cad287 upstream.
      
      In compliance with RFC5961, the network stack send challenge ACK in
      response to spurious SYN packets, since commit 0c228e83 ("tcp:
      Restore RFC5961-compliant behavior for SYN packets").
      
      This pose a problem for netfilter conntrack in state LAST_ACK, because
      this challenge ACK is (falsely) seen as ACKing last FIN, causing a
      false state transition (into TIME_WAIT).
      
      The challenge ACK is hard to distinguish from real last ACK.  Thus,
      solution introduce a flag that tracks the potential for seeing a
      challenge ACK, in case a SYN packet is let through and current state
      is LAST_ACK.
      
      When conntrack transition LAST_ACK to TIME_WAIT happens, this flag is
      used for determining if we are expecting a challenge ACK.
      
      Scapy based reproducer script avail here:
       https://github.com/netoptimizer/network-testing/blob/master/scapy/tcp_hacks_3WHS_LAST_ACK.py
      
      Fixes: 0c228e83 ("tcp: Restore RFC5961-compliant behavior for SYN packets")
      Signed-off-by: default avatarJesper Dangaard Brouer <brouer@redhat.com>
      Acked-by: default avatarJozsef Kadlecsik <kadlec@blackhole.kfki.hu>
      Signed-off-by: default avatarPablo Neira Ayuso <pablo@netfilter.org>
      Signed-off-by: default avatarKamal Mostafa <kamal@canonical.com>
      1f997b1a
    • Daniel Borkmann's avatar
      net: sched: fix call_rcu() race on classifier module unloads · 82532ec7
      Daniel Borkmann authored
      commit c78e1746 upstream.
      
      Vijay reported that a loop as simple as ...
      
        while true; do
          tc qdisc add dev foo root handle 1: prio
          tc filter add dev foo parent 1: u32 match u32 0 0  flowid 1
          tc qdisc del dev foo root
          rmmod cls_u32
        done
      
      ... will panic the kernel. Moreover, he bisected the change
      apparently introducing it to 78fd1d0a ("netlink: Re-add
      locking to netlink_lookup() and seq walker").
      
      The removal of synchronize_net() from the netlink socket
      triggering the qdisc to be removed, seems to have uncovered
      an RCU resp. module reference count race from the tc API.
      Given that RCU conversion was done after e341694e ("netlink:
      Convert netlink_lookup() to use RCU protected hash table")
      which added the synchronize_net() originally, occasion of
      hitting the bug was less likely (not impossible though):
      
      When qdiscs that i) support attaching classifiers and,
      ii) have at least one of them attached, get deleted, they
      invoke tcf_destroy_chain(), and thus call into ->destroy()
      handler from a classifier module.
      
      After RCU conversion, all classifier that have an internal
      prio list, unlink them and initiate freeing via call_rcu()
      deferral.
      
      Meanhile, tcf_destroy() releases already reference to the
      tp->ops->owner module before the queued RCU callback handler
      has been invoked.
      
      Subsequent rmmod on the classifier module is then not prevented
      since all module references are already dropped.
      
      By the time, the kernel invokes the RCU callback handler from
      the module, that function address is then invalid.
      
      One way to fix it would be to add an rcu_barrier() to
      unregister_tcf_proto_ops() to wait for all pending call_rcu()s
      to complete.
      
      synchronize_rcu() is not appropriate as under heavy RCU
      callback load, registered call_rcu()s could be deferred
      longer than a grace period. In case we don't have any pending
      call_rcu()s, the barrier is allowed to return immediately.
      
      Since we came here via unregister_tcf_proto_ops(), there
      are no users of a given classifier anymore. Further nested
      call_rcu()s pointing into the module space are not being
      done anywhere.
      
      Only cls_bpf_delete_prog() may schedule a work item, to
      unlock pages eventually, but that is not in the range/context
      of cls_bpf anymore.
      
      Fixes: 25d8c0d5 ("net: rcu-ify tcf_proto")
      Fixes: 9888faef ("net: sched: cls_basic use RCU")
      Reported-by: default avatarVijay Subramanian <subramanian.vijay@gmail.com>
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Cc: John Fastabend <john.r.fastabend@intel.com>
      Cc: Eric Dumazet <edumazet@google.com>
      Cc: Thomas Graf <tgraf@suug.ch>
      Cc: Jamal Hadi Salim <jhs@mojatatu.com>
      Cc: Alexei Starovoitov <ast@plumgrid.com>
      Tested-by: default avatarVijay Subramanian <subramanian.vijay@gmail.com>
      Acked-by: default avatarAlexei Starovoitov <ast@plumgrid.com>
      Acked-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarKamal Mostafa <kamal@canonical.com>
      82532ec7
    • Nicolas Dichtel's avatar
      rtnl/bond: don't send rtnl msg for unregistered iface · d042a64c
      Nicolas Dichtel authored
      commit ed2a80ab upstream.
      
      Before the patch, the command 'ip link add bond2 type bond mode 802.3ad'
      causes the kernel to send a rtnl message for the bond2 interface, with an
      ifindex 0.
      
      'ip monitor' shows:
      0: bond2: <BROADCAST,MULTICAST,MASTER> mtu 1500 state DOWN group default
          link/ether 00:00:00:00:00:00 brd ff:ff:ff:ff:ff:ff
      9: bond2@NONE: <BROADCAST,MULTICAST,MASTER> mtu 1500 qdisc noop state DOWN group default
          link/ether ea:3e:1f:53:92:7b brd ff:ff:ff:ff:ff:ff
      [snip]
      
      The patch fixes the spotted bug by checking in bond driver if the interface
      is registered before calling the notifier chain.
      It also adds a check in rtmsg_ifinfo() to prevent this kind of bug in the
      future.
      
      Fixes: d4261e56 ("bonding: create netlink event when bonding option is changed")
      CC: Jiri Pirko <jiri@resnulli.us>
      Reported-by: default avatarJulien Meunier <julien.meunier@6wind.com>
      Signed-off-by: default avatarNicolas Dichtel <nicolas.dichtel@6wind.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarKamal Mostafa <kamal@canonical.com>
      d042a64c
    • Tommi Rantala's avatar
      ipvs: fix memory leak in ip_vs_ctl.c · 0950b526
      Tommi Rantala authored
      commit f30bf2a5 upstream.
      
      Fix memory leak introduced in commit a0840e2e ("IPVS: netns,
      ip_vs_ctl local vars moved to ipvs struct."):
      
      unreferenced object 0xffff88005785b800 (size 2048):
        comm "(-localed)", pid 1434, jiffies 4294755650 (age 1421.089s)
        hex dump (first 32 bytes):
          bb 89 0b 83 ff ff ff ff b0 78 f0 4e 00 88 ff ff  .........x.N....
          04 00 00 00 a4 01 00 00 00 00 00 00 00 00 00 00  ................
        backtrace:
          [<ffffffff8262ea8e>] kmemleak_alloc+0x4e/0xb0
          [<ffffffff811fba74>] __kmalloc_track_caller+0x244/0x430
          [<ffffffff811b88a0>] kmemdup+0x20/0x50
          [<ffffffff823276b7>] ip_vs_control_net_init+0x1f7/0x510
          [<ffffffff8231d630>] __ip_vs_init+0x100/0x250
          [<ffffffff822363a1>] ops_init+0x41/0x190
          [<ffffffff82236583>] setup_net+0x93/0x150
          [<ffffffff82236cc2>] copy_net_ns+0x82/0x140
          [<ffffffff810ab13d>] create_new_namespaces+0xfd/0x190
          [<ffffffff810ab49a>] unshare_nsproxy_namespaces+0x5a/0xc0
          [<ffffffff810833e3>] SyS_unshare+0x173/0x310
          [<ffffffff8265cbd7>] system_call_fastpath+0x12/0x6f
          [<ffffffffffffffff>] 0xffffffffffffffff
      
      Fixes: a0840e2e ("IPVS: netns, ip_vs_ctl local vars moved to ipvs struct.")
      Signed-off-by: default avatarTommi Rantala <tt.rantala@gmail.com>
      Acked-by: default avatarJulian Anastasov <ja@ssi.bg>
      Signed-off-by: default avatarSimon Horman <horms@verge.net.au>
      Signed-off-by: default avatarKamal Mostafa <kamal@canonical.com>
      0950b526