1. 27 Feb, 2014 5 commits
  2. 26 Feb, 2014 9 commits
    • Nikolay Aleksandrov's avatar
      bonding: fix a div error caused by the slave release path · ee6154e1
      Nikolay Aleksandrov authored
      There's a bug in the slave release function which leads the transmit
      functions which use the bond->slave_cnt to a div by 0 because we might
      just have released our last slave and made slave_cnt == 0 but at the same
      time we may have a transmitter after the check for an empty list which will
      fetch it and use it in the slave id calculation.
      Fix it by moving the slave_cnt after synchronize_rcu so if this was our
      last slave any new transmitters will see an empty slave list which is
      checked after rcu lock but before calling the mode transmit functions
      which rely on bond->slave_cnt.
      
      Fixes: 278b2083 ("bonding: initial RCU conversion")
      
      CC: Veaceslav Falico <vfalico@redhat.com>
      CC: Andy Gospodarek <andy@greyhouse.net>
      CC: Jay Vosburgh <fubar@us.ibm.com>
      CC: David S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarNikolay Aleksandrov <nikolay@redhat.com>
      Acked-by: default avatarVeaceslav Falico <vfalico@redhat.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      ee6154e1
    • Freddy Xin's avatar
      AX88179_178A: Add VID:DID for Lenovo OneLinkDock Gigabit LAN · e5fe0cd4
      Freddy Xin authored
      Add VID:DID for Lenovo OneLinkDock Gigabit LAN
      Signed-off-by: default avatarFreddy Xin <freddy@asix.com.tw>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      e5fe0cd4
    • David S. Miller's avatar
      Merge branch 'bonding_rtnl' · 5d6dd5bf
      David S. Miller authored
      Ding Tianhong says:
      
      ====================
      Fix RTNL: assertion failed at net/core/rtnetlink.c
      
      The commit 1d3ee88a
      (bonding: add netlink attributes to slave link dev)
      make the bond_set_active_slave() and bond_set_backup_slave()
      use rtmsg_ifinfo to send slave's states and this functions
      should be called in RTNL.
      
      But the 902.3ad and ARP monitor did not hold the RTNL when calling
      thses two functions, so fix them.
      
      v1->v2: Add new micro to indicate that the notification should be send
              later, not never.
              And add a new patch to fix the same problem for ARP mode.
      
      v2->v3: modify the bond_should_notify to should_notify_rtnl, it is more
      	reasonable, and	use bool for should_notify_rtnl.
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      5d6dd5bf
    • dingtianhong's avatar
      bonding: Fix RTNL: assertion failed at net/core/rtnetlink.c for ab arp monitor · b0929915
      dingtianhong authored
      Veaceslav has reported and fix this problem by commit f2ebd477
      (bonding: restructure locking of bond_ab_arp_probe()). According Jay's
      opinion, the current solution is not very well, because the notification
      is to indicate that the interface has actually changed state in a meaningful
      way, but these calls in the ab ARP monitor are internal settings of the flags
      to allow the ARP monitor to search for a slave to become active when there are
      no active slaves. The flag setting to active or backup is to permit the ARP
      monitor's response logic to do the right thing when deciding if the test
      slave (current_arp_slave) is up or not.
      
      So the best way to fix the problem is that we should not send a notification
      when the slave is in testing state, and check the state at the end of the
      monitor, if the slave's state recover, avoid to send pointless notification
      twice. And RTNL is really a big lock, hold it regardless the slave's state
      changed or not when the current_active_slave is null will loss performance
      (every 100ms), so we should hold it only when the slave's state changed and
      need to notify.
      
      I revert the old commit and add new modifications.
      
      Cc: Jay Vosburgh <fubar@us.ibm.com>
      Cc: Veaceslav Falico <vfalico@redhat.com>
      Cc: Andy Gospodarek <andy@greyhouse.net>
      Signed-off-by: default avatarDing Tianhong <dingtianhong@huawei.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      b0929915
    • dingtianhong's avatar
      bonding: Fix RTNL: assertion failed at net/core/rtnetlink.c for 802.3ad mode · 5e5b0665
      dingtianhong authored
      The problem was introduced by the commit 1d3ee88a
      (bonding: add netlink attributes to slave link dev).
      The bond_set_active_slave() and bond_set_backup_slave()
      will use rtmsg_ifinfo to send slave's states, so these
      two functions should be called in RTNL.
      
      In 802.3ad mode, acquiring RTNL for the __enable_port and
      __disable_port cases is difficult, as those calls generally
      already hold the state machine lock, and cannot unconditionally
      call rtnl_lock because either they already hold RTNL (for calls
      via bond_3ad_unbind_slave) or due to the potential for deadlock
      with bond_3ad_adapter_speed_changed, bond_3ad_adapter_duplex_changed,
      bond_3ad_link_change, or bond_3ad_update_lacp_rate.  All four of
      those are called with RTNL held, and acquire the state machine lock
      second.  The calling contexts for __enable_port and __disable_port
      already hold the state machine lock, and may or may not need RTNL.
      
      According to the Jay's opinion, I don't think it is a problem that
      the slave don't send notify message synchronously when the status
      changed, normally the state machine is running every 100 ms, send
      the notify message at the end of the state machine if the slave's
      state changed should be better.
      
      I fix the problem through these steps:
      
      1). add a new function bond_set_slave_state() which could change
          the slave's state and call rtmsg_ifinfo() according to the input
          parameters called notify.
      
      2). Add a new slave parameter which called should_notify, if the slave's state
          changed and don't notify yet, the parameter will be set to 1, and then if
          the slave's state changed again, the param will be set to 0, it indicate that
          the slave's state has been restored, no need to notify any one.
      
      3). the __enable_port and __disable_port should not call rtmsg_ifinfo
          in the state machine lock, any change in the state of slave could
          set a flag in the slave, it will indicated that an rtmsg_ifinfo
          should be called at the end of the state machine.
      
      Cc: Jay Vosburgh <fubar@us.ibm.com>
      Cc: Veaceslav Falico <vfalico@redhat.com>
      Cc: Andy Gospodarek <andy@greyhouse.net>
      Signed-off-by: default avatarDing Tianhong <dingtianhong@huawei.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      5e5b0665
    • Joe Perches's avatar
      MAINTAINERS: Intel nic drivers · bc90d291
      Joe Perches authored
      Add a new F: line for the intel subdirectories.
      
      This allows get_maintainers to avoid using git log
      and cc'ing people that have submitted clean-up style
      patches for all first level directories under
      drivers/net/ethernet/intel/
      
      This does not make e100.c maintained.
      Signed-off-by: default avatarJoe Perches <joe@perches.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      bc90d291
    • Edward Cree's avatar
      sfc: check for NULL efx->ptp_data in efx_ptp_event · 8f355e5c
      Edward Cree authored
      If we receive a PTP event from the NIC when we haven't set up PTP state
      in the driver, we attempt to read through a NULL pointer efx->ptp_data,
      triggering a panic.
      Signed-off-by: default avatarEdward Cree <ecree@solarflare.com>
      Acked-by: default avatarShradha Shah <sshah@solarflare.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      8f355e5c
    • Eric Dumazet's avatar
      net: tcp: use NET_INC_STATS() · 9a9bfd03
      Eric Dumazet authored
      While LINUX_MIB_TCPSPURIOUS_RTX_HOSTQUEUES can only be incremented
      in tcp_transmit_skb() from softirq (incoming message or timer
      activation), it is better to use NET_INC_STATS() instead of
      NET_INC_STATS_BH() as tcp_transmit_skb() can be called from process
      context.
      
      This will avoid copy/paste confusion when/if we want to add
      other SNMP counters in tcp_transmit_skb()
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Cc: Hannes Frederic Sowa <hannes@stressinduktion.org>
      Cc: Florian Westphal <fw@strlen.de>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      9a9bfd03
    • Steffen Klassert's avatar
      xfrm: Fix unlink race when policies are deleted. · 3a9016f9
      Steffen Klassert authored
      When a policy is unlinked from the lists in thread context,
      the xfrm timer can fire before we can mark this policy as dead.
      So reinitialize the bydst hlist, then hlist_unhashed() will
      notice that this policy is not linked and will avoid a
      doulble unlink of that policy.
      Reported-by: default avatarXianpeng Zhao <673321875@qq.com>
      Signed-off-by: default avatarSteffen Klassert <steffen.klassert@secunet.com>
      3a9016f9
  3. 25 Feb, 2014 13 commits
    • Cristian Bercaru's avatar
      phy: unmask link partner capabilities · a4572e0c
      Cristian Bercaru authored
      Masking the link partner's capabilities with local capabilities can be
      misleading in autonegotiation scenarios such as PAUSE frame
      autonegotiation.
      This patch calculates the join between the local capabilities and the
      link parner capabilities, when it determines the speed and duplex
      settings, but does not mask any of the link partner capabilities when
      it calculates PAUSE frame settings.
      Signed-off-by: default avatarCristian Bercaru <cristian.bercaru@freescale.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      a4572e0c
    • Mike Pecovnik's avatar
      net: Fix permission check in netlink_connect() · 46833a86
      Mike Pecovnik authored
      netlink_sendmsg() was changed to prevent non-root processes from sending
      messages with dst_pid != 0.
      netlink_connect() however still only checks if nladdr->nl_groups is set.
      This patch modifies netlink_connect() to check for the same condition.
      Signed-off-by: default avatarMike Pecovnik <mike.pecovnik@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      46833a86
    • Thadeu Lima de Souza Cascardo's avatar
      net/cxgb4: use remove handler as shutdown handler · 687d705c
      Thadeu Lima de Souza Cascardo authored
      Without a shutdown handler, T4 cards behave very badly after a kexec.
      Some firmware calls return errors indicating allocation failures, for
      example. This is probably because thouse resources were not released by
      a BYE message to the firmware, for example.
      
      Using the remove handler guarantees we will use a well tested path.
      
      With this patch I applied, I managed to use kexec multiple times and
      probe and iSCSI login worked every time.
      Signed-off-by: default avatarThadeu Lima de Souza Cascardo <cascardo@linux.vnet.ibm.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      687d705c
    • David S. Miller's avatar
      Merge branch 'qlcnic' · e88570f8
      David S. Miller authored
      Shahed Shaikh says:
      
      ====================
      qlcnic: Bug fixes
      
      This patch series includes following bug fixes,
      
      * Fix for return value handling of function qlcnic_enable_msi_legacy().
      * Fix for the usage of module parameters for interrupt mode.
        Driver should use flags while checking for driver's interrupt mode instead of
        module parameters.
      * Revert commit 1414abea (qlcnic: Restrict VF from configuring any VLAN mode),
        in order to save some multicast filters.
      * Fix a bug where driver was not re-setting sds ring count to 1 when
        it falls back from MSI-x mode to legacy interrupt mode.
      
      Please apply to net.
      
      Change in v2 -
      Dropped patch "qlcnic: reset firmware API lock during driver load" for further rework.
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      e88570f8
    • Rajesh Borundia's avatar
      qlcnic: Fix number of rings when we fall back from msix to legacy. · 42beb3f2
      Rajesh Borundia authored
      o Driver was not re-setting sds ring count to 1 after failing
         to allocate msi-x interrupts.
      Signed-off-by: default avatarRajesh Borundia <rajesh.borundia@qlogic.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      42beb3f2
    • Sucheta Chakraborty's avatar
      qlcnic: Allow any VLAN to be configured from VF. · 46428228
      Sucheta Chakraborty authored
      o This patch reverts commit 1414abea
        (qlcnic: Restrict VF from configuring any VLAN mode.)
        This will allow same multicast address to be used with any VLAN
        instead of programming seperate (MAC, VLAN) tuples in adapter.
        This will help save some multicast filters.
      Signed-off-by: default avatarSucheta Chakraborty <sucheta.chakraborty@qlogic.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      46428228
    • Shahed Shaikh's avatar
      qlcnic: Fix usage of use_msi and use_msi_x module parameters · b7520d2b
      Shahed Shaikh authored
      Once interrupts are enabled, instead of using module parameters,
      use flags (QLCNIC_MSI_ENABLED and QLCNIC_MSIX_ENABLED) set by driver
      to check interrupt mode.
      Signed-off-by: default avatarShahed Shaikh <shahed.shaikh@qlogic.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      b7520d2b
    • Shahed Shaikh's avatar
      qlcnic: Fix function return error check · fc49beae
      Shahed Shaikh authored
      Driver was treating -ve return value as success in case of
      qlcnic_enable_msi_legacy() failure
      Signed-off-by: default avatarShahed Shaikh <shahed.shaikh@qlogic.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      fc49beae
    • Ursula Braun's avatar
      qeth: postpone freeing of qdio memory · 22ae2790
      Ursula Braun authored
      To guarantee that a qdio ccw_device no longer touches the
      qdio memory shared with Linux, the qdio ccw_device should
      be offline when freeing the qdio memory. Thus this patch
      postpones freeing of qdio memory.
      Signed-off-by: default avatarUrsula Braun <ursula.braun@de.ibm.com>
      Signed-off-by: default avatarFrank Blaschka <frank.blaschka@de.ibm.com>
      Reviewed-by: default avatarSebastian Ott <sebott@linux.vnet.ibm.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      22ae2790
    • Hannes Frederic Sowa's avatar
      ipv4: ipv6: better estimate tunnel header cut for correct ufo handling · 91a48a2e
      Hannes Frederic Sowa authored
      Currently the UFO fragmentation process does not correctly handle inner
      UDP frames.
      
      (The following tcpdumps are captured on the parent interface with ufo
      disabled while tunnel has ufo enabled, 2000 bytes payload, mtu 1280,
      both sit device):
      
      IPv6:
      16:39:10.031613 IP (tos 0x0, ttl 64, id 3208, offset 0, flags [DF], proto IPv6 (41), length 1300)
          192.168.122.151 > 1.1.1.1: IP6 (hlim 64, next-header Fragment (44) payload length: 1240) 2001::1 > 2001::8: frag (0x00000001:0|1232) 44883 > distinct: UDP, length 2000
      16:39:10.031709 IP (tos 0x0, ttl 64, id 3209, offset 0, flags [DF], proto IPv6 (41), length 844)
          192.168.122.151 > 1.1.1.1: IP6 (hlim 64, next-header Fragment (44) payload length: 784) 2001::1 > 2001::8: frag (0x00000001:0|776) 58979 > 46366: UDP, length 5471
      
      We can see that fragmentation header offset is not correctly updated.
      (fragmentation id handling is corrected by 916e4cf4 ("ipv6: reuse
      ip6_frag_id from ip6_ufo_append_data")).
      
      IPv4:
      16:39:57.737761 IP (tos 0x0, ttl 64, id 3209, offset 0, flags [DF], proto IPIP (4), length 1296)
          192.168.122.151 > 1.1.1.1: IP (tos 0x0, ttl 64, id 57034, offset 0, flags [none], proto UDP (17), length 1276)
          192.168.99.1.35961 > 192.168.99.2.distinct: UDP, length 2000
      16:39:57.738028 IP (tos 0x0, ttl 64, id 3210, offset 0, flags [DF], proto IPIP (4), length 792)
          192.168.122.151 > 1.1.1.1: IP (tos 0x0, ttl 64, id 57035, offset 0, flags [none], proto UDP (17), length 772)
          192.168.99.1.13531 > 192.168.99.2.20653: UDP, length 51109
      
      In this case fragmentation id is incremented and offset is not updated.
      
      First, I aligned inet_gso_segment and ipv6_gso_segment:
      * align naming of flags
      * ipv6_gso_segment: setting skb->encapsulation is unnecessary, as we
        always ensure that the state of this flag is left untouched when
        returning from upper gso segmenation function
      * ipv6_gso_segment: move skb_reset_inner_headers below updating the
        fragmentation header data, we don't care for updating fragmentation
        header data
      * remove currently unneeded comment indicating skb->encapsulation might
        get changed by upper gso_segment callback (gre and udp-tunnel reset
        encapsulation after segmentation on each fragment)
      
      If we encounter an IPIP or SIT gso skb we now check for the protocol ==
      IPPROTO_UDP and that we at least have already traversed another ip(6)
      protocol header.
      
      The reason why we have to special case GSO_IPIP and GSO_SIT is that
      we reset skb->encapsulation to 0 while skb_mac_gso_segment the inner
      protocol of GSO_UDP_TUNNEL or GSO_GRE packets.
      Reported-by: default avatarWolfgang Walter <linux@stwm.de>
      Cc: Cong Wang <xiyou.wangcong@gmail.com>
      Cc: Tom Herbert <therbert@google.com>
      Cc: Eric Dumazet <eric.dumazet@gmail.com>
      Signed-off-by: default avatarHannes Frederic Sowa <hannes@stressinduktion.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      91a48a2e
    • Amir Vadai's avatar
      net,IB/mlx: Bump all Mellanox driver versions · 169a1d85
      Amir Vadai authored
      Bump all Mellanox driver versions.
      Signed-off-by: default avatarAmir Vadai <amirv@mellanox.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      169a1d85
    • Kyle McMartin's avatar
      r8169: initialize rtl8169_stats seqlock · 340fea3d
      Kyle McMartin authored
      Boris reports he's seeing:
      > [    9.195943] INFO: trying to register non-static key.
      > [    9.196031] the code is fine but needs lockdep annotation.
      > [    9.196031] turning off the locking correctness validator.
      > [    9.196031] CPU: 1 PID: 933 Comm: modprobe Not tainted 3.14.0-rc4+ #1
      with the r8169 driver.
      
      These are occuring because the seqcount embedded in u64_stats_sync on
      32-bit SMP is uninitialized which is making lockdep unhappy.
      Signed-off-by: default avatarKyle McMartin <kyle@redhat.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      340fea3d
    • Eric Dumazet's avatar
      tcp: reduce the bloat caused by tcp_is_cwnd_limited() · d10473d4
      Eric Dumazet authored
      tcp_is_cwnd_limited() allows GSO/TSO enabled flows to increase
      their cwnd to allow a full size (64KB) TSO packet to be sent.
      
      Non GSO flows only allow an extra room of 3 MSS.
      
      For most flows with a BDP below 10 MSS, this results in a bloat
      of cwnd reaching 90, and an inflate of RTT.
      
      Thanks to TSO auto sizing, we can restrict the bloat to the number
      of MSS contained in a TSO packet (tp->xmit_size_goal_segs), to keep
      original intent without performance impact.
      
      Because we keep cwnd small, it helps to keep TSO packet size to their
      optimal value.
      
      Example for a 10Mbit flow, with low TCP Small queue limits (no more than
      2 skb in qdisc/device tx ring)
      
      Before patch :
      
      lpk51:~# ./ss -i dst lpk52:44862 | grep cwnd
               cubic wscale:6,6 rto:215 rtt:15.875/2.5 mss:1448 cwnd:96
      ssthresh:96
      send 70.1Mbps unacked:14 rcv_space:29200
      
      After patch :
      
      lpk51:~# ./ss -i dst lpk52:52916 | grep cwnd
               cubic wscale:6,6 rto:206 rtt:5.206/0.036 mss:1448 cwnd:15
      ssthresh:14
      send 33.4Mbps unacked:4 rcv_space:29200
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Cc: Yuchung Cheng <ycheng@google.com>
      Cc: Neal Cardwell <ncardwell@google.com>
      Cc: Nandita Dukkipati <nanditad@google.com>
      Cc: Van Jacobson <vanj@google.com>
      Acked-by: default avatarNeal Cardwell <ncardwell@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      d10473d4
  4. 24 Feb, 2014 2 commits
  5. 22 Feb, 2014 7 commits
    • Hannes Frederic Sowa's avatar
      ipv6: reuse ip6_frag_id from ip6_ufo_append_data · 916e4cf4
      Hannes Frederic Sowa authored
      Currently we generate a new fragmentation id on UFO segmentation. It
      is pretty hairy to identify the correct net namespace and dst there.
      Especially tunnels use IFF_XMIT_DST_RELEASE and thus have no skb_dst
      available at all.
      
      This causes unreliable or very predictable ipv6 fragmentation id
      generation while segmentation.
      
      Luckily we already have pregenerated the ip6_frag_id in
      ip6_ufo_append_data and can use it here.
      Signed-off-by: default avatarHannes Frederic Sowa <hannes@stressinduktion.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      916e4cf4
    • Daniel Borkmann's avatar
      net: sctp: rework multihoming retransmission path selection to rfc4960 · 4c47af4d
      Daniel Borkmann authored
      Problem statement: 1) both paths (primary path1 and alternate
      path2) are up after the association has been established i.e.,
      HB packets are normally exchanged, 2) path2 gets inactive after
      path_max_retrans * max_rto timed out (i.e. path2 is down completely),
      3) now, if a transmission times out on the only surviving/active
      path1 (any ~1sec network service impact could cause this like
      a channel bonding failover), then the retransmitted packets are
      sent over the inactive path2; this happens with partial failover
      and without it.
      
      Besides not being optimal in the above scenario, a small failure
      or timeout in the only existing path has the potential to cause
      long delays in the retransmission (depending on RTO_MAX) until
      the still active path is reselected. Further, when the T3-timeout
      occurs, we have active_patch == retrans_path, and even though the
      timeout occurred on the initial transmission of data, not a
      retransmit, we end up updating retransmit path.
      
      RFC4960, section 6.4. "Multi-Homed SCTP Endpoints" states under
      6.4.1. "Failover from an Inactive Destination Address" the
      following:
      
        Some of the transport addresses of a multi-homed SCTP endpoint
        may become inactive due to either the occurrence of certain
        error conditions (see Section 8.2) or adjustments from the
        SCTP user.
      
        When there is outbound data to send and the primary path
        becomes inactive (e.g., due to failures), or where the SCTP
        user explicitly requests to send data to an inactive
        destination transport address, before reporting an error to
        its ULP, the SCTP endpoint should try to send the data to an
        alternate __active__ destination transport address if one
        exists.
      
        When retransmitting data that timed out, if the endpoint is
        multihomed, it should consider each source-destination address
        pair in its retransmission selection policy. When retransmitting
        timed-out data, the endpoint should attempt to pick the most
        divergent source-destination pair from the original
        source-destination pair to which the packet was transmitted.
      
        Note: Rules for picking the most divergent source-destination
        pair are an implementation decision and are not specified
        within this document.
      
      So, we should first reconsider to take the current active
      retransmission transport if we cannot find an alternative
      active one. If all of that fails, we can still round robin
      through unkown, partial failover, and inactive ones in the
      hope to find something still suitable.
      
      Commit 4141ddc0 ("sctp: retran_path update bug fix") broke
      that behaviour by selecting the next inactive transport when
      no other active transport was found besides the current assoc's
      peer.retran_path. Before commit 4141ddc0, we would have
      traversed through the list until we reach our peer.retran_path
      again, and in case that is still in state SCTP_ACTIVE, we would
      take it and return. Only if that is not the case either, we
      take the next inactive transport.
      
      Besides all that, another issue is that transports in state
      SCTP_UNKNOWN could be preferred over transports in state
      SCTP_ACTIVE in case a SCTP_ACTIVE transport appears after
      SCTP_UNKNOWN in the transport list yielding a weaker transport
      state to be used in retransmission.
      
      This patch mostly reverts 4141ddc0, but also rewrites
      this function to introduce more clarity and strictness into
      the code. A strict priority of transport states is enforced
      in this patch, hence selection is active > unkown > partial
      failover > inactive.
      
      Fixes: 4141ddc0 ("sctp: retran_path update bug fix")
      Signed-off-by: default avatarDaniel Borkmann <dborkman@redhat.com>
      Cc: Gui Jianfeng <guijianfeng@cn.fujitsu.com>
      Acked-by: default avatarVlad Yasevich <yasevich@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      4c47af4d
    • Jiri Pirko's avatar
      neigh: fix setting of default gc_* values · b194c1f1
      Jiri Pirko authored
      This patch fixes bug introduced by:
      commit 1d4c8c29
      "neigh: restore old behaviour of default parms values"
      
      The thing is that in neigh_sysctl_register, extra1 and extra2 which were
      previously set for NEIGH_VAR_GC_* are overwritten. That leads to
      nonsense int limits for gc_* variables. So fix this by not touching
      extra* fields for gc_* variables.
      Signed-off-by: default avatarJiri Pirko <jiri@resnulli.us>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      b194c1f1
    • Eric Dumazet's avatar
      net-tcp: fastopen: fix high order allocations · f5ddcbbb
      Eric Dumazet authored
      This patch fixes two bugs in fastopen :
      
      1) The tcp_sendmsg(...,  @size) argument was ignored.
      
         Code was relying on user not fooling the kernel with iovec mismatches
      
      2) When MTU is about 64KB, tcp_send_syn_data() attempts order-5
      allocations, which are likely to fail when memory gets fragmented.
      
      Fixes: 783237e8 ("net-tcp: Fast Open client - sending SYN-data")
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Cc: Yuchung Cheng <ycheng@google.com>
      Acked-by: default avatarYuchung Cheng <ycheng@google.com>
      Tested-by: default avatarYuchung Cheng <ycheng@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      f5ddcbbb
    • David S. Miller's avatar
      Merge branch 'tipc' · 68ad785c
      David S. Miller authored
      Ying Xue says:
      
      ====================
      tipc: clean up components initialization code
      
      In this series, we will fix a regression issue involved by commit
      6e967adf(tipc: relocate common functions from media to bearer)
      But before the issue is fixed, we firstly adjust the process of
      components initialization so as to remove all enabled flags from
      necessary tipc components. Otherwise, without the change, we also
      have to add an extra enabled flag into bearer layer indicating
      whether bearer setup is finshed or not.
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      68ad785c
    • Ying Xue's avatar
      tipc: make bearer set up in module insertion stage · 970122fd
      Ying Xue authored
      Accidentally a side effect is involved by commit 6e967adf(tipc:
      relocate common functions from media to bearer). Now tipc stack
      handler of receiving packets from netdevices as well as netdevice
      notification handler are registered when bearer is enabled rather
      than tipc module initialization stage, but the two handlers are
      both unregistered in tipc module exit phase. If tipc module is
      inserted and then immediately removed, the following warning
      message will appear:
      
      "dev_remove_pack: ffffffffa0380940 not found"
      
      This is because in module insertion stage tipc stack packet handler
      is not registered at all, but in module exit phase dev_remove_pack()
      needs to remove it. Of course, dev_remove_pack() cannot find tipc
      protocol handler from the kernel protocol handler list so that the
      warning message is printed out.
      
      But if registering the two handlers is adjusted from enabling bearer
      phase into inserting module stage, the warning message will be
      eliminated. Due to this change, tipc_core_start_net() and
      tipc_core_stop_net() can be deleted as well.
      Reported-by: default avatarWang Weidong <wangweidong1@huawei.com>
      Cc: Jon Maloy <jon.maloy@ericsson.com>
      Cc: Erik Hugne <erik.hugne@ericsson.com>
      Signed-off-by: default avatarYing Xue <ying.xue@windriver.com>
      Reviewed-by: default avatarPaul Gortmaker <paul.gortmaker@windriver.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      970122fd
    • Ying Xue's avatar
      tipc: remove all enabled flags from all tipc components · 9fe7ed47
      Ying Xue authored
      When tipc module is inserted, many tipc components are initialized
      one by one. During the initialization period, if one of them is
      failed, tipc_core_stop() will be called to stop all components
      whatever corresponding components are created or not. To avoid to
      release uncreated ones, relevant components have to add necessary
      enabled flags indicating whether they are created or not.
      
      But in the initialization stage, if one component is unsuccessfully
      created, we will just destroy successfully created components before
      the failed component instead of all components. All enabled flags
      defined in components, in turn, become redundant. Additionally it's
      also unnecessary to identify whether table.types is NULL in
      tipc_nametbl_stop() because name stable has been definitely created
      successfully when tipc_nametbl_stop() is called.
      
      Cc: Jon Maloy <jon.maloy@ericsson.com>
      Cc: Erik Hugne <erik.hugne@ericsson.com>
      Signed-off-by: default avatarYing Xue <ying.xue@windriver.com>
      Reviewed-by: default avatarPaul Gortmaker <paul.gortmaker@windriver.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      9fe7ed47
  6. 20 Feb, 2014 4 commits
    • Matija Glavinic Pecotic's avatar
      net: sctp: Potentially-Failed state should not be reached from unconfirmed state · 7cce3b75
      Matija Glavinic Pecotic authored
      In current implementation it is possible to reach PF state from unconfirmed.
      We can interpret sctp-failover-02 in a way that PF state is meant to be reached
      only from active state, in the end, this is when entering PF state makes sense.
      Here are few quotes from sctp-failover-02, but regardless of these, same
      understanding can be reached from whole section 5:
      
      Section 5.1, quickfailover guide:
          "The PF state is an intermediate state between Active and Failed states."
      
          "Each time the T3-rtx timer expires on an active or idle
          destination, the error counter of that destination address will
          be incremented.  When the value in the error counter exceeds
          PFMR, the endpoint should mark the destination transport address as PF."
      
      There are several concrete reasons for such interpretation. For start, rfc4960
      does not take into concern quickfailover algorithm. Therefore, quickfailover
      must comply to 4960. Point where this compliance can be argued is following
      behavior:
      When PF is entered, association overall error counter is incremented for each
      missed HB. This is contradictory to rfc4960, as address, while in unconfirmed
      state, is subjected to probing, and while it is probed, it should not increment
      association overall error counter. This has as a consequence that we might end
      up in situation in which we drop association due path failure on unconfirmed
      address, in case we have wrong configuration in a way:
      Association.Max.Retrans == Path.Max.Retrans.
      
      Another reason is that entering PF from unconfirmed will cause a loss of address
      confirmed event when address is once (if) confirmed. This is fine from failover
      guide point of view, but it is not consistent with behavior preceding failover
      implementation and recommendation from 4960:
      
      5.4.  Path Verification
         Whenever a path is confirmed, an indication MAY be given to the upper
         layer.
      Signed-off-by: default avatarMatija Glavinic Pecotic <matija.glavinic-pecotic.ext@nsn.com>
      Acked-by: default avatarVlad Yasevich <vyasevich@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      7cce3b75
    • Duan Fugang-B38611's avatar
      net: fec: fix potential issue to avoid fec interrupt lost and crc error · fb8ef788
      Duan Fugang-B38611 authored
      The current flow: Set TX BD ready, and then set "INT" and "PINS" bit to
      enable tx interrupt generation and crc checksum.
      
      There has potential issue like as:
      CPU			fec uDMA
      Set tx ready bit
      			uDMA start the BD transmission
      Set "INT" bit
      Set "PINS" bit
      ...
      
      Above situation cause fec tx interrupt lost and fec MAC don't do
      CRC checksum. The patch fix the potential issue.
      Signed-off-by: default avatarFugang Duan <B38611@freescale.com>
      Acked-by: default avatarFrank Li <Frank.li@freescale.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      fb8ef788
    • Nicolas Dichtel's avatar
      sit: fix panic with route cache in ip tunnels · cf71d2bc
      Nicolas Dichtel authored
      Bug introduced by commit 7d442fab ("ipv4: Cache dst in tunnels").
      
      Because sit code does not call ip_tunnel_init(), the dst_cache was not
      initialized.
      
      CC: Tom Herbert <therbert@google.com>
      Signed-off-by: default avatarNicolas Dichtel <nicolas.dichtel@6wind.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      cf71d2bc
    • Steffen Klassert's avatar
      xfrm: Clone states properly on migration · ee5c2317
      Steffen Klassert authored
      We loose a lot of information of the original state if we
      clone it with xfrm_state_clone(). In particular, there is
      no crypto algorithm attached if the original state uses
      an aead algorithm. This patch add the missing information
      to the clone state.
      Signed-off-by: default avatarSteffen Klassert <steffen.klassert@secunet.com>
      ee5c2317