1. 08 Feb, 2015 39 commits
    • David S. Miller's avatar
      Merge branch 'tcp_ack_loops' · f06535c5
      David S. Miller authored
      Neal Cardwellsays:
      
      ====================
      tcp: mitigate TCP ACK loops due to out-of-window validation dupacks
      
      This patch series mitigates "ack loop" DoS scenarios by rate-limiting
      outgoing duplicate ACKs sent in response to incoming "out of window"
      segments.
      
      Background
      -----------
      
      There are several cases in which the TCP RFCs specify that a TCP
      endpoint should send a pure duplicate ACK in response to a pure
      duplicate ACK that appears to be invalid due to being "out of window":
      
      (1) RFC 793 (section 3.9, page 69) specifies that endpoints should
          send a duplicate ACK in response to an ACK when the incoming
          sequence number is invalid due to being outside the receive
          window: "If an incoming segment is not acceptable, an
          acknowledgment should be sent in reply".
      
      (2) RFC 793 (section 3.9, page 72) says: "If the ACK acknowledges
          something not yet sent (SEG.ACK > SND.NXT) then send an ACK".
      
      (3) RFC 1323 (section 4.2.1, page 18) specifies that endpoints should
          send a duplicate ACK in response to an ACK when the PAWS check for
          the incoming timestamp value fails: "If .... SEG.TSval < TS.Recent
          and if TS.Recent is valid ... Send an acknowledgement in reply"
      
      The problem
      ------------
      
      Normally, this is not a problem. However, a buggy middlebox or
      malicious man-in-the-middle can inject a few packets into the
      conversation that advance each endpoint's notion of the current window
      (sequence, ACK, or timestamp), without either side noticing. In this
      case, from then on each side can think the other is sending invalid
      segments. Thus an infinite feedback loop of duplicate ACKs can ensue,
      as each endpoint receives a duplicate ACK, decides that it is invalid
      (due to sequence number, ACK number, or timestamp), and then sends a
      dupack in reply, which the other side decides is invalid, responding
      with a dupack... ad infinitum. This ping-pong feedback loop can happen
      at a very high rate.
      
      This phenomenon can and does happen in practice. It has been seen in
      datacenter and Internet contexts at Google, and has been documented by
      Anil Agarwal in the Nov 2013 tcpm thread "TCP mismatched sequence
      numbers issue", and Avery Fay in the Feb 2015 Linux netdev thread
      "Invalid timestamp? causing tight ack loop (hundreds of thousands of
      packets / sec)".
      
      This patch series
      ------------------
      
      This patch series mitigates such ack loops by rate-limiting outgoing
      duplicate ACKs sent in response to incoming TCP packets that are for
      an existing connection but that are invalid due to any of the reasons
      mentioned above: sequence number (1), ACK field (2), or timestamp
      value (3). The rate limit for such duplicate ACKs is specified by a
      new sysctl, tcp_invalid_ratelimit, which specifies the minimal space
      between such outbound duplicate ACKs, in milliseconds. The default is
      500 (500ms), and 0 disables the mechanism.
      
      We rate-limit these duplicate ACK responses rather than blocking them
      entirely or resetting the connection, because legitimate connections
      can rely on dupacks in response to some out-of-window segments. For
      example, zero window probes are typically sent with a sequence number
      that is below the current window, and ZWPs thus expect to thus elicit
      a dupack in response.
      
      Testing: this approach has been in use at Google for a while.
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      f06535c5
    • Neal Cardwell's avatar
      tcp: mitigate ACK loops for connections as tcp_timewait_sock · 4fb17a60
      Neal Cardwell authored
      Ensure that in state FIN_WAIT2 or TIME_WAIT, where the connection is
      represented by a tcp_timewait_sock, we rate limit dupacks in response
      to incoming packets (a) with TCP timestamps that fail PAWS checks, or
      (b) with sequence numbers that are out of the acceptable window.
      
      We do not send a dupack in response to out-of-window packets if it has
      been less than sysctl_tcp_invalid_ratelimit (default 500ms) since we
      last sent a dupack in response to an out-of-window packet.
      Reported-by: default avatarAvery Fay <avery@mixpanel.com>
      Signed-off-by: default avatarNeal Cardwell <ncardwell@google.com>
      Signed-off-by: default avatarYuchung Cheng <ycheng@google.com>
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      4fb17a60
    • Neal Cardwell's avatar
      tcp: mitigate ACK loops for connections as tcp_sock · f2b2c582
      Neal Cardwell authored
      Ensure that in state ESTABLISHED, where the connection is represented
      by a tcp_sock, we rate limit dupacks in response to incoming packets
      (a) with TCP timestamps that fail PAWS checks, or (b) with sequence
      numbers or ACK numbers that are out of the acceptable window.
      
      We do not send a dupack in response to out-of-window packets if it has
      been less than sysctl_tcp_invalid_ratelimit (default 500ms) since we
      last sent a dupack in response to an out-of-window packet.
      
      There is already a similar (although global) rate-limiting mechanism
      for "challenge ACKs". When deciding whether to send a challence ACK,
      we first consult the new per-connection rate limit, and then the
      global rate limit.
      Reported-by: default avatarAvery Fay <avery@mixpanel.com>
      Signed-off-by: default avatarNeal Cardwell <ncardwell@google.com>
      Signed-off-by: default avatarYuchung Cheng <ycheng@google.com>
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      f2b2c582
    • Neal Cardwell's avatar
      tcp: mitigate ACK loops for connections as tcp_request_sock · a9b2c06d
      Neal Cardwell authored
      In the SYN_RECV state, where the TCP connection is represented by
      tcp_request_sock, we now rate-limit SYNACKs in response to a client's
      retransmitted SYNs: we do not send a SYNACK in response to client SYN
      if it has been less than sysctl_tcp_invalid_ratelimit (default 500ms)
      since we last sent a SYNACK in response to a client's retransmitted
      SYN.
      
      This allows the vast majority of legitimate client connections to
      proceed unimpeded, even for the most aggressive platforms, iOS and
      MacOS, which actually retransmit SYNs 1-second intervals for several
      times in a row. They use SYN RTO timeouts following the progression:
      1,1,1,1,1,2,4,8,16,32.
      Reported-by: default avatarAvery Fay <avery@mixpanel.com>
      Signed-off-by: default avatarNeal Cardwell <ncardwell@google.com>
      Signed-off-by: default avatarYuchung Cheng <ycheng@google.com>
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      a9b2c06d
    • Neal Cardwell's avatar
      tcp: helpers to mitigate ACK loops by rate-limiting out-of-window dupacks · 032ee423
      Neal Cardwell authored
      Helpers for mitigating ACK loops by rate-limiting dupacks sent in
      response to incoming out-of-window packets.
      
      This patch includes:
      
      - rate-limiting logic
      - sysctl to control how often we allow dupacks to out-of-window packets
      - SNMP counter for cases where we rate-limited our dupack sending
      
      The rate-limiting logic in this patch decides to not send dupacks in
      response to out-of-window segments if (a) they are SYNs or pure ACKs
      and (b) the remote endpoint is sending them faster than the configured
      rate limit.
      
      We rate-limit our responses rather than blocking them entirely or
      resetting the connection, because legitimate connections can rely on
      dupacks in response to some out-of-window segments. For example, zero
      window probes are typically sent with a sequence number that is below
      the current window, and ZWPs thus expect to thus elicit a dupack in
      response.
      
      We allow dupacks in response to TCP segments with data, because these
      may be spurious retransmissions for which the remote endpoint wants to
      receive DSACKs. This is safe because segments with data can't
      realistically be part of ACK loops, which by their nature consist of
      each side sending pure/data-less ACKs to each other.
      
      The dupack interval is controlled by a new sysctl knob,
      tcp_invalid_ratelimit, given in milliseconds, in case an administrator
      needs to dial this upward in the face of a high-rate DoS attack. The
      name and units are chosen to be analogous to the existing analogous
      knob for ICMP, icmp_ratelimit.
      
      The default value for tcp_invalid_ratelimit is 500ms, which allows at
      most one such dupack per 500ms. This is chosen to be 2x faster than
      the 1-second minimum RTO interval allowed by RFC 6298 (section 2, rule
      2.4). We allow the extra 2x factor because network delay variations
      can cause packets sent at 1 second intervals to be compressed and
      arrive much closer.
      Reported-by: default avatarAvery Fay <avery@mixpanel.com>
      Signed-off-by: default avatarNeal Cardwell <ncardwell@google.com>
      Signed-off-by: default avatarYuchung Cheng <ycheng@google.com>
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      032ee423
    • Pravin B Shelar's avatar
      openvswitch: Initialize unmasked key and uid len · ca539345
      Pravin B Shelar authored
      Flow alloc needs to initialize unmasked key pointer. Otherwise
      it can crash kernel trying to free random unmasked-key pointer.
      
      general protection fault: 0000 [#1] SMP
      3.19.0-rc6-net-next+ #457
      Hardware name: Supermicro X7DWU/X7DWU, BIOS  1.1 04/30/2008
      RIP: 0010:[<ffffffff8111df0e>] [<ffffffff8111df0e>] kfree+0xac/0x196
      Call Trace:
       [<ffffffffa060bd87>] flow_free+0x21/0x59 [openvswitch]
       [<ffffffffa060bde0>] ovs_flow_free+0x21/0x23 [openvswitch]
       [<ffffffffa0605b4a>] ovs_packet_cmd_execute+0x2f3/0x35f [openvswitch]
       [<ffffffffa0605995>] ? ovs_packet_cmd_execute+0x13e/0x35f [openvswitch]
       [<ffffffff811fe6fb>] ? nla_parse+0x4f/0xec
       [<ffffffff8139a2fc>] genl_family_rcv_msg+0x26d/0x2c9
       [<ffffffff8107620f>] ? __lock_acquire+0x90e/0x9aa
       [<ffffffff8139a3be>] genl_rcv_msg+0x66/0x89
       [<ffffffff8139a358>] ? genl_family_rcv_msg+0x2c9/0x2c9
       [<ffffffff81399591>] netlink_rcv_skb+0x3e/0x95
       [<ffffffff81399898>] ? genl_rcv+0x18/0x37
       [<ffffffff813998a7>] genl_rcv+0x27/0x37
       [<ffffffff81399033>] netlink_unicast+0x103/0x191
       [<ffffffff81399382>] netlink_sendmsg+0x2c1/0x310
       [<ffffffff811007ad>] ? might_fault+0x50/0xa0
       [<ffffffff8135c773>] do_sock_sendmsg+0x5f/0x7a
       [<ffffffff8135c799>] sock_sendmsg+0xb/0xd
       [<ffffffff8135cacf>] ___sys_sendmsg+0x1a3/0x218
       [<ffffffff8113e54b>] ? get_close_on_exec+0x86/0x86
       [<ffffffff8115a9d0>] ? fsnotify+0x32c/0x348
       [<ffffffff8115a720>] ? fsnotify+0x7c/0x348
       [<ffffffff8113e5f5>] ? __fget+0xaa/0xbf
       [<ffffffff8113e54b>] ? get_close_on_exec+0x86/0x86
       [<ffffffff8135cccd>] __sys_sendmsg+0x3d/0x5e
       [<ffffffff8135cd02>] SyS_sendmsg+0x14/0x16
       [<ffffffff81411852>] system_call_fastpath+0x12/0x17
      
      Fixes: 74ed7ab9("openvswitch: Add support for unique flow IDs.")
      CC: Joe Stringer <joestringer@nicira.com>
      Reported-by: default avatarOr Gerlitz <ogerlitz@mellanox.com>
      Signed-off-by: default avatarPravin B Shelar <pshelar@nicira.com>
      Acked-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      ca539345
    • David S. Miller's avatar
      Merge branch 'cxgb4' · 34afb4eb
      David S. Miller authored
      Hariprasad Shenai says:
      
      ====================
      Add support to dump some hw debug info
      
      This patch series adds support to dump sensor info, dump Transport Processor
      event trace, dump Upper Layer Protocol RX module command trace, dump mailbox
      contents and dump Transport Processor congestion control configuration.
      
      Will send a separate patch series for all the hw stats patches, by moving them
      to ethtool.
      
      The patches series is created against 'net-next' tree.
      And includes patches on cxgb4 driver.
      
      We have included all the maintainers of respective drivers. Kindly review the
      change and let us know in case of any review comments.
      
      V2: Dopped all hw stats related patches. Added a new patch which adds support to
      dump congestion control table.
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      34afb4eb
    • Hariprasad Shenai's avatar
      cxgb4: Add support in debugfs to dump the congestion control table · bad43792
      Hariprasad Shenai authored
      Dump Transport Processor modules congestion control configuration
      Signed-off-by: default avatarHariprasad Shenai <hariprasad@chelsio.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      bad43792
    • Hariprasad Shenai's avatar
      cxgb4: Add support to dump mailbox content in debugfs · bf7c781d
      Hariprasad Shenai authored
      Adds support to dump the current contents of mailbox and the driver which owns
      it.
      Signed-off-by: default avatarHariprasad Shenai <hariprasad@chelsio.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      bf7c781d
    • Hariprasad Shenai's avatar
      cxgb4: Add support for ULP RX logic analyzer output in debugfs · 797ff0f5
      Hariprasad Shenai authored
      Dump Upper Layer Protocol RX module command trace
      Signed-off-by: default avatarHariprasad Shenai <hariprasad@chelsio.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      797ff0f5
    • Hariprasad Shenai's avatar
      cxgb4: Added support in debugfs to display TP logic analyzer output · 2d277b3b
      Hariprasad Shenai authored
      Dump Transport Processor event trace.
      Signed-off-by: default avatarHariprasad Shenai <hariprasad@chelsio.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      2d277b3b
    • Hariprasad Shenai's avatar
      cxgb4: Add support in debugfs to display sensor information · 70a5f3bb
      Hariprasad Shenai authored
      Dump out various chip sensor information. Currently Chip Temperature
      and Core Voltage.
      Signed-off-by: default avatarHariprasad Shenai <hariprasad@chelsio.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      70a5f3bb
    • David S. Miller's avatar
      Merge branch 'be2net' · bdb27482
      David S. Miller authored
      Sathya Perla says:
      
      ====================
      be2net: patch set
      
      Hi Dave, pls consider applying the following patch-set to the
      net-next tree. It has 5 code/style cleanup patches and 4 patches that
      add functionality to the driver.
      
      Patch 1 moves routines that were not needed to be in be.h to the respective
      src files, to avoid unnecessary compilation.
      
      Patch 2 replaces (1 << x) with BIT(x) macro
      
      Patch 3 refactors code that checks if a FW flash file is compatible
      with the adapter. The code is now refactored into 2 routines, the first one
      gets the file type from the image file and the 2nd routine checks if the
      file type is compatible with the adapter.
      
      Patch 4 adds compatibility checks for flashing a FW image on the new
      Skyhawk P2 HW revision.
      
      Patch 5 adds support for a new "offset based" flashing scheme, wherein
      the driver informs the FW of the offset at which each component in the flash
      file is to be flashed at. This helps flashing components that were
      previously not recognized by the running FW.
      
      Patch 6 simplifies the be_cmd_rx_filter() routine, by passing to it the
      filter flags already used in the FW cmd, instead of the netdev flags that
      were converted to the FW-cmd flags.
      
      Patch 7 introduces helper routines in be_set_rx_mode() and be_vid_config()
      to improve code readability.
      
      Patch 8 adds processing of port-misconfig async event sent by the FW.
      
      Patch 9 removes unnecessary swapping of a field in the TX desc.
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      bdb27482
    • Sathya Perla's avatar
      be2net: avoid unncessary swapping of fields in eth_tx_wrb · f986afcb
      Sathya Perla authored
      The 32-bit fields of a tx-wrb are little endian. The driver is currently
      using be_dws_le_to_cpu() routine to swap (cpu to le) all the fields of
      a tx-wrb. So, the rsvd field is also unnecessarily swapped.
      
      This patch fixes this by individually swapping the required fields.
      Also, the type of the fields in eth_tx_wrb{} is now changed to __le32
      from u32 to avoid sparse warnings.
      Signed-off-by: default avatarSathya Perla <sathya.perla@emulex.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      f986afcb
    • Vasundhara Volam's avatar
      be2net: process port misconfig async event · 21252377
      Vasundhara Volam authored
      This patch adds support for processing the port misconfigure async
      event generated by the FW. This event is generated typically when an
      optical module is incorrectly installed or is faulty.
      
      This patch also moves the port_name field to the adapter struct for
      logging the event. As the be_cmd_query_port_name() call is now moved
      to be_get_config(), it is modified to use the mailbox instead of MCCQ
      Signed-off-by: default avatarSathya Perla <sathya.perla@emulex.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      21252377
    • Sathya Perla's avatar
      be2net: refactor be_set_rx_mode() and be_vid_config() for readability · f66b7cfd
      Sathya Perla authored
      This patch re-factors the filter setting (uc-list, mc-list, promisc, vlan)
      code in be_set_rx_mode() and be_vid_config() to make it more readable
      and reduce code duplication.
      This patch adds a separate field to track the state/mode of filtering,
      along with moving all the filtering related fields to one place in be
      be_adapter structure.
      Signed-off-by: default avatarSathya Perla <sathya.perla@emulex.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      f66b7cfd
    • Sathya Perla's avatar
      be2net: remove duplicate code in be_cmd_rx_filter() · ac34b743
      Sathya Perla authored
      This patch passes BE_IF_FLAGS_XXX flags to be_cmd_rx_filter() routine
      instead of the IFF_XXX flags. Doing this gets rid of the code to convert
      the IFF_XXX flags to the BE_IF_FLAGS_XXX used by the FW cmd. The patch
      also removes code for setting if_flags_mask that was duplicated for each
      filter mode.
      Signed-off-by: default avatarSathya Perla <sathya.perla@emulex.com>
      Signed-off-by: default avatarKalesh AP <kalesh.purayil@emulex.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      ac34b743
    • Vasundhara Volam's avatar
      be2net: use offset based FW flashing for Skyhawk chip · 70a7b525
      Vasundhara Volam authored
      While sending FW update cmds to the FW, the driver specifies the "type"
      of each component that needs to be flashed. The FW then picks the offset
      in the flash area at which the componnet is to be flashed. This doesn't work
      when new components that the current FW doesn't recognize, need to be
      flashed. Recent FWs (10.2 and above) support a scheme of FW-update wherein
      the "offset" of the component in the flash area can be specified instead
      of the "type". This patch uses the "offset" based FW-update mechanism and
      only when it fails, it fallsback to the old "type" based update.
      Signed-off-by: default avatarVasundhara Volam <vasundhara.volam@emulex.com>
      Signed-off-by: default avatarSathya Perla <sathya.perla@emulex.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      70a7b525
    • Vasundhara Volam's avatar
      be2net: avoid flashing SH-B0 UFI image on SH-P2 chip · 81a9e226
      Vasundhara Volam authored
      Skyhawk-B0 FW UFI is not compatible to flash on Skyhawk-P2 ASIC.
      But, Skyhawk-P2 FW UFI is compatible with both B0 and P2 chips.
      Signed-off-by: default avatarVasundhara Volam <vasundhara.volam@emulex.com>
      Signed-off-by: default avatarSathya Perla <sathya.perla@emulex.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      81a9e226
    • Vasundhara Volam's avatar
      be2net: refactor code that checks flash file compatibility · 5d3acd0d
      Vasundhara Volam authored
      This patch re-factors the code that checks for flash file compatibility with
      the chip type, for better readability, as follows:
      	- be_get_ufi_type() returns the UFI type from the flash file
      	- be_check_ufi_compatibility() checks if the UFI type is compatible
      	  with the adapter/chip that is being flashed
      Signed-off-by: default avatarVasundhara Volam <vasundhara.volam@emulex.com>
      Signed-off-by: default avatarSathya Perla <sathya.perla@emulex.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      5d3acd0d
    • Vasundhara Volam's avatar
      be2net: replace (1 << x) with BIT(x) · 83b06116
      Vasundhara Volam authored
      BIT(x) is the preffered usage.
      Signed-off-by: default avatarVasundhara Volam <vasundhara.volam@emulex.com>
      Signed-off-by: default avatarSathya Perla <sathya.perla@emulex.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      83b06116
    • Sathya Perla's avatar
      be2net: move un-exported routines from be.h to respective src files · f7062ee5
      Sathya Perla authored
      Routines that are called only inside one src file must remain in that
      file itself. Including them in a header file that is used for exporting
      routine/struct definitions, causes unnecessary compilation of other
      src files, when such a routine is modified.
      Signed-off-by: default avatarSathya Perla <sathya.perla@emulex.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      f7062ee5
    • Roopa Prabhu's avatar
      bridge: add missing bridge port check for offloads · 1fd0bddb
      Roopa Prabhu authored
      This patch fixes a missing bridge port check caught by smatch.
      
      setlink/dellink of attributes like vlans can come for a bridge device
      and there is no need to offload those today. So, this patch adds a bridge
      port check. (In these cases however, the BRIDGE_SELF flags will always be set
      and we may not hit a problem with the current code).
      
      smatch complaint:
      
      The patch 68e331c7: "bridge: offload bridge port attributes to
      switch asic if feature flag set" from Jan 29, 2015, leads to the
      following Smatch complaint:
      
      net/bridge/br_netlink.c:552 br_setlink()
      	 error: we previously assumed 'p' could be null (see line 518)
      
      net/bridge/br_netlink.c
         517
         518		if (p && protinfo) {
                          ^
      Check for NULL.
      Reported-By: default avatarDan Carpenter <dan.carpenter@oracle.com>
      Signed-off-by: default avatarRoopa Prabhu <roopa@cumulusnetworks.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      1fd0bddb
    • David S. Miller's avatar
      Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/net-next · d78f802f
      David S. Miller authored
      Jeff Kirsher says:
      
      ====================
      Intel Wired LAN Driver Updates 2015-02-05
      
      This series contains updates to fm10k, ixgbe and ixgbevf.
      
      Matthew fixes an issue where fm10k does not properly drop the upper-most four
      bits on of the VLAN ID due to type promotion, so resolve the issue by not
      masking off the bits, but by throwing an error if the VLAN ID is out-of-bounds.
      Then cleans up two cases where variables were not being used, but were
      being set, so just remove the unused variables.
      
      Don cleans up sparse errors in the x550 family file for ixgbe.  Fixed up
      a redundant setting of the default value for set_rxpba, which was done
      twice accidentally.  Cleaned up the probe routine to remove a redundant
      attempt to identify the PHY, which could lead to a panic on x550.  Added
      support for VXLAN receive checksum offload in x550 hardware.  Added the
      Ethertype Anti-spoofing feature for affected devices.
      
      Emil enables ixgbe and ixgbevf to allow multiple queues in SRIOV mode.
      Adds RSS support for x550 per VF.  Fixed up a couple of issues introduced
      in commit 2b509c0c ("ixgbe: cleanup ixgbe_ndo_set_vf_vlan"), fixed
      setting of the VLAN inside ixgbe_enable_port_vlan() and disable the
      "hide VLAN" bit in PFQDE when port VLAN is disabled.  Cleaned up the
      setting of vlan_features by enabling all features at once.  Fixed the
      ordering of the shutdown patch so that we attempt to shutdown the rings
      more gracefully.  We shutdown the main Rx filter in the case of Rx and we
      set the carrier_off state in the case of Tx so that packets stop being
      delivered from outside the driver.  Then we shutdown interrupts and NAPI,
      then finally stop the rings from performing DMA and clean them.  Added
      code to allow for Tx hang checking to provide more robust debug info in
      the event of a transmit unit hang in ixgbevf.  Cleaned up ixgbevf logic
      dealing with link up/down by breaking down the link detection and up/down
      events into separate functions, similar to how these events are handled
      in other drivers.  Combined the ixgbevf reset and watchdog tasks into a
      single task so that we can avoid multiple schedules of the reset task when
      we have a reset event needed due to either the mailbox going down or
      transmit packets being present on a link down.
      
      v2: Fixed up patch #03 of the series to remove the variable type change
          based on feedback from David Laight
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      d78f802f
    • David S. Miller's avatar
      Merge branch 'r8152' · 57ee062e
      David S. Miller authored
      Hayes Wang says:
      
      ====================
      r8152: adjust the code
      
      V2:
      Correct the subject of patch #5. Replace "link feed" with "line feed".
      
      v1:
      Code adjustment.
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      57ee062e
    • hayeswang's avatar
      r8152: use BIT macro · f5aaaa6d
      hayeswang authored
      Use BIT macro to replace (1 << bits).
      Signed-off-by: default avatarHayes Wang <hayeswang@realtek.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      f5aaaa6d
    • hayeswang's avatar
      r8152: replace get_protocol with vlan_get_protocol · 6e74d174
      hayeswang authored
      vlan_get_protocol() has been defined and use it to replace
      get_protocol().
      Signed-off-by: default avatarHayes Wang <hayeswang@realtek.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      6e74d174
    • hayeswang's avatar
      r8152: adjust the line feed for hw_features · ccc39faf
      hayeswang authored
      Keep NETIF_F_HW_VLAN_CTAG_RX and NETIF_F_HW_VLAN_CTAG_TX at the
      same line.
      Signed-off-by: default avatarHayes Wang <hayeswang@realtek.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      ccc39faf
    • hayeswang's avatar
      r8152: check RTL8152_UNPLUG for rtl8152_close · 53543db5
      hayeswang authored
      It is unnecessary to accress the hw register if the device is unplugged.
      Signed-off-by: default avatarHayes Wang <hayeswang@realtek.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      53543db5
    • hayeswang's avatar
      r8152: check linking status with netif_carrier_ok · 51d979fa
      hayeswang authored
      Replace (tp->speed & LINK_STATUS) with netif_carrier_ok().
      Signed-off-by: default avatarHayes Wang <hayeswang@realtek.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      51d979fa
    • hayeswang's avatar
      r8152: adjust lpm timer · 34203e25
      hayeswang authored
      Set LPM timer to 500us, except for RTL_VER_04 which doesn't link at
      USB 3.0.
      Signed-off-by: default avatarHayes Wang <hayeswang@realtek.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      34203e25
    • hayeswang's avatar
      r8152: adjust rx_bottom · e1a2ca92
      hayeswang authored
      If a error occurs when submitting rx, skip the remaining submissions
      and try to submit them again next time.
      Signed-off-by: default avatarHayes Wang <hayeswang@realtek.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      e1a2ca92
    • Sowmini Varadhan's avatar
      rds: Make rds_message_copy_from_user() return 0 on success. · d0a47d32
      Sowmini Varadhan authored
      Commit 083735f4 ("rds: switch rds_message_copy_from_user() to iov_iter")
      breaks rds_message_copy_from_user() semantics on success, and causes it
      to return nbytes copied, when it should return 0.  This commit fixes that bug.
      Signed-off-by: default avatarSowmini Varadhan <sowmini.varadhan@oracle.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      d0a47d32
    • Rasmus Villemoes's avatar
      net: rds: Remove repeated function names from debug output · 11ac1199
      Rasmus Villemoes authored
      The macro rdsdebug is defined as
      
        pr_debug("%s(): " fmt, __func__ , ##args)
      
      Hence it doesn't make sense to include the name of the calling
      function explicitly in the format string passed to rdsdebug.
      Signed-off-by: default avatarRasmus Villemoes <linux@rasmusvillemoes.dk>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      11ac1199
    • Jarno Rajahalme's avatar
      net: openvswitch: Support masked set actions. · 83d2b9ba
      Jarno Rajahalme authored
      OVS userspace already probes the openvswitch kernel module for
      OVS_ACTION_ATTR_SET_MASKED support.  This patch adds the kernel module
      implementation of masked set actions.
      
      The existing set action sets many fields at once.  When only a subset
      of the IP header fields, for example, should be modified, all the IP
      fields need to be exact matched so that the other field values can be
      copied to the set action.  A masked set action allows modification of
      an arbitrary subset of the supported header bits without requiring the
      rest to be matched.
      
      Masked set action is now supported for all writeable key types, except
      for the tunnel key.  The set tunnel action is an exception as any
      input tunnel info is cleared before action processing starts, so there
      is no tunnel info to mask.
      
      The kernel module converts all (non-tunnel) set actions to masked set
      actions.  This makes action processing more uniform, and results in
      less branching and duplicating the action processing code.  When
      returning actions to userspace, the fully masked set actions are
      converted back to normal set actions.  We use a kernel internal action
      code to be able to tell the userspace provided and converted masked
      set actions apart.
      Signed-off-by: default avatarJarno Rajahalme <jrajahalme@nicira.com>
      Acked-by: default avatarPravin B Shelar <pshelar@nicira.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      83d2b9ba
    • David S. Miller's avatar
      Merge branch 'dsa-next' · 2150f984
      David S. Miller authored
      Florian Fainelli says:
      
      ====================
      net: dsa: bcm_sf2: GPHY power down
      
      This patch series implement GPHY power up and down in the SF2 switch
      driver in order to conserve power whenever possible (e.g: port is brought
      down or unused during Wake-on-LAN).
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      2150f984
    • Florian Fainelli's avatar
      net: dsa: bcm_sf2: implement GPHY power down · 9af197a8
      Florian Fainelli authored
      Implement the power on/off recommended procedure for the Single GPHY we
      have on our Starfighter 2 switch. In order to make sure we get proper
      LED link/activity signaling during suspend, switch the link indication
      from the Switch/MAC to the PHY.
      
      Finally, since the GPHY needs to be reset to be put in low power mode,
      we will loose any context applied to it: workarounds, EEE etc.. so we
      need to call phy_init_hw() to get our fixups re-applied successfully.
      Signed-off-by: default avatarFlorian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      9af197a8
    • Florian Fainelli's avatar
      net: dsa: bcm_sf2: move GPHY enabling to its own function · b083668c
      Florian Fainelli authored
      Move the code that touches the single GPHY register from
      bcm_sf2_sw_resume() to a separate function since we will have to
      enable/disable the GPHY from different locations, and we want the code
      to be self-contained.
      Signed-off-by: default avatarFlorian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      b083668c
    • David S. Miller's avatar
      Merge tag 'nfc-next-3.20-2' of git://git.kernel.org/pub/scm/linux/kernel/git/sameo/nfc-next · 3c09e92f
      David S. Miller authored
      NFC: 3.20 second pull request
      
      This is the second NFC pull request for 3.20.
      
      It brings:
      
      - NCI NFCEE (NFC Execution Environment, typically an embedded or
        external secure element) discovery and enabling/disabling support.
        In order to communicate with an NFCEE, we also added NCI's logical
        connections support to the NCI stack.
      
      - HCI over NCI protocol support. Some secure elements only understand
        HCI and thus we need to send them HCI frames when they're part of
        an NCI chipset.
      
      - NFC_EVT_TRANSACTION userspace API addition. Whenever an application
        running on a secure element needs to notify its host counterpart,
        we send an NFC_EVENT_SE_TRANSACTION event to userspace through the
        NFC netlink socket.
      
      - Secure element and HCI transaction event support for the st21nfcb
        chipset.
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      3c09e92f
  2. 06 Feb, 2015 1 commit
    • Thomas Graf's avatar
      rhashtable: Fix remove logic to avoid cross references between buckets · 020219a6
      Thomas Graf authored
      The remove logic properly searched the remaining chain for a matching
      entry with an identical hash but it did this while searching from both
      the old and new table. Instead in order to not leave stale references
      behind we need to:
      
       1. When growing and searching from the new table:
          Search remaining chain for entry with same hash to avoid having
          the new table directly point to a entry with a different hash.
      
       2. When shrinking and searching from the old table:
          Check if the element after the removed would create a cross
          reference and avoid it if so.
      
      These bugs were present from the beginning in nft_hash.
      
      Also, both insert functions calculated the hash based on the mask of
      the new table. This worked while growing. Wwhile shrinking, the mask
      of the inew table is smaller than the mask of the old table. This lead
      to a bit not being taken into account when selecting the bucket lock
      and thus caused the wrong bucket to be locked eventually.
      
      Fixes: 7e1e7763 ("lib: Resizable, Scalable, Concurrent Hash Table")
      Fixes: 97defe1e ("rhashtable: Per bucket locks & deferred expansion/shrinking")
      Reported-by: default avatarYing Xue <ying.xue@windriver.com>
      Signed-off-by: default avatarThomas Graf <tgraf@suug.ch>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      020219a6