1. 01 Jul, 2016 18 commits
  2. 30 Jun, 2016 22 commits
    • David S. Miller's avatar
      Merge branch '1GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/next-queue · 435c556c
      David S. Miller authored
      Jeff Kirsher says:
      
      ====================
      Intel Wired LAN Driver Updates 2016-06-29
      
      This series contains updates and fixes to e1000e, igb, ixgbe and fm10k.  A
      true smorgasbord of changes.
      
      Jake cleans up some obscurity by not using the BIT() macro on bitshift
      operation and also fixed the calculated index when looping through the
      indir array.  Fixes the issue with igb's workqueue item for overflow
      check from causing a surprise remove event.  The ptp_flags variable is
      added to simplify the work of writing several complex MAC type checks
      in the PTP code while fixing the workqueue.
      
      Alex Duyck fixes the receive buffers alignment which should not be L1
      cache aligned, but to 512 bytes instead.
      
      Denys Vlasenko prevents a division by zero which was reported under
      VMWare for e1000e.
      
      Amritha fixes an issue where filters in a child hash table must be
      cleared from the hardware before delete the filter links in ixgbe.
      
      Bhaktipriya Shridhar simply replaces the deprecated create_workqueue()
      with alloc_workqueue() for fm10k.
      
      Tony corrects ixgbe ethtool reporting to show x550 supports hardware
      timestamping of all packets.
      
      Emil fixes an issue where MAC-VLANs on the VF fail to pass traffic due
      to spoofed packets.
      
      Andrew Lunn increases performance on some systems where syncing a buffer
      for DMA is expensive.  So rather than sync the whole 2K receive buffer,
      only synchronize the length of the frame.
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      435c556c
    • David S. Miller's avatar
      Merge branch 'nfp-next' · c435e6e0
      David S. Miller authored
      Jakub Kicinski says:
      
      ====================
      nfp: few code improvements
      
      Three small patches for net-next.  First and second patches
      improve the code quality by spelling things correctly and
      removing unused parameters.  Third patch hooks-in standard
      kernel implementation of .get_link() in ethtool ops.
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      c435e6e0
    • Jakub Kicinski's avatar
      nfp: implement ethtool .get_link() callback · 2370def2
      Jakub Kicinski authored
      Point the ethtool .get_link() callback to the standard
      ethtool_op_get_link() implementation.
      Signed-off-by: default avatarJakub Kicinski <jakub.kicinski@netronome.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      2370def2
    • Jakub Kicinski's avatar
      nfp: remove unused parameter from nfp_net_write_mac_addr() · f642963b
      Jakub Kicinski authored
      nfp_net_write_mac_addr() always writes to the BAR the current
      device address taken from netdev struct.  The address given
      as parameter is actually ignored.  Since all callers pass
      netdev->dev_addr simply remove the parameter.
      
      While at it improve the function's kdoc a bit.
      Signed-off-by: default avatarJakub Kicinski <jakub.kicinski@netronome.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      f642963b
    • Jakub Kicinski's avatar
      nfp: correct name of control BAR define · 796312cd
      Jakub Kicinski authored
      Spell abbreviation of control as ctrl not crtl.
      Signed-off-by: default avatarJakub Kicinski <jakub.kicinski@netronome.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      796312cd
    • Dan Carpenter's avatar
      be2net: signedness bug in be_msix_enable() · 6fde0e63
      Dan Carpenter authored
      "num_vec" needs to be signed for the error handling to work.
      
      Fixes: e261768e ('be2net: support asymmetric rx/tx queue counts')
      Signed-off-by: default avatarDan Carpenter <dan.carpenter@oracle.com>
      Acked-by: default avatarSathya Perla <sathya.perla@broadcom.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      6fde0e63
    • Masanari Iida's avatar
      net: netcp: Fix a typo in keystone-netcp.txt · 9b9a553c
      Masanari Iida authored
      This patch fix a spelling typo in keystone-netcp.txt
      Signed-off-by: default avatarMasanari Iida <standby24x7@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      9b9a553c
    • David S. Miller's avatar
      Merge branch 'mediatek-next' · 833ba3d5
      David S. Miller authored
      John Crispin says:
      
      ====================
      net-next: mediatek: IRQ cleanups, fixes and grouping
      
      This series contains 2 small code cleanups that are leftovers from the
      MIPS support. There is also a small fix that adds proper locking to the
      code accessing the IRQ registers. Without this fix we saw deadlocks caused
      by the last patch of the series, which adds IRQ grouping. The grouping
      feature allows us to use different IRQs for TX and RX. By doing so we can
      use affinity to let the SoC handle the IRQs on different cores.
      
      This series depends on a previous series currently sitting in net.git
      starting with
      	commit 562c5a70 ("net: mediatek: only wake the queue if it is stopped")
      up to
      	commit 82c6544d ("net: mediatek: remove superfluous queue wake up call")
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      833ba3d5
    • John Crispin's avatar
      net-next: mediatek: add support for IRQ grouping · 80673029
      John Crispin authored
      The ethernet core has 3 IRQs. Using the IRQ grouping registers we are able
      to separate TX and RX IRQs, which allows us to service them on separate
      cores. This patch splits the IRQ handler into 2 separate functions, one for
      TX and another for RX. The TX housekeeping is split out into its own NAPI
      handler.
      Signed-off-by: default avatarJohn Crispin <john@phrozen.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      80673029
    • John Crispin's avatar
      net-next: mediatek: add IRQ locking · 7bc9ccec
      John Crispin authored
      The code that enables and disables IRQs is missing proper locking. After
      adding the IRQ grouping patch and routing the RX and TX IRQs to different
      cores we experienced IRQ stalls. Fix this by adding proper locking.
      We use a dedicated lock to reduce the latency if the IRQ code.
      Signed-off-by: default avatarJohn Crispin <john@phrozen.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      7bc9ccec
    • John Crispin's avatar
      net-next: mediatek: don't use intermediate variables to store IRQ masks · eece71e8
      John Crispin authored
      The code currently uses variables to store and never modify the bit masks
      of interrupts. This is legacy code from an early version of the driver
      that supported MIPS based SoCs where the IRQ bits depended on the actual
      SoC. As the bits are the same for all ARM based SoCs using this driver we
      can remove the intermediate variables.
      Signed-off-by: default avatarJohn Crispin <john@phrozen.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      eece71e8
    • John Crispin's avatar
      net-next: mediatek: remove superfluous register reads · 6e6edd8b
      John Crispin authored
      The driver was originally written for MIPS based SoC. These required the
      IRQ mask register to be read after writing it to ensure that the content
      was actually applied. As this version only works on ARM based SoCs, we can
      safely remove the 2 reads as they are no longer required.
      Signed-off-by: default avatarJohn Crispin <john@phrozen.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      6e6edd8b
    • Mateusz Bajorski's avatar
      fib_rules: Added NLM_F_EXCL support to fib_nl_newrule · 153380ec
      Mateusz Bajorski authored
      When adding rule with NLM_F_EXCL flag then check if the same rule exist.
      If yes then exit with -EEXIST.
      
      This is already implemented in iproute2:
              if (cmd == RTM_NEWRULE) {
                      req.n.nlmsg_flags |= NLM_F_CREATE|NLM_F_EXCL;
                      req.r.rtm_type = RTN_UNICAST;
              }
      
      Tested ipv4 and ipv6 with net-next linux on qemu x86
      
      expected behavior after patch:
      localhost ~ # ip rule
      0:    from all lookup local
      32766:    from all lookup main
      32767:    from all lookup default
      localhost ~ # ip rule add from 10.46.177.97 lookup 104 pref 1005
      localhost ~ # ip rule add from 10.46.177.97 lookup 104 pref 1005
      RTNETLINK answers: File exists
      localhost ~ # ip rule
      0:    from all lookup local
      1005:    from 10.46.177.97 lookup 104
      32766:    from all lookup main
      32767:    from all lookup default
      
      There was already topic regarding this but I don't see any changes
      merged and problem still occurs.
      https://lkml.kernel.org/r/1135778809.5944.7.camel+%28%29+localhost+%21+localdomainSigned-off-by: default avatarMateusz Bajorski <mateusz.bajorski@nokia.com>
      Acked-by: default avatarDavid Ahern <dsa@cumulusnetworks.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      153380ec
    • Seymour, Shane M's avatar
      tcp: increase size at which tcp_bound_to_half_wnd bounds to > TCP_MSS_DEFAULT · 2631b79f
      Seymour, Shane M authored
      In previous commit 01f83d69
      the following comments were added:
      
      "When peer uses tiny windows, there is no use in packetizing to sub-MSS
      pieces for the sake of SWS or making sure there are enough packets in
      the pipe for fast recovery."
      
      The test should be > TCP_MSS_DEFAULT not >= 512. This allows low end
      devices that send an MSS of 536 (TCP_MSS_DEFAULT) to see better network
      performance by sending it 536 bytes of data at a time instead of bounding
      to half window size (268). Other network stacks work this way, e.g. HP-UX.
      Signed-off-by: default avatarShane Seymour <shane.seymour@hpe.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      2631b79f
    • Andrey Vagin's avatar
      tcp: add an ability to dump and restore window parameters · b1ed4c4f
      Andrey Vagin authored
      We found that sometimes a restored tcp socket doesn't work.
      
      A reason of this bug is incorrect window parameters and in this case
      tcp_acceptable_seq() returns tcp_wnd_end(tp) instead of tp->snd_nxt. The
      other side drops packets with this seq, because seq is less than
      tp->rcv_nxt ( tcp_sequence() ).
      
      Data from a send queue is sent only if there is enough space in a
      window, so when we restore unacked data, we need to expand a window to
      fit this data.
      
      This was in a first version of this patch:
      "tcp: extend window to fit all restored unacked data in a send queue"
      
      Then Alexey recommended me to restore window parameters instead of
      adjusted them according with data in a sent queue. This sounds resonable.
      
      rcv_wnd has to be restored, because it was reported to another side
      and the offered window is never shrunk.
      One of reasons why we need to restore snd_wnd was described above.
      
      Cc: Pavel Emelyanov <xemul@parallels.com>
      Cc: "David S. Miller" <davem@davemloft.net>
      Cc: Alexey Kuznetsov <kuznet@ms2.inr.ac.ru>
      Cc: James Morris <jmorris@namei.org>
      Cc: Hideaki YOSHIFUJI <yoshfuji@linux-ipv6.org>
      Cc: Patrick McHardy <kaber@trash.net>
      Signed-off-by: default avatarAndrey Vagin <avagin@openvz.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      b1ed4c4f
    • David S. Miller's avatar
      Merge branch 'bridge-igmp-stats' · 641f7e40
      David S. Miller authored
      Nikolay Aleksandrov says:
      
      ====================
      net: bridge: add support for IGMP/MLD stats
      
      This patchset adds support for the new IFLA_STATS_LINK_XSTATS_SLAVE
      attribute which can be used with RTM_GETSTATS in order to export per-slave
      statistics. It works by passing the attribute to the linkxstats callback
      and if the callback user supports it - it should dump that slave's stats.
      This is much more scalable and permits us to request only a single port's
      statistics instead of dumping everything every time.
      The second patch adds support for per-port IGMP/MLD statistics and uses
      the new API to export them for the bridge and its ports. The stats are
      made in a very lightweight manner, the normal fast-path is not affected
      at all and the flood paths (br_flood/br_multicast_flood) are only affected
      if the packet is IGMP and the IGMP stats have been enabled using cache-hot
      data for the check.
      
      v2: Patch 01 is new, patch 02 has been reworked to use the new API, also
      in addition counters for IGMP/MLD parse errors have been added and members
      are added for per-port multicast traffic stats. The multicast counting has
      been slightly optimized (moved the br_multicast_count inside the IPv4/6
      IGMP functions after the checks for IGMP traffic) to avoid one conditional
      that was on all of the multicast traffic path (both IGMP and other).
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      641f7e40
    • Nikolay Aleksandrov's avatar
      net: bridge: add support for IGMP/MLD stats and export them via netlink · 1080ab95
      Nikolay Aleksandrov authored
      This patch adds stats support for the currently used IGMP/MLD types by the
      bridge. The stats are per-port (plus one stat per-bridge) and per-direction
      (RX/TX). The stats are exported via netlink via the new linkxstats API
      (RTM_GETSTATS). In order to minimize the performance impact, a new option
      is used to enable/disable the stats - multicast_stats_enabled, similar to
      the recent vlan stats. Also in order to avoid multiple IGMP/MLD type
      lookups and checks, we make use of the current "igmp" member of the bridge
      private skb->cb region to record the type on Rx (both host-generated and
      external packets pass by multicast_rcv()). We can do that since the igmp
      member was used as a boolean and all the valid IGMP/MLD types are positive
      values. The normal bridge fast-path is not affected at all, the only
      affected paths are the flooding ones and since we make use of the IGMP/MLD
      type, we can quickly determine if the packet should be counted using
      cache-hot data (cb's igmp member). We add counters for:
      * IGMP Queries
      * IGMP Leaves
      * IGMP v1/v2/v3 reports
      
      * MLD Queries
      * MLD Leaves
      * MLD v1/v2 reports
      
      These are invaluable when monitoring or debugging complex multicast setups
      with bridges.
      Signed-off-by: default avatarNikolay Aleksandrov <nikolay@cumulusnetworks.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      1080ab95
    • Nikolay Aleksandrov's avatar
      net: rtnetlink: add support for the IFLA_STATS_LINK_XSTATS_SLAVE attribute · 80e73cc5
      Nikolay Aleksandrov authored
      This patch adds support for the IFLA_STATS_LINK_XSTATS_SLAVE attribute
      which allows to export per-slave statistics if the master device supports
      the linkxstats callback. The attribute is passed down to the linkxstats
      callback and it is up to the callback user to use it (an example has been
      added to the only current user - the bridge). This allows us to query only
      specific slaves of master devices like bridge ports and export only what
      we're interested in instead of having to dump all ports and searching only
      for a single one. This will be used to export per-port IGMP/MLD stats and
      also per-port vlan stats in the future, possibly other statistics as well.
      Signed-off-by: default avatarNikolay Aleksandrov <nikolay@cumulusnetworks.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      80e73cc5
    • David S. Miller's avatar
      Merge branch 'bpf-helper-improvements' · 545c321b
      David S. Miller authored
      Daniel Borkmann says:
      
      ====================
      BPF helper improvements
      
      This set adds various BPF helper improvements, that is, cleaning
      up and adding BPF_F_CURRENT_CPU flag for tracing helper, allowing
      for preemption checks on bpf_get_smp_processor_id() helper, and
      adding two new helpers bpf_skb_change_{proto, type} for tc related
      programs. For further details please see individual patches.
      
      Note, this set requires -net to be merged into -net-next tree first.
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      545c321b
    • Daniel Borkmann's avatar
      bpf: add bpf_skb_change_type helper · d2485c42
      Daniel Borkmann authored
      This work adds a helper for changing skb->pkt_type in a controlled way.
      We only allow a subset of possible values and can extend that in future
      should other use cases come up. Doing this as a helper has the advantage
      that errors can be handeled gracefully and thus helper kept extensible.
      
      It's a write counterpart to pkt_type member we can already read from
      struct __sk_buff context. Major use case is to change incoming skbs to
      PACKET_HOST in a programmatic way instead of having to recirculate via
      redirect(..., BPF_F_INGRESS), for example.
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Acked-by: default avatarAlexei Starovoitov <ast@kernel.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      d2485c42
    • Daniel Borkmann's avatar
      bpf: add bpf_skb_change_proto helper · 6578171a
      Daniel Borkmann authored
      This patch adds a minimal helper for doing the groundwork of changing
      the skb->protocol in a controlled way. Currently supported is v4 to
      v6 and vice versa transitions, which allows f.e. for a minimal, static
      nat64 implementation where applications in containers that still
      require IPv4 can be transparently operated in an IPv6-only environment.
      For example, host facing veth of the container can transparently do
      the transitions in a programmatic way with the help of clsact qdisc
      and cls_bpf.
      
      Idea is to separate concerns for keeping complexity of the helper
      lower, which means that the programs utilize bpf_skb_change_proto(),
      bpf_skb_store_bytes() and bpf_lX_csum_replace() to get the job done,
      instead of doing everything in a single helper (and thus partially
      duplicating helper functionality). Also, bpf_skb_change_proto()
      shouldn't need to deal with raw packet data as this is done by other
      helpers.
      
      bpf_skb_proto_6_to_4() and bpf_skb_proto_4_to_6() unclone the skb to
      operate on a private one, push or pop additionally required header
      space and migrate the gso/gro meta data from the shared info. We do
      mark the gso type as dodgy so that headers are checked and segs
      recalculated by the gso/gro engine. The gso_size target is adapted
      as well. The flags argument added is currently reserved and can be
      used for future extensions.
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Acked-by: default avatarAlexei Starovoitov <ast@kernel.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      6578171a
    • Daniel Borkmann's avatar
      bpf: don't use raw processor id in generic helper · 80b48c44
      Daniel Borkmann authored
      Use smp_processor_id() for the generic helper bpf_get_smp_processor_id()
      instead of the raw variant. This allows for preemption checks when we
      have DEBUG_PREEMPT, and otherwise uses the raw variant anyway. We only
      need to keep the raw variant for socket filters, but we can reuse the
      helper that is already there from cBPF side.
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Acked-by: default avatarAlexei Starovoitov <ast@kernel.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      80b48c44