1. 13 Jan, 2016 1 commit
  2. 12 Jan, 2016 14 commits
  3. 11 Jan, 2016 25 commits
    • Karl Heiss's avatar
      bonding: Prevent IPv6 link local address on enslaved devices · 03d84a5f
      Karl Heiss authored
      Commit 1f718f0f ("bonding: populate neighbour's private on enslave")
      undoes the fix provided by commit c2edacf8 ("bonding / ipv6: no addrconf
      for slaves separately from master") by effectively setting the slave flag
      after the slave has been opened.  If the slave comes up quickly enough, it
      will go through the IPv6 addrconf before the slave flag has been set and
      will get a link local IPv6 address.
      
      In order to ensure that addrconf knows to ignore the slave devices on state
      change, set IFF_SLAVE before dev_open() during bonding enslavement.
      
      Fixes: 1f718f0f ("bonding: populate neighbour's private on enslave")
      Signed-off-by: default avatarKarl Heiss <kheiss@gmail.com>
      Signed-off-by: default avatarJay Vosburgh <jay.vosburgh@canonical.com>
      Reviewed-by: default avatarJarod Wilson <jarod@redhat.com>
      Signed-off-by: default avatarAndy Gospodarek <gospo@cumulusnetworks.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      03d84a5f
    • David S. Miller's avatar
      Merge branch 'mlx5-enhanced-flow-steering' · 7937963a
      David S. Miller authored
      Or Gerlitz says:
      
      ====================
      net/mlx5_core: Enhance flow steering support
      
      v0 --> v1 changes:
        - fixed improperly formatted comments.
        - compare value of ib_spec->eth.mask.ether_type in network byte order
           in ('IB/mlx5: Add flow steering utilities').
      
      v1 --> v2 changes:
        - made sure that service functions added in the IB driver are only static-fied
          on the last commit, to make sure bisection with -Werror works fine.
      
      v2 --> v3 changes:
         - squashed patches 11 and 12 into one patch, s.t Dave's comment
           on unused static functions gcc complaints during bisection is
           correctly addressed.
      
      v3 has been generated against net-next commit c9c99311 "Merge tag
      'batman-adv-for-davem' of git://git.open-mesh.org/linux-merge"
      
      The series is signed by Matan who was revently assigned to a maintainer for
      the mlx5_core and IB drivers (this is a 4.5-rc1 change to the maintainers file coming
      from the rdma tree) -- as such I didn't see a neeed to add my signature (Or).
      
      This series adds three new functionalists to the driver flow-steering
      infrastructure: auto-grouped flow tables, chaining of flow tables and
      updates for the root flow table.
      
      1. Auto-grouped flow tables - Flow table with auto grouping management.
      When a flow table is created, hints regarding the number of rule types
      and the number of rules are given in advance. Thus, a flow table is
      divided into #NUM_TYPES+1 groups each contains
      (#NUM_RULES)/(#NUM_TYPES+1) rules. The first #NUM_TYPES parts are groups
      which are filled if the added rule matches the group specification or
      the group is empty. The last part is filled by rules that can't fit
      any of the former groups.
      
      2. Chaining flow tables - Flow tables from different priorities are chained
      together, if there is no match in flow table of priority i we continue
      searching for a match in priority i+1. This is both true if priorities
      i and i+1 belongs to the same namespace or not.
      
      3. Updating the root flow table - the root flow table is the flow table
      with the lowest level. The hardware start searching for a match in the
      root flow table and continue according to the matches it find along
      the way.
      
      The first usage for the new functionality is flow steering for user-space
      ConnectX-4 offloaded HW Eth RX queues done through the mlx5 IB driver.
      
      When the mlx5 core driver is loaded, it opens three flow namespaces:
      1. By-pass namespace (used by mlx5 IB driver).
      2. Kernel namespace (used in order to get packets to the networking stack
      through mlx5 EN driver).
      3. Leftovers namespace (used by mlx5 IB and future sniffer)
      
      The series is built as follows:
      
      Patch #1 introduces auto-grouped flow tables support.
      
      Patch #2 add utility functions for finding the next and the previous
      flow tables in different priorities. This is used in order to chain
      the flow tables in a downstream patch.
      
      Patch #3 introduces a firmware command for updating the root flow table.
      
      Patch #4 introduces modify flow table firmware command, this command is used
      when we want to change the next flow table of an existing flow table.
      This is used for chaining flow tables as well.
      
      Patch #5 connect/disconnect flow tables. This is actually the chaining
      process when we want to link flow tables. This means that if we couldn't
      find a match in the first flow table, we'll continue in the chained
      flow table.
      
      Patch #6 updates priority's attributes that is required for flow table
      level allocation. We update both the max_fts (the number of allowed FTs
      in the sub-tree of this priority) and the start_level (which is the first
      level we'll assign to the flow-tables created inside the priority).
      
      Patch #7 adds checking of required device capabilities. Some namespaces
      could be only created if the hardware supports certain attributes.
      This is especially true for the Bypass and leftovers namespaces. This
      adds a generic mechanism to check these required attributes.
      
      Patch #8 creates two additional namespaces:
      	a. Bypass flow rules(has nine priorities)
      	b. Leftovers packets(have one priority) - for unmatched packets.
      
      Patch #9 re-factors ipv4/ipv6 match fields in the mlx5 firmware interface
      header to be more clear.
      
      Patch #10 exports the flow steering API for mlx5_ib usage
      
      Patch #11 implements the required support in mlx5_ib in order
      to support the RDMA flow steering verbs.
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      7937963a
    • Maor Gottlieb's avatar
      IB/mlx5: Add flow steering support · 038d2ef8
      Maor Gottlieb authored
      Adding flow steering support by creating a flow-table per
      priority (if rules exist in the priority). mlx5_ib uses
      autogrouping and thus only creates the required destinations.
      
      Also includes adding of these flow steering utilities
      
      1. Parsing verbs flow attributes hardware steering specs.
      
      2. Check if flow is multicast - this is required in order to decide
      to which flow table will we add the steering rule.
      
      3. Set outer headers in flow match criteria to zeros.
      Signed-off-by: default avatarMaor Gottlieb <maorg@mellanox.com>
      Signed-off-by: default avatarMoni Shoua <monis@mellanox.com>
      Signed-off-by: default avatarMatan Barak <matanb@mellanox.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      038d2ef8
    • Maor Gottlieb's avatar
      net/mlx5_core: Export flow steering API · b217ea25
      Maor Gottlieb authored
      Add exports to flow steering API for mlx5_ib usage.
      The following functions are exported:
      
      1. mlx5_create_auto_grouped_flow_table - used to create flow
      table with auto flow grouping management (create and destroy
      flow groups). In auto-grouped flow tables, we create groups
      automatically if needed (if we don't find an existing
      flow group with same match criteria when we add new rule).
      
      2. mlx5_destroy_flow_table - used to destroy  a flow table.
      
      3. mlx5_add_flow_rule - used to add flow rule into a flow table.
      
      4. mlx5_del_flow_rule - used to delete flow rule from its flow table.
      
      5. mlx5_get_flow_namespace - used to get a handle to the required
      namespace sub-tree.
      Signed-off-by: default avatarMaor Gottlieb <maorg@mellanox.com>
      Signed-off-by: default avatarMoni Shoua <monis@mellanox.com>
      Signed-off-by: default avatarMatan Barak <matanb@mellanox.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      b217ea25
    • Maor Gottlieb's avatar
      net/mlx5_core: Make ipv4/ipv6 location more clear · b4d1f032
      Maor Gottlieb authored
      Change the mlx5 firmware interface header to make it
      more clear which bytes should be used by IPv4 or
      IPv6 addresses.
      Signed-off-by: default avatarMaor Gottlieb <maorg@mellanox.com>
      Signed-off-by: default avatarMoni Shoua <monis@mellanox.com>
      Signed-off-by: default avatarMatan Barak <matanb@mellanox.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      b4d1f032
    • Maor Gottlieb's avatar
      net/mlx5_core: Enable flow steering support for the IB driver · 4cbdd30e
      Maor Gottlieb authored
      When the driver is loaded, we create flow steering namespace
      for kernel bypass with nine priorities and another namespace
      for leftovers(in order to catch packets that weren't matched).
      Verbs applications will use these priorities.
      we found nine as a number that balances the requirements from the
      user and retains performance.
      
      The bypass namespace is used by verbs applications that want to bypass
      the kernel networking stack. The leftovers namespace is used by verbs
      applications and the sniffer in order to catch packets that weren't
      handled by any preceding rules.
      Signed-off-by: default avatarMaor Gottlieb <maorg@mellanox.com>
      Signed-off-by: default avatarMoni Shoua <monis@mellanox.com>
      Signed-off-by: default avatarMatan Barak <matanb@mellanox.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      4cbdd30e
    • Maor Gottlieb's avatar
      net/mlx5_core: Initialize namespaces only when supported by device · 8d40d162
      Maor Gottlieb authored
      Before we create the sub tree of a steering namespaces(kernel, bypass,
      leftovers) we check that the device has the required capabilities
      in order to create this subtree.
      Signed-off-by: default avatarMaor Gottlieb <maorg@mellanox.com>
      Signed-off-by: default avatarMoni Shoua <monis@mellanox.com>
      Signed-off-by: default avatarMatan Barak <matanb@mellanox.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      8d40d162
    • Maor Gottlieb's avatar
      net/mlx5_core: Set priority attributes · 655227ed
      Maor Gottlieb authored
      Each priority has two attributes:
      1. max_ft - maximum allowed flow tables under this priority.
      2. start_level - start level range of the flow tables
      in the priority.
      
      These attributes are set by traversing the tree nodes by
      DFS and set start level and max flow tables to each priority.
      Start level depends on the max flow tables of the prior priorities
      in the tree.
      
      The leaves of the trees have max_ft set in them. Each node accumulates
      the max_ft of its children and set it accordingly.
      Signed-off-by: default avatarMaor Gottlieb <maorg@mellanox.com>
      Signed-off-by: default avatarMoni Shoua <monis@mellanox.com>
      Signed-off-by: default avatarMatan Barak <matanb@mellanox.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      655227ed
    • Maor Gottlieb's avatar
      net/mlx5_core: Connect flow tables · f90edfd2
      Maor Gottlieb authored
      Flow tables from different priorities should be chained together.
      When a packet arrives we search for a match in the
      by-pass flow tables (first we search for a match in priority 0
      and if we don't find a match we move to the next priority).
      If we can't find a match in any of the bypass flow-tables, we continue
      searching in the flow-tables of the next priority, which are the
      kernel's flow tables.
      
      Setting the miss flow table in a new flow table to be the next one in
      the list is performed via create flow table API. If we want to change an
      existing flow table, for example in order to point from an
      existing flow table to the new next-in-list flow table, we use the
      modify flow table API.
      Signed-off-by: default avatarMaor Gottlieb <maorg@mellanox.com>
      Signed-off-by: default avatarMoni Shoua <monis@mellanox.com>
      Signed-off-by: default avatarMatan Barak <matanb@mellanox.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      f90edfd2
    • Maor Gottlieb's avatar
      net/mlx5_core: Introduce modify flow table command · 34a40e68
      Maor Gottlieb authored
      Introduce the modify flow table command. This command is used when
      we want to change the next flow table of an existing flow table.
      The next flow table is defined as the table we search (in order
      to find a match), if we couldn't find a match in any of the flow table
      entries in the current flow table.
      Signed-off-by: default avatarMaor Gottlieb <maorg@mellanox.com>
      Signed-off-by: default avatarMoni Shoua <monis@mellanox.com>
      Signed-off-by: default avatarMatan Barak <matanb@mellanox.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      34a40e68
    • Maor Gottlieb's avatar
      net/mlx5_core: Managing root flow table · 2cc43b49
      Maor Gottlieb authored
      The root Flow Table for each Flow Table Type is defined,
      by default, as the Flow Table with level 0.
      
      In order not to use an empty flow tables and introduce new hops,
      but still preserve space for flow-tables that have a priority
      greater(lower number) than the current flow table, we introduce this
      new set root flow table command.
      This command tells the HW to start matching packets from the
      assigned root flow table.
      This command is used when we create new flow table with level lower than the
      current lowest flow table or it is the first flow table.
      Signed-off-by: default avatarMaor Gottlieb <maorg@mellanox.com>
      Signed-off-by: default avatarMoni Shoua <monis@mellanox.com>
      Signed-off-by: default avatarMatan Barak <matanb@mellanox.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      2cc43b49
    • Maor Gottlieb's avatar
      net/mlx5_core: Add utilities to find next and prev flow-tables · fdb6896f
      Maor Gottlieb authored
      Add two utility functions for find next and prev flow table.
      Find next flow table function gets priority and return the
      first flow table of the next priority in the tree.
      Find prev flow table return the last flow table of
      the previous priority in the tree.
      
      These utility functions are used for chaining flow table from different
      priorities.
      Signed-off-by: default avatarMaor Gottlieb <maorg@mellanox.com>
      Signed-off-by: default avatarMoni Shoua <monis@mellanox.com>
      Signed-off-by: default avatarMatan Barak <matanb@mellanox.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      fdb6896f
    • Maor Gottlieb's avatar
      net/mlx5_core: Introduce flow steering autogrouped flow table · f0d22d18
      Maor Gottlieb authored
      When user add rule to autogrouped flow table, we search
      for flow group with the same match criteria, if we don't
      find such group then we create new flow group with the
      required match criteria and insert the rule to this group.
      
      We divide the flow table into required_groups + 1,
      in order to reserve a part of the flow table for rules
      which don't match any existing group.
      Signed-off-by: default avatarMaor Gottlieb <maorg@mellanox.com>
      Signed-off-by: default avatarMoni Shoua <monis@mellanox.com>
      Signed-off-by: default avatarMatan Barak <matanb@mellanox.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      f0d22d18
    • Michal Kubeček's avatar
      udp: disallow UFO for sockets with SO_NO_CHECK option · 40ba3302
      Michal Kubeček authored
      Commit acf8dd0a ("udp: only allow UFO for packets from SOCK_DGRAM
      sockets") disallows UFO for packets sent from raw sockets. We need to do
      the same also for SOCK_DGRAM sockets with SO_NO_CHECK options, even if
      for a bit different reason: while such socket would override the
      CHECKSUM_PARTIAL set by ip_ufo_append_data(), gso_size is still set and
      bad offloading flags warning is triggered in __skb_gso_segment().
      
      In the IPv6 case, SO_NO_CHECK option is ignored but we need to disallow
      UFO for packets sent by sockets with UDP_NO_CHECK6_TX option.
      Signed-off-by: default avatarMichal Kubecek <mkubecek@suse.cz>
      Tested-by: default avatarShannon Nelson <shannon.nelson@intel.com>
      Acked-by: default avatarHannes Frederic Sowa <hannes@stressinduktion.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      40ba3302
    • John Fastabend's avatar
      net: pktgen: fix null ptr deref in skb allocation · 3de03596
      John Fastabend authored
      Fix possible null pointer dereference that may occur when calling
      skb_reserve() on a null skb.
      
      Fixes: 879c7220 ("net: pktgen: Observe needed_headroom of the device")
      Signed-off-by: default avatarJohn Fastabend <john.r.fastabend@intel.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      3de03596
    • David S. Miller's avatar
      Merge branch 'bpf-next' · 23c09c26
      David S. Miller authored
      Daniel Borkmann says:
      
      ====================
      BPF update
      
      This set adds IPv6 support for bpf_skb_{set,get}_tunnel_key() helper.
      It also exports flags to user space that are being used in helpers and
      weren't exported thus far. For more details, please see the individual
      patches.
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      23c09c26
    • Daniel Borkmann's avatar
      bpf: support ipv6 for bpf_skb_{set,get}_tunnel_key · c6c33454
      Daniel Borkmann authored
      After IPv6 support has recently been added to metadata dst and related
      encaps, add support for populating/reading it from an eBPF program.
      
      Commit d3aa45ce ("bpf: add helpers to access tunnel metadata") started
      with initial IPv4-only support back then (due to IPv6 metadata support
      not being available yet).
      
      To stay compatible with older programs, we need to test for the passed
      structure size. Also TOS and TTL support from the ip_tunnel_info key has
      been added. Tested with vxlan devs in collect meta data mode with IPv4,
      IPv6 and in compat mode over different network namespaces.
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Acked-by: default avatarAlexei Starovoitov <ast@kernel.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      c6c33454
    • Daniel Borkmann's avatar
      bpf: export helper function flags and reject invalid ones · 781c53bc
      Daniel Borkmann authored
      Export flags used by eBPF helper functions through UAPI, so they can be
      used by programs (instead of them redefining all flags each time or just
      using the hard-coded values). It also gives a better overview what flags
      are used where and we can further get rid of the extra macros defined in
      filter.c. Moreover, reject invalid flags.
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Acked-by: default avatarAlexei Starovoitov <ast@kernel.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      781c53bc
    • David S. Miller's avatar
      Merge branch 'renesas-eth-fixes' · 366f2931
      David S. Miller authored
      Sergei Shtylyov says:
      
      ====================
      Fix some dubious code in the Renesas Ethernet drivers
      
         Here's a set of 2 patches against DaveM's 'net.git' repo. While initializing
      EMAC the code tries to respect the duplex mode both programmed into ECMR and
      stored in its own private data -- this just can't be right.
      
      [1/2] ravb: stop reading ECMR in ravb_emac_init()
      [2/2] sh_eth: stop reading ECMR in sh_eth_dev_init()
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      366f2931
    • Sergei Shtylyov's avatar
      sh_eth: stop reading ECMR in sh_eth_dev_init() · bffa731f
      Sergei Shtylyov authored
      The code in sh_eth_dev_init()  twiddling the ECMR bits always looked a bit
      strange to me:  if one intends to respect 'mdp->duplex', why save old value
      of the ECMR.DM bit? As all the other bits are zeroed anyway, we don't really
      need to read ECMR before writing to it.
      Signed-off-by: default avatarSergei Shtylyov <sergei.shtylyov@cogentembedded.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      bffa731f
    • Sergei Shtylyov's avatar
      ravb: stop reading ECMR in ravb_emac_init() · 1c1fa821
      Sergei Shtylyov authored
      The code in ravb_emac_init() twiddling the ECMR bits always looked a bit
      strange to me: if one intends to respect 'priv->duplex', why save old value
      of the ECMR.DM bit?   As all the other bits are zeroed anyway, we don't
      really need to read ECMR before writing to it.
      Signed-off-by: default avatarSergei Shtylyov <sergei.shtylyov@cogentembedded.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      1c1fa821
    • Jamal Hadi Salim's avatar
      sched,cls_flower: set key address type when present · 66530bdf
      Jamal Hadi Salim authored
      only when user space passes the addresses should we consider their
      presence
      Signed-off-by: default avatarJamal Hadi Salim <jhs@mojatatu.com>
      Acked-by: default avatarJiri Pirko <jiri@resnulli.us>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      66530bdf
    • Neal Cardwell's avatar
      tcp_yeah: don't set ssthresh below 2 · 83d15e70
      Neal Cardwell authored
      For tcp_yeah, use an ssthresh floor of 2, the same floor used by Reno
      and CUBIC, per RFC 5681 (equation 4).
      
      tcp_yeah_ssthresh() was sometimes returning a 0 or negative ssthresh
      value if the intended reduction is as big or bigger than the current
      cwnd. Congestion control modules should never return a zero or
      negative ssthresh. A zero ssthresh generally results in a zero cwnd,
      causing the connection to stall. A negative ssthresh value will be
      interpreted as a u32 and will set a target cwnd for PRR near 4
      billion.
      
      Oleksandr Natalenko reported that a system using tcp_yeah with ECN
      could see a warning about a prior_cwnd of 0 in
      tcp_cwnd_reduction(). Testing verified that this was due to
      tcp_yeah_ssthresh() misbehaving in this way.
      Reported-by: default avatarOleksandr Natalenko <oleksandr@natalenko.name>
      Signed-off-by: default avatarNeal Cardwell <ncardwell@google.com>
      Signed-off-by: default avatarYuchung Cheng <ycheng@google.com>
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      83d15e70
    • Jarod Wilson's avatar
      bonding: make mii_status sysfs node consistent · c8086f6d
      Jarod Wilson authored
      The spew in /proc/net/bonding/bond0 uses netif_carrier_ok() to determine
      mii_status, while /sys/class/net/bond0/bonding/mii_status looks at
      curr_active_slave, which doesn't actually seem to be set sometimes when
      the bond actually is up. A mode 4 bond configured via ifcfg-foo files on a
      Red Hat Enterprise Linux system, after boot, comes up clean and
      functional, but the sysfs node shows mii_status of down, while proc shows
      up. A simple enough fix here seems to be to use the same method for
      determining up or down in both places, and I'd opt for the one that seems
      to match reality.
      
      CC: Jay Vosburgh <j.vosburgh@gmail.com>
      CC: Veaceslav Falico <vfalico@gmail.com>
      CC: Andy Gospodarek <gospo@cumulusnetworks.com>
      CC: netdev@vger.kernel.org
      Signed-off-by: default avatarJarod Wilson <jarod@redhat.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      c8086f6d
    • Marcelo Ricardo Leitner's avatar
      sctp: fix use-after-free in pr_debug statement · 649621e3
      Marcelo Ricardo Leitner authored
      Dmitry Vyukov reported a use-after-free in the code expanded by the
      macro debug_post_sfx, which is caused by the use of the asoc pointer
      after it was freed within sctp_side_effect() scope.
      
      This patch fixes it by allowing sctp_side_effect to clear that asoc
      pointer when the TCB is freed.
      
      As Vlad explained, we also have to cover the SCTP_DISPOSITION_ABORT case
      because it will trigger DELETE_TCB too on that same loop.
      
      Also, there were places issuing SCTP_CMD_INIT_FAILED and ASSOC_FAILED
      but returning SCTP_DISPOSITION_CONSUME, which would fool the scheme
      above. Fix it by returning SCTP_DISPOSITION_ABORT instead.
      
      The macro is already prepared to handle such NULL pointer.
      Reported-by: default avatarDmitry Vyukov <dvyukov@google.com>
      Signed-off-by: default avatarMarcelo Ricardo Leitner <marcelo.leitner@gmail.com>
      Acked-by: default avatarVlad Yasevich <vyasevich@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      649621e3