1. 11 Nov, 2019 2 commits
  2. 09 Nov, 2019 2 commits
  3. 07 Nov, 2019 14 commits
  4. 06 Nov, 2019 2 commits
  5. 04 Nov, 2019 20 commits
    • Daniel Borkmann's avatar
      Merge branch 'bpf-libbpf-bitfield-size-relo' · f23c7ce3
      Daniel Borkmann authored
      Andrii Nakryiko says:
      
      ====================
      This patch set adds support for reading bitfields in a relocatable manner
      through a set of relocations emitted by Clang, corresponding libbpf support
      for those relocations, as well as abstracting details into
      BPF_CORE_READ_BITFIELD/BPF_CORE_READ_BITFIELD_PROBED macro.
      
      We also add support for capturing relocatable field size, so that BPF program
      code can adjust its logic to actual amount of data it needs to operate on,
      even if it changes between kernels. New convenience macro is added to
      bpf_core_read.h (bpf_core_field_size(), in the same family of macro as
      bpf_core_read() and bpf_core_field_exists()). Corresponding set of selftests
      are added to excercise this logic and validate correctness in a variety of
      scenarios.
      
      Some of the overly strict logic of matching fields is relaxed to support wider
      variety of scenarios. See patch #1 for that.
      
      Patch #1 removes few overly strict test cases.
      Patch #2 adds support for bitfield-related relocations.
      Patch #3 adds some further adjustments to support generic field size
      relocations and introduces bpf_core_field_size() macro.
      Patch #4 tests bitfield reading.
      Patch #5 tests field size relocations.
      
      v1 -> v2:
        - added direct memory read-based macro and tests for bitfield reads.
      ====================
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      f23c7ce3
    • Andrii Nakryiko's avatar
      selftests/bpf: Add field size relocation tests · 0b163565
      Andrii Nakryiko authored
      Add test verifying correctness and logic of field size relocation support in
      libbpf.
      Signed-off-by: default avatarAndrii Nakryiko <andriin@fb.com>
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Link: https://lore.kernel.org/bpf/20191101222810.1246166-6-andriin@fb.com
      0b163565
    • Andrii Nakryiko's avatar
      selftest/bpf: Add relocatable bitfield reading tests · 8b1cb1c9
      Andrii Nakryiko authored
      Add a bunch of selftests verifying correctness of relocatable bitfield reading
      support in libbpf. Both bpf_probe_read()-based and direct read-based bitfield
      macros are tested. core_reloc.c "test_harness" is extended to support raw
      tracepoint and new typed raw tracepoints as test BPF program types.
      Signed-off-by: default avatarAndrii Nakryiko <andriin@fb.com>
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Link: https://lore.kernel.org/bpf/20191101222810.1246166-5-andriin@fb.com
      8b1cb1c9
    • Andrii Nakryiko's avatar
      libbpf: Add support for field size relocations · 94f060e9
      Andrii Nakryiko authored
      Add bpf_core_field_size() macro, capturing a relocation against field size.
      Adjust bits of internal libbpf relocation logic to allow capturing size
      relocations of various field types: arrays, structs/unions, enums, etc.
      Signed-off-by: default avatarAndrii Nakryiko <andriin@fb.com>
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Link: https://lore.kernel.org/bpf/20191101222810.1246166-4-andriin@fb.com
      94f060e9
    • Andrii Nakryiko's avatar
      libbpf: Add support for relocatable bitfields · ee26dade
      Andrii Nakryiko authored
      Add support for the new field relocation kinds, necessary to support
      relocatable bitfield reads. Provide macro for abstracting necessary code doing
      full relocatable bitfield extraction into u64 value. Two separate macros are
      provided:
      - BPF_CORE_READ_BITFIELD macro for direct memory read-enabled BPF programs
      (e.g., typed raw tracepoints). It uses direct memory dereference to extract
      bitfield backing integer value.
      - BPF_CORE_READ_BITFIELD_PROBED macro for cases where bpf_probe_read() needs
      to be used to extract same backing integer value.
      Signed-off-by: default avatarAndrii Nakryiko <andriin@fb.com>
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Link: https://lore.kernel.org/bpf/20191101222810.1246166-3-andriin@fb.com
      ee26dade
    • Andrii Nakryiko's avatar
      selftests/bpf: Remove too strict field offset relo test cases · 42765ede
      Andrii Nakryiko authored
      As libbpf is going to gain support for more field relocations, including field
      size, some restrictions about exact size match are going to be lifted. Remove
      test cases that explicitly test such failures.
      Signed-off-by: default avatarAndrii Nakryiko <andriin@fb.com>
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Link: https://lore.kernel.org/bpf/20191101222810.1246166-2-andriin@fb.com
      42765ede
    • David S. Miller's avatar
      Merge tag 'mlx5-updates-2019-11-01' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux · 1574cf83
      David S. Miller authored
      Saeed Mahameed says:
      
      ====================
      mlx5-updates-2019-11-01
      
      Misc updates for mlx5 netdev and core driver
      
      1) Steering Core: Replace CRC32 internal implementation with standard
         kernel lib.
      2) Steering Core: Support IPv4 and IPv6 mixed matcher.
      3) Steering Core: Lockless FTE read lookups
      4) TC: Bit sized fields rewrite support.
      5) FPGA: Standalone FPGA support.
      6) SRIOV: Reset VF parameters configurations on SRIOV disable.
      7) netdev: Dump WQs wqe descriptors on CQE with error events.
      8) MISC Cleanups.
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      1574cf83
    • YueHaibing's avatar
      mISDN: remove unused variable 'faxmodulation_s' · a37ac8ae
      YueHaibing authored
      drivers/isdn/hardware/mISDN/mISDNisar.c:30:17:
       warning: faxmodulation_s defined but not used [-Wunused-const-variable=]
      
      It is never used, so can be removed.
      Signed-off-by: default avatarYueHaibing <yuehaibing@huawei.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      a37ac8ae
    • Vincent Cheng's avatar
      ptp: Add a ptp clock driver for IDT ClockMatrix. · 3a6ba7dc
      Vincent Cheng authored
      The IDT ClockMatrix (TM) family includes integrated devices that provide
      eight PLL channels.  Each PLL channel can be independently configured as a
      frequency synthesizer, jitter attenuator, digitally controlled
      oscillator (DCO), or a digital phase lock loop (DPLL).  Typically
      these devices are used as timing references and clock sources for PTP
      applications.  This patch adds support for the device.
      Co-developed-by: default avatarRichard Cochran <richardcochran@gmail.com>
      Signed-off-by: default avatarRichard Cochran <richardcochran@gmail.com>
      Signed-off-by: default avatarVincent Cheng <vincent.cheng.xh@renesas.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      3a6ba7dc
    • Vincent Cheng's avatar
      dt-bindings: ptp: Add device tree binding for IDT ClockMatrix based PTP clock · 5c5e7aac
      Vincent Cheng authored
      Add device tree binding doc for the IDT ClockMatrix PTP clock.
      Signed-off-by: default avatarVincent Cheng <vincent.cheng.xh@renesas.com>
      Reviewed-by: default avatarSimon Horman <simon.horman@netronome.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      5c5e7aac
    • Francesco Ruggeri's avatar
      net: icmp6: provide input address for traceroute6 · fac6fce9
      Francesco Ruggeri authored
      traceroute6 output can be confusing, in that it shows the address
      that a router would use to reach the sender, rather than the address
      the packet used to reach the router.
      Consider this case:
      
              ------------------------ N2
               |                    |
             ------              ------  N3  ----
             | R1 |              | R2 |------|H2|
             ------              ------      ----
               |                    |
              ------------------------ N1
                        |
                       ----
                       |H1|
                       ----
      
      where H1's default route is through R1, and R1's default route is
      through R2 over N2.
      traceroute6 from H1 to H2 shows R2's address on N1 rather than on N2.
      
      The script below can be used to reproduce this scenario.
      
      traceroute6 output without this patch:
      
      traceroute to 2000:103::4 (2000:103::4), 30 hops max, 80 byte packets
       1  2000:101::1 (2000:101::1)  0.036 ms  0.008 ms  0.006 ms
       2  2000:101::2 (2000:101::2)  0.011 ms  0.008 ms  0.007 ms
       3  2000:103::4 (2000:103::4)  0.013 ms  0.010 ms  0.009 ms
      
      traceroute6 output with this patch:
      
      traceroute to 2000:103::4 (2000:103::4), 30 hops max, 80 byte packets
       1  2000:101::1 (2000:101::1)  0.056 ms  0.019 ms  0.006 ms
       2  2000:102::2 (2000:102::2)  0.013 ms  0.008 ms  0.008 ms
       3  2000:103::4 (2000:103::4)  0.013 ms  0.009 ms  0.009 ms
      
      #!/bin/bash
      #
      #        ------------------------ N2
      #         |                    |
      #       ------              ------  N3  ----
      #       | R1 |              | R2 |------|H2|
      #       ------              ------      ----
      #         |                    |
      #        ------------------------ N1
      #                  |
      #                 ----
      #                 |H1|
      #                 ----
      #
      # N1: 2000:101::/64
      # N2: 2000:102::/64
      # N3: 2000:103::/64
      #
      # R1's host part of address: 1
      # R2's host part of address: 2
      # H1's host part of address: 3
      # H2's host part of address: 4
      #
      # For example:
      # the IPv6 address of R1's interface on N2 is 2000:102::1/64
      #
      # Nets are implemented by macvlan interfaces (bridge mode) over
      # dummy interfaces.
      #
      
      # Create net namespaces
      ip netns add host1
      ip netns add host2
      ip netns add rtr1
      ip netns add rtr2
      
      # Create nets
      ip link add net1 type dummy; ip link set net1 up
      ip link add net2 type dummy; ip link set net2 up
      ip link add net3 type dummy; ip link set net3 up
      
      # Add interfaces to net1, move them to their nemaspaces
      ip link add link net1 dev host1net1 type macvlan mode bridge
      ip link set host1net1 netns host1
      ip link add link net1 dev rtr1net1 type macvlan mode bridge
      ip link set rtr1net1 netns rtr1
      ip link add link net1 dev rtr2net1 type macvlan mode bridge
      ip link set rtr2net1 netns rtr2
      
      # Add interfaces to net2, move them to their nemaspaces
      ip link add link net2 dev rtr1net2 type macvlan mode bridge
      ip link set rtr1net2 netns rtr1
      ip link add link net2 dev rtr2net2 type macvlan mode bridge
      ip link set rtr2net2 netns rtr2
      
      # Add interfaces to net3, move them to their nemaspaces
      ip link add link net3 dev rtr2net3 type macvlan mode bridge
      ip link set rtr2net3 netns rtr2
      ip link add link net3 dev host2net3 type macvlan mode bridge
      ip link set host2net3 netns host2
      
      # Configure interfaces and routes in host1
      ip netns exec host1 ip link set lo up
      ip netns exec host1 ip link set host1net1 up
      ip netns exec host1 ip -6 addr add 2000:101::3/64 dev host1net1
      ip netns exec host1 ip -6 route add default via 2000:101::1
      
      # Configure interfaces and routes in rtr1
      ip netns exec rtr1 ip link set lo up
      ip netns exec rtr1 ip link set rtr1net1 up
      ip netns exec rtr1 ip -6 addr add 2000:101::1/64 dev rtr1net1
      ip netns exec rtr1 ip link set rtr1net2 up
      ip netns exec rtr1 ip -6 addr add 2000:102::1/64 dev rtr1net2
      ip netns exec rtr1 ip -6 route add default via 2000:102::2
      ip netns exec rtr1 sysctl net.ipv6.conf.all.forwarding=1
      
      # Configure interfaces and routes in rtr2
      ip netns exec rtr2 ip link set lo up
      ip netns exec rtr2 ip link set rtr2net1 up
      ip netns exec rtr2 ip -6 addr add 2000:101::2/64 dev rtr2net1
      ip netns exec rtr2 ip link set rtr2net2 up
      ip netns exec rtr2 ip -6 addr add 2000:102::2/64 dev rtr2net2
      ip netns exec rtr2 ip link set rtr2net3 up
      ip netns exec rtr2 ip -6 addr add 2000:103::2/64 dev rtr2net3
      ip netns exec rtr2 sysctl net.ipv6.conf.all.forwarding=1
      
      # Configure interfaces and routes in host2
      ip netns exec host2 ip link set lo up
      ip netns exec host2 ip link set host2net3 up
      ip netns exec host2 ip -6 addr add 2000:103::4/64 dev host2net3
      ip netns exec host2 ip -6 route add default via 2000:103::2
      
      # Ping host2 from host1
      ip netns exec host1 ping6 -c5 2000:103::4
      
      # Traceroute host2 from host1
      ip netns exec host1 traceroute6 2000:103::4
      
      # Delete nets
      ip link del net3
      ip link del net2
      ip link del net1
      
      # Delete namespaces
      ip netns del rtr2
      ip netns del rtr1
      ip netns del host2
      ip netns del host1
      Signed-off-by: default avatarFrancesco Ruggeri <fruggeri@arista.com>
      Original-patch-by: default avatarHonggang Xu <hxu@arista.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      fac6fce9
    • Tuong Lien's avatar
      tipc: improve message bundling algorithm · 06e7c70c
      Tuong Lien authored
      As mentioned in commit e95584a8 ("tipc: fix unlimited bundling of
      small messages"), the current message bundling algorithm is inefficient
      that can generate bundles of only one payload message, that causes
      unnecessary overheads for both the sender and receiver.
      
      This commit re-designs the 'tipc_msg_make_bundle()' function (now named
      as 'tipc_msg_try_bundle()'), so that when a message comes at the first
      place, we will just check & keep a reference to it if the message is
      suitable for bundling. The message buffer will be put into the link
      backlog queue and processed as normal. Later on, when another one comes
      we will make a bundle with the first message if possible and so on...
      This way, a bundle if really needed will always consist of at least two
      payload messages. Otherwise, we let the first buffer go its way without
      any need of bundling, so reduce the overheads to zero.
      
      Moreover, since now we have both the messages in hand, we can even
      optimize the 'tipc_msg_bundle()' function, make bundle of a very large
      (size ~ MSS) and small messages which is not with the current algorithm
      e.g. [1400-byte message] + [10-byte message] (MTU = 1500).
      Acked-by: default avatarYing Xue <ying.xue@windreiver.com>
      Acked-by: default avatarJon Maloy <jon.maloy@ericsson.com>
      Signed-off-by: default avatarTuong Lien <tuong.t.lien@dektech.com.au>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      06e7c70c
    • Francesco Ruggeri's avatar
      net: icmp: use input address in traceroute · 2adf81c0
      Francesco Ruggeri authored
      Even with icmp_errors_use_inbound_ifaddr set, traceroute returns the
      primary address of the interface the packet was received on, even if
      the path goes through a secondary address. In the example:
      
                          1.0.3.1/24
       ---- 1.0.1.3/24    1.0.1.1/24 ---- 1.0.2.1/24    1.0.2.4/24 ----
       |H1|--------------------------|R1|--------------------------|H2|
       ----            N1            ----            N2            ----
      
      where 1.0.3.1/24 is R1's primary address on N1, traceroute from
      H1 to H2 returns:
      
      traceroute to 1.0.2.4 (1.0.2.4), 30 hops max, 60 byte packets
       1  1.0.3.1 (1.0.3.1)  0.018 ms  0.006 ms  0.006 ms
       2  1.0.2.4 (1.0.2.4)  0.021 ms  0.007 ms  0.007 ms
      
      After applying this patch, it returns:
      
      traceroute to 1.0.2.4 (1.0.2.4), 30 hops max, 60 byte packets
       1  1.0.1.1 (1.0.1.1)  0.033 ms  0.007 ms  0.006 ms
       2  1.0.2.4 (1.0.2.4)  0.011 ms  0.007 ms  0.007 ms
      Original-patch-by: default avatarBill Fenner <fenner@arista.com>
      Signed-off-by: default avatarFrancesco Ruggeri <fruggeri@arista.com>
      Reviewed-by: default avatarDavid Ahern <dsahern@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      2adf81c0
    • David S. Miller's avatar
      Merge branch 'optimize-openvswitch-flow-looking-up' · c219a166
      David S. Miller authored
      Tonghao Zhang says:
      
      ====================
      optimize openvswitch flow looking up
      
      This series patch optimize openvswitch for performance or simplify
      codes.
      
      Patch 1, 2, 4: Port Pravin B Shelar patches to
      linux upstream with little changes.
      
      Patch 5, 6, 7: Optimize the flow looking up and
      simplify the flow hash.
      
      Patch 8, 9: are bugfix.
      
      The performance test is on Intel Xeon E5-2630 v4.
      The test topology is show as below:
      
      +-----------------------------------+
      |   +---------------------------+   |
      |   | eth0   ovs-switch    eth1 |   | Host0
      |   +---------------------------+   |
      +-----------------------------------+
            ^                       |
            |                       |
            |                       |
            |                       |
            |                       v
      +-----+----+             +----+-----+
      | netperf  | Host1       | netserver| Host2
      +----------+             +----------+
      
      We use netperf send the 64B packets, and insert 255+ flow-mask:
      $ ovs-dpctl add-flow ovs-switch "in_port(1),eth(dst=00:01:00:00:00:00/ff:ff:ff:ff:ff:01),eth_type(0x0800),ipv4(frag=no)" 2
      ...
      $ ovs-dpctl add-flow ovs-switch "in_port(1),eth(dst=00:ff:00:00:00:00/ff:ff:ff:ff:ff:ff),eth_type(0x0800),ipv4(frag=no)" 2
      $
      $ netperf -t UDP_STREAM -H 2.2.2.200 -l 40 -- -m 18
      
      * Without series patch, throughput 8.28Mbps
      * With series patch, throughput 46.05Mbps
      
      v6:
      some coding style fixes
      
      v5:
      rewrite patch 8, release flow-mask when freeing flow
      
      v4:
      access ma->count with READ_ONCE/WRITE_ONCE API. More information,
      see patch 5 comments.
      
      v3:
      update ma point when realloc mask_array in patch 5
      
      v2:
      simplify codes. e.g. use kfree_rcu instead of call_rcu
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      c219a166
    • Tonghao Zhang's avatar
      net: openvswitch: simplify the ovs_dp_cmd_new · eec62ead
      Tonghao Zhang authored
      use the specified functions to init resource.
      Signed-off-by: default avatarTonghao Zhang <xiangxia.m.yue@gmail.com>
      Tested-by: default avatarGreg Rose <gvrose8192@gmail.com>
      Acked-by: default avatarPravin B Shelar <pshelar@ovn.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      eec62ead
    • Tonghao Zhang's avatar
      net: openvswitch: don't unlock mutex when changing the user_features fails · 4c76bf69
      Tonghao Zhang authored
      Unlocking of a not locked mutex is not allowed.
      Other kernel thread may be in critical section while
      we unlock it because of setting user_feature fail.
      
      Fixes: 95a7233c ("net: openvswitch: Set OvS recirc_id from tc chain index")
      Cc: Paul Blakey <paulb@mellanox.com>
      Signed-off-by: default avatarTonghao Zhang <xiangxia.m.yue@gmail.com>
      Tested-by: default avatarGreg Rose <gvrose8192@gmail.com>
      Acked-by: default avatarWilliam Tu <u9012063@gmail.com>
      Acked-by: default avatarPravin B Shelar <pshelar@ovn.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      4c76bf69
    • Tonghao Zhang's avatar
      net: openvswitch: fix possible memleak on destroy flow-table · 50b0e61b
      Tonghao Zhang authored
      When we destroy the flow tables which may contain the flow_mask,
      so release the flow mask struct.
      Signed-off-by: default avatarTonghao Zhang <xiangxia.m.yue@gmail.com>
      Tested-by: default avatarGreg Rose <gvrose8192@gmail.com>
      Acked-by: default avatarPravin B Shelar <pshelar@ovn.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      50b0e61b
    • Tonghao Zhang's avatar
      net: openvswitch: add likely in flow_lookup · 0a3e0137
      Tonghao Zhang authored
      The most case *index < ma->max, and flow-mask is not NULL.
      We add un/likely for performance.
      Signed-off-by: default avatarTonghao Zhang <xiangxia.m.yue@gmail.com>
      Tested-by: default avatarGreg Rose <gvrose8192@gmail.com>
      Acked-by: default avatarWilliam Tu <u9012063@gmail.com>
      Acked-by: default avatarPravin B Shelar <pshelar@ovn.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      0a3e0137
    • Tonghao Zhang's avatar
      net: openvswitch: simplify the flow_hash · 515b65a4
      Tonghao Zhang authored
      Simplify the code and remove the unnecessary BUILD_BUG_ON.
      Signed-off-by: default avatarTonghao Zhang <xiangxia.m.yue@gmail.com>
      Tested-by: default avatarGreg Rose <gvrose8192@gmail.com>
      Acked-by: default avatarWilliam Tu <u9012063@gmail.com>
      Acked-by: default avatarPravin B Shelar <pshelar@ovn.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      515b65a4
    • Tonghao Zhang's avatar
      net: openvswitch: optimize flow-mask looking up · 57f7d7b9
      Tonghao Zhang authored
      The full looking up on flow table traverses all mask array.
      If mask-array is too large, the number of invalid flow-mask
      increase, performance will be drop.
      
      One bad case, for example: M means flow-mask is valid and NULL
      of flow-mask means deleted.
      
      +-------------------------------------------+
      | M | NULL | ...                  | NULL | M|
      +-------------------------------------------+
      
      In that case, without this patch, openvswitch will traverses all
      mask array, because there will be one flow-mask in the tail. This
      patch changes the way of flow-mask inserting and deleting, and the
      mask array will be keep as below: there is not a NULL hole. In the
      fast path, we can "break" "for" (not "continue") in flow_lookup
      when we get a NULL flow-mask.
      
               "break"
                  v
      +-------------------------------------------+
      | M | M |  NULL |...           | NULL | NULL|
      +-------------------------------------------+
      
      This patch don't optimize slow or control path, still using ma->max
      to traverse. Slow path:
      * tbl_mask_array_realloc
      * ovs_flow_tbl_lookup_exact
      * flow_mask_find
      Signed-off-by: default avatarTonghao Zhang <xiangxia.m.yue@gmail.com>
      Tested-by: default avatarGreg Rose <gvrose8192@gmail.com>
      Acked-by: default avatarPravin B Shelar <pshelar@ovn.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      57f7d7b9