1. 16 May, 2014 11 commits
    • Jarno Rajahalme's avatar
      openvswitch: Fix output of SCTP mask. · d92ab135
      Jarno Rajahalme authored
      The 'output' argument of the ovs_nla_put_flow() is the one from which
      the bits are written to the netlink attributes.  For SCTP we
      accidentally used the bits from the 'swkey' instead.  This caused the
      mask attributes to include the bits from the actual flow key instead
      of the mask.
      Signed-off-by: default avatarJarno Rajahalme <jrajahalme@nicira.com>
      Acked-by: default avatarPravin B Shelar <pshelar@nicira.com>
      Signed-off-by: default avatarJesse Gross <jesse@nicira.com>
      d92ab135
    • Jarno Rajahalme's avatar
      openvswitch: Per NUMA node flow stats. · 63e7959c
      Jarno Rajahalme authored
      Keep kernel flow stats for each NUMA node rather than each (logical)
      CPU.  This avoids using the per-CPU allocator and removes most of the
      kernel-side OVS locking overhead otherwise on the top of perf reports
      and allows OVS to scale better with higher number of threads.
      
      With 9 handlers and 4 revalidators netperf TCP_CRR test flow setup
      rate doubles on a server with two hyper-threaded physical CPUs (16
      logical cores each) compared to the current OVS master.  Tested with
      non-trivial flow table with a TCP port match rule forcing all new
      connections with unique port numbers to OVS userspace.  The IP
      addresses are still wildcarded, so the kernel flows are not considered
      as exact match 5-tuple flows.  This type of flows can be expected to
      appear in large numbers as the result of more effective wildcarding
      made possible by improvements in OVS userspace flow classifier.
      
      Perf results for this test (master):
      
      Events: 305K cycles
      +   8.43%     ovs-vswitchd  [kernel.kallsyms]   [k] mutex_spin_on_owner
      +   5.64%     ovs-vswitchd  [kernel.kallsyms]   [k] __ticket_spin_lock
      +   4.75%     ovs-vswitchd  ovs-vswitchd        [.] find_match_wc
      +   3.32%     ovs-vswitchd  libpthread-2.15.so  [.] pthread_mutex_lock
      +   2.61%     ovs-vswitchd  [kernel.kallsyms]   [k] pcpu_alloc_area
      +   2.19%     ovs-vswitchd  ovs-vswitchd        [.] flow_hash_in_minimask_range
      +   2.03%          swapper  [kernel.kallsyms]   [k] intel_idle
      +   1.84%     ovs-vswitchd  libpthread-2.15.so  [.] pthread_mutex_unlock
      +   1.64%     ovs-vswitchd  ovs-vswitchd        [.] classifier_lookup
      +   1.58%     ovs-vswitchd  libc-2.15.so        [.] 0x7f4e6
      +   1.07%     ovs-vswitchd  [kernel.kallsyms]   [k] memset
      +   1.03%          netperf  [kernel.kallsyms]   [k] __ticket_spin_lock
      +   0.92%          swapper  [kernel.kallsyms]   [k] __ticket_spin_lock
      ...
      
      And after this patch:
      
      Events: 356K cycles
      +   6.85%     ovs-vswitchd  ovs-vswitchd        [.] find_match_wc
      +   4.63%     ovs-vswitchd  libpthread-2.15.so  [.] pthread_mutex_lock
      +   3.06%     ovs-vswitchd  [kernel.kallsyms]   [k] __ticket_spin_lock
      +   2.81%     ovs-vswitchd  ovs-vswitchd        [.] flow_hash_in_minimask_range
      +   2.51%     ovs-vswitchd  libpthread-2.15.so  [.] pthread_mutex_unlock
      +   2.27%     ovs-vswitchd  ovs-vswitchd        [.] classifier_lookup
      +   1.84%     ovs-vswitchd  libc-2.15.so        [.] 0x15d30f
      +   1.74%     ovs-vswitchd  [kernel.kallsyms]   [k] mutex_spin_on_owner
      +   1.47%          swapper  [kernel.kallsyms]   [k] intel_idle
      +   1.34%     ovs-vswitchd  ovs-vswitchd        [.] flow_hash_in_minimask
      +   1.33%     ovs-vswitchd  ovs-vswitchd        [.] rule_actions_unref
      +   1.16%     ovs-vswitchd  ovs-vswitchd        [.] hindex_node_with_hash
      +   1.16%     ovs-vswitchd  ovs-vswitchd        [.] do_xlate_actions
      +   1.09%     ovs-vswitchd  ovs-vswitchd        [.] ofproto_rule_ref
      +   1.01%          netperf  [kernel.kallsyms]   [k] __ticket_spin_lock
      ...
      
      There is a small increase in kernel spinlock overhead due to the same
      spinlock being shared between multiple cores of the same physical CPU,
      but that is barely visible in the netperf TCP_CRR test performance
      (maybe ~1% performance drop, hard to tell exactly due to variance in
      the test results), when testing for kernel module throughput (with no
      userspace activity, handful of kernel flows).
      
      On flow setup, a single stats instance is allocated (for the NUMA node
      0).  As CPUs from multiple NUMA nodes start updating stats, new
      NUMA-node specific stats instances are allocated.  This allocation on
      the packet processing code path is made to never block or look for
      emergency memory pools, minimizing the allocation latency.  If the
      allocation fails, the existing preallocated stats instance is used.
      Also, if only CPUs from one NUMA-node are updating the preallocated
      stats instance, no additional stats instances are allocated.  This
      eliminates the need to pre-allocate stats instances that will not be
      used, also relieving the stats reader from the burden of reading stats
      that are never used.
      Signed-off-by: default avatarJarno Rajahalme <jrajahalme@nicira.com>
      Acked-by: default avatarPravin B Shelar <pshelar@nicira.com>
      Signed-off-by: default avatarJesse Gross <jesse@nicira.com>
      63e7959c
    • Jarno Rajahalme's avatar
      openvswitch: Remove 5-tuple optimization. · 23dabf88
      Jarno Rajahalme authored
      The 5-tuple optimization becomes unnecessary with a later per-NUMA
      node stats patch.  Remove it first to make the changes easier to
      grasp.
      Signed-off-by: default avatarJarno Rajahalme <jrajahalme@nicira.com>
      Signed-off-by: default avatarJesse Gross <jesse@nicira.com>
      23dabf88
    • Joe Perches's avatar
      openvswitch: Use ether_addr_copy · 8c63ff09
      Joe Perches authored
      It's slightly smaller/faster for some architectures.
      Signed-off-by: default avatarJoe Perches <joe@perches.com>
      Signed-off-by: default avatarJesse Gross <jesse@nicira.com>
      8c63ff09
    • Joe Perches's avatar
      openvswitch: flow_netlink: Use pr_fmt to OVS_NLERR output · 2235ad1c
      Joe Perches authored
      Add "openvswitch: " prefix to OVS_NLERR output
      to match the other OVS_NLERR output of datapath.c
      Signed-off-by: default avatarJoe Perches <joe@perches.com>
      Signed-off-by: default avatarJesse Gross <jesse@nicira.com>
      2235ad1c
    • Joe Perches's avatar
      openvswitch: Use net_ratelimit in OVS_NLERR · 1815a883
      Joe Perches authored
      Each use of pr_<level>_once has a per-site flag.
      
      Some of the OVS_NLERR messages look as if seeing them
      multiple times could be useful, so use net_ratelimit()
      instead of pr_info_once.
      Signed-off-by: default avatarJoe Perches <joe@perches.com>
      Signed-off-by: default avatarJesse Gross <jesse@nicira.com>
      1815a883
    • Daniele Di Proietto's avatar
      openvswitch: Added (unsigned long long) cast in printf · cc23ebf3
      Daniele Di Proietto authored
      This is necessary, since u64 is not unsigned long long
      in all architectures: u64 could be also uint64_t.
      Signed-off-by: default avatarDaniele Di Proietto <daniele.di.proietto@gmail.com>
      Signed-off-by: default avatarJesse Gross <jesse@nicira.com>
      cc23ebf3
    • Daniele Di Proietto's avatar
      openvswitch: avoid cast-qual warning in vport_priv · 07dc0602
      Daniele Di Proietto authored
      This function must cast a const value to a non const value.
      By adding an uintptr_t cast the warning is suppressed.
      To avoid the cast (proper solution) several function signatures
      must be changed.
      Signed-off-by: default avatarDaniele Di Proietto <daniele.di.proietto@gmail.com>
      Signed-off-by: default avatarJesse Gross <jesse@nicira.com>
      07dc0602
    • Daniele Di Proietto's avatar
      openvswitch: avoid warnings in vport_from_priv · d0b4da13
      Daniele Di Proietto authored
      This change, firstly, avoids declaring the formal parameter const,
      since it is treated as non const. (to avoid -Wcast-qual)
      Secondly, it cast the pointer from void* to u8*, since it is used
      in arithmetic (to avoid -Wpointer-arith)
      Signed-off-by: default avatarDaniele Di Proietto <daniele.di.proietto@gmail.com>
      Signed-off-by: default avatarJesse Gross <jesse@nicira.com>
      d0b4da13
    • Daniele Di Proietto's avatar
      openvswitch: use const in some local vars and casts · 7085130b
      Daniele Di Proietto authored
      In few functions, const formal parameters are assigned or cast to
      non-const.
      These changes suppress warnings if compiled with -Wcast-qual.
      Signed-off-by: default avatarDaniele Di Proietto <daniele.di.proietto@gmail.com>
      Signed-off-by: default avatarJesse Gross <jesse@nicira.com>
      7085130b
    • dingtianhong's avatar
      macvlan: simplify the structure port · a188a54d
      dingtianhong authored
      The port->count was used to count the number of macvlan devs
      in the same port, but the list vlans could play the same role
      to do that, so free the port if the list vlans is empty and
      no need to use the parameter count.
      Signed-off-by: default avatarDing Tianhong <dingtianhong@huawei.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      a188a54d
  2. 15 May, 2014 24 commits
  3. 14 May, 2014 5 commits