1. 21 Apr, 2016 29 commits
  2. 20 Apr, 2016 11 commits
    • Roopa Prabhu's avatar
      rtnetlink: add new RTM_GETSTATS message to dump link stats · 10c9ead9
      Roopa Prabhu authored
      This patch adds a new RTM_GETSTATS message to query link stats via netlink
      from the kernel. RTM_NEWLINK also dumps stats today, but RTM_NEWLINK
      returns a lot more than just stats and is expensive in some cases when
      frequent polling for stats from userspace is a common operation.
      
      RTM_GETSTATS is an attempt to provide a light weight netlink message
      to explicity query only link stats from the kernel on an interface.
      The idea is to also keep it extensible so that new kinds of stats can be
      added to it in the future.
      
      This patch adds the following attribute for NETDEV stats:
      struct nla_policy ifla_stats_policy[IFLA_STATS_MAX + 1] = {
              [IFLA_STATS_LINK_64]  = { .len = sizeof(struct rtnl_link_stats64) },
      };
      
      Like any other rtnetlink message, RTM_GETSTATS can be used to get stats of
      a single interface or all interfaces with NLM_F_DUMP.
      
      Future possible new types of stat attributes:
      link af stats:
          - IFLA_STATS_LINK_IPV6  (nested. for ipv6 stats)
          - IFLA_STATS_LINK_MPLS  (nested. for mpls/mdev stats)
      extended stats:
          - IFLA_STATS_LINK_EXTENDED (nested. extended software netdev stats like bridge,
            vlan, vxlan etc)
          - IFLA_STATS_LINK_HW_EXTENDED (nested. extended hardware stats which are
            available via ethtool today)
      
      This patch also declares a filter mask for all stat attributes.
      User has to provide a mask of stats attributes to query. filter mask
      can be specified in the new hdr 'struct if_stats_msg' for stats messages.
      Other important field in the header is the ifindex.
      
      This api can also include attributes for global stats (eg tcp) in the future.
      When global stats are included in a stats msg, the ifindex in the header
      must be zero. A single stats message cannot contain both global and
      netdev specific stats. To easily distinguish them, netdev specific stat
      attributes name are prefixed with IFLA_STATS_LINK_
      
      Without any attributes in the filter_mask, no stats will be returned.
      
      This patch has been tested with mofified iproute2 ifstat.
      Suggested-by: default avatarJamal Hadi Salim <jhs@mojatatu.com>
      Signed-off-by: default avatarRoopa Prabhu <roopa@cumulusnetworks.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      10c9ead9
    • David S. Miller's avatar
      net: nla_align_64bit() needs to test the right pointer. · e6f268ef
      David S. Miller authored
      Netlink messages are appended, one object at a time, to the end of
      the SKB.  Therefore we need to test skb_tail_pointer() not skb->data
      for alignment.
      
      Fixes: 35c58459 ("net: Add helpers for 64-bit aligning netlink attributes.")
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      e6f268ef
    • Eric Dumazet's avatar
      net: fix HAVE_EFFICIENT_UNALIGNED_ACCESS typos · cca1d815
      Eric Dumazet authored
      HAVE_EFFICIENT_UNALIGNED_ACCESS needs CONFIG_ prefix.
      
      Also add a comment in nla_align_64bit() explaining we have
      to add a padding if current skb->data is aligned, as it
      certainly can be confusing.
      
      Fixes: 35c58459 ("net: Add helpers for 64-bit aligning netlink attributes.")
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      cca1d815
    • Peter Heise's avatar
      net/hsr: Fixed version field in ENUM · b84e9307
      Peter Heise authored
      New field (IFLA_HSR_VERSION) was added in the middle of an existing
      ENUM and would break kernel ABI, therefore moved to the end.
      Reported by Stephen Hemminger.
      Signed-off-by: default avatarPeter Heise <peter.heise@airbus.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      b84e9307
    • Vivien Didelot's avatar
      net: dsa: kill circular reference with slave priv · 46e7b8d8
      Vivien Didelot authored
      The dsa_slave_priv structure does not need a pointer to its net_device.
      Kill it.
      Signed-off-by: default avatarVivien Didelot <vivien.didelot@savoirfairelinux.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      46e7b8d8
    • David S. Miller's avatar
      Merge branch 'bpf_event_output' · 9f4ab6ec
      David S. Miller authored
      Daniel Borkmann says:
      
      ====================
      BPF updates
      
      This minor set adds a new helper bpf_event_output() for eBPF cls/act
      program types which allows to pass events to user space applications.
      For details, please see individual patches.
      
      v1 -> v2:
        - Address kbuild bot found compile issue in patch 2
        - Rest as is
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      9f4ab6ec
    • Daniel Borkmann's avatar
      bpf: add event output helper for notifications/sampling/logging · bd570ff9
      Daniel Borkmann authored
      This patch adds a new helper for cls/act programs that can push events
      to user space applications. For networking, this can be f.e. for sampling,
      debugging, logging purposes or pushing of arbitrary wake-up events. The
      idea is similar to a43eec30 ("bpf: introduce bpf_perf_event_output()
      helper") and 39111695 ("samples: bpf: add bpf_perf_event_output example").
      
      The eBPF program utilizes a perf event array map that user space populates
      with fds from perf_event_open(), the eBPF program calls into the helper
      f.e. as skb_event_output(skb, &my_map, BPF_F_CURRENT_CPU, raw, sizeof(raw))
      so that the raw data is pushed into the fd f.e. at the map index of the
      current CPU.
      
      User space can poll/mmap/etc on this and has a data channel for receiving
      events that can be post-processed. The nice thing is that since the eBPF
      program and user space application making use of it are tightly coupled,
      they can define their own arbitrary raw data format and what/when they
      want to push.
      
      While f.e. packet headers could be one part of the meta data that is being
      pushed, this is not a substitute for things like packet sockets as whole
      packet is not being pushed and push is only done in a single direction.
      Intention is more of a generically usable, efficient event pipe to applications.
      Workflow is that tc can pin the map and applications can attach themselves
      e.g. after cls/act setup to one or multiple map slots, demuxing is done by
      the eBPF program.
      
      Adding this facility is with minimal effort, it reuses the helper
      introduced in a43eec30 ("bpf: introduce bpf_perf_event_output() helper")
      and we get its functionality for free by overloading its BPF_FUNC_ identifier
      for cls/act programs, ctx is currently unused, but will be made use of in
      future. Example will be added to iproute2's BPF example files.
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Signed-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      bd570ff9
    • Daniel Borkmann's avatar
      bpf, trace: add BPF_F_CURRENT_CPU flag for bpf_perf_event_output · 1e33759c
      Daniel Borkmann authored
      Add a BPF_F_CURRENT_CPU flag to optimize the use-case where user space has
      per-CPU ring buffers and the eBPF program pushes the data into the current
      CPU's ring buffer which saves us an extra helper function call in eBPF.
      Also, make sure to properly reserve the remaining flags which are not used.
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Signed-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      1e33759c
    • Julia Lawall's avatar
      arcnet: com90xx: add __init attribute · 553bc087
      Julia Lawall authored
      Add __init attribute on a function that is only called from other __init
      functions and that is not inlined, at least with gcc version 4.8.4 on an
      x86 machine with allyesconfig.  Currently, the function is put in the
      .text.unlikely segment.  Declaring it as __init will cause it to be put in
      the .init.text and to disappear after initialization.
      
      The result of objdump -x on the function before the change is as follows:
      
      0000000000000000 l     F .text.unlikely 00000000000000bf check_mirror
      
      And after the change it is as follows:
      
      0000000000000000 l     F .init.text	00000000000000ba check_mirror
      
      Done with the help of Coccinelle.  The semantic patch checks for local
      static non-init functions that are called from an __init function and are
      not called from any other function.
      Signed-off-by: default avatarJulia Lawall <Julia.Lawall@lip6.fr>
      Acked-by: default avatarMichael Grzeschik <mgr@pengutronix.de>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      553bc087
    • Konstantin Khlebnikov's avatar
      net/ipv6/addrconf: fix sysctl table indentation · 5df1f77f
      Konstantin Khlebnikov authored
      Separated from previous patch for readability.
      Signed-off-by: default avatarKonstantin Khlebnikov <khlebnikov@yandex-team.ru>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      5df1f77f
    • Konstantin Khlebnikov's avatar
      net/ipv6/addrconf: simplify sysctl registration · 607ea7cd
      Konstantin Khlebnikov authored
      Struct ctl_table_header holds pointer to sysctl table which could be used
      for freeing it after unregistration. IPv4 sysctls already use that.
      Remove redundant NULL assignment: ndev allocated using kzalloc.
      
      This also saves some bytes: sysctl table could be shorter than
      DEVCONF_MAX+1 if some options are disable in config.
      Signed-off-by: default avatarKonstantin Khlebnikov <khlebnikov@yandex-team.ru>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      607ea7cd