1. 11 Jul, 2016 8 commits
    • Liping Zhang's avatar
      netfilter: nft_ct: make byte/packet expr more friendly · 3f8b61b7
      Liping Zhang authored
      If we want to use ct packets expr, and add a rule like follows:
        # nft add rule filter input ct packets gt 1 counter
      
      We will find that no packets will hit it, because
      nf_conntrack_acct is disabled by default. So It will
      not work until we enable it manually via
      "echo 1 > /proc/sys/net/netfilter/nf_conntrack_acct".
      
      This is not friendly, so like xt_connbytes do, if the user
      want to use ct byte/packet expr, enable nf_conntrack_acct
      automatically.
      Signed-off-by: default avatarLiping Zhang <liping.zhang@spreadtrum.com>
      Signed-off-by: default avatarPablo Neira Ayuso <pablo@netfilter.org>
      3f8b61b7
    • Hangbin Liu's avatar
      netfilter: physdev: physdev-is-out should not work with OUTPUT chain · 47c74456
      Hangbin Liu authored
      physdev_mt() will check skb->nf_bridge first, which was alloced in
      br_nf_pre_routing. So if we want to use --physdev-out and physdev-is-out,
      we need to match it in FORWARD or POSTROUTING chain. physdev_mt_check()
      only checked physdev-out and missed physdev-is-out. Fix it and update the
      debug message to make it clearer.
      Signed-off-by: default avatarHangbin Liu <liuhangbin@gmail.com>
      Reviewed-by: default avatarMarcelo R Leitner <marcelo.leitner@gmail.com>
      Signed-off-by: default avatarPablo Neira Ayuso <pablo@netfilter.org>
      47c74456
    • Florian Westphal's avatar
      netfilter: nat: convert nat bysrc hash to rhashtable · 870190a9
      Florian Westphal authored
      It did use a fixed-size bucket list plus single lock to protect add/del.
      
      Unlike the main conntrack table we only need to add and remove keys.
      Convert it to rhashtable to get table autosizing and per-bucket locking.
      
      The maximum number of entries is -- as before -- tied to the number of
      conntracks so we do not need another upperlimit.
      
      The change does not handle rhashtable_remove_fast error, only possible
      "error" is -ENOENT, and that is something that can happen legitimetely,
      e.g. because nat module was inserted at a later time and no src manip
      took place yet.
      
      Tested with http-client-benchmark + httpterm with DNAT and SNAT rules
      in place.
      Signed-off-by: default avatarFlorian Westphal <fw@strlen.de>
      Signed-off-by: default avatarPablo Neira Ayuso <pablo@netfilter.org>
      870190a9
    • Florian Westphal's avatar
      netfilter: move nat hlist_head to nf_conn · 7c966435
      Florian Westphal authored
      The nat extension structure is 32bytes in size on x86_64:
      
      struct nf_conn_nat {
              struct hlist_node          bysource;             /*     0    16 */
              struct nf_conn *           ct;                   /*    16     8 */
              union nf_conntrack_nat_help help;                /*    24     4 */
              int                        masq_index;           /*    28     4 */
              /* size: 32, cachelines: 1, members: 4 */
              /* last cacheline: 32 bytes */
      };
      
      The hlist is needed to quickly check for possible tuple collisions
      when installing a new nat binding. Storing this in the extension
      area has two drawbacks:
      
      1. We need ct backpointer to get the conntrack struct from the extension.
      2. When reallocation of extension area occurs we need to fixup the bysource
         hash head via hlist_replace_rcu.
      
      We can avoid both by placing the hlist_head in nf_conn and place nf_conn in
      the bysource hash rather than the extenstion.
      
      We can also remove the ->move support; no other extension needs it.
      
      Moving the entire nat extension into nf_conn would be possible as well but
      then we have to add yet another callback for deletion from the bysource
      hash table rather than just using nat extension ->destroy hook for this.
      
      nf_conn size doesn't increase due to aligment, followup patch replaces
      hlist_node with single pointer.
      Signed-off-by: default avatarFlorian Westphal <fw@strlen.de>
      Signed-off-by: default avatarPablo Neira Ayuso <pablo@netfilter.org>
      7c966435
    • Florian Westphal's avatar
      netfilter: conntrack: simplify early_drop · 242922a0
      Florian Westphal authored
      We don't need to acquire the bucket lock during early drop, we can
      use lockless traveral just like ____nf_conntrack_find.
      
      The timer deletion serves as synchronization point, if another cpu
      attempts to evict same entry, only one will succeed with timer deletion.
      Signed-off-by: default avatarFlorian Westphal <fw@strlen.de>
      Signed-off-by: default avatarPablo Neira Ayuso <pablo@netfilter.org>
      242922a0
    • Liping Zhang's avatar
      netfilter: nf_ct_helper: unlink helper again when hash resize happen · 8786a971
      Liping Zhang authored
      From: Liping Zhang <liping.zhang@spreadtrum.com>
      
      Similar to ctnl_untimeout, when hash resize happened, we should try
      to do unhelp from the 0# bucket again.
      Signed-off-by: default avatarLiping Zhang <liping.zhang@spreadtrum.com>
      Signed-off-by: default avatarPablo Neira Ayuso <pablo@netfilter.org>
      8786a971
    • Liping Zhang's avatar
      netfilter: cttimeout: unlink timeout obj again when hash resize happen · 474803d3
      Liping Zhang authored
      Imagine such situation, nf_conntrack_htable_size now is 4096, we are doing
      ctnl_untimeout, and iterate on 3000# bucket.
      
      Meanwhile, another user try to reduce hash size to 2048, then all nf_conn
      are removed to the new hashtable. When this hash resize operation finished,
      we still try to itreate ct begin from 3000# bucket, find nothing to do and
      just return.
      
      We may miss unlinking some timeout objects. And later we will end up with
      invalid references to timeout object that are already gone.
      
      So when we find that hash resize happened, try to unlink timeout objects
      from the 0# bucket again.
      Signed-off-by: default avatarLiping Zhang <liping.zhang@spreadtrum.com>
      Signed-off-by: default avatarPablo Neira Ayuso <pablo@netfilter.org>
      474803d3
    • Liping Zhang's avatar
      netfilter: conntrack: fix race between nf_conntrack proc read and hash resize · 64b87639
      Liping Zhang authored
      When we do "cat /proc/net/nf_conntrack", and meanwhile resize the conntrack
      hash table via /sys/module/nf_conntrack/parameters/hashsize, race will
      happen, because reader can observe a newly allocated hash but the old size
      (or vice versa). So oops will happen like follows:
      
        BUG: unable to handle kernel NULL pointer dereference at 0000000000000017
        IP: [<ffffffffa0418e21>] seq_print_acct+0x11/0x50 [nf_conntrack]
        Call Trace:
        [<ffffffffa0412f4e>] ? ct_seq_show+0x14e/0x340 [nf_conntrack]
        [<ffffffff81261a1c>] seq_read+0x2cc/0x390
        [<ffffffff812a8d62>] proc_reg_read+0x42/0x70
        [<ffffffff8123bee7>] __vfs_read+0x37/0x130
        [<ffffffff81347980>] ? security_file_permission+0xa0/0xc0
        [<ffffffff8123cf75>] vfs_read+0x95/0x140
        [<ffffffff8123e475>] SyS_read+0x55/0xc0
        [<ffffffff817c2572>] entry_SYSCALL_64_fastpath+0x1a/0xa4
      
      It is very easy to reproduce this kernel crash.
      1. open one shell and input the following cmds:
        while : ; do
          echo $RANDOM > /sys/module/nf_conntrack/parameters/hashsize
        done
      2. open more shells and input the following cmds:
        while : ; do
          cat /proc/net/nf_conntrack
        done
      3. just wait a monent, oops will happen soon.
      
      The solution in this patch is based on Florian's Commit 5e3c61f9
      ("netfilter: conntrack: fix lookup race during hash resize"). And
      add a wrapper function nf_conntrack_get_ht to get hash and hsize
      suggested by Florian Westphal.
      Signed-off-by: default avatarLiping Zhang <liping.zhang@spreadtrum.com>
      Signed-off-by: default avatarPablo Neira Ayuso <pablo@netfilter.org>
      64b87639
  2. 07 Jul, 2016 2 commits
  3. 06 Jul, 2016 14 commits
    • David S. Miller's avatar
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net · 30d0844b
      David S. Miller authored
      Conflicts:
      	drivers/net/ethernet/mellanox/mlx5/core/en.h
      	drivers/net/ethernet/mellanox/mlx5/core/en_main.c
      	drivers/net/usb/r8152.c
      
      All three conflicts were overlapping changes.
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      30d0844b
    • Linus Torvalds's avatar
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net · bc867651
      Linus Torvalds authored
      Pull networking fixes from David Miller:
      
       1) All users of AF_PACKET's fanout feature want a symmetric packet
          header hash for load balancing purposes, so give it to them.
      
       2) Fix vlan state synchronization in e1000e, from Jarod Wilson.
      
       3) Use correct socket pointer in ip_skb_dst_mtu(), from Shmulik
          Ladkani.
      
       4) mlx5 bug fixes from Mohamad Haj Yahia, Daniel Jurgens, Matthew
          Finlay, Rana Shahout, and Shaker Daibes.  Mostly to do with
          operation timeouts and PCI error handling.
      
       5) Fix checksum handling in mirred packet action, from WANG Cong.
      
       6) Set skb->dev correctly when transmitting in !protect_frames case of
          macsec driver, from Daniel Borkmann.
      
       7) Fix MTU calculation in geneve driver, from Haishuang Yan.
      
       8) Missing netif_napi_del() in unregister path of qeth driver, from
          Ursula Braun.
      
       9) Handle malformed route netlink messages in decnet properly, from
          Vergard Nossum.
      
      10) Memory leak of percpu data in ipv6 routing code, from Martin KaFai
          Lau.
      
      * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (41 commits)
        ipv6: Fix mem leak in rt6i_pcpu
        net: fix decnet rtnexthop parsing
        cxgb4: update latest firmware version supported
        net/mlx5: Avoid setting unused var when modifying vport node GUID
        bonding: fix enslavement slave link notifications
        r8152: fix runtime function for RTL8152
        qeth: delete napi struct when removing a qeth device
        Revert "fsl/fman: fix error handling"
        fsl/fman: fix error handling
        cdc_ncm: workaround for EM7455 "silent" data interface
        RDS: fix rds_tcp_init() error path
        geneve: fix max_mtu setting
        net: phy: dp83867: Fix initialization of PHYCR register
        enc28j60: Fix race condition in enc28j60 driver
        net: stmmac: Fix null-function call in ISR on stmmac1000
        tipc: fix nl compat regression for link statistics
        net: bcmsysport: Device stats are unsigned long
        macsec: set actual real device for xmit when !protect_frames
        net_sched: fix mirrored packets checksum
        packet: Use symmetric hash for PACKET_FANOUT_HASH.
        ...
      bc867651
    • David S. Miller's avatar
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf-next · ae3e4562
      David S. Miller authored
      Pablo Neira Ayuso says:
      
      ====================
      Netfilter updates for net-next
      
      The following patchset contains Netfilter updates for net-next,
      they are:
      
      1) Don't use userspace datatypes in bridge netfilter code, from
         Tobin Harding.
      
      2) Iterate only once over the expectation table when removing the
         helper module, instead of once per-netns, from Florian Westphal.
      
      3) Extra sanitization in xt_hook_ops_alloc() to return error in case
         we ever pass zero hooks, xt_hook_ops_alloc():
      
      4) Handle NFPROTO_INET from the logging core infrastructure, from
         Liping Zhang.
      
      5) Autoload loggers when TRACE target is used from rules, this doesn't
         change the behaviour in case the user already selected nfnetlink_log
         as preferred way to print tracing logs, also from Liping Zhang.
      
      6) Conntrack slabs with SLAB_HWCACHE_ALIGN to allow rearranging fields
         by cache lines, increases the size of entries in 11% per entry.
         From Florian Westphal.
      
      7) Skip zone comparison if CONFIG_NF_CONNTRACK_ZONES=n, from Florian.
      
      8) Remove useless defensive check in nf_logger_find_get() from Shivani
         Bhardwaj.
      
      9) Remove zone extension as place it in the conntrack object, this is
         always include in the hashing and we expect more intensive use of
         zones since containers are in place. Also from Florian Westphal.
      
      10) Owner match now works from any namespace, from Eric Bierdeman.
      
      11) Make sure we only reply with TCP reset to TCP traffic from
          nf_reject_ipv4, patch from Liping Zhang.
      
      12) Introduce --nflog-size to indicate amount of network packet bytes
          that are copied to userspace via log message, from Vishwanath Pai.
          This obsoletes --nflog-range that has never worked, it was designed
          to achieve this but it has never worked.
      
      13) Introduce generic macros for nf_tables object generation masks.
      
      14) Use generation mask in table, chain and set objects in nf_tables.
          This allows fixes interferences with ongoing preparation phase of
          the commit protocol and object listings going on at the same time.
          This update is introduced in three patches, one per object.
      
      15) Check if the object is active in the next generation for element
          deactivation in the rbtree implementation, given that deactivation
          happens from the commit phase path we have to observe the future
          status of the object.
      
      16) Support for deletion of just added elements in the hash set type.
      
      17) Allow to resize hashtable from /proc entry, not only from the
          obscure /sys entry that maps to the module parameter, from Florian
          Westphal.
      
      18) Get rid of NFT_BASECHAIN_DISABLED, this code is not exercised
          anymore since we tear down the ruleset whenever the netdevice
          goes away.
      
      19) Support for matching inverted set lookups, from Arturo Borrero.
      
      20) Simplify the iptables_mangle_hook() by removing a superfluous
          extra branch.
      
      21) Introduce ether_addr_equal_masked() and use it from the netfilter
          codebase, from Joe Perches.
      
      22) Remove references to "Use netfilter MARK value as routing key"
          from the Netfilter Kconfig description given that this toggle
          doesn't exists already for 10 years, from Moritz Sichert.
      
      23) Introduce generic NF_INVF() and use it from the xtables codebase,
          from Joe Perches.
      
      24) Setting logger to NONE via /proc was not working unless explicit
          nul-termination was included in the string. This fixes seems to
          leave the former behaviour there, so we don't break backward.
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      ae3e4562
    • Linus Torvalds's avatar
      Merge tag 'sound-4.7-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound · 4cdbbbd1
      Linus Torvalds authored
      Pull sound fixes from Takashi Iwai:
       "Here are a collection of small fixes: at this time, we've got a
        slightly high amount, but all small and trivial fixes, and nothing
        scary can be seen there"
      
      * tag 'sound-4.7-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound: (21 commits)
        ALSA: hda/realtek: Add Lenovo L460 to docking unit fixup
        ALSA: timer: Fix negative queue usage by racy accesses
        ASoC: rt5645: fix reg-2f default value.
        ASoC: fsl_ssi: Fix number of words per frame for I2S-slave mode
        ALSA: au88x0: Fix calculation in vortex_wtdma_bufshift()
        ALSA: hda - Add PCI ID for Kabylake-H
        ALSA: echoaudio: Fix memory allocation
        ASoC: Intel: atom: fix missing breaks that would cause the wrong operation to execute
        ALSA: hda - fix read before array start
        ASoC: cx20442: set tty->receiver_room in v253_open
        ASoC: ak4613: Enable cache usage to fix crashes on resume
        ASoC: wm8940: Enable cache usage to fix crashes on resume
        ASoC: Intel: Skylake: Initialize module list for Broxton
        ASoC: wm5102: Correct supported channels on trace compressed DAI
        ASoC: wm5110: Add missing route from OUT3R to SYSCLK
        ASoC: rt5670: fix HP Playback Volume control
        ASoC: hdmi-codec: select CONFIG_HDMI
        ASoC: davinci-mcasp: Fix dra7 DMA offset when using CFG port
        ASoC: hdac_hdmi: Fix potential NULL dereference
        ASoC: ak4613: Remove owner assignment from platform_driver
        ...
      4cdbbbd1
    • Linus Torvalds's avatar
      Merge tag 'chrome-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/olof/chrome-platform · 4d0a279c
      Linus Torvalds authored
      Pull chrome platform fix from Olof Johansson:
       "A single fix this time, closing a window where ioctl args are fetched
        twice"
      
      * tag 'chrome-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/olof/chrome-platform:
        platform/chrome: cros_ec_dev - double fetch bug in ioctl
      4d0a279c
    • Masashi Honma's avatar
      cfg80211: Add mesh peer AID setting API · 7d27a0ba
      Masashi Honma authored
      Previously, mesh power management functionality works only with kernel
      MPM. Because user space MPM did not report mesh peer AID to kernel,
      the kernel could not identify the bit in TIM element. So this patch
      adds mesh peer AID setting API.
      Signed-off-by: default avatarMasashi Honma <masashi.honma@gmail.com>
      Signed-off-by: default avatarJohannes Berg <johannes.berg@intel.com>
      7d27a0ba
    • Johannes Berg's avatar
      mac80211: parse wide bandwidth channel switch IE with workaround · 92b3a28a
      Johannes Berg authored
      Continuing the workaround implemented in commit 23665aaf
      ("mac80211: Interoperability workaround for 80+80 and 160 MHz channels")
      use the same code to parse the Wide Bandwidth Channel Switch element
      by converting to VHT Operation element since the spec also just refers
      to that for parsing semantics, particularly with the workaround.
      
      While at it, remove some dead code - the IEEE80211_STA_DISABLE_40MHZ
      flag can never be set at this point since it's checked earlier and the
      wide_bw_chansw_ie pointer is set to NULL if it's set.
      Signed-off-by: default avatarJohannes Berg <johannes.berg@intel.com>
      92b3a28a
    • Johannes Berg's avatar
      mac80211: report failure to start (partial) scan as scan abort · 7d10f6b1
      Johannes Berg authored
      Rather than reporting the scan as having completed, report it as
      being aborted.
      Signed-off-by: default avatarJohannes Berg <johannes.berg@intel.com>
      7d10f6b1
    • Avraham Stern's avatar
      mac80211: Add support for beacon report radio measurement · 7947d3e0
      Avraham Stern authored
      Add the following to support beacon report radio measurement
      with the measurement mode field set to passive or active:
      1. Propagate the required scan duration to the device
      2. Report the scan start time (in terms of TSF)
      3. Report each BSS's detection time (also in terms of TSF)
      
      TSF times refer to the BSS that the interface that requested the
      scan is connected to.
      Signed-off-by: default avatarAssaf Krauss <assaf.krauss@intel.com>
      Signed-off-by: default avatarAvraham Stern <avraham.stern@intel.com>
      [changed ath9k/10k, at76c59x-usb, iwlegacy, wl1251 and wlcore to match
      the new API]
      Signed-off-by: default avatarLuca Coelho <luciano.coelho@intel.com>
      Signed-off-by: default avatarJohannes Berg <johannes.berg@intel.com>
      7947d3e0
    • Avraham Stern's avatar
      nl80211: support beacon report scanning · 1d76250b
      Avraham Stern authored
      Beacon report radio measurement requires reporting observed BSSs
      on the channels specified in the beacon request. If the measurement
      mode is set to passive or active, it requires actually performing a
      scan (passive or active, accordingly), and reporting the time that
      the scan was started and the time each beacon/probe was received
      (both in terms of TSF of the BSS of the requesting AP). If the
      request mode is table, this information is optional.
      In addition, the radio measurement request specifies the channel
      dwell time for the measurement.
      
      In order to use scan for beacon report when the mode is active or
      passive, add a parameter to scan request that specifies the
      channel dwell time, and add scan start time and beacon received time
      to scan results information.
      
      Supporting beacon report is required for Multi Band Operation (MBO).
      Signed-off-by: default avatarAssaf Krauss <assaf.krauss@intel.com>
      Signed-off-by: default avatarDavid Spinadel <david.spinadel@intel.com>
      Signed-off-by: default avatarAvraham Stern <avraham.stern@intel.com>
      Signed-off-by: default avatarLuca Coelho <luciano.coelho@intel.com>
      Signed-off-by: default avatarJohannes Berg <johannes.berg@intel.com>
      1d76250b
    • Johannes Berg's avatar
      mac80211_hwsim: use signed net namespace ID · f1724b02
      Johannes Berg authored
      The API expects a pointer to a signed int so we should not use an
      unsigned int for it.
      Signed-off-by: default avatarJohannes Berg <johannes.berg@intel.com>
      f1724b02
    • Ilan Peer's avatar
      mac80211_hwsim: Add radar bandwidths to the P2P Device combination · f7736f50
      Ilan Peer authored
      Add radar_detect_widths to the interface combination that allows
      concurrent P2P Device dedicated interface and AP interfaces, to enable
      testing of radar detection when P2P Device interface is used.
      
      Clear the radar_detect_widths in case of multi channel contexts
      as this is not currently supported.
      
      As radar_detect_widths are now supported in all combinations,
      remove the hwsim_if_dfs_limits definition since it is no longer
      needed.
      Signed-off-by: default avatarIlan Peer <ilan.peer@intel.com>
      Signed-off-by: default avatarLuca Coelho <luciano.coelho@intel.com>
      Signed-off-by: default avatarJohannes Berg <johannes.berg@intel.com>
      f7736f50
    • Aviya Erenfeld's avatar
      nl80211: Add API to support VHT MU-MIMO air sniffer · c6e6a0c8
      Aviya Erenfeld authored
      add API to support VHT MU-MIMO air sniffer.
      in MU-MIMO there are parallel frames on the air while the HW
      has only one RX.
      add the capability to sniff one of the MU-MIMO parallel frames by
      giving the sniffer additional information so it'll know which
      of the parallel frames it shall follow.
      
      Add attribute - NL80211_ATTR_MU_MIMO_GROUP_DATA - for getting
      a MU-MIMO groupID in order to monitor packets from that group
      using VHT MU-MIMO.
      And add attribute -NL80211_ATTR_MU_MIMO_FOLLOW_ADDR - for passing
      MAC address to monitor mode.
      that option will be used by VHT MU-MIMO air sniffer to follow a
      station according to it's MAC address using VHT MU-MIMO.
      Signed-off-by: default avatarAviya Erenfeld <aviya.erenfeld@intel.com>
      Signed-off-by: default avatarLuca Coelho <luciano.coelho@intel.com>
      Signed-off-by: default avatarJohannes Berg <johannes.berg@intel.com>
      c6e6a0c8
    • Johannes Berg's avatar
      mac80211: agg-rx: refuse ADDBA Request with timeout update · f89e07d4
      Johannes Berg authored
      The current implementation of handling ADDBA Request while a session
      is already active with the peer is wrong - in case the peer is using
      the existing session's dialog token this should be treated as update
      to the session, which can update the timeout value.
      
      We don't really have a good way of supporting that, so reject, but
      implement the required behaviour in the spec of "Even if the updated
      ADDBA Request frame is not accepted, the original Block ACK setup
      remains active." (802.11-2012 10.5.4)
      Signed-off-by: default avatarJohannes Berg <johannes.berg@intel.com>
      f89e07d4
  4. 05 Jul, 2016 16 commits