1. 09 Sep, 2016 4 commits
    • Felix Fietkau's avatar
      ath9k: improve powersave filter handling · 315c457f
      Felix Fietkau authored
      For non-aggregated frames, ath9k was leaving handling of powersave
      filtered packets to mac80211. This can be too slow if the intermediate
      queue is already filled with packets and mac80211 does not immediately
      send a new packet via drv_tx().
      
      Improve response time with filtered frames by triggering clearing the
      powersave filter internally.
      Signed-off-by: default avatarFelix Fietkau <nbd@nbd.name>
      Signed-off-by: default avatarKalle Valo <kvalo@qca.qualcomm.com>
      315c457f
    • Felix Fietkau's avatar
      ath9k: use ieee80211_tx_status_noskb where possible · d94a461d
      Felix Fietkau authored
      It removes the need for undoing the padding changes to skb->data and it
      improves performance by eliminating one tx status lookup per MPDU in the
      status path. It is also useful for preparing a follow-up fix to better
      handle powersave filtering.
      
      A side effect is that these counters, available via debugfs, become now invalid:
      
      * dot11TransmittedFragmentCount
      * dot11FrameDuplicateCount,
      * dot11ReceivedFragmentCount
      * dot11MulticastReceivedFrameCount
      Signed-off-by: default avatarFelix Fietkau <nbd@nbd.name>
      [kvalo@qca.qualcomm.com: add a note about counters, thanks to Zefir Kurtisi]
      Signed-off-by: default avatarKalle Valo <kvalo@qca.qualcomm.com>
      d94a461d
    • Rajkumar Manoharan's avatar
      ath10k: fix throughput regression in multi client mode · 18f53fe0
      Rajkumar Manoharan authored
      commit 7a0adc83 ("ath10k: improve tx scheduling") is causing
      severe throughput drop in multi client mode. This issue is originally
      reported in veriwave setup with 50 clients with TCP downlink traffic.
      While increasing number of clients, the average throughput drops
      gradually. With 50 clients, the combined peak throughput is decreased
      to 98 Mbps whereas reverting given commit restored it to 550 Mbps.
      
      Processing txqs for every tx completion is causing overhead. Ideally for
      management frame tx completion, pending txqs processing can be avoided.
      The change partly reverts the commit "ath10k: improve tx scheduling".
      Processing pending txqs after all skbs tx completion will yeild enough
      room to burst tx frames.
      
      Fixes: 7a0adc83 ("ath10k: improve tx scheduling")
      Signed-off-by: default avatarRajkumar Manoharan <rmanohar@qti.qualcomm.com>
      Signed-off-by: default avatarKalle Valo <kvalo@qca.qualcomm.com>
      18f53fe0
    • Rajkumar Manoharan's avatar
      ath10k: implement NAPI support · 3c97f5de
      Rajkumar Manoharan authored
      Add NAPI support for rx and tx completion. NAPI poll is scheduled
      from interrupt handler. The design is as below
      
       - on interrupt
           - schedule napi and mask interrupts
       - on poll
         - process all pipes (no actual Tx/Rx)
         - process Rx within budget
         - if quota exceeds budget reschedule napi poll by returning budget
         - process Tx completions and update budget if necessary
         - process Tx fetch indications (pull-push)
         - push any other pending Tx (if possible)
         - before resched or napi completion replenish htt rx ring buffer
         - if work done < budget, complete napi poll and unmask interrupts
      
      This change also get rid of two tasklets (intr_tq and txrx_compl_task).
      
      Measured peak throughput with NAPI on IPQ4019 platform in controlled
      environment. No noticeable reduction in throughput is seen and also
      observed improvements in CPU usage. Approx. 15% CPU usage got reduced
      in UDP uplink case.
      
      DL: AP DUT Tx
      UL: AP DUT Rx
      
      IPQ4019 (avg. cpu usage %)
      
      ========
                      TOT              +NAPI
                    ===========      =============
      TCP DL       644 Mbps (42%)    645 Mbps (36%)
      TCP UL       673 Mbps (30%)    675 Mbps (26%)
      UDP DL       682 Mbps (49%)    680 Mbps (49%)
      UDP UL       720 Mbps (28%)    717 Mbps (11%)
      Signed-off-by: default avatarRajkumar Manoharan <rmanohar@qti.qualcomm.com>
      Signed-off-by: default avatarKalle Valo <kvalo@qca.qualcomm.com>
      3c97f5de
  2. 02 Sep, 2016 12 commits
    • Baoyou Xie's avatar
      ath9k: mark ath_fill_led_pin() static · c39265f7
      Baoyou Xie authored
      We get 1 warning about global functions without a declaration
      in the ath9k gpio driver when building with W=1:
      drivers/net/wireless/ath/ath9k/gpio.c:25:6: warning: no previous prototype for 'ath_fill_led_pin' [-Wmissing-prototypes]
      
      In fact, this function is only used in the file in which it is declared
      and don't need a declaration, but can be made static.
      so this patch marks it 'static'.
      Signed-off-by: default avatarBaoyou Xie <baoyou.xie@linaro.org>
      Signed-off-by: default avatarKalle Valo <kvalo@qca.qualcomm.com>
      c39265f7
    • Colin Ian King's avatar
      ath10k: fix spelling mistake "montior" -> "monitor" · 7f03d306
      Colin Ian King authored
      Trivial fix to spelling mistake in ath10k_warn message.
      Signed-off-by: default avatarColin Ian King <colin.king@canonical.com>
      Reviewed-by: default avatarJulian Calaby <julian.calaby@gmail.com>
      Signed-off-by: default avatarKalle Valo <kvalo@qca.qualcomm.com>
      7f03d306
    • Mohammed Shafi Shajakhan's avatar
      ath10k: Fix broken NULL func data frame status for 10.4 · 2cdce425
      Mohammed Shafi Shajakhan authored
      Older firmware with HTT delivers incorrect tx status for null func
      frames to driver, but this fixed in 10.2 and 10.4 firmware versions.
      Also this workaround results in reporting of incorrect null func status
      for 10.4. Fix this is by introducing a firmware feature flag for 10.4
      so that this workaround is skipped and proper tx status for null func
      frames are reported
      Signed-off-by: default avatarTamizh chelvam <c_traja@qti.qualcomm.com>
      Signed-off-by: default avatarMohammed Shafi Shajakhan <mohammed@qti.qualcomm.com>
      Signed-off-by: default avatarKalle Valo <kvalo@qca.qualcomm.com>
      2cdce425
    • Masahiro Yamada's avatar
      ath10k: replace config_enabled() with IS_REACHABLE() · 749bc03a
      Masahiro Yamada authored
      Commit 97f2645f ("tree-wide: replace config_enabled() with
      IS_ENABLED()") mostly did away with config_enabled().
      
      This is one of the postponed TODO items as config_enabled() is used
      for a tristate option here.  Theoretically, config_enabled() is
      equivalent to IS_BUILTIN(), but I guess IS_REACHABLE() is the best
      fit for this case because both CONFIG_HWMON and CONFIG_ATH10K are
      tristate.
      Signed-off-by: default avatarMasahiro Yamada <yamada.masahiro@socionext.com>
      Signed-off-by: default avatarKalle Valo <kvalo@qca.qualcomm.com>
      749bc03a
    • Maharaja Kennadyrajan's avatar
      ath10k: Added support for extended dbglog module id for 10.4 · afcbc82c
      Maharaja Kennadyrajan authored
      For 10.4 fw versions, dbglog module id has been extended from u32
      to u64, hence this patch fixes the same in the ath10k driver side.
      
      This patch doesn't break the older 10.4 releases. The FW change
      is already present in the older FWs.
      Signed-off-by: default avatarMaharaja Kennadyrajan <c_mkenna@qti.qualcomm.com>
      Signed-off-by: default avatarKalle Valo <kvalo@qca.qualcomm.com>
      afcbc82c
    • Daniel Wagner's avatar
      ath10k: use complete() instead complete_all() · 881ed54e
      Daniel Wagner authored
      There is only one waiter for the completion, therefore there
      is no need to use complete_all(). Let's make that clear by
      using complete() instead of complete_all().
      
      The usage pattern of the completion is:
      
      waiter context                          waker context
      
      scan.started
      ------------
      
      ath10k_start_scan()
        lockdep_assert_held(conf_mutex)
        auth10k_wmi_start_scan()
        wait_for_completion_timeout(scan.started)
      
      					ath10k_wmi_event_scan_start_failed()
      					  complete(scan.started)
      
      					ath10k_wmi_event_scan_started()
      					  complete(scan.started)
      
      scan.completed
      --------------
      
      ath10k_scan_stop()
        lockdep_assert_held(conf_mutex)
        ath10k_wmi_stop_scan()
        wait_for_completion_timeout(scan.completed)
      
      					__ath10k_scan_finish()
      					  complete(scan.completed)
      
      scan.on_channel
      ---------------
      
      ath10k_remain_on_channel()
        mutex_lock(conf_mutex)
        ath10k_start_scan()
        wait_for_completion_timeout(scan.on_channel)
      
      					ath10k_wmi_event_scan_foreign_chan()
      					  complete(scan.on_channel)
      
      offchan_tx_completed
      --------------------
      
      ath10k_offchan_tx_work()
        mutex_lock(conf_mutex)
        reinit_completion(offchan_tx_completed)
        wait_for_completion_timeout(offchan_tx_completed)
      
      					ath10k_report_offchain_tx()
      					  complete(offchan_tx_completed)
      
      install_key_done
      ----------------
      ath10k_install_key()
        lockep_assert_held(conf_mutex)
        reinit_completion(install_key_done)
        wait_for_completion_timeout(install_key_done)
      
      				        ath10k_htt_t2h_msg_handler()
      					  complete(install_key_done)
      
      vdev_setup_done
      ---------------
      
      ath10k_monitor_vdev_start()
        lockdep_assert_held(conf_mutex)
         reinit_completion(vdev_setup_done)
        ath10k_vdev_setup_sync()
          wait_for_completion_timeout(vdev_setup_done)
      
      					ath10k_wmi_event_vdev_start_resp()
      					  complete(vdev_setup_done)
      
      ath10k_monitor_vdev_stop()
        lockdep_assert_held(conf_mutex)
        reinit_completion(vdev_setup_done()
        ath10k_vdev_setup_sync()
          wait_for_completion_timeout(vdev_setup_done)
      
      					ath10k_wmi_event_vdev_stopped()
      					 complete(vdev_setup_done)
      
      thermal.wmi_sync
      ----------------
      ath10k_thermal_show_temp()
        mutex_lock(conf_mutex)
        reinit_completion(thermal.wmi_sync)
        wait_for_completion_timeout(thermal.wmi_sync)
      
      					ath10k_thermal_event_temperature()
      					  complete(thermal.wmi_sync)
      
      bss_survey_done
      ---------------
      ath10k_mac_update_bss_chan_survey
        lockdep_assert_held(conf_mutex)
        reinit_completion(bss_survey_done)
        wait_for_completion_timeout(bss_survey_done)
      
      					ath10k_wmi_event_pdev_bss_chan_info()
      					  complete(bss_survey_done)
      
      All complete() calls happen while the conf_mutex is taken. That means
      at max one waiter is possible.
      Signed-off-by: default avatarDaniel Wagner <daniel.wagner@bmw-carit.de>
      Signed-off-by: default avatarKalle Valo <kvalo@qca.qualcomm.com>
      881ed54e
    • Ashok Raj Nagarajan's avatar
      ath10k: fix sending frame in management path in push txq logic · e4fd726f
      Ashok Raj Nagarajan authored
      In the wake tx queue path, we are not checking if the frame to be sent
      takes management path or not. For eg. QOS null func frame coming here will
      take the management path. Since we are not incrementing the descriptor
      counter (num_pending_mgmt_tx) w.r.t tx management, on tx completion it is
      possible to see negative values.
      
      When the above counter reaches a negative value, we will not be sending a
      probe response out.
      
          if (is_presp &&
      	ar->hw_params.max_probe_resp_desc_thres < htt->num_pending_mgmt_tx)
      
      For IPQ4019, max_probe_resp_desc_thres (u32) is 24 is compared against
      num_pending_mgmt_tx (int) and the above condtions comes true if the counter
      is negative and we drop the probe response.
      
      To avoid this, check on the wake tx queue path as well for the tx path of
      the frame and increment the appropriate counters
      
      Fixes: cac08552 "ath10k: move mgmt descriptor limit handle under mgmt_tx"
      Signed-off-by: default avatarAshok Raj Nagarajan <arnagara@qti.qualcomm.com>
      Signed-off-by: default avatarKalle Valo <kvalo@qca.qualcomm.com>
      e4fd726f
    • Rajkumar Manoharan's avatar
      ath10k: improve wake_tx_queue ops performance · 83e164b7
      Rajkumar Manoharan authored
      txqs_lock is interfering with wake_tx_queue submitting more frames.
      so queues don't get filled in and don't keep firmware/hardware busy
      enough. This change helps to reduce the txqs_lock contention and
      wake_tx_queue() blockage to being possible in txrx_unref().
      
      To reduce turn around time of wake_tx_queue ops and to maintain fairness
      among all txqs, the callback is updated to push first txq alone from
      pending list for every wake_tx_queue call. Remaining txqs will be
      processed later upon tx completion.
      
      Below improvements are observed in push-only mode and validated on
      IPQ4019 platform. With this change, in AP mode ~10Mbps increase is
      observed in downlink (AP -> STA) traffic and approx. 5-10% of CPU
      usage is reduced.
      
      Major improvement is observed in 1-hop Mesh mode topology in 11ACVHT80.
      Compared to Infra mode, CPU overhead is higher in Mesh mode due to path
      lookup and no fast-xmit support. So reducing spin lock contention is
      helping in Mesh.
      
                   TOT       +change
                 --------    --------
      TCP DL     545 Mbps    595 Mbps
      TCP UL     555 Mbps    585 Mbps
      Signed-off-by: default avatarRajkumar Manoharan <rmanohar@qti.qualcomm.com>
      Signed-off-by: default avatarKalle Valo <kvalo@qca.qualcomm.com>
      83e164b7
    • Mohammed Shafi Shajakhan's avatar
      ath10k: suppress warnings when getting wmi WDS peer event id · 03c41cc1
      Mohammed Shafi Shajakhan authored
      'WMI_10_4_WDS_PEER_EVENTID' is not yet handled/implemented for WDS mode,
      as of now suppress the warning message "Unknown eventid: 36903"
      Signed-off-by: default avatarMohammed Shafi Shajakhan <mohammed@qti.qualcomm.com>
      Signed-off-by: default avatarKalle Valo <kvalo@qca.qualcomm.com>
      03c41cc1
    • Rajkumar Manoharan's avatar
      ath10k: fix group privacy action frame decryption for qca4019 · 7d42298e
      Rajkumar Manoharan authored
      Recent commit 46f6b060 ("mac80211: Encrypt "Group addressed privacy" action
      frames") encrypts group privacy action frames. But qca99x0 family chipset
      delivers broadcast/multicast management frames as encrypted and it should be
      decrypted by mac80211. Setting RX_FLAG_DECRYPTED stats for those frames is
      breaking mesh connection establishment.
      Signed-off-by: default avatarRajkumar Manoharan <rmanohar@qti.qualcomm.com>
      Signed-off-by: default avatarKalle Valo <kvalo@qca.qualcomm.com>
      7d42298e
    • Maharaja Kennadyrajan's avatar
      ath10k: hide kernel addresses from logs using %pK format specifier · 75b34800
      Maharaja Kennadyrajan authored
      With the %pK format specifier we hide the kernel addresses
      with the help of kptr_restrict sysctl.
      In this patch, %p is changed to %pK in the driver code.
      
      The sysctl is documented in Documentation/sysctl/kernel.txt.
      Signed-off-by: default avatarMaharaja Kennadyrajan <c_mkenna@qti.qualcomm.com>
      Signed-off-by: default avatarKalle Valo <kvalo@qca.qualcomm.com>
      75b34800
    • Tamizh chelvam's avatar
      ath10k: Add WMI_SERVICE_PERIODIC_CHAN_STAT_SUPPORT wmi service · 64ed5771
      Tamizh chelvam authored
      WMI_SERVICE_PERIODIC_CHAN_STAT_SUPPORT service has missed in
      the commit 7e247a9e ("ath10k: add dynamic tx mode switch
      config support for qca4019"). This patch adds the service to
      avoid mismatch between host and target.
      
      Fixes: 7e247a9e ("ath10k: add dynamic tx mode switch config support for qca4019")
      Signed-off-by: default avatarTamizh chelvam <c_traja@qti.qualcomm.com>
      Signed-off-by: default avatarKalle Valo <kvalo@qca.qualcomm.com>
      64ed5771
  3. 31 Aug, 2016 11 commits
  4. 19 Aug, 2016 10 commits
  5. 18 Aug, 2016 2 commits
    • David S. Miller's avatar
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net · 60747ef4
      David S. Miller authored
      Minor overlapping changes for both merge conflicts.
      
      Resolution work done by Stephen Rothwell was used
      as a reference.
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      60747ef4
    • Linus Torvalds's avatar
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net · 184ca823
      Linus Torvalds authored
      Pull networking fixes from David Miller:
      
       1) Buffers powersave frame test is reversed in cfg80211, fix from Felix
          Fietkau.
      
       2) Remove bogus WARN_ON in openvswitch, from Jarno Rajahalme.
      
       3) Fix some tg3 ethtool logic bugs, and one that would cause no
          interrupts to be generated when rx-coalescing is set to 0.  From
          Satish Baddipadige and Siva Reddy Kallam.
      
       4) QLCNIC mailbox corruption and napi budget handling fix from Manish
          Chopra.
      
       5) Fix fib_trie logic when walking the trie during /proc/net/route
          output than can access a stale node pointer.  From David Forster.
      
       6) Several sctp_diag fixes from Phil Sutter.
      
       7) PAUSE frame handling fixes in mlxsw driver from Ido Schimmel.
      
       8) Checksum fixup fixes in bpf from Daniel Borkmann.
      
       9) Memork leaks in nfnetlink, from Liping Zhang.
      
      10) Use after free in rxrpc, from David Howells.
      
      11) Use after free in new skb_array code of macvtap driver, from Jason
          Wang.
      
      12) Calipso resource leak, from Colin Ian King.
      
      13) mediatek bug fixes (missing stats sync init, etc.) from Sean Wang.
      
      14) Fix bpf non-linear packet write helpers, from Daniel Borkmann.
      
      15) Fix lockdep splats in macsec, from Sabrina Dubroca.
      
      16) hv_netvsc bug fixes from Vitaly Kuznetsov, mostly to do with VF
          handling.
      
      17) Various tc-action bug fixes, from CONG Wang.
      
      * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (116 commits)
        net_sched: allow flushing tc police actions
        net_sched: unify the init logic for act_police
        net_sched: convert tcf_exts from list to pointer array
        net_sched: move tc offload macros to pkt_cls.h
        net_sched: fix a typo in tc_for_each_action()
        net_sched: remove an unnecessary list_del()
        net_sched: remove the leftover cleanup_a()
        mlxsw: spectrum: Allow packets to be trapped from any PG
        mlxsw: spectrum: Unmap 802.1Q FID before destroying it
        mlxsw: spectrum: Add missing rollbacks in error path
        mlxsw: reg: Fix missing op field fill-up
        mlxsw: spectrum: Trap loop-backed packets
        mlxsw: spectrum: Add missing packet traps
        mlxsw: spectrum: Mark port as active before registering it
        mlxsw: spectrum: Create PVID vPort before registering netdevice
        mlxsw: spectrum: Remove redundant errors from the code
        mlxsw: spectrum: Don't return upon error in removal path
        i40e: check for and deal with non-contiguous TCs
        ixgbe: Re-enable ability to toggle VLAN filtering
        ixgbe: Force VLNCTRL.VFE to be set in all VMDq paths
        ...
      184ca823
  6. 17 Aug, 2016 1 commit
    • David S. Miller's avatar
      Merge branch 'strparser' · 48433419
      David S. Miller authored
      Tom Herbert says:
      
      ====================
      strp: Stream parser for messages
      
      This patch set introduces a utility for parsing application layer
      protocol messages in a TCP stream. This is a generalization of the
      mechanism implemented of Kernel Connection Multiplexor.
      
      This patch set adapts KCM to use the strparser. We expect that kTLS
      can use this mechanism also. RDS would probably be another candidate
      to use a common stream parsing mechanism.
      
      The API includes a context structure, a set of callbacks, utility
      functions, and a data ready function. The callbacks include
      a parse_msg function that is called to perform parsing (e.g.
      BPF parsing in case of KCM), and a rcv_msg function that is called
      when a full message has been completed.
      
      For strparser we specify the return codes from the parser to allow
      the backend to indicate that control of the socket should be
      transferred back to userspace to handle some exceptions in the
      stream: The return values are:
      
            >0 : indicates length of successfully parsed message
             0  : indicates more data must be received to parse the message
             -ESTRPIPE : current message should not be processed by the
                kernel, return control of the socket to userspace which
                can proceed to read the messages itself
             other < 0 : Error is parsing, give control back to userspace
                assuming that synchronization is lost and the stream
                is unrecoverable (application expected to close TCP socket)
      
      There is one issue I haven't been able to fully resolve. If parse_msg
      returns ESTRPIPE (wants control back to userspace) the parser may
      already have consumed some bytes of the message. There is no way to
      put bytes back into the TCP receive queue and tcp_read_sock does not
      allow an easy way to peek messages. In lieu of a better solution, we
      return ENODATA on the socket to indicate that the data stream is
      unrecoverable (application needs to close socket). This condition
      should only happen if an application layer message header is split
      across two skbuffs and parsing just the first skbuff wasn't sufficient
      to determine the that transfer to userspace is needed.
      
      This patch set contains:
      
        - strparser implementation
        - changes to kcm to use strparser
        - strparser.txt documentation
      
      v2:
        - Add copyright notice to C files
        - Remove GPL module license from strparser.c
        - Add report of rxpause
      
      v3:
        - Restore GPL module license
        - Use EXPORT_SYMBOL_GPL
      
      v4:
        - Removed unused function, changed another to be static as suggested
          by davem
        - Rewoked data_ready to be called from upper layer, no longer requires
          taking over socket data_ready callback as suggested by Lance Chao
      
      Tested:
        - Ran a KCM thrash test for 24 hours. No behavioral or performance
          differences observed.
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      48433419