1. 11 Aug, 2015 5 commits
    • LEROY Christophe's avatar
      net: fs_enet: explicitly remove I flag on TX partial frames · 8961822c
      LEROY Christophe authored
      We are not interested in interrupts for partially transmitted frames,
      we have to clear BD_ENET_TX_INTR explicitly otherwise it may remain
      from a previously used descriptor.
      Signed-off-by: default avatarChristophe Leroy <christophe.leroy@c-s.fr>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      8961822c
    • Eric Dumazet's avatar
      inet: fix possible request socket leak · 3257d8b1
      Eric Dumazet authored
      In commit b357a364 ("inet: fix possible panic in
      reqsk_queue_unlink()"), I missed fact that tcp_check_req()
      can return the listener socket in one case, and that we must
      release the request socket refcount or we leak it.
      
      Tested:
      
       Following packetdrill test template shows the issue
      
      0     socket(..., SOCK_STREAM, IPPROTO_TCP) = 3
      +0    setsockopt(3, SOL_SOCKET, SO_REUSEADDR, [1], 4) = 0
      +0    bind(3, ..., ...) = 0
      +0    listen(3, 1) = 0
      
      +0    < S 0:0(0) win 2920 <mss 1460,sackOK,nop,nop>
      +0    > S. 0:0(0) ack 1 <mss 1460,nop,nop,sackOK>
      +.002 < . 1:1(0) ack 21 win 2920
      +0    > R 21:21(0)
      
      Fixes: b357a364 ("inet: fix possible panic in reqsk_queue_unlink()")
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      3257d8b1
    • Eric Dumazet's avatar
      inet: fix races with reqsk timers · 2235f2ac
      Eric Dumazet authored
      reqsk_queue_destroy() and reqsk_queue_unlink() should use
      del_timer_sync() instead of del_timer() before calling reqsk_put(),
      otherwise we could free a req still used by another cpu.
      
      But before doing so, reqsk_queue_destroy() must release syn_wait_lock
      spinlock or risk a dead lock, as reqsk_timer_handler() might
      need to take this same spinlock from reqsk_queue_unlink() (called from
      inet_csk_reqsk_queue_drop())
      
      Fixes: fa76ce73 ("inet: get rid of central tcp/dccp listener timer")
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      2235f2ac
    • Fabio Estevam's avatar
      mkiss: Fix error handling in mkiss_open() · 9d332d92
      Fabio Estevam authored
      If register_netdev() fails we are not propagating the error and
      we return success because ax_open() succeeded previously.
      
      Fix this by checking the return value of ax_open() and
      register_netdev() and propagate the error in case of failure.
      Reported-by: default avatarRUC_Soft_Sec <zy900702@163.com>
      Signed-off-by: default avatarFabio Estevam <fabio.estevam@freescale.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      9d332d92
    • David S. Miller's avatar
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf · 18255457
      David S. Miller authored
      Pablo Neira Ayuso says:
      
      ====================
      Netfilter fixes for net
      
      The following patchset contains five Netfilter fixes for your net tree,
      they are:
      
      1) Silence a warning on falling back to vmalloc(). Since 88eab472, we can
         easily hit this warning message, that gets users confused. So let's get rid
         of it.
      
      2) Recently when porting the template object allocation on top of kmalloc to
         fix the netns dependencies between x_tables and conntrack, the error
         checks where left unchanged. Remove IS_ERR() and check for NULL instead.
         Patch from Dan Carpenter.
      
      3) Don't ignore gfp_flags in the new nf_ct_tmpl_alloc() function, from
         Joe Stringer.
      
      4) Fix a crash due to NULL pointer dereference in ip6t_SYNPROXY, patch from
         Phil Sutter.
      
      5) The sequence number of the Syn+ack that is sent from SYNPROXY to clients is
         not adjusted through our NAT infrastructure, as a result the client may
         ignore this TCP packet and TCP flow hangs until the client probes us.  Also
         from Phil Sutter.
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      18255457
  2. 10 Aug, 2015 14 commits
    • David S. Miller's avatar
      Merge branch 'bnx2x-fixes' · 875a74b6
      David S. Miller authored
      Yuval Mintz says:
      
      ====================
      bnx2x: small fixes
      
      This adds 2 small fixes, one to error flows during memory release
      and the other to flash writes via ethtool API.
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      875a74b6
    • Yuval Mintz's avatar
      bnx2x: Free NVRAM lock at end of each page · 0ea853df
      Yuval Mintz authored
      Writing each 4Kb page into flash might take up-to ~100 miliseconds,
      during which time management firmware cannot acces the nvram for its
      own uses.
      
      Firmware upgrade utility use the ethtool API to burn new flash images
      for the device via the ethtool API, doing so by writing several page-worth
      of data on each command. Such action might create problems for the
      management firmware, as the nvram might not be accessible for a long time.
      
      This patch changes the write implementation, releasing the nvram lock on
      the completion of each page, allowing the management firmware time to
      claim it and perform its own required actions.
      Signed-off-by: default avatarYuval Mintz <Yuval.Mintz@qlogic.com>
      Signed-off-by: default avatarAriel Elior <Ariel.Elior@qlogic.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      0ea853df
    • Yuval Mintz's avatar
      bnx2x: Prevent null pointer dereference on SKB release · e1615903
      Yuval Mintz authored
      On error flows its possible to free an SKB even if it was not allocated.
      Signed-off-by: default avatarYuval Mintz <Yuval.Mintz@qlogic.com>
      Signed-off-by: default avatarAriel Elior <Ariel.Elior@qlogic.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      e1615903
    • Dan Carpenter's avatar
      cxgb4: missing curly braces in t4_setup_debugfs() · 21a44763
      Dan Carpenter authored
      There were missing curly braces so it means we call add_debugfs_mem()
      unintentionally.
      
      Fixes: 3ccc6cf7 ('cxgb4: Adds support for T6 adapter')
      Signed-off-by: default avatarDan Carpenter <dan.carpenter@oracle.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      21a44763
    • Benjamin Poirier's avatar
      net-timestamp: Update skb_complete_tx_timestamp comment · 7a76a021
      Benjamin Poirier authored
      After "62bccb8c net-timestamp: Make the clone operation stand-alone from phy
      timestamping" the hwtstamps parameter of skb_complete_tx_timestamp() may no
      longer be NULL.
      Signed-off-by: default avatarBenjamin Poirier <bpoirier@suse.com>
      Cc: Alexander Duyck <alexander.h.duyck@redhat.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      7a76a021
    • Florian Westphal's avatar
      ipv6: don't reject link-local nexthop on other interface · 330567b7
      Florian Westphal authored
      48ed7b26 ("ipv6: reject locally assigned nexthop addresses") is too
      strict; it rejects following corner-case:
      
      ip -6 route add default via fe80::1:2:3 dev eth1
      
      [ where fe80::1:2:3 is assigned to a local interface, but not eth1 ]
      
      Fix this by restricting search to given device if nh is linklocal.
      
      Joint work with Hannes Frederic Sowa.
      
      Fixes: 48ed7b26 ("ipv6: reject locally assigned nexthop addresses")
      Signed-off-by: default avatarHannes Frederic Sowa <hannes@stressinduktion.org>
      Signed-off-by: default avatarFlorian Westphal <fw@strlen.de>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      330567b7
    • Daniel Borkmann's avatar
      netlink: make sure -EBUSY won't escape from netlink_insert · 4e7c1330
      Daniel Borkmann authored
      Linus reports the following deadlock on rtnl_mutex; triggered only
      once so far (extract):
      
      [12236.694209] NetworkManager  D 0000000000013b80     0  1047      1 0x00000000
      [12236.694218]  ffff88003f902640 0000000000000000 ffffffff815d15a9 0000000000000018
      [12236.694224]  ffff880119538000 ffff88003f902640 ffffffff81a8ff84 00000000ffffffff
      [12236.694230]  ffffffff81a8ff88 ffff880119c47f00 ffffffff815d133a ffffffff81a8ff80
      [12236.694235] Call Trace:
      [12236.694250]  [<ffffffff815d15a9>] ? schedule_preempt_disabled+0x9/0x10
      [12236.694257]  [<ffffffff815d133a>] ? schedule+0x2a/0x70
      [12236.694263]  [<ffffffff815d15a9>] ? schedule_preempt_disabled+0x9/0x10
      [12236.694271]  [<ffffffff815d2c3f>] ? __mutex_lock_slowpath+0x7f/0xf0
      [12236.694280]  [<ffffffff815d2cc6>] ? mutex_lock+0x16/0x30
      [12236.694291]  [<ffffffff814f1f90>] ? rtnetlink_rcv+0x10/0x30
      [12236.694299]  [<ffffffff8150ce3b>] ? netlink_unicast+0xfb/0x180
      [12236.694309]  [<ffffffff814f5ad3>] ? rtnl_getlink+0x113/0x190
      [12236.694319]  [<ffffffff814f202a>] ? rtnetlink_rcv_msg+0x7a/0x210
      [12236.694331]  [<ffffffff8124565c>] ? sock_has_perm+0x5c/0x70
      [12236.694339]  [<ffffffff814f1fb0>] ? rtnetlink_rcv+0x30/0x30
      [12236.694346]  [<ffffffff8150d62c>] ? netlink_rcv_skb+0x9c/0xc0
      [12236.694354]  [<ffffffff814f1f9f>] ? rtnetlink_rcv+0x1f/0x30
      [12236.694360]  [<ffffffff8150ce3b>] ? netlink_unicast+0xfb/0x180
      [12236.694367]  [<ffffffff8150d344>] ? netlink_sendmsg+0x484/0x5d0
      [12236.694376]  [<ffffffff810a236f>] ? __wake_up+0x2f/0x50
      [12236.694387]  [<ffffffff814cad23>] ? sock_sendmsg+0x33/0x40
      [12236.694396]  [<ffffffff814cb05e>] ? ___sys_sendmsg+0x22e/0x240
      [12236.694405]  [<ffffffff814cab75>] ? ___sys_recvmsg+0x135/0x1a0
      [12236.694415]  [<ffffffff811a9d12>] ? eventfd_write+0x82/0x210
      [12236.694423]  [<ffffffff811a0f9e>] ? fsnotify+0x32e/0x4c0
      [12236.694429]  [<ffffffff8108cb70>] ? wake_up_q+0x60/0x60
      [12236.694434]  [<ffffffff814cba09>] ? __sys_sendmsg+0x39/0x70
      [12236.694440]  [<ffffffff815d4797>] ? entry_SYSCALL_64_fastpath+0x12/0x6a
      
      It seems so far plausible that the recursive call into rtnetlink_rcv()
      looks suspicious. One way, where this could trigger is that the senders
      NETLINK_CB(skb).portid was wrongly 0 (which is rtnetlink socket), so
      the rtnl_getlink() request's answer would be sent to the kernel instead
      to the actual user process, thus grabbing rtnl_mutex() twice.
      
      One theory would be that netlink_autobind() triggered via netlink_sendmsg()
      internally overwrites the -EBUSY error to 0, but where it is wrongly
      originating from __netlink_insert() instead. That would reset the
      socket's portid to 0, which is then filled into NETLINK_CB(skb).portid
      later on. As commit d470e3b4 ("[NETLINK]: Fix two socket hashing bugs.")
      also puts it, -EBUSY should not be propagated from netlink_insert().
      
      It looks like it's very unlikely to reproduce. We need to trigger the
      rhashtable_insert_rehash() handler under a situation where rehashing
      currently occurs (one /rare/ way would be to hit ht->elasticity limits
      while not filled enough to expand the hashtable, but that would rather
      require a specifically crafted bind() sequence with knowledge about
      destination slots, seems unlikely). It probably makes sense to guard
      __netlink_insert() in any case and remap that error. It was suggested
      that EOVERFLOW might be better than an already overloaded ENOMEM.
      
      Reference: http://thread.gmane.org/gmane.linux.network/372676Reported-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Acked-by: default avatarHerbert Xu <herbert@gondor.apana.org.au>
      Acked-by: default avatarThomas Graf <tgraf@suug.ch>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      4e7c1330
    • Ivan Vecera's avatar
      bna: fix interrupts storm caused by erroneous packets · ade4dc3e
      Ivan Vecera authored
      The commit "e29aa339 bna: Enable Multi Buffer RX" moved packets counter
      increment from the beginning of the NAPI processing loop after the check
      for erroneous packets so they are never accounted. This counter is used
      to inform firmware about number of processed completions (packets).
      As these packets are never acked the firmware fires IRQs for them again
      and again.
      
      Fixes: e29aa339 ("bna: Enable Multi Buffer RX")
      Signed-off-by: default avatarIvan Vecera <ivecera@redhat.com>
      Acked-by: default avatarRasesh Mody <rasesh.mody@qlogic.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      ade4dc3e
    • David S. Miller's avatar
      Merge branch 'mvpp2-fixes' · ea708584
      David S. Miller authored
      Marcin Wojtas says:
      
      ====================
      Fixes for the network driver of Marvell Armada 375 SoC
      
      This is a set of three patches that fix long-lasting problems implemented in
      the initial support for the Armada 375 network controller.
      
      Due to an inappropriate concept of handling the per-CPU sent packets'
      processing on TX path the driver numerous problems occured, such as RCU
      stalls. Those have been fixed, of which details you can find in the commit
      logs. The patches were intensively tested on top of v4.2-rc5.
      
      I'm looking forward to any comments or remarks.
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      ea708584
    • Marcin Wojtas's avatar
      net: mvpp2: replace TX coalescing interrupts with hrtimer · edc660fa
      Marcin Wojtas authored
      The PP2 controller is capable of per-CPU TX processing, which means there are
      per-CPU banked register sets and queues. Current version of the driver supports
      TX packet coalescing - once on given CPU sent packets amount reaches a threshold
      value, an IRQ occurs. However, there is a single interrupt line responsible for
      CPU0/1 TX and RX events (the latter is not per-CPU, the hardware does not
      support RSS).
      
      When the top-half executes the interrupt cause is not known. This is why in
      NAPI poll function, along with RX processing, IRQ cause register on both
      CPU's is accessed in order to determine on which of them the TX coalescing
      threshold might have been reached. Thus the egress processing and releasing the
      buffers is able to take place on the corresponding CPU. Hitherto approach lead
      to an illegal usage of on_each_cpu function in softirq context.
      
      The problem is solved by resigning from TX coalescing interrupts and separating
      egress finalization from NAPI processing. For that purpose a method of using
      hrtimer is introduced. In main transmit function (mvpp2_tx) buffers are released
      once a software coalescing threshold is reached. In case not all the data is
      processed a timer is set on this CPU - in its interrupt context a tasklet is
      scheduled in which all queues are processed. At once only one timer per-CPU can
      be running, which is controlled by a dedicated flag.
      
      This commit removes TX processing from NAPI polling function, disables hardware
      coalescing and enables hrtimer with tasklet, using new per-CPU port structure
      (mvpp2_port_pcpu).
      Signed-off-by: default avatarMarcin Wojtas <mw@semihalf.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      edc660fa
    • Marcin Wojtas's avatar
      net: mvpp2: enable proper per-CPU TX buffers unmapping · 71ce391d
      Marcin Wojtas authored
      mvpp2 driver allows usage of per-CPU TX processing. Once the packets are
      prepared independetly on each CPU, the hardware enqueues the descriptors in
      common TX queue. After they are sent, the buffers and associated sk_buffs
      should be released on the corresponding CPU.
      
      This is why a special index is maintained in order to point to the right data to
      be released after transmission takes place. Each per-CPU TX queue comprise an
      array of sent sk_buffs, freed in mvpp2_txq_bufs_free function. However, the
      index was used there also for obtaining a descriptor (and therefore a buffer to
      be DMA-unmapped) from common TX queue, which was wrong, because it was not
      referring to the current CPU.
      
      This commit enables proper unmapping of sent data buffers by indexing them in
      per-CPU queues using a dedicated array for keeping their physical addresses.
      Signed-off-by: default avatarMarcin Wojtas <mw@semihalf.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      71ce391d
    • Marcin Wojtas's avatar
      net: mvpp2: remove excessive spinlocks from driver initialization · d53793c5
      Marcin Wojtas authored
      Using spinlocks protection during one-time driver initialization is not
      necessary. Moreover it resulted in invalid GFP_KERNEL allocation under the lock.
      
      This commit removes redundant spinlocks from buffer manager part of mvpp2
      initialization.
      Signed-off-by: default avatarMarcin Wojtas <mw@semihalf.com>
      Reported-by: default avatarAlexandre Fournier <alexandre.fournier@wisp-e.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      d53793c5
    • Phil Sutter's avatar
      netfilter: SYNPROXY: fix sending window update to client · 3c16241c
      Phil Sutter authored
      Upon receipt of SYNACK from the server, ipt_SYNPROXY first sends back an ACK to
      finish the server handshake, then calls nf_ct_seqadj_init() to initiate
      sequence number adjustment of forwarded packets to the client and finally sends
      a window update to the client to unblock it's TX queue.
      
      Since synproxy_send_client_ack() does not set synproxy_send_tcp()'s nfct
      parameter, no sequence number adjustment happens and the client receives the
      window update with incorrect sequence number. Depending on client TCP
      implementation, this leads to a significant delay (until a window probe is
      being sent).
      Signed-off-by: default avatarPhil Sutter <phil@nwl.cc>
      Signed-off-by: default avatarPablo Neira Ayuso <pablo@netfilter.org>
      3c16241c
    • Phil Sutter's avatar
      netfilter: ip6t_SYNPROXY: fix NULL pointer dereference · 96fffb4f
      Phil Sutter authored
      This happens when networking namespaces are enabled.
      Suggested-by: default avatarPatrick McHardy <kaber@trash.net>
      Signed-off-by: default avatarPhil Sutter <phil@nwl.cc>
      Acked-by: default avatarPatrick McHardy <kaber@trash.net>
      Signed-off-by: default avatarPablo Neira Ayuso <pablo@netfilter.org>
      96fffb4f
  3. 07 Aug, 2015 18 commits
  4. 05 Aug, 2015 1 commit
  5. 04 Aug, 2015 2 commits