1. 26 May, 2015 29 commits
  2. 08 May, 2015 1 commit
  3. 05 May, 2015 10 commits
    • David S. Miller's avatar
      ipv4: Missing sk_nulls_node_init() in ping_unhash(). · d2084ea8
      David S. Miller authored
      [ Upstream commit a134f083 ]
      
      If we don't do that, then the poison value is left in the ->pprev
      backlink.
      
      This can cause crashes if we do a disconnect, followed by a connect().
      Tested-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Reported-by: default avatarWen Xu <hotdog3645@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarKamal Mostafa <kamal@canonical.com>
      d2084ea8
    • Eric Dumazet's avatar
      net: fix crash in build_skb() · f50b24ae
      Eric Dumazet authored
      [ Upstream commit 2ea2f62c ]
      
      When I added pfmemalloc support in build_skb(), I forgot netlink
      was using build_skb() with a vmalloc() area.
      
      In this patch I introduce __build_skb() for netlink use,
      and build_skb() is a wrapper handling both skb->head_frag and
      skb->pfmemalloc
      
      This means netlink no longer has to hack skb->head_frag
      
      [ 1567.700067] kernel BUG at arch/x86/mm/physaddr.c:26!
      [ 1567.700067] invalid opcode: 0000 [#1] PREEMPT SMP KASAN
      [ 1567.700067] Dumping ftrace buffer:
      [ 1567.700067]    (ftrace buffer empty)
      [ 1567.700067] Modules linked in:
      [ 1567.700067] CPU: 9 PID: 16186 Comm: trinity-c182 Not tainted 4.0.0-next-20150424-sasha-00037-g4796e21 #2167
      [ 1567.700067] task: ffff880127efb000 ti: ffff880246770000 task.ti: ffff880246770000
      [ 1567.700067] RIP: __phys_addr (arch/x86/mm/physaddr.c:26 (discriminator 3))
      [ 1567.700067] RSP: 0018:ffff8802467779d8  EFLAGS: 00010202
      [ 1567.700067] RAX: 000041000ed8e000 RBX: ffffc9008ed8e000 RCX: 000000000000002c
      [ 1567.700067] RDX: 0000000000000004 RSI: 0000000000000000 RDI: ffffffffb3fd6049
      [ 1567.700067] RBP: ffff8802467779f8 R08: 0000000000000019 R09: ffff8801d0168000
      [ 1567.700067] R10: ffff8801d01680c7 R11: ffffed003a02d019 R12: ffffc9000ed8e000
      [ 1567.700067] R13: 0000000000000f40 R14: 0000000000001180 R15: ffffc9000ed8e000
      [ 1567.700067] FS:  00007f2a7da3f700(0000) GS:ffff8801d1000000(0000) knlGS:0000000000000000
      [ 1567.700067] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      [ 1567.700067] CR2: 0000000000738308 CR3: 000000022e329000 CR4: 00000000000007e0
      [ 1567.700067] Stack:
      [ 1567.700067]  ffffc9000ed8e000 ffff8801d0168000 ffffc9000ed8e000 ffff8801d0168000
      [ 1567.700067]  ffff880246777a28 ffffffffad7c0a21 0000000000001080 ffff880246777c08
      [ 1567.700067]  ffff88060d302e68 ffff880246777b58 ffff880246777b88 ffffffffad9a6821
      [ 1567.700067] Call Trace:
      [ 1567.700067] build_skb (include/linux/mm.h:508 net/core/skbuff.c:316)
      [ 1567.700067] netlink_sendmsg (net/netlink/af_netlink.c:1633 net/netlink/af_netlink.c:2329)
      [ 1567.774369] ? sched_clock_cpu (kernel/sched/clock.c:311)
      [ 1567.774369] ? netlink_unicast (net/netlink/af_netlink.c:2273)
      [ 1567.774369] ? netlink_unicast (net/netlink/af_netlink.c:2273)
      [ 1567.774369] sock_sendmsg (net/socket.c:614 net/socket.c:623)
      [ 1567.774369] sock_write_iter (net/socket.c:823)
      [ 1567.774369] ? sock_sendmsg (net/socket.c:806)
      [ 1567.774369] __vfs_write (fs/read_write.c:479 fs/read_write.c:491)
      [ 1567.774369] ? get_lock_stats (kernel/locking/lockdep.c:249)
      [ 1567.774369] ? default_llseek (fs/read_write.c:487)
      [ 1567.774369] ? vtime_account_user (kernel/sched/cputime.c:701)
      [ 1567.774369] ? rw_verify_area (fs/read_write.c:406 (discriminator 4))
      [ 1567.774369] vfs_write (fs/read_write.c:539)
      [ 1567.774369] SyS_write (fs/read_write.c:586 fs/read_write.c:577)
      [ 1567.774369] ? SyS_read (fs/read_write.c:577)
      [ 1567.774369] ? __this_cpu_preempt_check (lib/smp_processor_id.c:63)
      [ 1567.774369] ? trace_hardirqs_on_caller (kernel/locking/lockdep.c:2594 kernel/locking/lockdep.c:2636)
      [ 1567.774369] ? trace_hardirqs_on_thunk (arch/x86/lib/thunk_64.S:42)
      [ 1567.774369] system_call_fastpath (arch/x86/kernel/entry_64.S:261)
      
      Fixes: 79930f58 ("net: do not deplete pfmemalloc reserve")
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Reported-by: default avatarSasha Levin <sasha.levin@oracle.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarKamal Mostafa <kamal@canonical.com>
      f50b24ae
    • Eric Dumazet's avatar
      net: do not deplete pfmemalloc reserve · 9e9c24cd
      Eric Dumazet authored
      [ Upstream commit 79930f58 ]
      
      build_skb() should look at the page pfmemalloc status.
      If set, this means page allocator allocated this page in the
      expectation it would help to free other pages. Networking
      stack can do that only if skb->pfmemalloc is also set.
      
      Also, we must refrain using high order pages from the pfmemalloc
      reserve, so __page_frag_refill() must also use __GFP_NOMEMALLOC for
      them. Under memory pressure, using order-0 pages is probably the best
      strategy.
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarKamal Mostafa <kamal@canonical.com>
      9e9c24cd
    • Eric Dumazet's avatar
      tcp: avoid looping in tcp_send_fin() · b73fa167
      Eric Dumazet authored
      [ Upstream commit 845704a5 ]
      
      Presence of an unbound loop in tcp_send_fin() had always been hard
      to explain when analyzing crash dumps involving gigantic dying processes
      with millions of sockets.
      
      Lets try a different strategy :
      
      In case of memory pressure, try to add the FIN flag to last packet
      in write queue, even if packet was already sent. TCP stack will
      be able to deliver this FIN after a timeout event. Note that this
      FIN being delivered by a retransmit, it also carries a Push flag
      given our current implementation.
      
      By checking sk_under_memory_pressure(), we anticipate that cooking
      many FIN packets might deplete tcp memory.
      
      In the case we could not allocate a packet, even with __GFP_WAIT
      allocation, then not sending a FIN seems quite reasonable if it allows
      to get rid of this socket, free memory, and not block the process from
      eventually doing other useful work.
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarKamal Mostafa <kamal@canonical.com>
      b73fa167
    • Eric Dumazet's avatar
      tcp: fix possible deadlock in tcp_send_fin() · ae806cd1
      Eric Dumazet authored
      [ Upstream commit d83769a5 ]
      
      Using sk_stream_alloc_skb() in tcp_send_fin() is dangerous in
      case a huge process is killed by OOM, and tcp_mem[2] is hit.
      
      To be able to free memory we need to make progress, so this
      patch allows FIN packets to not care about tcp_mem[2], if
      skb allocation succeeded.
      
      In a follow-up patch, we might abort tcp_send_fin() infinite loop
      in case TIF_MEMDIE is set on this thread, as memory allocator
      did its best getting extra memory already.
      
      This patch reverts d22e1537 ("tcp: fix tcp fin memory accounting")
      
      Fixes: d22e1537 ("tcp: fix tcp fin memory accounting")
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarKamal Mostafa <kamal@canonical.com>
      ae806cd1
    • Sebastian Pöhn's avatar
      ip_forward: Drop frames with attached skb->sk · 74c9bf2b
      Sebastian Pöhn authored
      [ Upstream commit 2ab95749 ]
      
      Initial discussion was:
      [FYI] xfrm: Don't lookup sk_policy for timewait sockets
      
      Forwarded frames should not have a socket attached. Especially
      tw sockets will lead to panics later-on in the stack.
      
      This was observed with TPROXY assigning a tw socket and broken
      policy routing (misconfigured). As a result frame enters
      forwarding path instead of input. We cannot solve this in
      TPROXY as it cannot know that policy routing is broken.
      
      v2:
      Remove useless comment
      Signed-off-by: default avatarSebastian Poehn <sebastian.poehn@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarKamal Mostafa <kamal@canonical.com>
      74c9bf2b
    • Eric W. Biederman's avatar
      gianfar: Carefully free skbs in functions called by netpoll. · 06b1d865
      Eric W. Biederman authored
      commit c9974ad4 upstream.
      
      netpoll can call functions in hard irq context that are ordinarily
      called in lesser contexts.  For those functions use dev_kfree_skb_any
      and dev_consume_skb_any so skbs are freed safely from hard irq
      context.
      Signed-off-by: default avatar"Eric W. Biederman" <ebiederm@xmission.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarKamal Mostafa <kamal@canonical.com>
      06b1d865
    • Eric W. Biederman's avatar
      benet: Call dev_kfree_skby_any instead of kfree_skb. · e52968f3
      Eric W. Biederman authored
      commit d8ec2c02 upstream.
      
      Replace free_skb with dev_kfree_skb_any in be_tx_compl_process as
      which can be called in hard irq by netpoll, softirq context
      by normal napi polling, and in normal sleepable context
      by the network device close method.
      Signed-off-by: default avatar"Eric W. Biederman" <ebiederm@xmission.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarKamal Mostafa <kamal@canonical.com>
      e52968f3
    • Eric W. Biederman's avatar
      ixgb: Call dev_kfree_skby_any instead of dev_kfree_skb. · ccef2f4b
      Eric W. Biederman authored
      commit f7e79913 upstream.
      
      Replace dev_kfree_skb with dev_kfree_skb_any in functions that can
      be called in hard irq and other contexts.
      Signed-off-by: default avatar"Eric W. Biederman" <ebiederm@xmission.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarKamal Mostafa <kamal@canonical.com>
      ccef2f4b
    • Eric W. Biederman's avatar
      tg3: Call dev_kfree_skby_any instead of dev_kfree_skb. · ef1b2664
      Eric W. Biederman authored
      commit 497a27b9 upstream.
      
      Replace dev_kfree_skb with dev_kfree_skb_any in functions that can
      be called in hard irq and other contexts.
      Signed-off-by: default avatar"Eric W. Biederman" <ebiederm@xmission.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarKamal Mostafa <kamal@canonical.com>
      ef1b2664