1. 26 Apr, 2012 1 commit
  2. 25 Apr, 2012 2 commits
    • Julian Anastasov's avatar
      ipvs: fix crash in ip_vs_control_net_cleanup on unload · 8f9b9a2f
      Julian Anastasov authored
      	commit 14e40546 (2.6.39)
      ("Add __ip_vs_control_{init,cleanup}_sysctl()")
      introduced regression due to wrong __net_init for
      __ip_vs_control_cleanup_sysctl. This leads to crash when
      the ip_vs module is unloaded.
      
      	Fix it by changing __net_init to __net_exit for
      the function that is already renamed to ip_vs_control_net_cleanup_sysctl.
      Signed-off-by: default avatarJulian Anastasov <ja@ssi.bg>
      Signed-off-by: default avatarHans Schillstrom <hans@schillstrom.com>
      Signed-off-by: default avatarSimon Horman <horms@verge.net.au>
      Signed-off-by: default avatarPablo Neira Ayuso <pablo@netfilter.org>
      8f9b9a2f
    • Sasha Levin's avatar
      ipvs: Verify that IP_VS protocol has been registered · 7118c07a
      Sasha Levin authored
      The registration of a protocol might fail, there were no checks
      and all registrations were assumed to be correct. This lead to
      NULL ptr dereferences when apps tried registering.
      
      For example:
      
      [ 1293.226051] BUG: unable to handle kernel NULL pointer dereference at 0000000000000018
      [ 1293.227038] IP: [<ffffffff822aacb0>] tcp_register_app+0x60/0xb0
      [ 1293.227038] PGD 391de067 PUD 6c20b067 PMD 0
      [ 1293.227038] Oops: 0000 [#1] PREEMPT SMP
      [ 1293.227038] CPU 1
      [ 1293.227038] Pid: 19609, comm: trinity Tainted: G        W    3.4.0-rc1-next-20120405-sasha-dirty #57
      [ 1293.227038] RIP: 0010:[<ffffffff822aacb0>]  [<ffffffff822aacb0>] tcp_register_app+0x60/0xb0
      [ 1293.227038] RSP: 0018:ffff880038c1dd18  EFLAGS: 00010286
      [ 1293.227038] RAX: ffffffffffffffc0 RBX: 0000000000001500 RCX: 0000000000010000
      [ 1293.227038] RDX: 0000000000000000 RSI: ffff88003a2d5888 RDI: 0000000000000282
      [ 1293.227038] RBP: ffff880038c1dd48 R08: 0000000000000000 R09: 0000000000000000
      [ 1293.227038] R10: 0000000000000000 R11: 0000000000000000 R12: ffff88003a2d5668
      [ 1293.227038] R13: ffff88003a2d5988 R14: ffff8800696a8ff8 R15: 0000000000000000
      [ 1293.227038] FS:  00007f01930d9700(0000) GS:ffff88007ce00000(0000) knlGS:0000000000000000
      [ 1293.227038] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
      [ 1293.227038] CR2: 0000000000000018 CR3: 0000000065dfc000 CR4: 00000000000406e0
      [ 1293.227038] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      [ 1293.227038] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
      [ 1293.227038] Process trinity (pid: 19609, threadinfo ffff880038c1c000, task ffff88002dc73000)
      [ 1293.227038] Stack:
      [ 1293.227038]  ffff880038c1dd48 00000000fffffff4 ffff8800696aada0 ffff8800694f5580
      [ 1293.227038]  ffffffff8369f1e0 0000000000001500 ffff880038c1dd98 ffffffff822a716b
      [ 1293.227038]  0000000000000000 ffff8800696a8ff8 0000000000000015 ffff8800694f5580
      [ 1293.227038] Call Trace:
      [ 1293.227038]  [<ffffffff822a716b>] ip_vs_app_inc_new+0xdb/0x180
      [ 1293.227038]  [<ffffffff822a7258>] register_ip_vs_app_inc+0x48/0x70
      [ 1293.227038]  [<ffffffff822b2fea>] __ip_vs_ftp_init+0xba/0x140
      [ 1293.227038]  [<ffffffff821c9060>] ops_init+0x80/0x90
      [ 1293.227038]  [<ffffffff821c90cb>] setup_net+0x5b/0xe0
      [ 1293.227038]  [<ffffffff821c9416>] copy_net_ns+0x76/0x100
      [ 1293.227038]  [<ffffffff810dc92b>] create_new_namespaces+0xfb/0x190
      [ 1293.227038]  [<ffffffff810dca21>] unshare_nsproxy_namespaces+0x61/0x80
      [ 1293.227038]  [<ffffffff810afd1f>] sys_unshare+0xff/0x290
      [ 1293.227038]  [<ffffffff8187622e>] ? trace_hardirqs_on_thunk+0x3a/0x3f
      [ 1293.227038]  [<ffffffff82665539>] system_call_fastpath+0x16/0x1b
      [ 1293.227038] Code: 89 c7 e8 34 91 3b 00 89 de 66 c1 ee 04 31 de 83 e6 0f 48 83 c6 22 48 c1 e6 04 4a 8b 14 26 49 8d 34 34 48 8d 42 c0 48 39 d6 74 13 <66> 39 58 58 74 22 48 8b 48 40 48 8d 41 c0 48 39 ce 75 ed 49 8d
      [ 1293.227038] RIP  [<ffffffff822aacb0>] tcp_register_app+0x60/0xb0
      [ 1293.227038]  RSP <ffff880038c1dd18>
      [ 1293.227038] CR2: 0000000000000018
      [ 1293.379284] ---[ end trace 364ab40c7011a009 ]---
      [ 1293.381182] Kernel panic - not syncing: Fatal exception in interrupt
      Signed-off-by: default avatarSasha Levin <levinsasha928@gmail.com>
      Acked-by: default avatarJulian Anastasov <ja@ssi.bg>
      Signed-off-by: default avatarSimon Horman <horms@verge.net.au>
      Signed-off-by: default avatarPablo Neira Ayuso <pablo@netfilter.org>
      7118c07a
  3. 10 Apr, 2012 2 commits
    • Gao feng's avatar
      netfilter: nf_conntrack: fix incorrect logic in nf_conntrack_init_net · 6ba90067
      Gao feng authored
      in function nf_conntrack_init_net,when nf_conntrack_timeout_init falied,
      we should call nf_conntrack_ecache_fini to do rollback.
      but the current code calls nf_conntrack_timeout_fini.
      Signed-off-by: default avatarGao feng <gaofeng@cn.fujitsu.com>
      Signed-off-by: default avatarPablo Neira Ayuso <pablo@netfilter.org>
      6ba90067
    • Jozsef Kadlecsik's avatar
      netfilter: nf_ct_ipv4: packets with wrong ihl are invalid · 07153c6e
      Jozsef Kadlecsik authored
      It was reported that the Linux kernel sometimes logs:
      
      klogd: [2629147.402413] kernel BUG at net / netfilter /
      nf_conntrack_proto_tcp.c: 447!
      klogd: [1072212.887368] kernel BUG at net / netfilter /
      nf_conntrack_proto_tcp.c: 392
      
      ipv4_get_l4proto() in nf_conntrack_l3proto_ipv4.c and tcp_error() in
      nf_conntrack_proto_tcp.c should catch malformed packets, so the errors
      at the indicated lines - TCP options parsing - should not happen.
      However, tcp_error() relies on the "dataoff" offset to the TCP header,
      calculated by ipv4_get_l4proto().  But ipv4_get_l4proto() does not check
      bogus ihl values in IPv4 packets, which then can slip through tcp_error()
      and get caught at the TCP options parsing routines.
      
      The patch fixes ipv4_get_l4proto() by invalidating packets with bogus
      ihl value.
      
      The patch closes netfilter bugzilla id 771.
      Signed-off-by: default avatarJozsef Kadlecsik <kadlec@blackhole.kfki.hu>
      Signed-off-by: default avatarPablo Neira Ayuso <pablo@netfilter.org>
      07153c6e
  4. 09 Apr, 2012 3 commits
  5. 06 Apr, 2012 8 commits
  6. 05 Apr, 2012 6 commits
    • stephen hemminger's avatar
      MAINTAINERS: update for Marvell Ethernet drivers · 44c14c1d
      stephen hemminger authored
      Marvell has agreed to do maintenance on the sky2 driver.
         * Add the developer to the maintainers file
         * Remove the old reference to the long gone (sk98lin) driver
         * Rearrange to fit current topic organization
      Signed-off-by: default avatarStephen Hemminger <shemminger@vyatta.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      44c14c1d
    • Veaceslav Falico's avatar
      bonding: properly unset current_arp_slave on slave link up · 5a430974
      Veaceslav Falico authored
      When a slave comes up, we're unsetting the current_arp_slave without
      removing active flags from it, which can lead to situations where we have
      more than one slave with active flags in active-backup mode.
      
      To avoid this situation we must remove the active flags from a slave before
      removing it as a current_arp_slave.
      Signed-off-by: default avatarVeaceslav Falico <vfalico@redhat.com>
      Signed-off-by: default avatarJay Vosburgh <fubar@us.ibm.com>
      Signed-off-by: default avatarAndy Gospodarek <andy@greyhouse.net>
      Signed-off-by: default avatarMarcelo Ricardo Leitner <mleitner@redhat.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      5a430974
    • Sasha Levin's avatar
      phonet: Check input from user before allocating · bcf1b70a
      Sasha Levin authored
      A phonet packet is limited to USHRT_MAX bytes, this is never checked during
      tx which means that the user can specify any size he wishes, and the kernel
      will attempt to allocate that size.
      
      In the good case, it'll lead to the following warning, but it may also cause
      the kernel to kick in the OOM and kill a random task on the server.
      
      [ 8921.744094] WARNING: at mm/page_alloc.c:2255 __alloc_pages_slowpath+0x65/0x730()
      [ 8921.749770] Pid: 5081, comm: trinity Tainted: G        W    3.4.0-rc1-next-20120402-sasha #46
      [ 8921.756672] Call Trace:
      [ 8921.758185]  [<ffffffff810b2ba7>] warn_slowpath_common+0x87/0xb0
      [ 8921.762868]  [<ffffffff810b2be5>] warn_slowpath_null+0x15/0x20
      [ 8921.765399]  [<ffffffff8117eae5>] __alloc_pages_slowpath+0x65/0x730
      [ 8921.769226]  [<ffffffff81179c8a>] ? zone_watermark_ok+0x1a/0x20
      [ 8921.771686]  [<ffffffff8117d045>] ? get_page_from_freelist+0x625/0x660
      [ 8921.773919]  [<ffffffff8117f3a8>] __alloc_pages_nodemask+0x1f8/0x240
      [ 8921.776248]  [<ffffffff811c03e0>] kmalloc_large_node+0x70/0xc0
      [ 8921.778294]  [<ffffffff811c4bd4>] __kmalloc_node_track_caller+0x34/0x1c0
      [ 8921.780847]  [<ffffffff821b0e3c>] ? sock_alloc_send_pskb+0xbc/0x260
      [ 8921.783179]  [<ffffffff821b3c65>] __alloc_skb+0x75/0x170
      [ 8921.784971]  [<ffffffff821b0e3c>] sock_alloc_send_pskb+0xbc/0x260
      [ 8921.787111]  [<ffffffff821b002e>] ? release_sock+0x7e/0x90
      [ 8921.788973]  [<ffffffff821b0ff0>] sock_alloc_send_skb+0x10/0x20
      [ 8921.791052]  [<ffffffff824cfc20>] pep_sendmsg+0x60/0x380
      [ 8921.792931]  [<ffffffff824cb4a6>] ? pn_socket_bind+0x156/0x180
      [ 8921.794917]  [<ffffffff824cb50f>] ? pn_socket_autobind+0x3f/0x90
      [ 8921.797053]  [<ffffffff824cb63f>] pn_socket_sendmsg+0x4f/0x70
      [ 8921.798992]  [<ffffffff821ab8e7>] sock_aio_write+0x187/0x1b0
      [ 8921.801395]  [<ffffffff810e325e>] ? sub_preempt_count+0xae/0xf0
      [ 8921.803501]  [<ffffffff8111842c>] ? __lock_acquire+0x42c/0x4b0
      [ 8921.805505]  [<ffffffff821ab760>] ? __sock_recv_ts_and_drops+0x140/0x140
      [ 8921.807860]  [<ffffffff811e07cc>] do_sync_readv_writev+0xbc/0x110
      [ 8921.809986]  [<ffffffff811958e7>] ? might_fault+0x97/0xa0
      [ 8921.811998]  [<ffffffff817bd99e>] ? security_file_permission+0x1e/0x90
      [ 8921.814595]  [<ffffffff811e17e2>] do_readv_writev+0xe2/0x1e0
      [ 8921.816702]  [<ffffffff810b8dac>] ? do_setitimer+0x1ac/0x200
      [ 8921.818819]  [<ffffffff810e2ec1>] ? get_parent_ip+0x11/0x50
      [ 8921.820863]  [<ffffffff810e325e>] ? sub_preempt_count+0xae/0xf0
      [ 8921.823318]  [<ffffffff811e1926>] vfs_writev+0x46/0x60
      [ 8921.825219]  [<ffffffff811e1a3f>] sys_writev+0x4f/0xb0
      [ 8921.827127]  [<ffffffff82658039>] system_call_fastpath+0x16/0x1b
      [ 8921.829384] ---[ end trace dffe390f30db9eb7 ]---
      Signed-off-by: default avatarSasha Levin <levinsasha928@gmail.com>
      Acked-by: default avatarRémi Denis-Courmont <remi.denis-courmont@nokia.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      bcf1b70a
    • Eric Dumazet's avatar
      tcp: tcp_sendpages() should call tcp_push() once · 35f9c09f
      Eric Dumazet authored
      commit 2f533844 (tcp: allow splice() to build full TSO packets) added
      a regression for splice() calls using SPLICE_F_MORE.
      
      We need to call tcp_flush() at the end of the last page processed in
      tcp_sendpages(), or else transmits can be deferred and future sends
      stall.
      
      Add a new internal flag, MSG_SENDPAGE_NOTLAST, acting like MSG_MORE, but
      with different semantic.
      
      For all sendpage() providers, its a transparent change. Only
      sock_sendpage() and tcp_sendpages() can differentiate the two different
      flags provided by pipe_to_sendpage()
      Reported-by: default avatarTom Herbert <therbert@google.com>
      Cc: Nandita Dukkipati <nanditad@google.com>
      Cc: Neal Cardwell <ncardwell@google.com>
      Cc: Tom Herbert <therbert@google.com>
      Cc: Yuchung Cheng <ycheng@google.com>
      Cc: H.K. Jerry Chu <hkchu@google.com>
      Cc: Maciej Żenczykowski <maze@google.com>
      Cc: Mahesh Bandewar <maheshb@google.com>
      Cc: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi>
      Signed-off-by: default avatarEric Dumazet <eric.dumazet@gmail&gt;com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      35f9c09f
    • RongQing.Li's avatar
      ipv6: fix array index in ip6_mc_add_src() · 78d50217
      RongQing.Li authored
      Convert array index from the loop bound to the loop index.
      
      And remove the void type conversion to ip6_mc_del1_src() return
      code, seem it is unnecessary, since ip6_mc_del1_src() does not
      use __must_check similar attribute, no compiler will report the
      warning when it is removed.
      
      v2: enrich the commit header
      Signed-off-by: default avatarRongQing.Li <roy.qing.li@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      78d50217
    • Thadeu Lima de Souza Cascardo's avatar
      mlx4: allocate just enough pages instead of always 4 pages · 117980c4
      Thadeu Lima de Souza Cascardo authored
      The driver uses a 2-order allocation, which is too much on architectures
      like ppc64, which has a 64KiB page. This particular allocation is used
      for large packet fragments that may have a size of 512, 1024, 4096 or
      fill the whole allocation. So, a minimum size of 16384 is good enough
      and will be the same size that is used in architectures of 4KiB sized
      pages.
      
      This will avoid allocation failures that we see when the system is under
      stress, but still has plenty of memory, like the one below.
      
      This will also allow us to set the interface MTU to higher values like
      9000, which was not possible on ppc64 without this patch.
      
      Node 1 DMA: 737*64kB 37*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB 0*8192kB 0*16384kB = 51904kB
      83137 total pagecache pages
      0 pages in swap cache
      Swap cache stats: add 0, delete 0, find 0/0
      Free swap  = 10420096kB
      Total swap = 10420096kB
      107776 pages RAM
      1184 pages reserved
      147343 pages shared
      28152 pages non-shared
      netstat: page allocation failure. order:2, mode:0x4020
      Call Trace:
      [c0000001a4fa3770] [c000000000012f04] .show_stack+0x74/0x1c0 (unreliable)
      [c0000001a4fa3820] [c00000000016af38] .__alloc_pages_nodemask+0x618/0x930
      [c0000001a4fa39a0] [c0000000001a71a0] .alloc_pages_current+0xb0/0x170
      [c0000001a4fa3a40] [d00000000dcc3e00] .mlx4_en_alloc_frag+0x200/0x240 [mlx4_en]
      [c0000001a4fa3b10] [d00000000dcc3f8c] .mlx4_en_complete_rx_desc+0x14c/0x250 [mlx4_en]
      [c0000001a4fa3be0] [d00000000dcc4eec] .mlx4_en_process_rx_cq+0x62c/0x850 [mlx4_en]
      [c0000001a4fa3d20] [d00000000dcc5150] .mlx4_en_poll_rx_cq+0x40/0x90 [mlx4_en]
      [c0000001a4fa3dc0] [c0000000004e2bb8] .net_rx_action+0x178/0x450
      [c0000001a4fa3eb0] [c00000000009c9b8] .__do_softirq+0x118/0x290
      [c0000001a4fa3f90] [c000000000031df8] .call_do_softirq+0x14/0x24
      [c000000184c3b520] [c00000000000e700] .do_softirq+0xf0/0x110
      [c000000184c3b5c0] [c00000000009c6d4] .irq_exit+0xb4/0xc0
      [c000000184c3b640] [c00000000000e964] .do_IRQ+0x144/0x230
      Signed-off-by: default avatarThadeu Lima de Souza Cascardo <cascardo@linux.vnet.ibm.com>
      Signed-off-by: default avatarKleber Sacilotto de Souza <klebers@linux.vnet.ibm.com>
      Tested-by: default avatarKleber Sacilotto de Souza <klebers@linux.vnet.ibm.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      117980c4
  7. 04 Apr, 2012 15 commits
  8. 03 Apr, 2012 3 commits