1. 04 Nov, 2013 22 commits
    • Daniel Borkmann's avatar
      net: unix: inherit SOCK_PASS{CRED, SEC} flags from socket to fix race · 4dde1cb0
      Daniel Borkmann authored
      [ Upstream commit 90c6bd34 ]
      
      In the case of credentials passing in unix stream sockets (dgram
      sockets seem not affected), we get a rather sparse race after
      commit 16e57262 ("af_unix: dont send SCM_CREDENTIALS by default").
      
      We have a stream server on receiver side that requests credential
      passing from senders (e.g. nc -U). Since we need to set SO_PASSCRED
      on each spawned/accepted socket on server side to 1 first (as it's
      not inherited), it can happen that in the time between accept() and
      setsockopt() we get interrupted, the sender is being scheduled and
      continues with passing data to our receiver. At that time SO_PASSCRED
      is neither set on sender nor receiver side, hence in cmsg's
      SCM_CREDENTIALS we get eventually pid:0, uid:65534, gid:65534
      (== overflow{u,g}id) instead of what we actually would like to see.
      
      On the sender side, here nc -U, the tests in maybe_add_creds()
      invoked through unix_stream_sendmsg() would fail, as at that exact
      time, as mentioned, the sender has neither SO_PASSCRED on his side
      nor sees it on the server side, and we have a valid 'other' socket
      in place. Thus, sender believes it would just look like a normal
      connection, not needing/requesting SO_PASSCRED at that time.
      
      As reverting 16e57262 would not be an option due to the significant
      performance regression reported when having creds always passed,
      one way/trade-off to prevent that would be to set SO_PASSCRED on
      the listener socket and allow inheriting these flags to the spawned
      socket on server side in accept(). It seems also logical to do so
      if we'd tell the listener socket to pass those flags onwards, and
      would fix the race.
      
      Before, strace:
      
      recvmsg(4, {msg_name(0)=NULL, msg_iov(1)=[{"blub\n", 4096}],
              msg_controllen=32, {cmsg_len=28, cmsg_level=SOL_SOCKET,
              cmsg_type=SCM_CREDENTIALS{pid=0, uid=65534, gid=65534}},
              msg_flags=0}, 0) = 5
      
      After, strace:
      
      recvmsg(4, {msg_name(0)=NULL, msg_iov(1)=[{"blub\n", 4096}],
              msg_controllen=32, {cmsg_len=28, cmsg_level=SOL_SOCKET,
              cmsg_type=SCM_CREDENTIALS{pid=11580, uid=1000, gid=1000}},
              msg_flags=0}, 0) = 5
      Signed-off-by: default avatarDaniel Borkmann <dborkman@redhat.com>
      Cc: Eric Dumazet <edumazet@google.com>
      Cc: Eric W. Biederman <ebiederm@xmission.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      4dde1cb0
    • Salva Peiró's avatar
      wanxl: fix info leak in ioctl · 79379b96
      Salva Peiró authored
      [ Upstream commit 2b13d06c ]
      
      The wanxl_ioctl() code fails to initialize the two padding bytes of
      struct sync_serial_settings after the ->loopback member. Add an explicit
      memset(0) before filling the structure to avoid the info leak.
      Signed-off-by: default avatarSalva Peiró <speiro@ai2.upv.es>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      79379b96
    • Vlad Yasevich's avatar
      sctp: Perform software checksum if packet has to be fragmented. · f3ce845c
      Vlad Yasevich authored
      [ Upstream commit d2dbbba7 ]
      
      IP/IPv6 fragmentation knows how to compute only TCP/UDP checksum.
      This causes problems if SCTP packets has to be fragmented and
      ipsummed has been set to PARTIAL due to checksum offload support.
      This condition can happen when retransmitting after MTU discover,
      or when INIT or other control chunks are larger then MTU.
      Check for the rare fragmentation condition in SCTP and use software
      checksum calculation in this case.
      
      CC: Fan Du <fan.du@windriver.com>
      Signed-off-by: default avatarVlad Yasevich <vyasevich@gmail.com>
      Acked-by: default avatarNeil Horman <nhorman@tuxdriver.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      f3ce845c
    • Fan Du's avatar
      sctp: Use software crc32 checksum when xfrm transform will happen. · 80d20e7b
      Fan Du authored
      [ Upstream commit 27127a82 ]
      
      igb/ixgbe have hardware sctp checksum support, when this feature is enabled
      and also IPsec is armed to protect sctp traffic, ugly things happened as
      xfrm_output checks CHECKSUM_PARTIAL to do checksum operation(sum every thing
      up and pack the 16bits result in the checksum field). The result is fail
      establishment of sctp communication.
      Signed-off-by: default avatarFan Du <fan.du@windriver.com>
      Cc: Neil Horman <nhorman@tuxdriver.com>
      Cc: Steffen Klassert <steffen.klassert@secunet.com>
      Signed-off-by: default avatarVlad Yasevich <vyasevich@gmail.com>
      Acked-by: default avatarNeil Horman <nhorman@tuxdriver.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      80d20e7b
    • Vlad Yasevich's avatar
      net: dst: provide accessor function to dst->xfrm · 3e5d72cd
      Vlad Yasevich authored
      [ Upstream commit e87b3998 ]
      
      dst->xfrm is conditionally defined.  Provide accessor funtion that
      is always available.
      Signed-off-by: default avatarVlad Yasevich <vyasevich@gmail.com>
      Acked-by: default avatarNeil Horman <nhorman@tuxdriver.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      3e5d72cd
    • Eric Dumazet's avatar
      bnx2x: record rx queue for LRO packets · 225be57b
      Eric Dumazet authored
      [ Upstream commit 60e66fee ]
      
      RPS support is kind of broken on bnx2x, because only non LRO packets
      get proper rx queue information. This triggers reorders, as it seems
      bnx2x like to generate a non LRO packet for segment including TCP PUSH
      flag : (this might be pure coincidence, but all the reorders I've
      seen involve segments with a PUSH)
      
      11:13:34.335847 IP A > B: . 415808:447136(31328) ack 1 win 457 <nop,nop,timestamp 3789336 3985797>
      11:13:34.335992 IP A > B: . 447136:448560(1424) ack 1 win 457 <nop,nop,timestamp 3789336 3985797>
      11:13:34.336391 IP A > B: . 448560:479888(31328) ack 1 win 457 <nop,nop,timestamp 3789337 3985797>
      11:13:34.336425 IP A > B: P 511216:512640(1424) ack 1 win 457 <nop,nop,timestamp 3789337 3985798>
      11:13:34.336423 IP A > B: . 479888:511216(31328) ack 1 win 457 <nop,nop,timestamp 3789337 3985798>
      11:13:34.336924 IP A > B: . 512640:543968(31328) ack 1 win 457 <nop,nop,timestamp 3789337 3985798>
      11:13:34.336963 IP A > B: . 543968:575296(31328) ack 1 win 457 <nop,nop,timestamp 3789337 3985798>
      
      We must call skb_record_rx_queue() to properly give to RPS (and more
      generally for TX queue selection on forward path) the receive queue
      information.
      
      Similar fix is needed for skb_mark_napi_id(), but will be handled
      in a separate patch to ease stable backports.
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Cc: Willem de Bruijn <willemb@google.com>
      Cc: Eilon Greenstein <eilong@broadcom.com>
      Acked-by: default avatarDmitry Kravkov <dmitry@broadcom.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      225be57b
    • Mathias Krause's avatar
      connector: use nlmsg_len() to check message length · c6077c90
      Mathias Krause authored
      [ Upstream commit 162b2bed ]
      
      The current code tests the length of the whole netlink message to be
      at least as long to fit a cn_msg. This is wrong as nlmsg_len includes
      the length of the netlink message header. Use nlmsg_len() instead to
      fix this "off-by-NLMSG_HDRLEN" size check.
      Signed-off-by: default avatarMathias Krause <minipli@googlemail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      c6077c90
    • Mathias Krause's avatar
      unix_diag: fix info leak · f4358dfd
      Mathias Krause authored
      [ Upstream commit 6865d1e8 ]
      
      When filling the netlink message we miss to wipe the pad field,
      therefore leak one byte of heap memory to userland. Fix this by
      setting pad to 0.
      Signed-off-by: default avatarMathias Krause <minipli@googlemail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      f4358dfd
    • Salva Peiró's avatar
      farsync: fix info leak in ioctl · df290b8e
      Salva Peiró authored
      [ Upstream commit 96b34040 ]
      
      The fst_get_iface() code fails to initialize the two padding bytes of
      struct sync_serial_settings after the ->loopback member. Add an explicit
      memset(0) before filling the structure to avoid the info leak.
      Signed-off-by: default avatarDan Carpenter <dan.carpenter@oracle.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      df290b8e
    • Eric Dumazet's avatar
      l2tp: must disable bh before calling l2tp_xmit_skb() · 120bc4f8
      Eric Dumazet authored
      [ Upstream commit 455cc32b ]
      
      François Cachereul made a very nice bug report and suspected
      the bh_lock_sock() / bh_unlok_sock() pair used in l2tp_xmit_skb() from
      process context was not good.
      
      This problem was added by commit 6af88da1
      ("l2tp: Fix locking in l2tp_core.c").
      
      l2tp_eth_dev_xmit() runs from BH context, so we must disable BH
      from other l2tp_xmit_skb() users.
      
      [  452.060011] BUG: soft lockup - CPU#1 stuck for 23s! [accel-pppd:6662]
      [  452.061757] Modules linked in: l2tp_ppp l2tp_netlink l2tp_core pppoe pppox
      ppp_generic slhc ipv6 ext3 mbcache jbd virtio_balloon xfs exportfs dm_mod
      virtio_blk ata_generic virtio_net floppy ata_piix libata virtio_pci virtio_ring virtio [last unloaded: scsi_wait_scan]
      [  452.064012] CPU 1
      [  452.080015] BUG: soft lockup - CPU#2 stuck for 23s! [accel-pppd:6643]
      [  452.080015] CPU 2
      [  452.080015]
      [  452.080015] Pid: 6643, comm: accel-pppd Not tainted 3.2.46.mini #1 Bochs Bochs
      [  452.080015] RIP: 0010:[<ffffffff81059f6c>]  [<ffffffff81059f6c>] do_raw_spin_lock+0x17/0x1f
      [  452.080015] RSP: 0018:ffff88007125fc18  EFLAGS: 00000293
      [  452.080015] RAX: 000000000000aba9 RBX: ffffffff811d0703 RCX: 0000000000000000
      [  452.080015] RDX: 00000000000000ab RSI: ffff8800711f6896 RDI: ffff8800745c8110
      [  452.080015] RBP: ffff88007125fc18 R08: 0000000000000020 R09: 0000000000000000
      [  452.080015] R10: 0000000000000000 R11: 0000000000000280 R12: 0000000000000286
      [  452.080015] R13: 0000000000000020 R14: 0000000000000240 R15: 0000000000000000
      [  452.080015] FS:  00007fdc0cc24700(0000) GS:ffff8800b6f00000(0000) knlGS:0000000000000000
      [  452.080015] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      [  452.080015] CR2: 00007fdb054899b8 CR3: 0000000074404000 CR4: 00000000000006a0
      [  452.080015] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      [  452.080015] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
      [  452.080015] Process accel-pppd (pid: 6643, threadinfo ffff88007125e000, task ffff8800b27e6dd0)
      [  452.080015] Stack:
      [  452.080015]  ffff88007125fc28 ffffffff81256559 ffff88007125fc98 ffffffffa01b2bd1
      [  452.080015]  ffff88007125fc58 000000000000000c 00000000029490d0 0000009c71dbe25e
      [  452.080015]  000000000000005c 000000080000000e 0000000000000000 ffff880071170600
      [  452.080015] Call Trace:
      [  452.080015]  [<ffffffff81256559>] _raw_spin_lock+0xe/0x10
      [  452.080015]  [<ffffffffa01b2bd1>] l2tp_xmit_skb+0x189/0x4ac [l2tp_core]
      [  452.080015]  [<ffffffffa01c2d36>] pppol2tp_sendmsg+0x15e/0x19c [l2tp_ppp]
      [  452.080015]  [<ffffffff811c7872>] __sock_sendmsg_nosec+0x22/0x24
      [  452.080015]  [<ffffffff811c83bd>] sock_sendmsg+0xa1/0xb6
      [  452.080015]  [<ffffffff81254e88>] ? __schedule+0x5c1/0x616
      [  452.080015]  [<ffffffff8103c7c6>] ? __dequeue_signal+0xb7/0x10c
      [  452.080015]  [<ffffffff810bbd21>] ? fget_light+0x75/0x89
      [  452.080015]  [<ffffffff811c8444>] ? sockfd_lookup_light+0x20/0x56
      [  452.080015]  [<ffffffff811c9b34>] sys_sendto+0x10c/0x13b
      [  452.080015]  [<ffffffff8125cac2>] system_call_fastpath+0x16/0x1b
      [  452.080015] Code: 81 48 89 e5 72 0c 31 c0 48 81 ff 45 66 25 81 0f 92 c0 5d c3 55 b8 00 01 00 00 48 89 e5 f0 66 0f c1 07 0f b6 d4 38 d0 74 06 f3 90 <8a> 07 eb f6 5d c3 90 90 55 48 89 e5 9c 58 0f 1f 44 00 00 5d c3
      [  452.080015] Call Trace:
      [  452.080015]  [<ffffffff81256559>] _raw_spin_lock+0xe/0x10
      [  452.080015]  [<ffffffffa01b2bd1>] l2tp_xmit_skb+0x189/0x4ac [l2tp_core]
      [  452.080015]  [<ffffffffa01c2d36>] pppol2tp_sendmsg+0x15e/0x19c [l2tp_ppp]
      [  452.080015]  [<ffffffff811c7872>] __sock_sendmsg_nosec+0x22/0x24
      [  452.080015]  [<ffffffff811c83bd>] sock_sendmsg+0xa1/0xb6
      [  452.080015]  [<ffffffff81254e88>] ? __schedule+0x5c1/0x616
      [  452.080015]  [<ffffffff8103c7c6>] ? __dequeue_signal+0xb7/0x10c
      [  452.080015]  [<ffffffff810bbd21>] ? fget_light+0x75/0x89
      [  452.080015]  [<ffffffff811c8444>] ? sockfd_lookup_light+0x20/0x56
      [  452.080015]  [<ffffffff811c9b34>] sys_sendto+0x10c/0x13b
      [  452.080015]  [<ffffffff8125cac2>] system_call_fastpath+0x16/0x1b
      [  452.064012]
      [  452.064012] Pid: 6662, comm: accel-pppd Not tainted 3.2.46.mini #1 Bochs Bochs
      [  452.064012] RIP: 0010:[<ffffffff81059f6e>]  [<ffffffff81059f6e>] do_raw_spin_lock+0x19/0x1f
      [  452.064012] RSP: 0018:ffff8800b6e83ba0  EFLAGS: 00000297
      [  452.064012] RAX: 000000000000aaa9 RBX: ffff8800b6e83b40 RCX: 0000000000000002
      [  452.064012] RDX: 00000000000000aa RSI: 000000000000000a RDI: ffff8800745c8110
      [  452.064012] RBP: ffff8800b6e83ba0 R08: 000000000000c802 R09: 000000000000001c
      [  452.064012] R10: ffff880071096c4e R11: 0000000000000006 R12: ffff8800b6e83b18
      [  452.064012] R13: ffffffff8125d51e R14: ffff8800b6e83ba0 R15: ffff880072a589c0
      [  452.064012] FS:  00007fdc0b81e700(0000) GS:ffff8800b6e80000(0000) knlGS:0000000000000000
      [  452.064012] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      [  452.064012] CR2: 0000000000625208 CR3: 0000000074404000 CR4: 00000000000006a0
      [  452.064012] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      [  452.064012] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
      [  452.064012] Process accel-pppd (pid: 6662, threadinfo ffff88007129a000, task ffff8800744f7410)
      [  452.064012] Stack:
      [  452.064012]  ffff8800b6e83bb0 ffffffff81256559 ffff8800b6e83bc0 ffffffff8121c64a
      [  452.064012]  ffff8800b6e83bf0 ffffffff8121ec7a ffff880072a589c0 ffff880071096c62
      [  452.064012]  0000000000000011 ffffffff81430024 ffff8800b6e83c80 ffffffff8121f276
      [  452.064012] Call Trace:
      [  452.064012]  <IRQ>
      [  452.064012]  [<ffffffff81256559>] _raw_spin_lock+0xe/0x10
      [  452.064012]  [<ffffffff8121c64a>] spin_lock+0x9/0xb
      [  452.064012]  [<ffffffff8121ec7a>] udp_queue_rcv_skb+0x186/0x269
      [  452.064012]  [<ffffffff8121f276>] __udp4_lib_rcv+0x297/0x4ae
      [  452.064012]  [<ffffffff8121c178>] ? raw_rcv+0xe9/0xf0
      [  452.064012]  [<ffffffff8121f4a7>] udp_rcv+0x1a/0x1c
      [  452.064012]  [<ffffffff811fe385>] ip_local_deliver_finish+0x12b/0x1a5
      [  452.064012]  [<ffffffff811fe54e>] ip_local_deliver+0x53/0x84
      [  452.064012]  [<ffffffff811fe1d0>] ip_rcv_finish+0x2bc/0x2f3
      [  452.064012]  [<ffffffff811fe78f>] ip_rcv+0x210/0x269
      [  452.064012]  [<ffffffff8101911e>] ? kvm_clock_get_cycles+0x9/0xb
      [  452.064012]  [<ffffffff811d88cd>] __netif_receive_skb+0x3a5/0x3f7
      [  452.064012]  [<ffffffff811d8eba>] netif_receive_skb+0x57/0x5e
      [  452.064012]  [<ffffffff811cf30f>] ? __netdev_alloc_skb+0x1f/0x3b
      [  452.064012]  [<ffffffffa0049126>] virtnet_poll+0x4ba/0x5a4 [virtio_net]
      [  452.064012]  [<ffffffff811d9417>] net_rx_action+0x73/0x184
      [  452.064012]  [<ffffffffa01b2cc2>] ? l2tp_xmit_skb+0x27a/0x4ac [l2tp_core]
      [  452.064012]  [<ffffffff810343b9>] __do_softirq+0xc3/0x1a8
      [  452.064012]  [<ffffffff81013b56>] ? ack_APIC_irq+0x10/0x12
      [  452.064012]  [<ffffffff81256559>] ? _raw_spin_lock+0xe/0x10
      [  452.064012]  [<ffffffff8125e0ac>] call_softirq+0x1c/0x26
      [  452.064012]  [<ffffffff81003587>] do_softirq+0x45/0x82
      [  452.064012]  [<ffffffff81034667>] irq_exit+0x42/0x9c
      [  452.064012]  [<ffffffff8125e146>] do_IRQ+0x8e/0xa5
      [  452.064012]  [<ffffffff8125676e>] common_interrupt+0x6e/0x6e
      [  452.064012]  <EOI>
      [  452.064012]  [<ffffffff810b82a1>] ? kfree+0x8a/0xa3
      [  452.064012]  [<ffffffffa01b2cc2>] ? l2tp_xmit_skb+0x27a/0x4ac [l2tp_core]
      [  452.064012]  [<ffffffffa01b2c25>] ? l2tp_xmit_skb+0x1dd/0x4ac [l2tp_core]
      [  452.064012]  [<ffffffffa01c2d36>] pppol2tp_sendmsg+0x15e/0x19c [l2tp_ppp]
      [  452.064012]  [<ffffffff811c7872>] __sock_sendmsg_nosec+0x22/0x24
      [  452.064012]  [<ffffffff811c83bd>] sock_sendmsg+0xa1/0xb6
      [  452.064012]  [<ffffffff81254e88>] ? __schedule+0x5c1/0x616
      [  452.064012]  [<ffffffff8103c7c6>] ? __dequeue_signal+0xb7/0x10c
      [  452.064012]  [<ffffffff810bbd21>] ? fget_light+0x75/0x89
      [  452.064012]  [<ffffffff811c8444>] ? sockfd_lookup_light+0x20/0x56
      [  452.064012]  [<ffffffff811c9b34>] sys_sendto+0x10c/0x13b
      [  452.064012]  [<ffffffff8125cac2>] system_call_fastpath+0x16/0x1b
      [  452.064012] Code: 89 e5 72 0c 31 c0 48 81 ff 45 66 25 81 0f 92 c0 5d c3 55 b8 00 01 00 00 48 89 e5 f0 66 0f c1 07 0f b6 d4 38 d0 74 06 f3 90 8a 07 <eb> f6 5d c3 90 90 55 48 89 e5 9c 58 0f 1f 44 00 00 5d c3 55 48
      [  452.064012] Call Trace:
      [  452.064012]  <IRQ>  [<ffffffff81256559>] _raw_spin_lock+0xe/0x10
      [  452.064012]  [<ffffffff8121c64a>] spin_lock+0x9/0xb
      [  452.064012]  [<ffffffff8121ec7a>] udp_queue_rcv_skb+0x186/0x269
      [  452.064012]  [<ffffffff8121f276>] __udp4_lib_rcv+0x297/0x4ae
      [  452.064012]  [<ffffffff8121c178>] ? raw_rcv+0xe9/0xf0
      [  452.064012]  [<ffffffff8121f4a7>] udp_rcv+0x1a/0x1c
      [  452.064012]  [<ffffffff811fe385>] ip_local_deliver_finish+0x12b/0x1a5
      [  452.064012]  [<ffffffff811fe54e>] ip_local_deliver+0x53/0x84
      [  452.064012]  [<ffffffff811fe1d0>] ip_rcv_finish+0x2bc/0x2f3
      [  452.064012]  [<ffffffff811fe78f>] ip_rcv+0x210/0x269
      [  452.064012]  [<ffffffff8101911e>] ? kvm_clock_get_cycles+0x9/0xb
      [  452.064012]  [<ffffffff811d88cd>] __netif_receive_skb+0x3a5/0x3f7
      [  452.064012]  [<ffffffff811d8eba>] netif_receive_skb+0x57/0x5e
      [  452.064012]  [<ffffffff811cf30f>] ? __netdev_alloc_skb+0x1f/0x3b
      [  452.064012]  [<ffffffffa0049126>] virtnet_poll+0x4ba/0x5a4 [virtio_net]
      [  452.064012]  [<ffffffff811d9417>] net_rx_action+0x73/0x184
      [  452.064012]  [<ffffffffa01b2cc2>] ? l2tp_xmit_skb+0x27a/0x4ac [l2tp_core]
      [  452.064012]  [<ffffffff810343b9>] __do_softirq+0xc3/0x1a8
      [  452.064012]  [<ffffffff81013b56>] ? ack_APIC_irq+0x10/0x12
      [  452.064012]  [<ffffffff81256559>] ? _raw_spin_lock+0xe/0x10
      [  452.064012]  [<ffffffff8125e0ac>] call_softirq+0x1c/0x26
      [  452.064012]  [<ffffffff81003587>] do_softirq+0x45/0x82
      [  452.064012]  [<ffffffff81034667>] irq_exit+0x42/0x9c
      [  452.064012]  [<ffffffff8125e146>] do_IRQ+0x8e/0xa5
      [  452.064012]  [<ffffffff8125676e>] common_interrupt+0x6e/0x6e
      [  452.064012]  <EOI>  [<ffffffff810b82a1>] ? kfree+0x8a/0xa3
      [  452.064012]  [<ffffffffa01b2cc2>] ? l2tp_xmit_skb+0x27a/0x4ac [l2tp_core]
      [  452.064012]  [<ffffffffa01b2c25>] ? l2tp_xmit_skb+0x1dd/0x4ac [l2tp_core]
      [  452.064012]  [<ffffffffa01c2d36>] pppol2tp_sendmsg+0x15e/0x19c [l2tp_ppp]
      [  452.064012]  [<ffffffff811c7872>] __sock_sendmsg_nosec+0x22/0x24
      [  452.064012]  [<ffffffff811c83bd>] sock_sendmsg+0xa1/0xb6
      [  452.064012]  [<ffffffff81254e88>] ? __schedule+0x5c1/0x616
      [  452.064012]  [<ffffffff8103c7c6>] ? __dequeue_signal+0xb7/0x10c
      [  452.064012]  [<ffffffff810bbd21>] ? fget_light+0x75/0x89
      [  452.064012]  [<ffffffff811c8444>] ? sockfd_lookup_light+0x20/0x56
      [  452.064012]  [<ffffffff811c9b34>] sys_sendto+0x10c/0x13b
      [  452.064012]  [<ffffffff8125cac2>] system_call_fastpath+0x16/0x1b
      Reported-by: default avatarFrançois Cachereul <f.cachereul@alphalink.fr>
      Tested-by: default avatarFrançois Cachereul <f.cachereul@alphalink.fr>
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Cc: James Chapman <jchapman@katalix.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      120bc4f8
    • Marc Kleine-Budde's avatar
      net: vlan: fix nlmsg size calculation in vlan_get_size() · 5468ba55
      Marc Kleine-Budde authored
      [ Upstream commit c33a39c5 ]
      
      This patch fixes the calculation of the nlmsg size, by adding the missing
      nla_total_size().
      
      Cc: Patrick McHardy <kaber@trash.net>
      Signed-off-by: default avatarMarc Kleine-Budde <mkl@pengutronix.de>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      5468ba55
    • Vlad Yasevich's avatar
      bridge: Correctly clamp MAX forward_delay when enabling STP · 0a8a2e32
      Vlad Yasevich authored
      [ Upstream commit 4b6c7879 ]
      
      Commit be4f154d
      	bridge: Clamp forward_delay when enabling STP
      had a typo when attempting to clamp maximum forward delay.
      
      It is possible to set bridge_forward_delay to be higher then
      permitted maximum when STP is off.  When turning STP on, the
      higher then allowed delay has to be clamed down to max value.
      Signed-off-by: default avatarVlad Yasevich <vyasevic@redhat.com>
      CC: Herbert Xu <herbert@gondor.apana.org.au>
      CC: Stephen Hemminger <shemminger@vyatta.com>
      Reviewed-by: default avatarVeaceslav Falico <vfalico@redhat.com>
      Acked-by: default avatarHerbert Xu <herbert@gondor.apana.org.au>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      0a8a2e32
    • Marcelo Ricardo Leitner's avatar
      ipv6: restrict neighbor entry creation to output flow · 25c065e6
      Marcelo Ricardo Leitner authored
      This patch is based on 3.2.y branch, the one used by reporter. Please let me
      know if it should be different. Thanks.
      
      The patch which introduced the regression was applied on stables:
      3.0.64 3.4.31 3.7.8 3.2.39
      
      The patch which introduced the regression was for stable trees only.
      
      ---8<---
      
      Commit 0d6a7707 "ipv6: do not create
      neighbor entries for local delivery" introduced a regression on
      which routes to local delivery would not work anymore. Like this:
      
          $ ip -6 route add local 2001::/64 dev lo
          $ ping6 -c1 2001::9
          PING 2001::9(2001::9) 56 data bytes
          ping: sendmsg: Invalid argument
      
      As this is a local delivery, that commit would not allow the creation of a
      neighbor entry and thus the packet cannot be sent.
      
      But as TPROXY scenario actually needs to avoid the neighbor entry creation only
      for input flow, this patch now limits previous patch to input flow, keeping
      output as before that patch.
      Reported-by: default avatarDebabrata Banerjee <dbavatar@gmail.com>
      Signed-off-by: default avatarMarcelo Ricardo Leitner <mleitner@redhat.com>
      Signed-off-by: default avatarJiri Pirko <jiri@resnulli.us>
      Acked-by: default avatarHannes Frederic Sowa <hannes@stressinduktion.org>
      CC: Hannes Frederic Sowa <hannes@stressinduktion.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      25c065e6
    • Marc Kleine-Budde's avatar
      can: dev: fix nlmsg size calculation in can_get_size() · bbcb20aa
      Marc Kleine-Budde authored
      [ Upstream commit fe119a05 ]
      
      This patch fixes the calculation of the nlmsg size, by adding the missing
      nla_total_size().
      Signed-off-by: default avatarMarc Kleine-Budde <mkl@pengutronix.de>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      bbcb20aa
    • Jiri Benc's avatar
      ipv4: fix ineffective source address selection · ad61d4c7
      Jiri Benc authored
      [ Upstream commit 0a7e2260 ]
      
      When sending out multicast messages, the source address in inet->mc_addr is
      ignored and rewritten by an autoselected one. This is caused by a typo in
      commit 813b3b5d ("ipv4: Use caller's on-stack flowi as-is in output
      route lookups").
      Signed-off-by: default avatarJiri Benc <jbenc@redhat.com>
      Acked-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      ad61d4c7
    • Mathias Krause's avatar
      proc connector: fix info leaks · 23fd882b
      Mathias Krause authored
      [ Upstream commit e727ca82 ]
      
      Initialize event_data for all possible message types to prevent leaking
      kernel stack contents to userland (up to 20 bytes). Also set the flags
      member of the connector message to 0 to prevent leaking two more stack
      bytes this way.
      Signed-off-by: default avatarMathias Krause <minipli@googlemail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      23fd882b
    • Dan Carpenter's avatar
      net: heap overflow in __audit_sockaddr() · 5684fac3
      Dan Carpenter authored
      [ Upstream commit 1661bf36 ]
      
      We need to cap ->msg_namelen or it leads to a buffer overflow when we
      to the memcpy() in __audit_sockaddr().  It requires CAP_AUDIT_CONTROL to
      exploit this bug.
      
      The call tree is:
      ___sys_recvmsg()
        move_addr_to_user()
          audit_sockaddr()
            __audit_sockaddr()
      Reported-by: default avatarJüri Aedla <juri.aedla@gmail.com>
      Signed-off-by: default avatarDan Carpenter <dan.carpenter@oracle.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      5684fac3
    • Sebastian Hesselbarth's avatar
      net: mv643xx_eth: fix orphaned statistics timer crash · 7ee57de6
      Sebastian Hesselbarth authored
      [ Upstream commit f564412c ]
      
      The periodic statistics timer gets started at port _probe() time, but
      is stopped on _stop() only. In a modular environment, this can cause
      the timer to access already deallocated memory, if the module is unloaded
      without starting the eth device. To fix this, we add the timer right
      before the port is started, instead of at _probe() time.
      Signed-off-by: default avatarSebastian Hesselbarth <sebastian.hesselbarth@gmail.com>
      Acked-by: default avatarJason Cooper <jason@lakedaemon.net>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      7ee57de6
    • Sebastian Hesselbarth's avatar
      net: mv643xx_eth: update statistics timer from timer context only · 75120c13
      Sebastian Hesselbarth authored
      [ Upstream commit 041b4ddb ]
      
      Each port driver installs a periodic timer to update port statistics
      by calling mib_counters_update. As mib_counters_update is also called
      from non-timer context, we should not reschedule the timer there but
      rather move it to timer-only context.
      Signed-off-by: default avatarSebastian Hesselbarth <sebastian.hesselbarth@gmail.com>
      Acked-by: default avatarJason Cooper <jason@lakedaemon.net>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      75120c13
    • Eric Dumazet's avatar
      net: do not call sock_put() on TIMEWAIT sockets · 791673fb
      Eric Dumazet authored
      [ Upstream commit 80ad1d61 ]
      
      commit 3ab5aee7 ("net: Convert TCP & DCCP hash tables to use RCU /
      hlist_nulls") incorrectly used sock_put() on TIMEWAIT sockets.
      
      We should instead use inet_twsk_put()
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      791673fb
    • Eric Dumazet's avatar
      tcp: do not forget FIN in tcp_shifted_skb() · d1e668e7
      Eric Dumazet authored
      [ Upstream commit 5e8a402f ]
      
      Yuchung found following problem :
      
       There are bugs in the SACK processing code, merging part in
       tcp_shift_skb_data(), that incorrectly resets or ignores the sacked
       skbs FIN flag. When a receiver first SACK the FIN sequence, and later
       throw away ofo queue (e.g., sack-reneging), the sender will stop
       retransmitting the FIN flag, and hangs forever.
      
      Following packetdrill test can be used to reproduce the bug.
      
      $ cat sack-merge-bug.pkt
      `sysctl -q net.ipv4.tcp_fack=0`
      
      // Establish a connection and send 10 MSS.
      0.000 socket(..., SOCK_STREAM, IPPROTO_TCP) = 3
      +.000 setsockopt(3, SOL_SOCKET, SO_REUSEADDR, [1], 4) = 0
      +.000 bind(3, ..., ...) = 0
      +.000 listen(3, 1) = 0
      
      +.050 < S 0:0(0) win 32792 <mss 1000,sackOK,nop,nop,nop,wscale 7>
      +.000 > S. 0:0(0) ack 1 <mss 1460,nop,nop,sackOK,nop,wscale 6>
      +.001 < . 1:1(0) ack 1 win 1024
      +.000 accept(3, ..., ...) = 4
      
      +.100 write(4, ..., 12000) = 12000
      +.000 shutdown(4, SHUT_WR) = 0
      +.000 > . 1:10001(10000) ack 1
      +.050 < . 1:1(0) ack 2001 win 257
      +.000 > FP. 10001:12001(2000) ack 1
      +.050 < . 1:1(0) ack 2001 win 257 <sack 10001:11001,nop,nop>
      +.050 < . 1:1(0) ack 2001 win 257 <sack 10001:12002,nop,nop>
      // SACK reneg
      +.050 < . 1:1(0) ack 12001 win 257
      +0 %{ print "unacked: ",tcpi_unacked }%
      +5 %{ print "" }%
      
      First, a typo inverted left/right of one OR operation, then
      code forgot to advance end_seq if the merged skb carried FIN.
      
      Bug was added in 2.6.29 by commit 832d11c5
      ("tcp: Try to restore large SKBs while SACK processing")
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarYuchung Cheng <ycheng@google.com>
      Acked-by: default avatarNeal Cardwell <ncardwell@google.com>
      Cc: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi>
      Acked-by: default avatarIlpo Järvinen <ilpo.jarvinen@helsinki.fi>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      d1e668e7
    • Eric Dumazet's avatar
      tcp: must unclone packets before mangling them · 11db1e4c
      Eric Dumazet authored
      [ Upstream commit c52e2421 ]
      
      TCP stack should make sure it owns skbs before mangling them.
      
      We had various crashes using bnx2x, and it turned out gso_size
      was cleared right before bnx2x driver was populating TC descriptor
      of the _previous_ packet send. TCP stack can sometime retransmit
      packets that are still in Qdisc.
      
      Of course we could make bnx2x driver more robust (using
      ACCESS_ONCE(shinfo->gso_size) for example), but the bug is TCP stack.
      
      We have identified two points where skb_unclone() was needed.
      
      This patch adds a WARN_ON_ONCE() to warn us if we missed another
      fix of this kind.
      
      Kudos to Neal for finding the root cause of this bug. Its visible
      using small MSS.
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarNeal Cardwell <ncardwell@google.com>
      Cc: Yuchung Cheng <ycheng@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      11db1e4c
  2. 22 Oct, 2013 12 commits
  3. 13 Oct, 2013 6 commits
    • Greg Kroah-Hartman's avatar
      Linux 3.4.66 · dc3a8b0c
      Greg Kroah-Hartman authored
      dc3a8b0c
    • Theodore Ts'o's avatar
      ext4: avoid hang when mounting non-journal filesystems with orphan list · 016a3592
      Theodore Ts'o authored
      commit 0e9a9a1a upstream.
      
      When trying to mount a file system which does not contain a journal,
      but which does have a orphan list containing an inode which needs to
      be truncated, the mount call with hang forever in
      ext4_orphan_cleanup() because ext4_orphan_del() will return
      immediately without removing the inode from the orphan list, leading
      to an uninterruptible loop in kernel code which will busy out one of
      the CPU's on the system.
      
      This can be trivially reproduced by trying to mount the file system
      found in tests/f_orphan_extents_inode/image.gz from the e2fsprogs
      source tree.  If a malicious user were to put this on a USB stick, and
      mount it on a Linux desktop which has automatic mounts enabled, this
      could be considered a potential denial of service attack.  (Not a big
      deal in practice, but professional paranoids worry about such things,
      and have even been known to allocate CVE numbers for such problems.)
      
      -js: This is a fix for CVE-2013-2015.
      Signed-off-by: default avatar"Theodore Ts'o" <tytso@mit.edu>
      Reviewed-by: default avatarZheng Liu <wenqing.lz@taobao.com>
      Acked-by: default avatarJan Kara <jack@suse.cz>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      016a3592
    • Josef Bacik's avatar
      Btrfs: change how we queue blocks for backref checking · 027a76bf
      Josef Bacik authored
      commit b6c60c80 upstream.
      
      Previously we only added blocks to the list to have their backrefs checked if
      the level of the block is right above the one we are searching for.  This is
      because we want to make sure we don't add the entire path up to the root to the
      lists to make sure we process things one at a time.  This assumes that if any
      blocks in the path to the root are going to be not checked (shared in other
      words) then they will be in the level right above the current block on up.  This
      isn't quite right though since we can have blocks higher up the list that are
      shared because they are attached to a reloc root.  But we won't add this block
      to be checked and then later on we will BUG_ON(!upper->checked).  So instead
      keep track of wether or not we've queued a block to be checked in this current
      search, and if we haven't go ahead and queue it to be checked.  This patch fixed
      the panic I was seeing where we BUG_ON(!upper->checked).  Thanks,
      Signed-off-by: default avatarJosef Bacik <jbacik@fusionio.com>
      Signed-off-by: default avatarChris Mason <chris.mason@fusionio.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      027a76bf
    • Chris Metcalf's avatar
      tile: use a more conservative __my_cpu_offset in CONFIG_PREEMPT · 5df70853
      Chris Metcalf authored
      commit f862eefe upstream.
      
      It turns out the kernel relies on barrier() to force a reload of the
      percpu offset value.  Since we can't easily modify the definition of
      barrier() to include "tp" as an output register, we instead provide a
      definition of __my_cpu_offset as extended assembly that includes a fake
      stack read to hazard against barrier(), forcing gcc to know that it
      must reread "tp" and recompute anything based on "tp" after a barrier.
      
      This fixes observed hangs in the slub allocator when we are looping
      on a percpu cmpxchg_double.
      
      A similar fix for ARMv7 was made in June in change 509eb76e.
      Signed-off-by: default avatarChris Metcalf <cmetcalf@tilera.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      5df70853
    • Lv Zheng's avatar
      ACPI / IPMI: Fix atomic context requirement of ipmi_msg_handler() · b55ef2ed
      Lv Zheng authored
      commit 06a8566b upstream.
      
      This patch fixes the issues indicated by the test results that
      ipmi_msg_handler() is invoked in atomic context.
      
      BUG: scheduling while atomic: kipmi0/18933/0x10000100
      Modules linked in: ipmi_si acpi_ipmi ...
      CPU: 3 PID: 18933 Comm: kipmi0 Tainted: G       AW    3.10.0-rc7+ #2
      Hardware name: QCI QSSC-S4R/QSSC-S4R, BIOS QSSC-S4R.QCI.01.00.0027.070120100606 07/01/2010
       ffff8838245eea00 ffff88103fc63c98 ffffffff814c4a1e ffff88103fc63ca8
       ffffffff814bfbab ffff88103fc63d28 ffffffff814c73e0 ffff88103933cbd4
       0000000000000096 ffff88103fc63ce8 ffff88102f618000 ffff881035c01fd8
      Call Trace:
       <IRQ>  [<ffffffff814c4a1e>] dump_stack+0x19/0x1b
       [<ffffffff814bfbab>] __schedule_bug+0x46/0x54
       [<ffffffff814c73e0>] __schedule+0x83/0x59c
       [<ffffffff81058853>] __cond_resched+0x22/0x2d
       [<ffffffff814c794b>] _cond_resched+0x14/0x1d
       [<ffffffff814c6d82>] mutex_lock+0x11/0x32
       [<ffffffff8101e1e9>] ? __default_send_IPI_dest_field.constprop.0+0x53/0x58
       [<ffffffffa09e3f9c>] ipmi_msg_handler+0x23/0x166 [ipmi_si]
       [<ffffffff812bf6e4>] deliver_response+0x55/0x5a
       [<ffffffff812c0fd4>] handle_new_recv_msgs+0xb67/0xc65
       [<ffffffff81007ad1>] ? read_tsc+0x9/0x19
       [<ffffffff814c8620>] ? _raw_spin_lock_irq+0xa/0xc
       [<ffffffffa09e1128>] ipmi_thread+0x5c/0x146 [ipmi_si]
       ...
      
      Also Tony Camuso says:
      
       We were getting occasional "Scheduling while atomic" call traces
       during boot on some systems. Problem was first seen on a Cisco C210
       but we were able to reproduce it on a Cisco c220m3. Setting
       CONFIG_LOCKDEP and LOCKDEP_SUPPORT to 'y' exposed a lockdep around
       tx_msg_lock in acpi_ipmi.c struct acpi_ipmi_device.
      
       =================================
       [ INFO: inconsistent lock state ]
       2.6.32-415.el6.x86_64-debug-splck #1
       ---------------------------------
       inconsistent {SOFTIRQ-ON-W} -> {IN-SOFTIRQ-W} usage.
       ksoftirqd/3/17 [HC0[0]:SC1[1]:HE1:SE0] takes:
        (&ipmi_device->tx_msg_lock){+.?...}, at: [<ffffffff81337a27>] ipmi_msg_handler+0x71/0x126
       {SOFTIRQ-ON-W} state was registered at:
         [<ffffffff810ba11c>] __lock_acquire+0x63c/0x1570
         [<ffffffff810bb0f4>] lock_acquire+0xa4/0x120
         [<ffffffff815581cc>] __mutex_lock_common+0x4c/0x400
         [<ffffffff815586ea>] mutex_lock_nested+0x4a/0x60
         [<ffffffff8133789d>] acpi_ipmi_space_handler+0x11b/0x234
         [<ffffffff81321c62>] acpi_ev_address_space_dispatch+0x170/0x1be
      
      The fix implemented by this change has been tested by Tony:
      
       Tested the patch in a boot loop with lockdep debug enabled and never
       saw the problem in over 400 reboots.
      Reported-and-tested-by: default avatarTony Camuso <tcamuso@redhat.com>
      Signed-off-by: default avatarLv Zheng <lv.zheng@intel.com>
      Reviewed-by: default avatarHuang Ying <ying.huang@intel.com>
      Signed-off-by: default avatarRafael J. Wysocki <rafael.j.wysocki@intel.com>
      Cc: Jonghwan Choi <jhbird.choi@samsung.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      b55ef2ed
    • David Rientjes's avatar
      mm, show_mem: suppress page counts in non-blockable contexts · 022a41db
      David Rientjes authored
      commit 4b59e6c4 upstream.
      
      On large systems with a lot of memory, walking all RAM to determine page
      types may take a half second or even more.
      
      In non-blockable contexts, the page allocator will emit a page allocation
      failure warning unless __GFP_NOWARN is specified.  In such contexts, irqs
      are typically disabled and such a lengthy delay may even result in NMI
      watchdog timeouts.
      
      To fix this, suppress the page walk in such contexts when printing the
      page allocation failure warning.
      Signed-off-by: default avatarDavid Rientjes <rientjes@google.com>
      Cc: Mel Gorman <mgorman@suse.de>
      Acked-by: default avatarMichal Hocko <mhocko@suse.cz>
      Cc: Dave Hansen <dave@linux.vnet.ibm.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Cc: Xishi Qiu <qiuxishi@huawei.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      022a41db