1. 28 Mar, 2015 27 commits
  2. 24 Mar, 2015 13 commits
    • Eric Dumazet's avatar
      tcp: make connect() mem charging friendly · e8f117f0
      Eric Dumazet authored
      [ Upstream commit 355a901e ]
      
      While working on sk_forward_alloc problems reported by Denys
      Fedoryshchenko, we found that tcp connect() (and fastopen) do not call
      sk_wmem_schedule() for SYN packet (and/or SYN/DATA packet), so
      sk_forward_alloc is negative while connect is in progress.
      
      We can fix this by calling regular sk_stream_alloc_skb() both for the
      SYN packet (in tcp_connect()) and the syn_data packet in
      tcp_send_syn_data()
      
      Then, tcp_send_syn_data() can avoid copying syn_data as we simply
      can manipulate syn_data->cb[] to remove SYN flag (and increment seq)
      
      Instead of open coding memcpy_fromiovecend(), simply use this helper.
      
      This leaves in socket write queue clean fast clone skbs.
      
      This was tested against our fastopen packetdrill tests.
      Reported-by: default avatarDenys Fedoryshchenko <nuclearcat@nuclearcat.com>
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Acked-by: default avatarYuchung Cheng <ycheng@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarSasha Levin <sasha.levin@oracle.com>
      e8f117f0
    • Catalin Marinas's avatar
      net: compat: Update get_compat_msghdr() to match copy_msghdr_from_user() behaviour · 34ca18c8
      Catalin Marinas authored
      [ Upstream commit 91edd096 ]
      
      Commit db31c55a (net: clamp ->msg_namelen instead of returning an
      error) introduced the clamping of msg_namelen when the unsigned value
      was larger than sizeof(struct sockaddr_storage). This caused a
      msg_namelen of -1 to be valid. The native code was subsequently fixed by
      commit dbb490b9 (net: socket: error on a negative msg_namelen).
      
      In addition, the native code sets msg_namelen to 0 when msg_name is
      NULL. This was done in commit (6a2a2b3a net:socket: set msg_namelen
      to 0 if msg_name is passed as NULL in msghdr struct from userland) and
      subsequently updated by 08adb7da (fold verify_iovec() into
      copy_msghdr_from_user()).
      
      This patch brings the get_compat_msghdr() in line with
      copy_msghdr_from_user().
      
      Fixes: db31c55a (net: clamp ->msg_namelen instead of returning an error)
      Cc: David S. Miller <davem@davemloft.net>
      Cc: Dan Carpenter <dan.carpenter@oracle.com>
      Signed-off-by: default avatarCatalin Marinas <catalin.marinas@arm.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarSasha Levin <sasha.levin@oracle.com>
      34ca18c8
    • Josh Hunt's avatar
      tcp: fix tcp fin memory accounting · b182ecc8
      Josh Hunt authored
      [ Upstream commit d22e1537 ]
      
      tcp_send_fin() does not account for the memory it allocates properly, so
      sk_forward_alloc can be negative in cases where we've sent a FIN:
      
      ss example output (ss -amn | grep -B1 f4294):
      tcp    FIN-WAIT-1 0      1            192.168.0.1:45520         192.0.2.1:8080
      	skmem:(r0,rb87380,t0,tb87380,f4294966016,w1280,o0,bl0)
      Acked-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarSasha Levin <sasha.levin@oracle.com>
      b182ecc8
    • Steven Barth's avatar
      ipv6: fix backtracking for throw routes · 7f249ac5
      Steven Barth authored
      [ Upstream commit 73ba57bf ]
      
      for throw routes to trigger evaluation of other policy rules
      EAGAIN needs to be propagated up to fib_rules_lookup
      similar to how its done for IPv4
      
      A simple testcase for verification is:
      
      ip -6 rule add lookup 33333 priority 33333
      ip -6 route add throw 2001:db8::1
      ip -6 route add 2001:db8::1 via fe80::1 dev wlan0 table 33333
      ip route get 2001:db8::1
      Signed-off-by: default avatarSteven Barth <cyrus@openwrt.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarSasha Levin <sasha.levin@oracle.com>
      7f249ac5
    • Ondrej Zary's avatar
      Revert "net: cx82310_eth: use common match macro" · 5039b8cf
      Ondrej Zary authored
      [ Upstream commit 8d006e01 ]
      
      This reverts commit 11ad714b because
      it breaks cx82310_eth.
      
      The custom USB_DEVICE_CLASS macro matches
      bDeviceClass, bDeviceSubClass and bDeviceProtocol
      but the common USB_DEVICE_AND_INTERFACE_INFO matches
      bInterfaceClass, bInterfaceSubClass and bInterfaceProtocol instead, which are
      not specified.
      Signed-off-by: default avatarOndrej Zary <linux@rainbow-software.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarSasha Levin <sasha.levin@oracle.com>
      5039b8cf
    • Al Viro's avatar
      rxrpc: bogus MSG_PEEK test in rxrpc_recvmsg() · 3d1acc9e
      Al Viro authored
      [ Upstream commit 7d985ed1 ]
      
      [I would really like an ACK on that one from dhowells; it appears to be
      quite straightforward, but...]
      
      MSG_PEEK isn't passed to ->recvmsg() via msg->msg_flags; as the matter of
      fact, neither the kernel users of rxrpc, nor the syscalls ever set that bit
      in there.  It gets passed via flags; in fact, another such check in the same
      function is done correctly - as flags & MSG_PEEK.
      
      It had been that way (effectively disabled) for 8 years, though, so the patch
      needs beating up - that case had never been tested.  If it is correct, it's
      -stable fodder.
      Signed-off-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarSasha Levin <sasha.levin@oracle.com>
      3d1acc9e
    • Al Viro's avatar
      caif: fix MSG_OOB test in caif_seqpkt_recvmsg() · 02bfe56e
      Al Viro authored
      [ Upstream commit 3eeff778 ]
      
      It should be checking flags, not msg->msg_flags.  It's ->sendmsg()
      instances that need to look for that in ->msg_flags, ->recvmsg() ones
      (including the other ->recvmsg() instance in that file, as well as
      unix_dgram_recvmsg() this one claims to be imitating) check in flags.
      Braino had been introduced in commit dcda13 ("caif: Bugfix - use MSG_TRUNC
      in receive") back in 2010, so it goes quite a while back.
      Signed-off-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarSasha Levin <sasha.levin@oracle.com>
      02bfe56e
    • Eric Dumazet's avatar
      inet_diag: fix possible overflow in inet_diag_dump_one_icsk() · e1f2092a
      Eric Dumazet authored
      [ Upstream commit c8e2c80d ]
      
      inet_diag_dump_one_icsk() allocates too small skb.
      
      Add inet_sk_attr_size() helper right before inet_sk_diag_fill()
      so that it can be updated if/when new attributes are added.
      
      iproute2/ss currently does not use this dump_one() interface,
      this might explain nobody noticed this problem yet.
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarSasha Levin <sasha.levin@oracle.com>
      e1f2092a
    • Jason Wang's avatar
      virtio-net: correctly delete napi hash · b9befa43
      Jason Wang authored
      [ Upstream commit ab3971b1 ]
      
      We don't delete napi from hash list during module exit. This will
      cause the following panic when doing module load and unload:
      
      BUG: unable to handle kernel paging request at 0000004e00000075
      IP: [<ffffffff816bd01b>] napi_hash_add+0x6b/0xf0
      PGD 3c5d5067 PUD 0
      Oops: 0000 [#1] SMP
      ...
      Call Trace:
      [<ffffffffa0a5bfb7>] init_vqs+0x107/0x490 [virtio_net]
      [<ffffffffa0a5c9f2>] virtnet_probe+0x562/0x791815639d880be [virtio_net]
      [<ffffffff8139e667>] virtio_dev_probe+0x137/0x200
      [<ffffffff814c7f2a>] driver_probe_device+0x7a/0x250
      [<ffffffff814c81d3>] __driver_attach+0x93/0xa0
      [<ffffffff814c8140>] ? __device_attach+0x40/0x40
      [<ffffffff814c6053>] bus_for_each_dev+0x63/0xa0
      [<ffffffff814c7a79>] driver_attach+0x19/0x20
      [<ffffffff814c76f0>] bus_add_driver+0x170/0x220
      [<ffffffffa0a60000>] ? 0xffffffffa0a60000
      [<ffffffff814c894f>] driver_register+0x5f/0xf0
      [<ffffffff8139e41b>] register_virtio_driver+0x1b/0x30
      [<ffffffffa0a60010>] virtio_net_driver_init+0x10/0x12 [virtio_net]
      
      This patch fixes this by doing this in virtnet_free_queues(). And also
      don't delete napi in virtnet_freeze() since it will call
      virtnet_free_queues() which has already did this.
      
      Fixes 91815639 ("virtio-net: rx busy polling support")
      Cc: Rusty Russell <rusty@rustcorp.com.au>
      Cc: Michael S. Tsirkin <mst@redhat.com>
      Signed-off-by: default avatarJason Wang <jasowang@redhat.com>
      Acked-by: default avatarMichael S. Tsirkin <mst@redhat.com>
      Reviewed-by: default avatarMichael S. Tsirkin <mst@redhat.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarSasha Levin <sasha.levin@oracle.com>
      b9befa43
    • Arnd Bergmann's avatar
      rds: avoid potential stack overflow · 6ba8661b
      Arnd Bergmann authored
      [ Upstream commit f862e07c ]
      
      The rds_iw_update_cm_id function stores a large 'struct rds_sock' object
      on the stack in order to pass a pair of addresses. This happens to just
      fit withint the 1024 byte stack size warning limit on x86, but just
      exceed that limit on ARM, which gives us this warning:
      
      net/rds/iw_rdma.c:200:1: warning: the frame size of 1056 bytes is larger than 1024 bytes [-Wframe-larger-than=]
      
      As the use of this large variable is basically bogus, we can rearrange
      the code to not do that. Instead of passing an rds socket into
      rds_iw_get_device, we now just pass the two addresses that we have
      available in rds_iw_update_cm_id, and we change rds_iw_get_mr accordingly,
      to create two address structures on the stack there.
      Signed-off-by: default avatarArnd Bergmann <arnd@arndb.de>
      Acked-by: default avatarSowmini Varadhan <sowmini.varadhan@oracle.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarSasha Levin <sasha.levin@oracle.com>
      6ba8661b
    • Alexey Kodanev's avatar
      net: sysctl_net_core: check SNDBUF and RCVBUF for min length · c48cf4f2
      Alexey Kodanev authored
      [ Upstream commit b1cb59cf ]
      
      sysctl has sysctl.net.core.rmem_*/wmem_* parameters which can be
      set to incorrect values. Given that 'struct sk_buff' allocates from
      rcvbuf, incorrectly set buffer length could result to memory
      allocation failures. For example, set them as follows:
      
          # sysctl net.core.rmem_default=64
            net.core.wmem_default = 64
          # sysctl net.core.wmem_default=64
            net.core.wmem_default = 64
          # ping localhost -s 1024 -i 0 > /dev/null
      
      This could result to the following failure:
      
      skbuff: skb_over_panic: text:ffffffff81628db4 len:-32 put:-32
      head:ffff88003a1cc200 data:ffff88003a1cc200 tail:0xffffffe0 end:0xc0 dev:<NULL>
      kernel BUG at net/core/skbuff.c:102!
      invalid opcode: 0000 [#1] SMP
      ...
      task: ffff88003b7f5550 ti: ffff88003ae88000 task.ti: ffff88003ae88000
      RIP: 0010:[<ffffffff8155fbd1>]  [<ffffffff8155fbd1>] skb_put+0xa1/0xb0
      RSP: 0018:ffff88003ae8bc68  EFLAGS: 00010296
      RAX: 000000000000008d RBX: 00000000ffffffe0 RCX: 0000000000000000
      RDX: ffff88003fdcf598 RSI: ffff88003fdcd9c8 RDI: ffff88003fdcd9c8
      RBP: ffff88003ae8bc88 R08: 0000000000000001 R09: 0000000000000000
      R10: 0000000000000001 R11: 00000000000002b2 R12: 0000000000000000
      R13: 0000000000000000 R14: ffff88003d3f7300 R15: ffff88000012a900
      FS:  00007fa0e2b4a840(0000) GS:ffff88003fc00000(0000) knlGS:0000000000000000
      CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      CR2: 0000000000d0f7e0 CR3: 000000003b8fb000 CR4: 00000000000006f0
      Stack:
       ffff88003a1cc200 00000000ffffffe0 00000000000000c0 ffffffff818cab1d
       ffff88003ae8bd68 ffffffff81628db4 ffff88003ae8bd48 ffff88003b7f5550
       ffff880031a09408 ffff88003b7f5550 ffff88000012aa48 ffff88000012ab00
      Call Trace:
       [<ffffffff81628db4>] unix_stream_sendmsg+0x2c4/0x470
       [<ffffffff81556f56>] sock_write_iter+0x146/0x160
       [<ffffffff811d9612>] new_sync_write+0x92/0xd0
       [<ffffffff811d9cd6>] vfs_write+0xd6/0x180
       [<ffffffff811da499>] SyS_write+0x59/0xd0
       [<ffffffff81651532>] system_call_fastpath+0x12/0x17
      Code: 00 00 48 89 44 24 10 8b 87 c8 00 00 00 48 89 44 24 08 48 8b 87 d8 00
            00 00 48 c7 c7 30 db 91 81 48 89 04 24 31 c0 e8 4f a8 0e 00 <0f> 0b
            eb fe 66 66 2e 0f 1f 84 00 00 00 00 00 55 48 89 e5 48 83
      RIP  [<ffffffff8155fbd1>] skb_put+0xa1/0xb0
      RSP <ffff88003ae8bc68>
      Kernel panic - not syncing: Fatal exception
      
      Moreover, the possible minimum is 1, so we can get another kernel panic:
      ...
      BUG: unable to handle kernel paging request at ffff88013caee5c0
      IP: [<ffffffff815604cf>] __alloc_skb+0x12f/0x1f0
      ...
      Signed-off-by: default avatarAlexey Kodanev <alexey.kodanev@oracle.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarSasha Levin <sasha.levin@oracle.com>
      c48cf4f2
    • Nimrod Andy's avatar
      net: fec: fix receive VLAN CTAG HW acceleration issue · 563d9192
      Nimrod Andy authored
      [ Upstream commit af5cbc98 ]
      
      The current driver support receive VLAN CTAG HW acceleration feature
      (NETIF_F_HW_VLAN_CTAG_RX) through software simulation. There calls the
      api .skb_copy_to_linear_data_offset() to skip the VLAN tag, but there
      have overlap between the two memory data point range. The patch just fix
      the issue.
      
      V2:
      Michael Grzeschik suggest to use memmove() instead of skb_copy_to_linear_data_offset().
      Reported-by: default avatarMichael Grzeschik <m.grzeschik@pengutronix.de>
      Fixes: 1b7bde6d ("net: fec: implement rx_copybreak to improve rx performance")
      Signed-off-by: default avatarFugang Duan <B38611@freescale.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarSasha Levin <sasha.levin@oracle.com>
      563d9192
    • WANG Cong's avatar
      net_sched: fix struct tc_u_hnode layout in u32 · 06e0dd73
      WANG Cong authored
      [ Upstream commit 5778d39d ]
      
      We dynamically allocate divisor+1 entries for ->ht[] in tc_u_hnode:
      
        ht = kzalloc(sizeof(*ht) + divisor*sizeof(void *), GFP_KERNEL);
      
      So ->ht is supposed to be the last field of this struct, however
      this is broken, since an rcu head is appended after it.
      
      Fixes: 1ce87720 ("net: sched: make cls_u32 lockless")
      Cc: Jamal Hadi Salim <jhs@mojatatu.com>
      Cc: John Fastabend <john.fastabend@gmail.com>
      Signed-off-by: default avatarCong Wang <xiyou.wangcong@gmail.com>
      Acked-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarSasha Levin <sasha.levin@oracle.com>
      06e0dd73