1. 26 Mar, 2015 12 commits
    • Jason Wang's avatar
      virtio-net: correctly delete napi hash · cdffd074
      Jason Wang authored
      [ Upstream commit ab3971b1 ]
      
      We don't delete napi from hash list during module exit. This will
      cause the following panic when doing module load and unload:
      
      BUG: unable to handle kernel paging request at 0000004e00000075
      IP: [<ffffffff816bd01b>] napi_hash_add+0x6b/0xf0
      PGD 3c5d5067 PUD 0
      Oops: 0000 [#1] SMP
      ...
      Call Trace:
      [<ffffffffa0a5bfb7>] init_vqs+0x107/0x490 [virtio_net]
      [<ffffffffa0a5c9f2>] virtnet_probe+0x562/0x791815639d880be [virtio_net]
      [<ffffffff8139e667>] virtio_dev_probe+0x137/0x200
      [<ffffffff814c7f2a>] driver_probe_device+0x7a/0x250
      [<ffffffff814c81d3>] __driver_attach+0x93/0xa0
      [<ffffffff814c8140>] ? __device_attach+0x40/0x40
      [<ffffffff814c6053>] bus_for_each_dev+0x63/0xa0
      [<ffffffff814c7a79>] driver_attach+0x19/0x20
      [<ffffffff814c76f0>] bus_add_driver+0x170/0x220
      [<ffffffffa0a60000>] ? 0xffffffffa0a60000
      [<ffffffff814c894f>] driver_register+0x5f/0xf0
      [<ffffffff8139e41b>] register_virtio_driver+0x1b/0x30
      [<ffffffffa0a60010>] virtio_net_driver_init+0x10/0x12 [virtio_net]
      
      This patch fixes this by doing this in virtnet_free_queues(). And also
      don't delete napi in virtnet_freeze() since it will call
      virtnet_free_queues() which has already did this.
      
      Fixes 91815639 ("virtio-net: rx busy polling support")
      Cc: Rusty Russell <rusty@rustcorp.com.au>
      Cc: Michael S. Tsirkin <mst@redhat.com>
      Signed-off-by: default avatarJason Wang <jasowang@redhat.com>
      Acked-by: default avatarMichael S. Tsirkin <mst@redhat.com>
      Reviewed-by: default avatarMichael S. Tsirkin <mst@redhat.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      cdffd074
    • Arnd Bergmann's avatar
      rds: avoid potential stack overflow · 3bde7bbe
      Arnd Bergmann authored
      [ Upstream commit f862e07c ]
      
      The rds_iw_update_cm_id function stores a large 'struct rds_sock' object
      on the stack in order to pass a pair of addresses. This happens to just
      fit withint the 1024 byte stack size warning limit on x86, but just
      exceed that limit on ARM, which gives us this warning:
      
      net/rds/iw_rdma.c:200:1: warning: the frame size of 1056 bytes is larger than 1024 bytes [-Wframe-larger-than=]
      
      As the use of this large variable is basically bogus, we can rearrange
      the code to not do that. Instead of passing an rds socket into
      rds_iw_get_device, we now just pass the two addresses that we have
      available in rds_iw_update_cm_id, and we change rds_iw_get_mr accordingly,
      to create two address structures on the stack there.
      Signed-off-by: default avatarArnd Bergmann <arnd@arndb.de>
      Acked-by: default avatarSowmini Varadhan <sowmini.varadhan@oracle.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      3bde7bbe
    • Alexey Kodanev's avatar
      net: sysctl_net_core: check SNDBUF and RCVBUF for min length · 1e7778e3
      Alexey Kodanev authored
      [ Upstream commit b1cb59cf ]
      
      sysctl has sysctl.net.core.rmem_*/wmem_* parameters which can be
      set to incorrect values. Given that 'struct sk_buff' allocates from
      rcvbuf, incorrectly set buffer length could result to memory
      allocation failures. For example, set them as follows:
      
          # sysctl net.core.rmem_default=64
            net.core.wmem_default = 64
          # sysctl net.core.wmem_default=64
            net.core.wmem_default = 64
          # ping localhost -s 1024 -i 0 > /dev/null
      
      This could result to the following failure:
      
      skbuff: skb_over_panic: text:ffffffff81628db4 len:-32 put:-32
      head:ffff88003a1cc200 data:ffff88003a1cc200 tail:0xffffffe0 end:0xc0 dev:<NULL>
      kernel BUG at net/core/skbuff.c:102!
      invalid opcode: 0000 [#1] SMP
      ...
      task: ffff88003b7f5550 ti: ffff88003ae88000 task.ti: ffff88003ae88000
      RIP: 0010:[<ffffffff8155fbd1>]  [<ffffffff8155fbd1>] skb_put+0xa1/0xb0
      RSP: 0018:ffff88003ae8bc68  EFLAGS: 00010296
      RAX: 000000000000008d RBX: 00000000ffffffe0 RCX: 0000000000000000
      RDX: ffff88003fdcf598 RSI: ffff88003fdcd9c8 RDI: ffff88003fdcd9c8
      RBP: ffff88003ae8bc88 R08: 0000000000000001 R09: 0000000000000000
      R10: 0000000000000001 R11: 00000000000002b2 R12: 0000000000000000
      R13: 0000000000000000 R14: ffff88003d3f7300 R15: ffff88000012a900
      FS:  00007fa0e2b4a840(0000) GS:ffff88003fc00000(0000) knlGS:0000000000000000
      CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      CR2: 0000000000d0f7e0 CR3: 000000003b8fb000 CR4: 00000000000006f0
      Stack:
       ffff88003a1cc200 00000000ffffffe0 00000000000000c0 ffffffff818cab1d
       ffff88003ae8bd68 ffffffff81628db4 ffff88003ae8bd48 ffff88003b7f5550
       ffff880031a09408 ffff88003b7f5550 ffff88000012aa48 ffff88000012ab00
      Call Trace:
       [<ffffffff81628db4>] unix_stream_sendmsg+0x2c4/0x470
       [<ffffffff81556f56>] sock_write_iter+0x146/0x160
       [<ffffffff811d9612>] new_sync_write+0x92/0xd0
       [<ffffffff811d9cd6>] vfs_write+0xd6/0x180
       [<ffffffff811da499>] SyS_write+0x59/0xd0
       [<ffffffff81651532>] system_call_fastpath+0x12/0x17
      Code: 00 00 48 89 44 24 10 8b 87 c8 00 00 00 48 89 44 24 08 48 8b 87 d8 00
            00 00 48 c7 c7 30 db 91 81 48 89 04 24 31 c0 e8 4f a8 0e 00 <0f> 0b
            eb fe 66 66 2e 0f 1f 84 00 00 00 00 00 55 48 89 e5 48 83
      RIP  [<ffffffff8155fbd1>] skb_put+0xa1/0xb0
      RSP <ffff88003ae8bc68>
      Kernel panic - not syncing: Fatal exception
      
      Moreover, the possible minimum is 1, so we can get another kernel panic:
      ...
      BUG: unable to handle kernel paging request at ffff88013caee5c0
      IP: [<ffffffff815604cf>] __alloc_skb+0x12f/0x1f0
      ...
      Signed-off-by: default avatarAlexey Kodanev <alexey.kodanev@oracle.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      1e7778e3
    • Neal Cardwell's avatar
      tcp: restore 1.5x per RTT limit to CUBIC cwnd growth in congestion avoidance · 3a485973
      Neal Cardwell authored
      [ Upstream commit d578e18c ]
      
      Commit 814d488c ("tcp: fix the timid additive increase on stretch
      ACKs") fixed a bug where tcp_cong_avoid_ai() would either credit a
      connection with an increase of snd_cwnd_cnt, or increase snd_cwnd, but
      not both, resulting in cwnd increasing by 1 packet on at most every
      alternate invocation of tcp_cong_avoid_ai().
      
      Although the commit correctly implemented the CUBIC algorithm, which
      can increase cwnd by as much as 1 packet per 1 packet ACKed (2x per
      RTT), in practice that could be too aggressive: in tests on network
      paths with small buffers, YouTube server retransmission rates nearly
      doubled.
      
      This commit restores CUBIC to a maximum cwnd growth rate of 1 packet
      per 2 packets ACKed (1.5x per RTT). In YouTube tests this restored
      retransmit rates to low levels.
      
      Testing: This patch has been tested in datacenter netperf transfers
      and live youtube.com and google.com servers.
      
      Fixes: 9cd981dc ("tcp: fix stretch ACK bugs in CUBIC")
      Signed-off-by: default avatarNeal Cardwell <ncardwell@google.com>
      Signed-off-by: default avatarYuchung Cheng <ycheng@google.com>
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      3a485973
    • Neal Cardwell's avatar
      tcp: fix tcp_cong_avoid_ai() credit accumulation bug with decreases in w · 34081d1d
      Neal Cardwell authored
      [ Upstream commit 9949afa4 ]
      
      The recent change to tcp_cong_avoid_ai() to handle stretch ACKs
      introduced a bug where snd_cwnd_cnt could accumulate a very large
      value while w was large, and then if w was reduced snd_cwnd could be
      incremented by a large delta, leading to a large burst and high packet
      loss. This was tickled when CUBIC's bictcp_update() sets "ca->cnt =
      100 * cwnd".
      
      This bug crept in while preparing the upstream version of
      814d488c.
      
      Testing: This patch has been tested in datacenter netperf transfers
      and live youtube.com and google.com servers.
      
      Fixes: 814d488c ("tcp: fix the timid additive increase on stretch ACKs")
      Signed-off-by: default avatarNeal Cardwell <ncardwell@google.com>
      Signed-off-by: default avatarYuchung Cheng <ycheng@google.com>
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      34081d1d
    • Nimrod Andy's avatar
      net: fec: fix receive VLAN CTAG HW acceleration issue · be0e858e
      Nimrod Andy authored
      [ Upstream commit af5cbc98 ]
      
      The current driver support receive VLAN CTAG HW acceleration feature
      (NETIF_F_HW_VLAN_CTAG_RX) through software simulation. There calls the
      api .skb_copy_to_linear_data_offset() to skip the VLAN tag, but there
      have overlap between the two memory data point range. The patch just fix
      the issue.
      
      V2:
      Michael Grzeschik suggest to use memmove() instead of skb_copy_to_linear_data_offset().
      Reported-by: default avatarMichael Grzeschik <m.grzeschik@pengutronix.de>
      Fixes: 1b7bde6d ("net: fec: implement rx_copybreak to improve rx performance")
      Signed-off-by: default avatarFugang Duan <B38611@freescale.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      be0e858e
    • WANG Cong's avatar
      net_sched: fix struct tc_u_hnode layout in u32 · 9f2db938
      WANG Cong authored
      [ Upstream commit 5778d39d ]
      
      We dynamically allocate divisor+1 entries for ->ht[] in tc_u_hnode:
      
        ht = kzalloc(sizeof(*ht) + divisor*sizeof(void *), GFP_KERNEL);
      
      So ->ht is supposed to be the last field of this struct, however
      this is broken, since an rcu head is appended after it.
      
      Fixes: 1ce87720 ("net: sched: make cls_u32 lockless")
      Cc: Jamal Hadi Salim <jhs@mojatatu.com>
      Cc: John Fastabend <john.fastabend@gmail.com>
      Signed-off-by: default avatarCong Wang <xiyou.wangcong@gmail.com>
      Acked-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      9f2db938
    • David S. Miller's avatar
      sparc64: Fix several bugs in memmove(). · 588caa85
      David S. Miller authored
      [ Upstream commit 2077cef4 ]
      
      Firstly, handle zero length calls properly.  Believe it or not there
      are a few of these happening during early boot.
      
      Next, we can't just drop to a memcpy() call in the forward copy case
      where dst <= src.  The reason is that the cache initializing stores
      used in the Niagara memcpy() implementations can end up clearing out
      cache lines before we've sourced their original contents completely.
      
      For example, considering NG4memcpy, the main unrolled loop begins like
      this:
      
           load   src + 0x00
           load   src + 0x08
           load   src + 0x10
           load   src + 0x18
           load   src + 0x20
           store  dst + 0x00
      
      Assume dst is 64 byte aligned and let's say that dst is src - 8 for
      this memcpy() call.  That store at the end there is the one to the
      first line in the cache line, thus clearing the whole line, which thus
      clobbers "src + 0x28" before it even gets loaded.
      
      To avoid this, just fall through to a simple copy only mildly
      optimized for the case where src and dst are 8 byte aligned and the
      length is a multiple of 8 as well.  We could get fancy and call
      GENmemcpy() but this is good enough for how this thing is actually
      used.
      Reported-by: default avatarDavid Ahern <david.ahern@oracle.com>
      Reported-by: default avatarBob Picco <bpicco@meloft.net>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      588caa85
    • David Ahern's avatar
      sparc: Touch NMI watchdog when walking cpus and calling printk · d19c121b
      David Ahern authored
      [ Upstream commit 31aaa98c ]
      
      With the increase in number of CPUs calls to functions that dump
      output to console (e.g., arch_trigger_all_cpu_backtrace) can take
      a long time to complete. If IRQs are disabled eventually the NMI
      watchdog kicks in and creates more havoc. Avoid by telling the NMI
      watchdog everything is ok.
      Signed-off-by: default avatarDavid Ahern <david.ahern@oracle.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      d19c121b
    • David Ahern's avatar
      sparc: perf: Make counting mode actually work · a65225da
      David Ahern authored
      [ Upstream commit d51291cb ]
      
      Currently perf-stat (aka, counting mode) does not work:
      
      $ perf stat ls
      ...
       Performance counter stats for 'ls':
      
                1.585665      task-clock (msec)         #    0.580 CPUs utilized
                      24      context-switches          #    0.015 M/sec
                       0      cpu-migrations            #    0.000 K/sec
                      86      page-faults               #    0.054 M/sec
         <not supported>      cycles
         <not supported>      stalled-cycles-frontend
         <not supported>      stalled-cycles-backend
         <not supported>      instructions
         <not supported>      branches
         <not supported>      branch-misses
      
             0.002735100 seconds time elapsed
      
      The reason is that state is never reset (stays with PERF_HES_UPTODATE set).
      Add a call to sparc_pmu_enable_event during the added_event handling.
      Clean up the encoding since pmu_start calls sparc_pmu_enable_event which
      does the same. Passing PERF_EF_RELOAD to sparc_pmu_start means the call
      to sparc_perf_event_set_period can be removed as well.
      
      With this patch:
      
      $ perf stat ls
      ...
       Performance counter stats for 'ls':
      
                1.552890      task-clock (msec)         #    0.552 CPUs utilized
                      24      context-switches          #    0.015 M/sec
                       0      cpu-migrations            #    0.000 K/sec
                      86      page-faults               #    0.055 M/sec
               5,748,997      cycles                    #    3.702 GHz
         <not supported>      stalled-cycles-frontend:HG
         <not supported>      stalled-cycles-backend:HG
               1,684,362      instructions:HG           #    0.29  insns per cycle
                 295,133      branches:HG               #  190.054 M/sec
                  28,007      branch-misses:HG          #    9.49% of all branches
      
             0.002815665 seconds time elapsed
      Signed-off-by: default avatarDavid Ahern <david.ahern@oracle.com>
      Acked-by: default avatarBob Picco <bob.picco@oracle.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      a65225da
    • David Ahern's avatar
      sparc: perf: Remove redundant perf_pmu_{en|dis}able calls · 9fd7f980
      David Ahern authored
      [ Upstream commit 5b0d4b55 ]
      
      perf_pmu_disable is called by core perf code before pmu->del and the
      enable function is called by core perf code afterwards. No need to
      call again within sparc_pmu_del.
      
      Ditto for pmu->add and sparc_pmu_add.
      Signed-off-by: default avatarDavid Ahern <david.ahern@oracle.com>
      Acked-by: default avatarBob Picco <bob.picco@oracle.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      9fd7f980
    • Rob Gardner's avatar
      sparc: semtimedop() unreachable due to comparison error · 915a9ae8
      Rob Gardner authored
      [ Upstream commit 53eb2516 ]
      
      A bug was reported that the semtimedop() system call was always
      failing eith ENOSYS.
      
      Since SEMCTL is defined as 3, and SEMTIMEDOP is defined as 4,
      the comparison "call <= SEMCTL" will always prevent SEMTIMEDOP
      from getting through to the semaphore ops switch statement.
      
      This is corrected by changing the comparison to "call <= SEMTIMEDOP".
      
      Orabug: 20633375
      Signed-off-by: default avatarRob Gardner <rob.gardner@oracle.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      915a9ae8
  2. 18 Mar, 2015 28 commits