1. 28 Mar, 2015 14 commits
  2. 24 Mar, 2015 26 commits
    • Eric Dumazet's avatar
      tcp: make connect() mem charging friendly · e8f117f0
      Eric Dumazet authored
      [ Upstream commit 355a901e ]
      
      While working on sk_forward_alloc problems reported by Denys
      Fedoryshchenko, we found that tcp connect() (and fastopen) do not call
      sk_wmem_schedule() for SYN packet (and/or SYN/DATA packet), so
      sk_forward_alloc is negative while connect is in progress.
      
      We can fix this by calling regular sk_stream_alloc_skb() both for the
      SYN packet (in tcp_connect()) and the syn_data packet in
      tcp_send_syn_data()
      
      Then, tcp_send_syn_data() can avoid copying syn_data as we simply
      can manipulate syn_data->cb[] to remove SYN flag (and increment seq)
      
      Instead of open coding memcpy_fromiovecend(), simply use this helper.
      
      This leaves in socket write queue clean fast clone skbs.
      
      This was tested against our fastopen packetdrill tests.
      Reported-by: default avatarDenys Fedoryshchenko <nuclearcat@nuclearcat.com>
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Acked-by: default avatarYuchung Cheng <ycheng@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarSasha Levin <sasha.levin@oracle.com>
      e8f117f0
    • Catalin Marinas's avatar
      net: compat: Update get_compat_msghdr() to match copy_msghdr_from_user() behaviour · 34ca18c8
      Catalin Marinas authored
      [ Upstream commit 91edd096 ]
      
      Commit db31c55a (net: clamp ->msg_namelen instead of returning an
      error) introduced the clamping of msg_namelen when the unsigned value
      was larger than sizeof(struct sockaddr_storage). This caused a
      msg_namelen of -1 to be valid. The native code was subsequently fixed by
      commit dbb490b9 (net: socket: error on a negative msg_namelen).
      
      In addition, the native code sets msg_namelen to 0 when msg_name is
      NULL. This was done in commit (6a2a2b3a net:socket: set msg_namelen
      to 0 if msg_name is passed as NULL in msghdr struct from userland) and
      subsequently updated by 08adb7da (fold verify_iovec() into
      copy_msghdr_from_user()).
      
      This patch brings the get_compat_msghdr() in line with
      copy_msghdr_from_user().
      
      Fixes: db31c55a (net: clamp ->msg_namelen instead of returning an error)
      Cc: David S. Miller <davem@davemloft.net>
      Cc: Dan Carpenter <dan.carpenter@oracle.com>
      Signed-off-by: default avatarCatalin Marinas <catalin.marinas@arm.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarSasha Levin <sasha.levin@oracle.com>
      34ca18c8
    • Josh Hunt's avatar
      tcp: fix tcp fin memory accounting · b182ecc8
      Josh Hunt authored
      [ Upstream commit d22e1537 ]
      
      tcp_send_fin() does not account for the memory it allocates properly, so
      sk_forward_alloc can be negative in cases where we've sent a FIN:
      
      ss example output (ss -amn | grep -B1 f4294):
      tcp    FIN-WAIT-1 0      1            192.168.0.1:45520         192.0.2.1:8080
      	skmem:(r0,rb87380,t0,tb87380,f4294966016,w1280,o0,bl0)
      Acked-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarSasha Levin <sasha.levin@oracle.com>
      b182ecc8
    • Steven Barth's avatar
      ipv6: fix backtracking for throw routes · 7f249ac5
      Steven Barth authored
      [ Upstream commit 73ba57bf ]
      
      for throw routes to trigger evaluation of other policy rules
      EAGAIN needs to be propagated up to fib_rules_lookup
      similar to how its done for IPv4
      
      A simple testcase for verification is:
      
      ip -6 rule add lookup 33333 priority 33333
      ip -6 route add throw 2001:db8::1
      ip -6 route add 2001:db8::1 via fe80::1 dev wlan0 table 33333
      ip route get 2001:db8::1
      Signed-off-by: default avatarSteven Barth <cyrus@openwrt.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarSasha Levin <sasha.levin@oracle.com>
      7f249ac5
    • Ondrej Zary's avatar
      Revert "net: cx82310_eth: use common match macro" · 5039b8cf
      Ondrej Zary authored
      [ Upstream commit 8d006e01 ]
      
      This reverts commit 11ad714b because
      it breaks cx82310_eth.
      
      The custom USB_DEVICE_CLASS macro matches
      bDeviceClass, bDeviceSubClass and bDeviceProtocol
      but the common USB_DEVICE_AND_INTERFACE_INFO matches
      bInterfaceClass, bInterfaceSubClass and bInterfaceProtocol instead, which are
      not specified.
      Signed-off-by: default avatarOndrej Zary <linux@rainbow-software.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarSasha Levin <sasha.levin@oracle.com>
      5039b8cf
    • Al Viro's avatar
      rxrpc: bogus MSG_PEEK test in rxrpc_recvmsg() · 3d1acc9e
      Al Viro authored
      [ Upstream commit 7d985ed1 ]
      
      [I would really like an ACK on that one from dhowells; it appears to be
      quite straightforward, but...]
      
      MSG_PEEK isn't passed to ->recvmsg() via msg->msg_flags; as the matter of
      fact, neither the kernel users of rxrpc, nor the syscalls ever set that bit
      in there.  It gets passed via flags; in fact, another such check in the same
      function is done correctly - as flags & MSG_PEEK.
      
      It had been that way (effectively disabled) for 8 years, though, so the patch
      needs beating up - that case had never been tested.  If it is correct, it's
      -stable fodder.
      Signed-off-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarSasha Levin <sasha.levin@oracle.com>
      3d1acc9e
    • Al Viro's avatar
      caif: fix MSG_OOB test in caif_seqpkt_recvmsg() · 02bfe56e
      Al Viro authored
      [ Upstream commit 3eeff778 ]
      
      It should be checking flags, not msg->msg_flags.  It's ->sendmsg()
      instances that need to look for that in ->msg_flags, ->recvmsg() ones
      (including the other ->recvmsg() instance in that file, as well as
      unix_dgram_recvmsg() this one claims to be imitating) check in flags.
      Braino had been introduced in commit dcda13 ("caif: Bugfix - use MSG_TRUNC
      in receive") back in 2010, so it goes quite a while back.
      Signed-off-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarSasha Levin <sasha.levin@oracle.com>
      02bfe56e
    • Eric Dumazet's avatar
      inet_diag: fix possible overflow in inet_diag_dump_one_icsk() · e1f2092a
      Eric Dumazet authored
      [ Upstream commit c8e2c80d ]
      
      inet_diag_dump_one_icsk() allocates too small skb.
      
      Add inet_sk_attr_size() helper right before inet_sk_diag_fill()
      so that it can be updated if/when new attributes are added.
      
      iproute2/ss currently does not use this dump_one() interface,
      this might explain nobody noticed this problem yet.
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarSasha Levin <sasha.levin@oracle.com>
      e1f2092a
    • Jason Wang's avatar
      virtio-net: correctly delete napi hash · b9befa43
      Jason Wang authored
      [ Upstream commit ab3971b1 ]
      
      We don't delete napi from hash list during module exit. This will
      cause the following panic when doing module load and unload:
      
      BUG: unable to handle kernel paging request at 0000004e00000075
      IP: [<ffffffff816bd01b>] napi_hash_add+0x6b/0xf0
      PGD 3c5d5067 PUD 0
      Oops: 0000 [#1] SMP
      ...
      Call Trace:
      [<ffffffffa0a5bfb7>] init_vqs+0x107/0x490 [virtio_net]
      [<ffffffffa0a5c9f2>] virtnet_probe+0x562/0x791815639d880be [virtio_net]
      [<ffffffff8139e667>] virtio_dev_probe+0x137/0x200
      [<ffffffff814c7f2a>] driver_probe_device+0x7a/0x250
      [<ffffffff814c81d3>] __driver_attach+0x93/0xa0
      [<ffffffff814c8140>] ? __device_attach+0x40/0x40
      [<ffffffff814c6053>] bus_for_each_dev+0x63/0xa0
      [<ffffffff814c7a79>] driver_attach+0x19/0x20
      [<ffffffff814c76f0>] bus_add_driver+0x170/0x220
      [<ffffffffa0a60000>] ? 0xffffffffa0a60000
      [<ffffffff814c894f>] driver_register+0x5f/0xf0
      [<ffffffff8139e41b>] register_virtio_driver+0x1b/0x30
      [<ffffffffa0a60010>] virtio_net_driver_init+0x10/0x12 [virtio_net]
      
      This patch fixes this by doing this in virtnet_free_queues(). And also
      don't delete napi in virtnet_freeze() since it will call
      virtnet_free_queues() which has already did this.
      
      Fixes 91815639 ("virtio-net: rx busy polling support")
      Cc: Rusty Russell <rusty@rustcorp.com.au>
      Cc: Michael S. Tsirkin <mst@redhat.com>
      Signed-off-by: default avatarJason Wang <jasowang@redhat.com>
      Acked-by: default avatarMichael S. Tsirkin <mst@redhat.com>
      Reviewed-by: default avatarMichael S. Tsirkin <mst@redhat.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarSasha Levin <sasha.levin@oracle.com>
      b9befa43
    • Arnd Bergmann's avatar
      rds: avoid potential stack overflow · 6ba8661b
      Arnd Bergmann authored
      [ Upstream commit f862e07c ]
      
      The rds_iw_update_cm_id function stores a large 'struct rds_sock' object
      on the stack in order to pass a pair of addresses. This happens to just
      fit withint the 1024 byte stack size warning limit on x86, but just
      exceed that limit on ARM, which gives us this warning:
      
      net/rds/iw_rdma.c:200:1: warning: the frame size of 1056 bytes is larger than 1024 bytes [-Wframe-larger-than=]
      
      As the use of this large variable is basically bogus, we can rearrange
      the code to not do that. Instead of passing an rds socket into
      rds_iw_get_device, we now just pass the two addresses that we have
      available in rds_iw_update_cm_id, and we change rds_iw_get_mr accordingly,
      to create two address structures on the stack there.
      Signed-off-by: default avatarArnd Bergmann <arnd@arndb.de>
      Acked-by: default avatarSowmini Varadhan <sowmini.varadhan@oracle.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarSasha Levin <sasha.levin@oracle.com>
      6ba8661b
    • Alexey Kodanev's avatar
      net: sysctl_net_core: check SNDBUF and RCVBUF for min length · c48cf4f2
      Alexey Kodanev authored
      [ Upstream commit b1cb59cf ]
      
      sysctl has sysctl.net.core.rmem_*/wmem_* parameters which can be
      set to incorrect values. Given that 'struct sk_buff' allocates from
      rcvbuf, incorrectly set buffer length could result to memory
      allocation failures. For example, set them as follows:
      
          # sysctl net.core.rmem_default=64
            net.core.wmem_default = 64
          # sysctl net.core.wmem_default=64
            net.core.wmem_default = 64
          # ping localhost -s 1024 -i 0 > /dev/null
      
      This could result to the following failure:
      
      skbuff: skb_over_panic: text:ffffffff81628db4 len:-32 put:-32
      head:ffff88003a1cc200 data:ffff88003a1cc200 tail:0xffffffe0 end:0xc0 dev:<NULL>
      kernel BUG at net/core/skbuff.c:102!
      invalid opcode: 0000 [#1] SMP
      ...
      task: ffff88003b7f5550 ti: ffff88003ae88000 task.ti: ffff88003ae88000
      RIP: 0010:[<ffffffff8155fbd1>]  [<ffffffff8155fbd1>] skb_put+0xa1/0xb0
      RSP: 0018:ffff88003ae8bc68  EFLAGS: 00010296
      RAX: 000000000000008d RBX: 00000000ffffffe0 RCX: 0000000000000000
      RDX: ffff88003fdcf598 RSI: ffff88003fdcd9c8 RDI: ffff88003fdcd9c8
      RBP: ffff88003ae8bc88 R08: 0000000000000001 R09: 0000000000000000
      R10: 0000000000000001 R11: 00000000000002b2 R12: 0000000000000000
      R13: 0000000000000000 R14: ffff88003d3f7300 R15: ffff88000012a900
      FS:  00007fa0e2b4a840(0000) GS:ffff88003fc00000(0000) knlGS:0000000000000000
      CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      CR2: 0000000000d0f7e0 CR3: 000000003b8fb000 CR4: 00000000000006f0
      Stack:
       ffff88003a1cc200 00000000ffffffe0 00000000000000c0 ffffffff818cab1d
       ffff88003ae8bd68 ffffffff81628db4 ffff88003ae8bd48 ffff88003b7f5550
       ffff880031a09408 ffff88003b7f5550 ffff88000012aa48 ffff88000012ab00
      Call Trace:
       [<ffffffff81628db4>] unix_stream_sendmsg+0x2c4/0x470
       [<ffffffff81556f56>] sock_write_iter+0x146/0x160
       [<ffffffff811d9612>] new_sync_write+0x92/0xd0
       [<ffffffff811d9cd6>] vfs_write+0xd6/0x180
       [<ffffffff811da499>] SyS_write+0x59/0xd0
       [<ffffffff81651532>] system_call_fastpath+0x12/0x17
      Code: 00 00 48 89 44 24 10 8b 87 c8 00 00 00 48 89 44 24 08 48 8b 87 d8 00
            00 00 48 c7 c7 30 db 91 81 48 89 04 24 31 c0 e8 4f a8 0e 00 <0f> 0b
            eb fe 66 66 2e 0f 1f 84 00 00 00 00 00 55 48 89 e5 48 83
      RIP  [<ffffffff8155fbd1>] skb_put+0xa1/0xb0
      RSP <ffff88003ae8bc68>
      Kernel panic - not syncing: Fatal exception
      
      Moreover, the possible minimum is 1, so we can get another kernel panic:
      ...
      BUG: unable to handle kernel paging request at ffff88013caee5c0
      IP: [<ffffffff815604cf>] __alloc_skb+0x12f/0x1f0
      ...
      Signed-off-by: default avatarAlexey Kodanev <alexey.kodanev@oracle.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarSasha Levin <sasha.levin@oracle.com>
      c48cf4f2
    • Nimrod Andy's avatar
      net: fec: fix receive VLAN CTAG HW acceleration issue · 563d9192
      Nimrod Andy authored
      [ Upstream commit af5cbc98 ]
      
      The current driver support receive VLAN CTAG HW acceleration feature
      (NETIF_F_HW_VLAN_CTAG_RX) through software simulation. There calls the
      api .skb_copy_to_linear_data_offset() to skip the VLAN tag, but there
      have overlap between the two memory data point range. The patch just fix
      the issue.
      
      V2:
      Michael Grzeschik suggest to use memmove() instead of skb_copy_to_linear_data_offset().
      Reported-by: default avatarMichael Grzeschik <m.grzeschik@pengutronix.de>
      Fixes: 1b7bde6d ("net: fec: implement rx_copybreak to improve rx performance")
      Signed-off-by: default avatarFugang Duan <B38611@freescale.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarSasha Levin <sasha.levin@oracle.com>
      563d9192
    • WANG Cong's avatar
      net_sched: fix struct tc_u_hnode layout in u32 · 06e0dd73
      WANG Cong authored
      [ Upstream commit 5778d39d ]
      
      We dynamically allocate divisor+1 entries for ->ht[] in tc_u_hnode:
      
        ht = kzalloc(sizeof(*ht) + divisor*sizeof(void *), GFP_KERNEL);
      
      So ->ht is supposed to be the last field of this struct, however
      this is broken, since an rcu head is appended after it.
      
      Fixes: 1ce87720 ("net: sched: make cls_u32 lockless")
      Cc: Jamal Hadi Salim <jhs@mojatatu.com>
      Cc: John Fastabend <john.fastabend@gmail.com>
      Signed-off-by: default avatarCong Wang <xiyou.wangcong@gmail.com>
      Acked-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarSasha Levin <sasha.levin@oracle.com>
      06e0dd73
    • David S. Miller's avatar
      sparc64: Fix several bugs in memmove(). · 62268585
      David S. Miller authored
      [ Upstream commit 2077cef4 ]
      
      Firstly, handle zero length calls properly.  Believe it or not there
      are a few of these happening during early boot.
      
      Next, we can't just drop to a memcpy() call in the forward copy case
      where dst <= src.  The reason is that the cache initializing stores
      used in the Niagara memcpy() implementations can end up clearing out
      cache lines before we've sourced their original contents completely.
      
      For example, considering NG4memcpy, the main unrolled loop begins like
      this:
      
           load   src + 0x00
           load   src + 0x08
           load   src + 0x10
           load   src + 0x18
           load   src + 0x20
           store  dst + 0x00
      
      Assume dst is 64 byte aligned and let's say that dst is src - 8 for
      this memcpy() call.  That store at the end there is the one to the
      first line in the cache line, thus clearing the whole line, which thus
      clobbers "src + 0x28" before it even gets loaded.
      
      To avoid this, just fall through to a simple copy only mildly
      optimized for the case where src and dst are 8 byte aligned and the
      length is a multiple of 8 as well.  We could get fancy and call
      GENmemcpy() but this is good enough for how this thing is actually
      used.
      Reported-by: default avatarDavid Ahern <david.ahern@oracle.com>
      Reported-by: default avatarBob Picco <bpicco@meloft.net>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarSasha Levin <sasha.levin@oracle.com>
      62268585
    • David Ahern's avatar
      sparc: Touch NMI watchdog when walking cpus and calling printk · 54762bf1
      David Ahern authored
      [ Upstream commit 31aaa98c ]
      
      With the increase in number of CPUs calls to functions that dump
      output to console (e.g., arch_trigger_all_cpu_backtrace) can take
      a long time to complete. If IRQs are disabled eventually the NMI
      watchdog kicks in and creates more havoc. Avoid by telling the NMI
      watchdog everything is ok.
      Signed-off-by: default avatarDavid Ahern <david.ahern@oracle.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarSasha Levin <sasha.levin@oracle.com>
      54762bf1
    • David Ahern's avatar
      sparc: perf: Make counting mode actually work · 4cd5bcca
      David Ahern authored
      [ Upstream commit d51291cb ]
      
      Currently perf-stat (aka, counting mode) does not work:
      
      $ perf stat ls
      ...
       Performance counter stats for 'ls':
      
                1.585665      task-clock (msec)         #    0.580 CPUs utilized
                      24      context-switches          #    0.015 M/sec
                       0      cpu-migrations            #    0.000 K/sec
                      86      page-faults               #    0.054 M/sec
         <not supported>      cycles
         <not supported>      stalled-cycles-frontend
         <not supported>      stalled-cycles-backend
         <not supported>      instructions
         <not supported>      branches
         <not supported>      branch-misses
      
             0.002735100 seconds time elapsed
      
      The reason is that state is never reset (stays with PERF_HES_UPTODATE set).
      Add a call to sparc_pmu_enable_event during the added_event handling.
      Clean up the encoding since pmu_start calls sparc_pmu_enable_event which
      does the same. Passing PERF_EF_RELOAD to sparc_pmu_start means the call
      to sparc_perf_event_set_period can be removed as well.
      
      With this patch:
      
      $ perf stat ls
      ...
       Performance counter stats for 'ls':
      
                1.552890      task-clock (msec)         #    0.552 CPUs utilized
                      24      context-switches          #    0.015 M/sec
                       0      cpu-migrations            #    0.000 K/sec
                      86      page-faults               #    0.055 M/sec
               5,748,997      cycles                    #    3.702 GHz
         <not supported>      stalled-cycles-frontend:HG
         <not supported>      stalled-cycles-backend:HG
               1,684,362      instructions:HG           #    0.29  insns per cycle
                 295,133      branches:HG               #  190.054 M/sec
                  28,007      branch-misses:HG          #    9.49% of all branches
      
             0.002815665 seconds time elapsed
      Signed-off-by: default avatarDavid Ahern <david.ahern@oracle.com>
      Acked-by: default avatarBob Picco <bob.picco@oracle.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarSasha Levin <sasha.levin@oracle.com>
      4cd5bcca
    • David Ahern's avatar
      sparc: perf: Remove redundant perf_pmu_{en|dis}able calls · 84763ada
      David Ahern authored
      [ Upstream commit 5b0d4b55 ]
      
      perf_pmu_disable is called by core perf code before pmu->del and the
      enable function is called by core perf code afterwards. No need to
      call again within sparc_pmu_del.
      
      Ditto for pmu->add and sparc_pmu_add.
      Signed-off-by: default avatarDavid Ahern <david.ahern@oracle.com>
      Acked-by: default avatarBob Picco <bob.picco@oracle.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarSasha Levin <sasha.levin@oracle.com>
      84763ada
    • Rob Gardner's avatar
      sparc: semtimedop() unreachable due to comparison error · 4fb82ed6
      Rob Gardner authored
      [ Upstream commit 53eb2516 ]
      
      A bug was reported that the semtimedop() system call was always
      failing eith ENOSYS.
      
      Since SEMCTL is defined as 3, and SEMTIMEDOP is defined as 4,
      the comparison "call <= SEMCTL" will always prevent SEMTIMEDOP
      from getting through to the semaphore ops switch statement.
      
      This is corrected by changing the comparison to "call <= SEMTIMEDOP".
      
      Orabug: 20633375
      Signed-off-by: default avatarRob Gardner <rob.gardner@oracle.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarSasha Levin <sasha.levin@oracle.com>
      4fb82ed6
    • Andreas Larsson's avatar
      sparc32: destroy_context() and switch_mm() needs to disable interrupts. · b8b07bdf
      Andreas Larsson authored
      [ Upstream commit 66d0f7ec ]
      
      Load balancing can be triggered in the critical sections protected by
      srmmu_context_spinlock in destroy_context() and switch_mm() and can hang
      the cpu waiting for the rq lock of another cpu that in turn has called
      switch_mm hangning on srmmu_context_spinlock leading to deadlock.
      
      So, disable interrupt while taking srmmu_context_spinlock in
      destroy_context() and switch_mm() so we don't deadlock.
      
      See also commit 77b838fa ("[SPARC64]: destroy_context() needs to disable
      interrupts.")
      Signed-off-by: default avatarAndreas Larsson <andreas@gaisler.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarSasha Levin <sasha.levin@oracle.com>
      b8b07bdf
    • Sasha Levin's avatar
      Linux 3.18.10 · 96e199f1
      Sasha Levin authored
      Signed-off-by: default avatarSasha Levin <sasha.levin@oracle.com>
      96e199f1
    • Ian Munsie's avatar
      cxl: Add missing return statement after handling AFU errror · da7a0533
      Ian Munsie authored
      commit a6130ed2 upstream.
      
      We were missing a return statement in the PSL interrupt handler in the
      case of an AFU error, which would trigger an "Unhandled CXL PSL IRQ"
      warning. We do actually handle these type of errors (by notifying
      userspace), so add the missing return IRQ_HANDLED so we don't throw
      unecessary warnings.
      Signed-off-by: default avatarIan Munsie <imunsie@au1.ibm.com>
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      da7a0533
    • Ryan Grimm's avatar
      cxl: Fix device_node reference counting · 3bc64c1b
      Ryan Grimm authored
      commit 6f963ec2 upstream.
      
      When unbinding and rebinding the driver on a system with a card in PHB0, this
      error condition is reached after a few attempts:
      
      ERROR: Bad of_node_put() on /pciex@3fffe40000000
      CPU: 0 PID: 3040 Comm: bash Not tainted 3.18.0-rc3-12545-g3627ffe #152
      Call Trace:
      [c000000721acb5c0] [c00000000086ef94] .dump_stack+0x84/0xb0 (unreliable)
      [c000000721acb640] [c00000000073a0a8] .of_node_release+0xd8/0xe0
      [c000000721acb6d0] [c00000000044bc44] .kobject_release+0x74/0xe0
      [c000000721acb760] [c0000000007394fc] .of_node_put+0x1c/0x30
      [c000000721acb7d0] [c000000000545cd8] .cxl_probe+0x1a98/0x1d50
      [c000000721acb900] [c0000000004845a0] .local_pci_probe+0x40/0xc0
      [c000000721acb980] [c000000000484998] .pci_device_probe+0x128/0x170
      [c000000721acba30] [c00000000052400c] .driver_probe_device+0xac/0x2a0
      [c000000721acbad0] [c000000000522468] .bind_store+0x108/0x160
      [c000000721acbb70] [c000000000521448] .drv_attr_store+0x38/0x60
      [c000000721acbbe0] [c000000000293840] .sysfs_kf_write+0x60/0xa0
      [c000000721acbc50] [c000000000292500] .kernfs_fop_write+0x140/0x1d0
      [c000000721acbcf0] [c000000000208648] .vfs_write+0xd8/0x260
      [c000000721acbd90] [c000000000208b18] .SyS_write+0x58/0x100
      [c000000721acbe30] [c000000000009258] syscall_exit+0x0/0x98
      
      We are missing a call to of_node_get(). pnv_pci_to_phb_node() should
      call of_node_get() otherwise np's reference count isn't incremented and
      it might go away. Rename pnv_pci_to_phb_node() to pnv_pci_get_phb_node()
      so it's clear it calls of_node_get().
      Signed-off-by: default avatarRyan Grimm <grimm@linux.vnet.ibm.com>
      Acked-by: default avatarIan Munsie <imunsie@au1.ibm.com>
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      3bc64c1b
    • Ryan Grimm's avatar
      cxl: Use image state defaults for reloading FPGA · 0153a8f0
      Ryan Grimm authored
      commit 4beb5421 upstream.
      
      Select defaults such that a PERST causes flash image reload.  Select which
      image based on what the card is set up to load.
      
      CXL_VSEC_PERST_LOADS_IMAGE selects whether PERST assertion causes flash image
      load.
      
      CXL_VSEC_PERST_SELECT_USER selects which image is loaded on the next PERST.
      
      cxl_update_image_control writes these bits into the VSEC.
      Signed-off-by: default avatarRyan Grimm <grimm@linux.vnet.ibm.com>
      Acked-by: default avatarIan Munsie <imunsie@au1.ibm.com>
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      0153a8f0
    • Sergei Shtylyov's avatar
      clk-gate: fix bit # check in clk_register_gate() · 22374fc8
      Sergei Shtylyov authored
      commit 2e9dcdae upstream.
      
      In case CLK_GATE_HIWORD_MASK flag is passed to clk_register_gate(), the bit #
      should be no higher than 15, however the corresponding check is obviously off-
      by-one.
      
      Fixes: 04577994 ("clk: gate: add CLK_GATE_HIWORD_MASK")
      Signed-off-by: default avatarSergei Shtylyov <sergei.shtylyov@cogentembedded.com>
      Signed-off-by: default avatarMichael Turquette <mturquette@linaro.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      22374fc8
    • Peter Zijlstra's avatar
      sched/autogroup: Fix failure to set cpu.rt_runtime_us · 9acb057e
      Peter Zijlstra authored
      commit 1fe89e1b upstream.
      
      Because task_group() uses a cache of autogroup_task_group(), whose
      output depends on sched_class, switching classes can generate
      problems.
      
      In particular, when started as fair, the cache points to the
      autogroup, so when switching to RT the tg_rt_schedulable() test fails
      for every cpu.rt_{runtime,period}_us change because now the autogroup
      has tasks and no runtime.
      
      Furthermore, going back to the previous semantics of varying
      task_group() with sched_class has the down-side that the sched_debug
      output varies as well, even though the task really is in the
      autogroup.
      
      Therefore add an autogroup exception to tg_has_rt_tasks() -- such that
      both (all) task_group() usages in sched/core now have one. And remove
      all the remnants of the variable task_group() output.
      Reported-by: default avatarZefan Li <lizefan@huawei.com>
      Signed-off-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Mike Galbraith <umgwanakikbuti@gmail.com>
      Cc: Stefan Bader <stefan.bader@canonical.com>
      Fixes: 8323f26c ("sched: Fix race in task_group()")
      Link: http://lkml.kernel.org/r/20150209112237.GR5029@twins.programming.kicks-ass.netSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      9acb057e
    • Michal Hocko's avatar
      vmstat: do not use deferrable delayed work for vmstat_update · c462adc8
      Michal Hocko authored
      commit ba4877b9 upstream.
      
      Vinayak Menon has reported that an excessive number of tasks was throttled
      in the direct reclaim inside too_many_isolated() because NR_ISOLATED_FILE
      was relatively high compared to NR_INACTIVE_FILE.  However it turned out
      that the real number of NR_ISOLATED_FILE was 0 and the per-cpu
      vm_stat_diff wasn't transferred into the global counter.
      
      vmstat_work which is responsible for the sync is defined as deferrable
      delayed work which means that the defined timeout doesn't wake up an idle
      CPU.  A CPU might stay in an idle state for a long time and general effort
      is to keep such a CPU in this state as long as possible which might lead
      to all sorts of troubles for vmstat consumers as can be seen with the
      excessive direct reclaim throttling.
      
      This patch basically reverts 39bf6270 ("VM statistics: Make timer
      deferrable") but it shouldn't cause any problems for idle CPUs because
      only CPUs with an active per-cpu drift are woken up since 7cc36bbd
      ("vmstat: on-demand vmstat workers v8") and CPUs which are idle for a
      longer time shouldn't have per-cpu drift.
      
      Fixes: 39bf6270 (VM statistics: Make timer deferrable)
      Signed-off-by: default avatarMichal Hocko <mhocko@suse.cz>
      Reported-by: default avatarVinayak Menon <vinmenon@codeaurora.org>
      Acked-by: default avatarChristoph Lameter <cl@linux.com>
      Cc: Johannes Weiner <hannes@cmpxchg.org>
      Cc: Vladimir Davydov <vdavydov@parallels.com>
      Cc: Mel Gorman <mgorman@suse.de>
      Cc: Minchan Kim <minchan@kernel.org>
      Cc: David Rientjes <rientjes@google.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      c462adc8