1. 24 Jun, 2016 25 commits
    • AceLan Kao's avatar
      ALSA: hda - Fix headset mic detection problem for Dell machine · 1bf80a48
      AceLan Kao authored
      commit f90d83b3 upstream.
      
      Add the pin configuration value of this machine into the pin_quirk
      table to make DELL1_MIC_NO_PRESENCE apply to this machine.
      Signed-off-by: default avatarAceLan Kao <acelan.kao@canonical.com>
      Signed-off-by: default avatarTakashi Iwai <tiwai@suse.de>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      1bf80a48
    • Vinod Koul's avatar
      ALSA: hda - Add PCI ID for Kabylake · 1f4b7507
      Vinod Koul authored
      commit 35639a0e upstream.
      
      Kabylake shows up as PCI ID 0xa171. And Kabylake-LP as 0x9d71.
      Since these are similar to Skylake add these to SKL_PLUS macro
      Signed-off-by: default avatarVinod Koul <vinod.koul@intel.com>
      Signed-off-by: default avatarTakashi Iwai <tiwai@suse.de>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      1f4b7507
    • Paolo Bonzini's avatar
      KVM: irqfd: fix NULL pointer dereference in kvm_irq_map_gsi · 2cb77b0a
      Paolo Bonzini authored
      commit c622a3c2 upstream.
      
      Found by syzkaller:
      
          BUG: unable to handle kernel NULL pointer dereference at 0000000000000120
          IP: [<ffffffffa0797202>] kvm_irq_map_gsi+0x12/0x90 [kvm]
          PGD 6f80b067 PUD b6535067 PMD 0
          Oops: 0000 [#1] SMP
          CPU: 3 PID: 4988 Comm: a.out Not tainted 4.4.9-300.fc23.x86_64 #1
          [...]
          Call Trace:
           [<ffffffffa0795f62>] irqfd_update+0x32/0xc0 [kvm]
           [<ffffffffa0796c7c>] kvm_irqfd+0x3dc/0x5b0 [kvm]
           [<ffffffffa07943f4>] kvm_vm_ioctl+0x164/0x6f0 [kvm]
           [<ffffffff81241648>] do_vfs_ioctl+0x298/0x480
           [<ffffffff812418a9>] SyS_ioctl+0x79/0x90
           [<ffffffff817a1062>] tracesys_phase2+0x84/0x89
          Code: b5 71 a7 e0 5b 41 5c 41 5d 5d f3 c3 66 66 66 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 55 48 8b 8f 10 2e 00 00 31 c0 48 89 e5 <39> 91 20 01 00 00 76 6a 48 63 d2 48 8b 94 d1 28 01 00 00 48 85
          RIP  [<ffffffffa0797202>] kvm_irq_map_gsi+0x12/0x90 [kvm]
           RSP <ffff8800926cbca8>
          CR2: 0000000000000120
      
      Testcase:
      
          #include <unistd.h>
          #include <sys/syscall.h>
          #include <string.h>
          #include <stdint.h>
          #include <linux/kvm.h>
          #include <fcntl.h>
          #include <sys/ioctl.h>
      
          long r[26];
      
          int main()
          {
              memset(r, -1, sizeof(r));
              r[2] = open("/dev/kvm", 0);
              r[3] = ioctl(r[2], KVM_CREATE_VM, 0);
      
              struct kvm_irqfd ifd;
              ifd.fd = syscall(SYS_eventfd2, 5, 0);
              ifd.gsi = 3;
              ifd.flags = 2;
              ifd.resamplefd = ifd.fd;
              r[25] = ioctl(r[3], KVM_IRQFD, &ifd);
              return 0;
          }
      Reported-by: default avatarDmitry Vyukov <dvyukov@google.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      Signed-off-by: default avatarRadim Krčmář <rkrcmar@redhat.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      2cb77b0a
    • Paolo Bonzini's avatar
      KVM: x86: fix OOPS after invalid KVM_SET_DEBUGREGS · ded4fc62
      Paolo Bonzini authored
      commit d14bdb55 upstream.
      
      MOV to DR6 or DR7 causes a #GP if an attempt is made to write a 1 to
      any of bits 63:32.  However, this is not detected at KVM_SET_DEBUGREGS
      time, and the next KVM_RUN oopses:
      
         general protection fault: 0000 [#1] SMP
         CPU: 2 PID: 14987 Comm: a.out Not tainted 4.4.9-300.fc23.x86_64 #1
         Hardware name: LENOVO 2325F51/2325F51, BIOS G2ET32WW (1.12 ) 05/30/2012
         [...]
         Call Trace:
          [<ffffffffa072c93d>] kvm_arch_vcpu_ioctl_run+0x141d/0x14e0 [kvm]
          [<ffffffffa071405d>] kvm_vcpu_ioctl+0x33d/0x620 [kvm]
          [<ffffffff81241648>] do_vfs_ioctl+0x298/0x480
          [<ffffffff812418a9>] SyS_ioctl+0x79/0x90
          [<ffffffff817a0f2e>] entry_SYSCALL_64_fastpath+0x12/0x71
         Code: 55 83 ff 07 48 89 e5 77 27 89 ff ff 24 fd 90 87 80 81 0f 23 fe 5d c3 0f 23 c6 5d c3 0f 23 ce 5d c3 0f 23 d6 5d c3 0f 23 de 5d c3 <0f> 23 f6 5d c3 0f 0b 66 66 66 66 66 2e 0f 1f 84 00 00 00 00 00
         RIP  [<ffffffff810639eb>] native_set_debugreg+0x2b/0x40
          RSP <ffff88005836bd50>
      
      Testcase (beautified/reduced from syzkaller output):
      
          #include <unistd.h>
          #include <sys/syscall.h>
          #include <string.h>
          #include <stdint.h>
          #include <linux/kvm.h>
          #include <fcntl.h>
          #include <sys/ioctl.h>
      
          long r[8];
      
          int main()
          {
              struct kvm_debugregs dr = { 0 };
      
              r[2] = open("/dev/kvm", O_RDONLY);
              r[3] = ioctl(r[2], KVM_CREATE_VM, 0);
              r[4] = ioctl(r[3], KVM_CREATE_VCPU, 7);
      
              memcpy(&dr,
                     "\x5d\x6a\x6b\xe8\x57\x3b\x4b\x7e\xcf\x0d\xa1\x72"
                     "\xa3\x4a\x29\x0c\xfc\x6d\x44\x00\xa7\x52\xc7\xd8"
                     "\x00\xdb\x89\x9d\x78\xb5\x54\x6b\x6b\x13\x1c\xe9"
                     "\x5e\xd3\x0e\x40\x6f\xb4\x66\xf7\x5b\xe3\x36\xcb",
                     48);
              r[7] = ioctl(r[4], KVM_SET_DEBUGREGS, &dr);
              r[6] = ioctl(r[4], KVM_RUN, 0);
          }
      Reported-by: default avatarDmitry Vyukov <dvyukov@google.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      Signed-off-by: default avatarRadim Krčmář <rkrcmar@redhat.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      ded4fc62
    • David Wragg's avatar
      vxlan, gre, geneve: Set a large MTU on ovs-created tunnel devices · ce9c0dba
      David Wragg authored
      [ Upstream commit 7e059158 ]
      
      Prior to 4.3, openvswitch tunnel vports (vxlan, gre and geneve) could
      transmit vxlan packets of any size, constrained only by the ability to
      send out the resulting packets.  4.3 introduced netdevs corresponding
      to tunnel vports.  These netdevs have an MTU, which limits the size of
      a packet that can be successfully encapsulated.  The default MTU
      values are low (1500 or less), which is awkwardly small in the context
      of physical networks supporting jumbo frames, and leads to a
      conspicuous change in behaviour for userspace.
      
      Instead, set the MTU on openvswitch-created netdevs to be the relevant
      maximum (i.e. the maximum IP packet size minus any relevant overhead),
      effectively restoring the behaviour prior to 4.3.
      Signed-off-by: default avatarDavid Wragg <david@weave.works>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      ce9c0dba
    • David Wragg's avatar
      geneve: Relax MTU constraints · 51d7c394
      David Wragg authored
      [ Upstream commit 55e5bfb5 ]
      
      Allow the MTU of geneve devices to be set to large values, in order to
      exploit underlying networks with larger frame sizes.
      
      GENEVE does not have a fixed encapsulation overhead (an openvswitch
      rule can add variable length options), so there is no relevant maximum
      MTU to enforce.  A maximum of IP_MAX_MTU is used instead.
      Encapsulated packets that are too big for the underlying network will
      get dropped on the floor.
      Signed-off-by: default avatarDavid Wragg <david@weave.works>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      51d7c394
    • David Wragg's avatar
      vxlan: Relax MTU constraints · 3dc44305
      David Wragg authored
      [ Upstream commit 72564b59 ]
      
      Allow the MTU of vxlan devices without an underlying device to be set
      to larger values (up to a maximum based on IP packet limits and vxlan
      overhead).
      
      Previously, their MTUs could not be set to higher than the
      conventional ethernet value of 1500.  This is a very arbitrary value
      in the context of vxlan, and prevented vxlan devices from being able
      to take advantage of jumbo frames etc.
      
      The default MTU remains 1500, for compatibility.
      Signed-off-by: default avatarDavid Wragg <david@weave.works>
      Acked-by: default avatarRoopa Prabhu <roopa@cumulusnetworks.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      3dc44305
    • Jakub Sitnicki's avatar
      ipv6: Skip XFRM lookup if dst_entry in socket cache is valid · 4d82f395
      Jakub Sitnicki authored
      [ Upstream commit 00bc0ef5 ]
      
      At present we perform an xfrm_lookup() for each UDPv6 message we
      send. The lookup involves querying the flow cache (flow_cache_lookup)
      and, in case of a cache miss, creating an XFRM bundle.
      
      If we miss the flow cache, we can end up creating a new bundle and
      deriving the path MTU (xfrm_init_pmtu) from on an already transformed
      dst_entry, which we pass from the socket cache (sk->sk_dst_cache) down
      to xfrm_lookup(). This can happen only if we're caching the dst_entry
      in the socket, that is when we're using a connected UDP socket.
      
      To put it another way, the path MTU shrinks each time we miss the flow
      cache, which later on leads to incorrectly fragmented payload. It can
      be observed with ESPv6 in transport mode:
      
        1) Set up a transformation and lower the MTU to trigger fragmentation
          # ip xfrm policy add dir out src ::1 dst ::1 \
            tmpl src ::1 dst ::1 proto esp spi 1
          # ip xfrm state add src ::1 dst ::1 \
            proto esp spi 1 enc 'aes' 0x0b0b0b0b0b0b0b0b0b0b0b0b0b0b0b0b
          # ip link set dev lo mtu 1500
      
        2) Monitor the packet flow and set up an UDP sink
          # tcpdump -ni lo -ttt &
          # socat udp6-listen:12345,fork /dev/null &
      
        3) Send a datagram that needs fragmentation with a connected socket
          # perl -e 'print "@" x 1470 | socat - udp6:[::1]:12345
          2016/06/07 18:52:52 socat[724] E read(3, 0x555bb3d5ba00, 8192): Protocol error
          00:00:00.000000 IP6 ::1 > ::1: frag (0|1448) ESP(spi=0x00000001,seq=0x2), length 1448
          00:00:00.000014 IP6 ::1 > ::1: frag (1448|32)
          00:00:00.000050 IP6 ::1 > ::1: ESP(spi=0x00000001,seq=0x3), length 1272
          (^ ICMPv6 Parameter Problem)
          00:00:00.000022 IP6 ::1 > ::1: ESP(spi=0x00000001,seq=0x5), length 136
      
        4) Compare it to a non-connected socket
          # perl -e 'print "@" x 1500' | socat - udp6-sendto:[::1]:12345
          00:00:40.535488 IP6 ::1 > ::1: frag (0|1448) ESP(spi=0x00000001,seq=0x6), length 1448
          00:00:00.000010 IP6 ::1 > ::1: frag (1448|64)
      
      What happens in step (3) is:
      
        1) when connecting the socket in __ip6_datagram_connect(), we
           perform an XFRM lookup, miss the flow cache, create an XFRM
           bundle, and cache the destination,
      
        2) afterwards, when sending the datagram, we perform an XFRM lookup,
           again, miss the flow cache (due to mismatch of flowi6_iif and
           flowi6_oif, which is an issue of its own), and recreate an XFRM
           bundle based on the cached (and already transformed) destination.
      
      To prevent the recreation of an XFRM bundle, avoid an XFRM lookup
      altogether whenever we already have a destination entry cached in the
      socket. This prevents the path MTU shrinkage and brings us on par with
      UDPv4.
      
      The fix also benefits connected PINGv6 sockets, another user of
      ip6_sk_dst_lookup_flow(), who also suffer messages being transformed
      twice.
      
      Joint work with Hannes Frederic Sowa.
      Reported-by: default avatarJan Tluka <jtluka@redhat.com>
      Signed-off-by: default avatarJakub Sitnicki <jkbs@redhat.com>
      Acked-by: default avatarHannes Frederic Sowa <hannes@stressinduktion.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      4d82f395
    • Guillaume Nault's avatar
      l2tp: fix configuration passed to setup_udp_tunnel_sock() · 05cbd46b
      Guillaume Nault authored
      [ Upstream commit a5c5e2da ]
      
      Unused fields of udp_cfg must be all zeros. Otherwise
      setup_udp_tunnel_sock() fills ->gro_receive and ->gro_complete
      callbacks with garbage, eventually resulting in panic when used by
      udp_gro_receive().
      
      [   72.694123] BUG: unable to handle kernel paging request at ffff880033f87d78
      [   72.695518] IP: [<ffff880033f87d78>] 0xffff880033f87d78
      [   72.696530] PGD 26e2067 PUD 26e3067 PMD 342ed063 PTE 8000000033f87163
      [   72.696530] Oops: 0011 [#1] SMP KASAN
      [   72.696530] Modules linked in: l2tp_ppp l2tp_netlink l2tp_core ip6_udp_tunnel udp_tunnel pptp gre pppox ppp_generic slhc crc32c_intel ghash_clmulni_intel jitterentropy_rng sha256_generic hmac drbg ansi_cprng aesni_intel evdev aes_x86_64 ablk_helper cryptd lrw gf128mul glue_helper serio_raw acpi_cpufreq button proc\
      essor ext4 crc16 jbd2 mbcache virtio_blk virtio_net virtio_pci virtio_ring virtio
      [   72.696530] CPU: 3 PID: 0 Comm: swapper/3 Not tainted 4.7.0-rc1 #1
      [   72.696530] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Debian-1.8.2-1 04/01/2014
      [   72.696530] task: ffff880035b59700 ti: ffff880035b70000 task.ti: ffff880035b70000
      [   72.696530] RIP: 0010:[<ffff880033f87d78>]  [<ffff880033f87d78>] 0xffff880033f87d78
      [   72.696530] RSP: 0018:ffff880035f87bc0  EFLAGS: 00010246
      [   72.696530] RAX: ffffed000698f996 RBX: ffff88003326b840 RCX: ffffffff814cc823
      [   72.696530] RDX: ffff88003326b840 RSI: ffff880033e48038 RDI: ffff880034c7c780
      [   72.696530] RBP: ffff880035f87c18 R08: 000000000000a506 R09: 0000000000000000
      [   72.696530] R10: ffff880035f87b38 R11: ffff880034b9344d R12: 00000000ebfea715
      [   72.696530] R13: 0000000000000000 R14: ffff880034c7c780 R15: 0000000000000000
      [   72.696530] FS:  0000000000000000(0000) GS:ffff880035f80000(0000) knlGS:0000000000000000
      [   72.696530] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      [   72.696530] CR2: ffff880033f87d78 CR3: 0000000033c98000 CR4: 00000000000406a0
      [   72.696530] Stack:
      [   72.696530]  ffffffff814cc834 ffff880034b93468 0000001481416818 ffff88003326b874
      [   72.696530]  ffff880034c7ccb0 ffff880033e48038 ffff88003326b840 ffff880034b93462
      [   72.696530]  ffff88003326b88a ffff88003326b88c ffff880034b93468 ffff880035f87c70
      [   72.696530] Call Trace:
      [   72.696530]  <IRQ>
      [   72.696530]  [<ffffffff814cc834>] ? udp_gro_receive+0x1c6/0x1f9
      [   72.696530]  [<ffffffff814ccb1c>] udp4_gro_receive+0x2b5/0x310
      [   72.696530]  [<ffffffff814d989b>] inet_gro_receive+0x4a3/0x4cd
      [   72.696530]  [<ffffffff81431b32>] dev_gro_receive+0x584/0x7a3
      [   72.696530]  [<ffffffff810adf7a>] ? __lock_is_held+0x29/0x64
      [   72.696530]  [<ffffffff814321f7>] napi_gro_receive+0x124/0x21d
      [   72.696530]  [<ffffffffa000b145>] virtnet_receive+0x8df/0x8f6 [virtio_net]
      [   72.696530]  [<ffffffffa000b27e>] virtnet_poll+0x1d/0x8d [virtio_net]
      [   72.696530]  [<ffffffff81431350>] net_rx_action+0x15b/0x3b9
      [   72.696530]  [<ffffffff815893d6>] __do_softirq+0x216/0x546
      [   72.696530]  [<ffffffff81062392>] irq_exit+0x49/0xb6
      [   72.696530]  [<ffffffff81588e9a>] do_IRQ+0xe2/0xfa
      [   72.696530]  [<ffffffff81587a49>] common_interrupt+0x89/0x89
      [   72.696530]  <EOI>
      [   72.696530]  [<ffffffff810b05df>] ? trace_hardirqs_on_caller+0x229/0x270
      [   72.696530]  [<ffffffff8102b3c7>] ? default_idle+0x1c/0x2d
      [   72.696530]  [<ffffffff8102b3c5>] ? default_idle+0x1a/0x2d
      [   72.696530]  [<ffffffff8102bb8c>] arch_cpu_idle+0xa/0xc
      [   72.696530]  [<ffffffff810a6c39>] default_idle_call+0x1a/0x1c
      [   72.696530]  [<ffffffff810a6d96>] cpu_startup_entry+0x15b/0x20f
      [   72.696530]  [<ffffffff81039a81>] start_secondary+0x12c/0x133
      [   72.696530] Code: ff ff ff ff ff ff ff ff ff ff 7f ff ff ff ff ff ff ff 7f 00 7e f8 33 00 88 ff ff 6d 61 58 81 ff ff ff ff 5e de 0a 81 ff ff ff ff <00> 5c e2 34 00 88 ff ff 00 00 00 00 00 00 00 00 00 00 00 00 00
      [   72.696530] RIP  [<ffff880033f87d78>] 0xffff880033f87d78
      [   72.696530]  RSP <ffff880035f87bc0>
      [   72.696530] CR2: ffff880033f87d78
      [   72.696530] ---[ end trace ad7758b9a1dccf99 ]---
      [   72.696530] Kernel panic - not syncing: Fatal exception in interrupt
      [   72.696530] Kernel Offset: disabled
      [   72.696530] ---[ end Kernel panic - not syncing: Fatal exception in interrupt
      
      v2: use empty initialiser instead of "{ NULL }" to avoid relying on
          first field's type.
      
      Fixes: 38fd2af2 ("udp: Add socket based GRO and config")
      Signed-off-by: default avatarGuillaume Nault <g.nault@alphalink.fr>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      05cbd46b
    • Toshiaki Makita's avatar
      bridge: Don't insert unnecessary local fdb entry on changing mac address · 38f56354
      Toshiaki Makita authored
      [ Upstream commit 0b148def ]
      
      The missing br_vlan_should_use() test caused creation of an unneeded
      local fdb entry on changing mac address of a bridge device when there is
      a vlan which is configured on a bridge port but not on the bridge
      device.
      
      Fixes: 2594e906 ("bridge: vlan: add per-vlan struct and move to rhashtables")
      Signed-off-by: default avatarToshiaki Makita <makita.toshiaki@lab.ntt.co.jp>
      Acked-by: default avatarNikolay Aleksandrov <nikolay@cumulusnetworks.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      38f56354
    • Yuchung Cheng's avatar
      tcp: record TLP and ER timer stats in v6 stats · f946ceab
      Yuchung Cheng authored
      [ Upstream commit ce3cf4ec ]
      
      The v6 tcp stats scan do not provide TLP and ER timer information
      correctly like the v4 version . This patch fixes that.
      
      Fixes: 6ba8a3b1 ("tcp: Tail loss probe (TLP)")
      Fixes: eed530b6 ("tcp: early retransmit")
      Signed-off-by: default avatarYuchung Cheng <ycheng@google.com>
      Signed-off-by: default avatarNeal Cardwell <ncardwell@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      f946ceab
    • Chen Haiquan's avatar
      vxlan: Accept user specified MTU value when create new vxlan link · 721976e9
      Chen Haiquan authored
      [ Upstream commit ce577668 ]
      
      When create a new vxlan link, example:
        ip link add vtap mtu 1440 type vxlan vni 1 dev eth0
      
      The argument "mtu" has no effect, because it is not set to conf->mtu. The
      default value is used in vxlan_dev_configure function.
      
      This problem was introduced by commit 0dfbdf41 (vxlan: Factor out device
      configuration).
      
      Fixes: 0dfbdf41 (vxlan: Factor out device configuration)
      Signed-off-by: default avatarChen Haiquan <oc@yunify.com>
      Acked-by: default avatarCong Wang <xiyou.wangcong@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      721976e9
    • Ivan Vecera's avatar
      team: don't call netdev_change_features under team->lock · 13a055d6
      Ivan Vecera authored
      [ Upstream commit f6988cb6 ]
      
      The team_device_event() notifier calls team_compute_features() to fix
      vlan_features under team->lock to protect team->port_list. The problem is
      that subsequent __team_compute_features() calls netdev_change_features()
      to propagate vlan_features to upper vlan devices while team->lock is still
      taken. This can lead to deadlock when NETIF_F_LRO is modified on lower
      devices or team device itself.
      
      Example:
      The team0 as active backup with eth0 and eth1 NICs. Both eth0 & eth1 are
      LRO capable and LRO is enabled. Thus LRO is also enabled on team0.
      
      The command 'ethtool -K team0 lro off' now hangs due to this deadlock:
      
      dev_ethtool()
      -> ethtool_set_features()
       -> __netdev_update_features(team)
        -> netdev_sync_lower_features()
         -> netdev_update_features(lower_1)
          -> __netdev_update_features(lower_1)
          -> netdev_features_change(lower_1)
           -> call_netdevice_notifiers(...)
            -> team_device_event(lower_1)
             -> team_compute_features(team) [TAKES team->lock]
              -> netdev_change_features(team)
               -> __netdev_update_features(team)
                -> netdev_sync_lower_features()
                 -> netdev_update_features(lower_2)
                  -> __netdev_update_features(lower_2)
                  -> netdev_features_change(lower_2)
                   -> call_netdevice_notifiers(...)
                    -> team_device_event(lower_2)
                     -> team_compute_features(team) [DEADLOCK]
      
      The bug is present in team from the beginning but it appeared after the commit
      fd867d51 (net/core: generic support for disabling netdev features down stack)
      that adds synchronization of features with lower devices.
      
      Fixes: fd867d51 (net/core: generic support for disabling netdev features down stack)
      Cc: Jiri Pirko <jiri@resnulli.us>
      Signed-off-by: default avatarIvan Vecera <ivecera@redhat.com>
      Signed-off-by: default avatarJiri Pirko <jiri@mellanox.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      13a055d6
    • Edward Cree's avatar
      sfc: on MC reset, clear PIO buffer linkage in TXQs · 450db517
      Edward Cree authored
      [ Upstream commit c0795bf6 ]
      
      Otherwise, if we fail to allocate new PIO buffers, our TXQs will try to
      use the old ones, which aren't there any more.
      
      Fixes: 183233be "sfc: Allocate and link PIO buffers; map them with write-combining"
      Signed-off-by: default avatarEdward Cree <ecree@solarflare.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      450db517
    • Daniel Borkmann's avatar
      bpf, inode: disallow userns mounts · bfe951d5
      Daniel Borkmann authored
      [ Upstream commit 612bacad ]
      
      Follow-up to commit e27f4a94 ("bpf: Use mount_nodev not mount_ns
      to mount the bpf filesystem"), which removes the FS_USERNS_MOUNT flag.
      
      The original idea was to have a per mountns instance instead of a
      single global fs instance, but that didn't work out and we had to
      switch to mount_nodev() model. The intent of that middle ground was
      that we avoid users who don't play nice to create endless instances
      of bpf fs which are difficult to control and discover from an admin
      point of view, but at the same time it would have allowed us to be
      more flexible with regard to namespaces.
      
      Therefore, since we now did the switch to mount_nodev() as a fix
      where individual instances are created, we also need to remove userns
      mount flag along with it to avoid running into mentioned situation.
      I don't expect any breakage at this early point in time with removing
      the flag and we can revisit this later should the requirement for
      this come up with future users. This and commit e27f4a94 have
      been split to facilitate tracking should any of them run into the
      unlikely case of causing a regression.
      
      Fixes: b2197755 ("bpf: add support for persistent maps/progs")
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Acked-by: default avatarHannes Frederic Sowa <hannes@stressinduktion.org>
      Acked-by: default avatarAlexei Starovoitov <ast@kernel.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      bfe951d5
    • Nicolas Dichtel's avatar
      uapi glibc compat: fix compilation when !__USE_MISC in glibc · f5f16bf6
      Nicolas Dichtel authored
      [ Upstream commit f0a3fdca ]
      
      These structures are defined only if __USE_MISC is set in glibc net/if.h
      headers, ie when _BSD_SOURCE or _SVID_SOURCE are defined.
      
      CC: Jan Engelhardt <jengelh@inai.de>
      CC: Josh Boyer <jwboyer@fedoraproject.org>
      CC: Stephen Hemminger <shemming@brocade.com>
      CC: Waldemar Brodkorb <mail@waldemar-brodkorb.de>
      CC: Gabriel Laskar <gabriel@lse.epita.fr>
      CC: Mikko Rapeli <mikko.rapeli@iki.fi>
      Fixes: 4a91cb61 ("uapi glibc compat: fix compile errors when glibc net/if.h included before linux/if.h")
      Signed-off-by: default avatarNicolas Dichtel <nicolas.dichtel@6wind.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      f5f16bf6
    • Hannes Frederic Sowa's avatar
      udp: prevent skbs lingering in tunnel socket queues · ab1f253d
      Hannes Frederic Sowa authored
      [ Upstream commit e5aed006 ]
      
      In case we find a socket with encapsulation enabled we should call
      the encap_recv function even if just a udp header without payload is
      available. The callbacks are responsible for correctly verifying and
      dropping the packets.
      
      Also, in case the header validation fails for geneve and vxlan we
      shouldn't put the skb back into the socket queue, no one will pick
      them up there.  Instead we can simply discard them in the respective
      encap_recv functions.
      Signed-off-by: default avatarHannes Frederic Sowa <hannes@stressinduktion.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      ab1f253d
    • Eric W. Biederman's avatar
      bpf: Use mount_nodev not mount_ns to mount the bpf filesystem · 5b7ea922
      Eric W. Biederman authored
      [ Upstream commit e27f4a94 ]
      
      While reviewing the filesystems that set FS_USERNS_MOUNT I spotted the
      bpf filesystem.  Looking at the code I saw a broken usage of mount_ns
      with current->nsproxy->mnt_ns. As the code does not acquire a
      reference to the mount namespace it can not possibly be correct to
      store the mount namespace on the superblock as it does.
      
      Replace mount_ns with mount_nodev so that each mount of the bpf
      filesystem returns a distinct instance, and the code is not buggy.
      
      In discussion with Hannes Frederic Sowa it was reported that the use
      of mount_ns was an attempt to have one bpf instance per mount
      namespace, in an attempt to keep resources that pin resources from
      hiding.  That intent simply does not work, the vfs is not built to
      allow that kind of behavior.  Which means that the bpf filesystem
      really is buggy both semantically and in it's implemenation as it does
      not nor can it implement the original intent.
      
      This change is userspace visible, but my experience with similar
      filesystems leads me to believe nothing will break with a model of each
      mount of the bpf filesystem is distinct from all others.
      
      Fixes: b2197755 ("bpf: add support for persistent maps/progs")
      Cc: Hannes Frederic Sowa <hannes@stressinduktion.org>
      Acked-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Signed-off-by: default avatar"Eric W. Biederman" <ebiederm@xmission.com>
      Acked-by: default avatarHannes Frederic Sowa <hannes@stressinduktion.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      5b7ea922
    • Jason Wang's avatar
      tuntap: correctly wake up process during uninit · bccd56fa
      Jason Wang authored
      [ Upstream commit addf8fc4 ]
      
      We used to check dev->reg_state against NETREG_REGISTERED after each
      time we are woke up. But after commit 9e641bdc ("net-tun:
      restructure tun_do_read for better sleep/wakeup efficiency"), it uses
      skb_recv_datagram() which does not check dev->reg_state. This will
      result if we delete a tun/tap device after a process is blocked in the
      reading. The device will wait for the reference count which was held
      by that process for ever.
      
      Fixes this by using RCV_SHUTDOWN which will be checked during
      sk_recv_datagram() before trying to wake up the process during uninit.
      
      Fixes: 9e641bdc ("net-tun: restructure tun_do_read for better
      sleep/wakeup efficiency")
      Cc: Eric Dumazet <edumazet@google.com>
      Cc: Xi Wang <xii@google.com>
      Cc: Michael S. Tsirkin <mst@redhat.com>
      Signed-off-by: default avatarJason Wang <jasowang@redhat.com>
      Acked-by: default avatarEric Dumazet <edumazet@google.com>
      Acked-by: default avatarMichael S. Tsirkin <mst@redhat.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      bccd56fa
    • Jiri Pirko's avatar
      switchdev: pass pointer to fib_info instead of copy · 835d0122
      Jiri Pirko authored
      [ Upstream commit da4ed551 ]
      
      The problem is that fib_info->nh is [0] so the struct fib_info
      allocation size depends on number of nexthops. If we just copy fib_info,
      we do not copy the nexthops info and driver accesses memory which is not
      ours.
      
      Given the fact that fib4 does not defer operations and therefore it does
      not need copy, just pass the pointer down to drivers as it was done
      before.
      
      Fixes: 850d0cbc ("switchdev: remove pointers from switchdev objects")
      Signed-off-by: default avatarJiri Pirko <jiri@mellanox.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      835d0122
    • Richard Alpe's avatar
      tipc: fix nametable publication field in nl compat · 6a58f3e1
      Richard Alpe authored
      [ Upstream commit 03aaaa9b ]
      
      The publication field of the old netlink API should contain the
      publication key and not the publication reference.
      
      Fixes: 44a8ae94 (tipc: convert legacy nl name table dump to nl compat)
      Signed-off-by: default avatarRichard Alpe <richard.alpe@ericsson.com>
      Acked-by: default avatarJon Maloy <jon.maloy@ericsson.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      6a58f3e1
    • Herbert Xu's avatar
      netlink: Fix dump skb leak/double free · 49543942
      Herbert Xu authored
      [ Upstream commit 92964c79 ]
      
      When we free cb->skb after a dump, we do it after releasing the
      lock.  This means that a new dump could have started in the time
      being and we'll end up freeing their skb instead of ours.
      
      This patch saves the skb and module before we unlock so we free
      the right memory.
      
      Fixes: 16b304f3 ("netlink: Eliminate kmalloc in netlink dump operation.")
      Reported-by: default avatarBaozeng Ding <sploving1@gmail.com>
      Signed-off-by: default avatarHerbert Xu <herbert@gondor.apana.org.au>
      Acked-by: default avatarCong Wang <xiyou.wangcong@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      49543942
    • Richard Alpe's avatar
      tipc: check nl sock before parsing nested attributes · 23cdd8c3
      Richard Alpe authored
      [ Upstream commit 45e093ae ]
      
      Make sure the socket for which the user is listing publication exists
      before parsing the socket netlink attributes.
      
      Prior to this patch a call without any socket caused a NULL pointer
      dereference in tipc_nl_publ_dump().
      Tested-and-reported-by: default avatarBaozeng Ding <sploving1@gmail.com>
      Signed-off-by: default avatarRichard Alpe <richard.alpe@ericsson.com>
      Acked-by: default avatarJon Maloy <jon.maloy@ericsson.cm>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      23cdd8c3
    • Ewan D. Milne's avatar
      scsi: Add QEMU CD-ROM to VPD Inquiry Blacklist · c54c115d
      Ewan D. Milne authored
      commit fbd83006 upstream.
      
      Linux fails to boot as a guest with a QEMU CD-ROM:
      
      [    4.439488] ata2.00: ATAPI: QEMU CD-ROM, 0.8.2, max UDMA/100
      [    4.443649] ata2.00: configured for MWDMA2
      [    4.450267] scsi 1:0:0:0: CD-ROM            QEMU     QEMU CD-ROM      0.8. PQ: 0 ANSI: 5
      [    4.464317] ata2.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x6 frozen
      [    4.464319] ata2.00: BMDMA stat 0x5
      [    4.464339] ata2.00: cmd a0/01:00:00:00:01/00:00:00:00:00/a0 tag 0 dma 16640 in
      [    4.464339]          Inquiry 12 01 00 00 ff 00res 48/20:02:00:24:00/00:00:00:00:00/a0 Emask 0x2 (HSM violation)
      [    4.464341] ata2.00: status: { DRDY DRQ }
      [    4.465864] ata2: soft resetting link
      [    4.625971] ata2.00: configured for MWDMA2
      [    4.628290] ata2: EH complete
      [    4.646670] ata2.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x6 frozen
      [    4.646671] ata2.00: BMDMA stat 0x5
      [    4.646683] ata2.00: cmd a0/01:00:00:00:01/00:00:00:00:00/a0 tag 0 dma 16640 in
      [    4.646683]          Inquiry 12 01 00 00 ff 00res 48/20:02:00:24:00/00:00:00:00:00/a0 Emask 0x2 (HSM violation)
      [    4.646685] ata2.00: status: { DRDY DRQ }
      [    4.648193] ata2: soft resetting link
      
      ...
      
      Fix this by suppressing VPD inquiry for this device.
      Signed-off-by: default avatarEwan D. Milne <emilne@redhat.com>
      Reported-by: default avatarJan Stancek <jstancek@redhat.com>
      Tested-by: default avatarJan Stancek <jstancek@redhat.com>
      Reviewed-by: default avatarJohannes Thumshirn <jthumshirn@suse.de>
      Signed-off-by: default avatarMartin K. Petersen <martin.petersen@oracle.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      c54c115d
    • James Bottomley's avatar
      scsi_lib: correctly retry failed zero length REQ_TYPE_FS commands · 0dec8c0d
      James Bottomley authored
      commit a621bac3 upstream.
      
      When SCSI was written, all commands coming from the filesystem
      (REQ_TYPE_FS commands) had data.  This meant that our signal for needing
      to complete the command was the number of bytes completed being equal to
      the number of bytes in the request.  Unfortunately, with the advent of
      flush barriers, we can now get zero length REQ_TYPE_FS commands, which
      confuse this logic because they satisfy the condition every time.  This
      means they never get retried even for retryable conditions, like UNIT
      ATTENTION because we complete them early assuming they're done.  Fix
      this by special casing the early completion condition to recognise zero
      length commands with errors and let them drop through to the retry code.
      Reported-by: default avatarSebastian Parschauer <s.parschauer@gmx.de>
      Signed-off-by: default avatarJames E.J. Bottomley <jejb@linux.vnet.ibm.com>
      Tested-by: default avatarJack Wang <jinpu.wang@profitbricks.com>
      Signed-off-by: default avatarMartin K. Petersen <martin.petersen@oracle.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      0dec8c0d
  2. 08 Jun, 2016 15 commits
    • Greg Kroah-Hartman's avatar
      Linux 4.4.13 · ba760d43
      Greg Kroah-Hartman authored
      ba760d43
    • Dave Chinner's avatar
      xfs: handle dquot buffer readahead in log recovery correctly · 55f6ddfc
      Dave Chinner authored
      commit 7d6a13f0 upstream.
      
      When we do dquot readahead in log recovery, we do not use a verifier
      as the underlying buffer may not have dquots in it. e.g. the
      allocation operation hasn't yet been replayed. Hence we do not want
      to fail recovery because we detect an operation to be replayed has
      not been run yet. This problem was addressed for inodes in commit
      d8914002 ("xfs: inode buffers may not be valid during recovery
      readahead") but the problem was not recognised to exist for dquots
      and their buffers as the dquot readahead did not have a verifier.
      
      The result of not using a verifier is that when the buffer is then
      next read to replay a dquot modification, the dquot buffer verifier
      will only be attached to the buffer if *readahead is not complete*.
      Hence we can read the buffer, replay the dquot changes and then add
      it to the delwri submission list without it having a verifier
      attached to it. This then generates warnings in xfs_buf_ioapply(),
      which catches and warns about this case.
      
      Fix this and make it handle the same readahead verifier error cases
      as for inode buffers by adding a new readahead verifier that has a
      write operation as well as a read operation that marks the buffer as
      not done if any corruption is detected.  Also make sure we don't run
      readahead if the dquot buffer has been marked as cancelled by
      recovery.
      
      This will result in readahead either succeeding and the buffer
      having a valid write verifier, or readahead failing and the buffer
      state requiring the subsequent read to resubmit the IO with the new
      verifier.  In either case, this will result in the buffer always
      ending up with a valid write verifier on it.
      
      Note: we also need to fix the inode buffer readahead error handling
      to mark the buffer with EIO. Brian noticed the code I copied from
      there wrong during review, so fix it at the same time. Add comments
      linking the two functions that handle readahead verifier errors
      together so we don't forget this behavioural link in future.
      Signed-off-by: default avatarDave Chinner <dchinner@redhat.com>
      Reviewed-by: default avatarBrian Foster <bfoster@redhat.com>
      Signed-off-by: default avatarDave Chinner <david@fromorbit.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      55f6ddfc
    • Eric Sandeen's avatar
      xfs: print name of verifier if it fails · 063b0dc8
      Eric Sandeen authored
      commit 233135b7 upstream.
      
      This adds a name to each buf_ops structure, so that if
      a verifier fails we can print the type of verifier that
      failed it.  Should be a slight debugging aid, I hope.
      Signed-off-by: default avatarEric Sandeen <sandeen@redhat.com>
      Reviewed-by: default avatarBrian Foster <bfoster@redhat.com>
      Signed-off-by: default avatarDave Chinner <david@fromorbit.com>
      Cc: Holger Hoffstätte <holger@applied-asynchrony.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      063b0dc8
    • Dave Chinner's avatar
      xfs: skip stale inodes in xfs_iflush_cluster · 21cfd6cc
      Dave Chinner authored
      commit 7d3aa7fe upstream.
      
      We don't write back stale inodes so we should skip them in
      xfs_iflush_cluster, too.
      Signed-off-by: default avatarDave Chinner <dchinner@redhat.com>
      Reviewed-by: default avatarBrian Foster <bfoster@redhat.com>
      Reviewed-by: default avatarChristoph Hellwig <hch@lst.de>
      Signed-off-by: default avatarDave Chinner <david@fromorbit.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      21cfd6cc
    • Dave Chinner's avatar
      xfs: fix inode validity check in xfs_iflush_cluster · baa7a74d
      Dave Chinner authored
      commit 51b07f30 upstream.
      
      Some careless idiot(*) wrote crap code in commit 1a3e8f3d ("xfs:
      convert inode cache lookups to use RCU locking") back in late 2010,
      and so xfs_iflush_cluster checks the wrong inode for whether it is
      still valid under RCU protection. Fix it to lock and check the
      correct inode.
      
      (*) Careless-idiot: Dave Chinner <dchinner@redhat.com>
      Discovered-by: default avatarBrain Foster <bfoster@redhat.com>
      Signed-off-by: default avatarDave Chinner <dchinner@redhat.com>
      Reviewed-by: default avatarChristoph Hellwig <hch@lst.de>
      Signed-off-by: default avatarDave Chinner <david@fromorbit.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      baa7a74d
    • Dave Chinner's avatar
      xfs: xfs_iflush_cluster fails to abort on error · 7dc8f21b
      Dave Chinner authored
      commit b1438f47 upstream.
      
      When a failure due to an inode buffer occurs, the error handling
      fails to abort the inode writeback correctly. This can result in the
      inode being reclaimed whilst still in the AIL, leading to
      use-after-free situations as well as filesystems that cannot be
      unmounted as the inode log items left in the AIL never get removed.
      
      Fix this by ensuring fatal errors from xfs_imap_to_bp() result in
      the inode flush being aborted correctly.
      Reported-by: default avatarShyam Kaushik <shyam@zadarastorage.com>
      Diagnosed-by: default avatarShyam Kaushik <shyam@zadarastorage.com>
      Tested-by: default avatarShyam Kaushik <shyam@zadarastorage.com>
      Signed-off-by: default avatarDave Chinner <dchinner@redhat.com>
      Reviewed-by: default avatarChristoph Hellwig <hch@lst.de>
      Signed-off-by: default avatarDave Chinner <david@fromorbit.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      7dc8f21b
    • Dave Chinner's avatar
      xfs: Don't wrap growfs AGFL indexes · d7d92ca7
      Dave Chinner authored
      commit ad747e3b upstream.
      
      Commit 96f859d5 ("libxfs: pack the agfl header structure so
      XFS_AGFL_SIZE is correct") allowed the freelist to use the empty
      slot at the end of the freelist on 64 bit systems that was not
      being used due to sizeof() rounding up the structure size.
      
      This has caused versions of xfs_repair prior to 4.5.0 (which also
      has the fix) to report this as a corruption once the filesystem has
      been grown. Older kernels can also have problems (seen from a whacky
      container/vm management environment) mounting filesystems grown on a
      system with a newer kernel than the vm/container it is deployed on.
      
      To avoid this problem, change the initial free list indexes not to
      wrap across the end of the AGFL, hence avoiding the initialisation
      of agf_fllast to the last index in the AGFL.
      Signed-off-by: default avatarDave Chinner <dchinner@redhat.com>
      Reviewed-by: default avatarCarlos Maiolino <cmaiolino@redhat.com>
      Signed-off-by: default avatarDave Chinner <david@fromorbit.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      d7d92ca7
    • Eric Sandeen's avatar
      xfs: disallow rw remount on fs with unknown ro-compat features · ec86bfec
      Eric Sandeen authored
      commit d0a58e83 upstream.
      
      Today, a kernel which refuses to mount a filesystem read-write
      due to unknown ro-compat features can still transition to read-write
      via the remount path.  The old kernel is most likely none the wiser,
      because it's unaware of the new feature, and isn't using it.  However,
      writing to the filesystem may well corrupt metadata related to that
      new feature, and moving to a newer kernel which understand the feature
      will have problems.
      
      Right now the only ro-compat feature we have is the free inode btree,
      which showed up in v3.16.  It would be good to push this back to
      all the active stable kernels, I think, so that if anyone is using
      newer mkfs (which enables the finobt feature) with older kernel
      releases, they'll be protected.
      Signed-off-by: default avatarEric Sandeen <sandeen@redhat.com>
      Reviewed-by: default avatarBill O'Donnell <billodo@redhat.com>
      Reviewed-by: default avatarDave Chinner <dchinner@redhat.com>
      Signed-off-by: default avatarDave Chinner <david@fromorbit.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      ec86bfec
    • Arnd Bergmann's avatar
      gcov: disable tree-loop-im to reduce stack usage · 8edc7f04
      Arnd Bergmann authored
      commit c87bf431 upstream.
      
      Enabling CONFIG_GCOV_PROFILE_ALL produces us a lot of warnings like
      
      lib/lz4/lz4hc_compress.c: In function 'lz4_compresshcctx':
      lib/lz4/lz4hc_compress.c:514:1: warning: the frame size of 1504 bytes is larger than 1024 bytes [-Wframe-larger-than=]
      
      After some investigation, I found that this behavior started with gcc-4.9,
      and opened https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69702.
      A suggested workaround for it is to use the -fno-tree-loop-im
      flag that turns off one of the optimization stages in gcc, so the
      code runs a little slower but does not use excessive amounts
      of stack.
      
      We could make this conditional on the gcc version, but I could not
      find an easy way to do this in Kbuild and the benefit would be
      fairly small, given that most of the gcc version in production are
      affected now.
      
      I'm marking this for 'stable' backports because it addresses a bug
      with code generation in gcc that exists in all kernel versions
      with the affected gcc releases.
      Signed-off-by: default avatarArnd Bergmann <arnd@arndb.de>
      Acked-by: default avatarPeter Oberparleiter <oberpar@linux.vnet.ibm.com>
      Signed-off-by: default avatarMichal Marek <mmarek@suse.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      8edc7f04
    • Srinivas Pandruvada's avatar
      scripts/package/Makefile: rpmbuild add support of RPMOPTS · 4b2fb176
      Srinivas Pandruvada authored
      commit 65a9f31c upstream.
      
      After commit 21a59991 ("scripts/package/Makefile: rpmbuild is needed
      for rpm targets"), it is no longer possible to specify RPMOPTS.
      For example, we can no longer able to control _topdir using the following
      make command.
      make RPMOPTS="--define '_topdir /home/xyz/workspace/'" binrpm-pkg
      
      Fixes: 21a59991 ("scripts/package/Makefile: rpmbuild is needed for rpm targets")
      Signed-off-by: default avatarSrinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
      Signed-off-by: default avatarMichal Marek <mmarek@suse.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      4b2fb176
    • Ville Syrjälä's avatar
      dma-debug: avoid spinlock recursion when disabling dma-debug · 7d0b4945
      Ville Syrjälä authored
      commit 3017cd63 upstream.
      
      With netconsole (at least) the pr_err("...  disablingn") call can
      recurse back into the dma-debug code, where it'll try to grab
      free_entries_lock again.  Avoid the problem by doing the printk after
      dropping the lock.
      
      Link: http://lkml.kernel.org/r/1463678421-18683-1-git-send-email-ville.syrjala@linux.intel.comSigned-off-by: default avatarVille Syrjälä <ville.syrjala@linux.intel.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      7d0b4945
    • Rafael J. Wysocki's avatar
      PM / sleep: Handle failures in device_suspend_late() consistently · 98c28450
      Rafael J. Wysocki authored
      commit 3a17fb32 upstream.
      
      Grygorii Strashko reports:
      
       The PM runtime will be left disabled for the device if its
       .suspend_late() callback fails and async suspend is not allowed
       for this device. In this case device will not be added in
       dpm_late_early_list and dpm_resume_early() will ignore this
       device, as result PM runtime will be disabled for it forever
       (side effect: after 8 subsequent failures for the same device
       the PM runtime will be reenabled due to disable_depth overflow).
      
      To fix this problem, add devices to dpm_late_early_list regardless
      of whether or not device_suspend_late() returns errors for them.
      
      That will ensure failures in there to be handled consistently for
      all devices regardless of their async suspend/resume status.
      Reported-by: default avatarGrygorii Strashko <grygorii.strashko@ti.com>
      Tested-by: default avatarGrygorii Strashko <grygorii.strashko@ti.com>
      Signed-off-by: default avatarRafael J. Wysocki <rafael.j.wysocki@intel.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      98c28450
    • Nicolai Stange's avatar
      ext4: silence UBSAN in ext4_mb_init() · 8b8de1c9
      Nicolai Stange authored
      commit 935244cd upstream.
      
      Currently, in ext4_mb_init(), there's a loop like the following:
      
        do {
          ...
          offset += 1 << (sb->s_blocksize_bits - i);
          i++;
        } while (i <= sb->s_blocksize_bits + 1);
      
      Note that the updated offset is used in the loop's next iteration only.
      
      However, at the last iteration, that is at i == sb->s_blocksize_bits + 1,
      the shift count becomes equal to (unsigned)-1 > 31 (c.f. C99 6.5.7(3))
      and UBSAN reports
      
        UBSAN: Undefined behaviour in fs/ext4/mballoc.c:2621:15
        shift exponent 4294967295 is too large for 32-bit type 'int'
        [...]
        Call Trace:
         [<ffffffff818c4d25>] dump_stack+0xbc/0x117
         [<ffffffff818c4c69>] ? _atomic_dec_and_lock+0x169/0x169
         [<ffffffff819411ab>] ubsan_epilogue+0xd/0x4e
         [<ffffffff81941cac>] __ubsan_handle_shift_out_of_bounds+0x1fb/0x254
         [<ffffffff81941ab1>] ? __ubsan_handle_load_invalid_value+0x158/0x158
         [<ffffffff814b6dc1>] ? kmem_cache_alloc+0x101/0x390
         [<ffffffff816fc13b>] ? ext4_mb_init+0x13b/0xfd0
         [<ffffffff814293c7>] ? create_cache+0x57/0x1f0
         [<ffffffff8142948a>] ? create_cache+0x11a/0x1f0
         [<ffffffff821c2168>] ? mutex_lock+0x38/0x60
         [<ffffffff821c23ab>] ? mutex_unlock+0x1b/0x50
         [<ffffffff814c26ab>] ? put_online_mems+0x5b/0xc0
         [<ffffffff81429677>] ? kmem_cache_create+0x117/0x2c0
         [<ffffffff816fcc49>] ext4_mb_init+0xc49/0xfd0
         [...]
      
      Observe that the mentioned shift exponent, 4294967295, equals (unsigned)-1.
      
      Unless compilers start to do some fancy transformations (which at least
      GCC 6.0.0 doesn't currently do), the issue is of cosmetic nature only: the
      such calculated value of offset is never used again.
      
      Silence UBSAN by introducing another variable, offset_incr, holding the
      next increment to apply to offset and adjust that one by right shifting it
      by one position per loop iteration.
      
      Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=114701
      Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=112161Signed-off-by: default avatarNicolai Stange <nicstange@gmail.com>
      Signed-off-by: default avatarTheodore Ts'o <tytso@mit.edu>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      8b8de1c9
    • Nicolai Stange's avatar
      ext4: address UBSAN warning in mb_find_order_for_block() · 12aa7d95
      Nicolai Stange authored
      commit b5cb316c upstream.
      
      Currently, in mb_find_order_for_block(), there's a loop like the following:
      
        while (order <= e4b->bd_blkbits + 1) {
          ...
          bb += 1 << (e4b->bd_blkbits - order);
        }
      
      Note that the updated bb is used in the loop's next iteration only.
      
      However, at the last iteration, that is at order == e4b->bd_blkbits + 1,
      the shift count becomes negative (c.f. C99 6.5.7(3)) and UBSAN reports
      
        UBSAN: Undefined behaviour in fs/ext4/mballoc.c:1281:11
        shift exponent -1 is negative
        [...]
        Call Trace:
         [<ffffffff818c4d35>] dump_stack+0xbc/0x117
         [<ffffffff818c4c79>] ? _atomic_dec_and_lock+0x169/0x169
         [<ffffffff819411bb>] ubsan_epilogue+0xd/0x4e
         [<ffffffff81941cbc>] __ubsan_handle_shift_out_of_bounds+0x1fb/0x254
         [<ffffffff81941ac1>] ? __ubsan_handle_load_invalid_value+0x158/0x158
         [<ffffffff816e93a0>] ? ext4_mb_generate_from_pa+0x590/0x590
         [<ffffffff816502c8>] ? ext4_read_block_bitmap_nowait+0x598/0xe80
         [<ffffffff816e7b7e>] mb_find_order_for_block+0x1ce/0x240
         [...]
      
      Unless compilers start to do some fancy transformations (which at least
      GCC 6.0.0 doesn't currently do), the issue is of cosmetic nature only: the
      such calculated value of bb is never used again.
      
      Silence UBSAN by introducing another variable, bb_incr, holding the next
      increment to apply to bb and adjust that one by right shifting it by one
      position per loop iteration.
      
      Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=114701
      Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=112161Signed-off-by: default avatarNicolai Stange <nicstange@gmail.com>
      Signed-off-by: default avatarTheodore Ts'o <tytso@mit.edu>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      12aa7d95
    • Jan Kara's avatar
      ext4: fix oops on corrupted filesystem · b2601bb0
      Jan Kara authored
      commit 74177f55 upstream.
      
      When filesystem is corrupted in the right way, it can happen
      ext4_mark_iloc_dirty() in ext4_orphan_add() returns error and we
      subsequently remove inode from the in-memory orphan list. However this
      deletion is done with list_del(&EXT4_I(inode)->i_orphan) and thus we
      leave i_orphan list_head with a stale content. Later we can look at this
      content causing list corruption, oops, or other issues. The reported
      trace looked like:
      
      WARNING: CPU: 0 PID: 46 at lib/list_debug.c:53 __list_del_entry+0x6b/0x100()
      list_del corruption, 0000000061c1d6e0->next is LIST_POISON1
      0000000000100100)
      CPU: 0 PID: 46 Comm: ext4.exe Not tainted 4.1.0-rc4+ #250
      Stack:
       60462947 62219960 602ede24 62219960
       602ede24 603ca293 622198f0 602f02eb
       62219950 6002c12c 62219900 601b4d6b
      Call Trace:
       [<6005769c>] ? vprintk_emit+0x2dc/0x5c0
       [<602ede24>] ? printk+0x0/0x94
       [<600190bc>] show_stack+0xdc/0x1a0
       [<602ede24>] ? printk+0x0/0x94
       [<602ede24>] ? printk+0x0/0x94
       [<602f02eb>] dump_stack+0x2a/0x2c
       [<6002c12c>] warn_slowpath_common+0x9c/0xf0
       [<601b4d6b>] ? __list_del_entry+0x6b/0x100
       [<6002c254>] warn_slowpath_fmt+0x94/0xa0
       [<602f4d09>] ? __mutex_lock_slowpath+0x239/0x3a0
       [<6002c1c0>] ? warn_slowpath_fmt+0x0/0xa0
       [<60023ebf>] ? set_signals+0x3f/0x50
       [<600a205a>] ? kmem_cache_free+0x10a/0x180
       [<602f4e88>] ? mutex_lock+0x18/0x30
       [<601b4d6b>] __list_del_entry+0x6b/0x100
       [<601177ec>] ext4_orphan_del+0x22c/0x2f0
       [<6012f27c>] ? __ext4_journal_start_sb+0x2c/0xa0
       [<6010b973>] ? ext4_truncate+0x383/0x390
       [<6010bc8b>] ext4_write_begin+0x30b/0x4b0
       [<6001bb50>] ? copy_from_user+0x0/0xb0
       [<601aa840>] ? iov_iter_fault_in_readable+0xa0/0xc0
       [<60072c4f>] generic_perform_write+0xaf/0x1e0
       [<600c4166>] ? file_update_time+0x46/0x110
       [<60072f0f>] __generic_file_write_iter+0x18f/0x1b0
       [<6010030f>] ext4_file_write_iter+0x15f/0x470
       [<60094e10>] ? unlink_file_vma+0x0/0x70
       [<6009b180>] ? unlink_anon_vmas+0x0/0x260
       [<6008f169>] ? free_pgtables+0xb9/0x100
       [<600a6030>] __vfs_write+0xb0/0x130
       [<600a61d5>] vfs_write+0xa5/0x170
       [<600a63d6>] SyS_write+0x56/0xe0
       [<6029fcb0>] ? __libc_waitpid+0x0/0xa0
       [<6001b698>] handle_syscall+0x68/0x90
       [<6002633d>] userspace+0x4fd/0x600
       [<6002274f>] ? save_registers+0x1f/0x40
       [<60028bd7>] ? arch_prctl+0x177/0x1b0
       [<60017bd5>] fork_handler+0x85/0x90
      
      Fix the problem by using list_del_init() as we always should with
      i_orphan list.
      Reported-by: default avatarVegard Nossum <vegard.nossum@oracle.com>
      Signed-off-by: default avatarJan Kara <jack@suse.cz>
      Signed-off-by: default avatarTheodore Ts'o <tytso@mit.edu>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      b2601bb0