1. 03 Mar, 2014 2 commits
  2. 01 Mar, 2014 3 commits
    • Filipe David Borba Manana's avatar
      Btrfs: fix data corruption when reading/updating compressed extents · 383728be
      Filipe David Borba Manana authored
      commit a2aa75e1 upstream.
      
      When using a mix of compressed file extents and prealloc extents, it
      is possible to fill a page of a file with random, garbage data from
      some unrelated previous use of the page, instead of a sequence of zeroes.
      
      A simple sequence of steps to get into such case, taken from the test
      case I made for xfstests, is:
      
         _scratch_mkfs
         _scratch_mount "-o compress-force=lzo"
         $XFS_IO_PROG -f -c "pwrite -S 0x06 -b 18670 266978 18670" $SCRATCH_MNT/foobar
         $XFS_IO_PROG -c "falloc 26450 665194" $SCRATCH_MNT/foobar
         $XFS_IO_PROG -c "truncate 542872" $SCRATCH_MNT/foobar
         $XFS_IO_PROG -c "fsync" $SCRATCH_MNT/foobar
      
      This results in the following file items in the fs tree:
      
         item 4 key (257 INODE_ITEM 0) itemoff 15879 itemsize 160
             inode generation 6 transid 6 size 542872 block group 0 mode 100600
         item 5 key (257 INODE_REF 256) itemoff 15863 itemsize 16
             inode ref index 2 namelen 6 name: foobar
         item 6 key (257 EXTENT_DATA 0) itemoff 15810 itemsize 53
             extent data disk byte 0 nr 0 gen 6
             extent data offset 0 nr 24576 ram 266240
             extent compression 0
         item 7 key (257 EXTENT_DATA 24576) itemoff 15757 itemsize 53
             prealloc data disk byte 12849152 nr 241664 gen 6
             prealloc data offset 0 nr 241664
         item 8 key (257 EXTENT_DATA 266240) itemoff 15704 itemsize 53
             extent data disk byte 12845056 nr 4096 gen 6
             extent data offset 0 nr 20480 ram 20480
             extent compression 2
         item 9 key (257 EXTENT_DATA 286720) itemoff 15651 itemsize 53
             prealloc data disk byte 13090816 nr 405504 gen 6
             prealloc data offset 0 nr 258048
      
      The on disk extent at offset 266240 (which corresponds to 1 single disk block),
      contains 5 compressed chunks of file data. Each of the first 4 compress 4096
      bytes of file data, while the last one only compresses 3024 bytes of file data.
      Therefore a read into the file region [285648 ; 286720[ (length = 4096 - 3024 =
      1072 bytes) should always return zeroes (our next extent is a prealloc one).
      
      The solution here is the compression code path to zero the remaining (untouched)
      bytes of the last page it uncompressed data into, as the information about how
      much space the file data consumes in the last page is not known in the upper layer
      fs/btrfs/extent_io.c:__do_readpage(). In __do_readpage we were correctly zeroing
      the remainder of the page but only if it corresponds to the last page of the inode
      and if the inode's size is not a multiple of the page size.
      
      This would cause not only returning random data on reads, but also permanently
      storing random data when updating parts of the region that should be zeroed.
      For the example above, it means updating a single byte in the region [285648 ; 286720[
      would store that byte correctly but also store random data on disk.
      
      A test case for xfstests follows soon.
      Signed-off-by: default avatarFilipe David Borba Manana <fdmanana@gmail.com>
      Signed-off-by: default avatarChris Mason <clm@fb.com>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      383728be
    • Filipe David Borba Manana's avatar
      Btrfs: fix tree mod logging · 3939448a
      Filipe David Borba Manana authored
      commit 5de865ee upstream.
      
      While running the test btrfs/004 from xfstests in a loop, it failed
      about 1 time out of 20 runs in my desktop. The failure happened in
      the backref walking part of the test, and the test's error message was
      like this:
      
        btrfs/004 93s ... [failed, exit status 1] - output mismatch (see /home/fdmanana/git/hub/xfstests_2/results//btrfs/004.out.bad)
            --- tests/btrfs/004.out	2013-11-26 18:25:29.263333714 +0000
            +++ /home/fdmanana/git/hub/xfstests_2/results//btrfs/004.out.bad	2013-12-10 15:25:10.327518516 +0000
            @@ -1,3 +1,8 @@
             QA output created by 004
             *** test backref walking
            -*** done
            +unexpected output from
            +	/home/fdmanana/git/hub/btrfs-progs/btrfs inspect-internal logical-resolve -P 141512704 /home/fdmanana/btrfs-tests/scratch_1
            +expected inum: 405, expected address: 454656, file: /home/fdmanana/btrfs-tests/scratch_1/snap1/p0/d6/d3d/d156/fce, got:
            +
             ...
             (Run 'diff -u tests/btrfs/004.out /home/fdmanana/git/hub/xfstests_2/results//btrfs/004.out.bad' to see the entire diff)
        Ran: btrfs/004
        Failures: btrfs/004
        Failed 1 of 1 tests
      
      But immediately after the test finished, the btrfs inspect-internal command
      returned the expected output:
      
        $ btrfs inspect-internal logical-resolve -P 141512704 /home/fdmanana/btrfs-tests/scratch_1
        inode 405 offset 454656 root 258
        inode 405 offset 454656 root 5
      
      It turned out this was because the btrfs_search_old_slot() calls performed
      during backref walking (backref.c:__resolve_indirect_ref) were not finding
      anything. The reason for this turned out to be that the tree mod logging
      code was not logging some node multi-step operations atomically, therefore
      btrfs_search_old_slot() callers iterated often over an incomplete tree that
      wasn't fully consistent with any tree state from the past. Besides missing
      items, this often (but not always) resulted in -EIO errors during old slot
      searches, reported in dmesg like this:
      
      [ 4299.933936] ------------[ cut here ]------------
      [ 4299.933949] WARNING: CPU: 0 PID: 23190 at fs/btrfs/ctree.c:1343 btrfs_search_old_slot+0x57b/0xab0 [btrfs]()
      [ 4299.933950] Modules linked in: btrfs raid6_pq xor pci_stub vboxpci(O) vboxnetadp(O) vboxnetflt(O) vboxdrv(O) bnep rfcomm bluetooth parport_pc ppdev binfmt_misc joydev snd_hda_codec_h
      [ 4299.933977] CPU: 0 PID: 23190 Comm: btrfs Tainted: G        W  O 3.12.0-fdm-btrfs-next-16+ #70
      [ 4299.933978] Hardware name: To Be Filled By O.E.M. To Be Filled By O.E.M./Z77 Pro4, BIOS P1.50 09/04/2012
      [ 4299.933979]  000000000000053f ffff8806f3fd98f8 ffffffff8176d284 0000000000000007
      [ 4299.933982]  0000000000000000 ffff8806f3fd9938 ffffffff8104a81c ffff880659c64b70
      [ 4299.933984]  ffff880659c643d0 ffff8806599233d8 ffff880701e2e938 0000160000000000
      [ 4299.933987] Call Trace:
      [ 4299.933991]  [<ffffffff8176d284>] dump_stack+0x55/0x76
      [ 4299.933994]  [<ffffffff8104a81c>] warn_slowpath_common+0x8c/0xc0
      [ 4299.933997]  [<ffffffff8104a86a>] warn_slowpath_null+0x1a/0x20
      [ 4299.934003]  [<ffffffffa065d3bb>] btrfs_search_old_slot+0x57b/0xab0 [btrfs]
      [ 4299.934005]  [<ffffffff81775f3b>] ? _raw_read_unlock+0x2b/0x50
      [ 4299.934010]  [<ffffffffa0655001>] ? __tree_mod_log_search+0x81/0xc0 [btrfs]
      [ 4299.934019]  [<ffffffffa06dd9b0>] __resolve_indirect_refs+0x130/0x5f0 [btrfs]
      [ 4299.934027]  [<ffffffffa06a21f1>] ? free_extent_buffer+0x61/0xc0 [btrfs]
      [ 4299.934034]  [<ffffffffa06de39c>] find_parent_nodes+0x1fc/0xe40 [btrfs]
      [ 4299.934042]  [<ffffffffa06b13e0>] ? defrag_lookup_extent+0xe0/0xe0 [btrfs]
      [ 4299.934048]  [<ffffffffa06b13e0>] ? defrag_lookup_extent+0xe0/0xe0 [btrfs]
      [ 4299.934056]  [<ffffffffa06df980>] iterate_extent_inodes+0xe0/0x250 [btrfs]
      [ 4299.934058]  [<ffffffff817762db>] ? _raw_spin_unlock+0x2b/0x50
      [ 4299.934065]  [<ffffffffa06dfb82>] iterate_inodes_from_logical+0x92/0xb0 [btrfs]
      [ 4299.934071]  [<ffffffffa06b13e0>] ? defrag_lookup_extent+0xe0/0xe0 [btrfs]
      [ 4299.934078]  [<ffffffffa06b7015>] btrfs_ioctl+0xf65/0x1f60 [btrfs]
      [ 4299.934080]  [<ffffffff811658b8>] ? handle_mm_fault+0x278/0xb00
      [ 4299.934083]  [<ffffffff81075563>] ? up_read+0x23/0x40
      [ 4299.934085]  [<ffffffff8177a41c>] ? __do_page_fault+0x20c/0x5a0
      [ 4299.934088]  [<ffffffff811b2946>] do_vfs_ioctl+0x96/0x570
      [ 4299.934090]  [<ffffffff81776e23>] ? error_sti+0x5/0x6
      [ 4299.934093]  [<ffffffff810b71e8>] ? trace_hardirqs_off_caller+0x28/0xd0
      [ 4299.934096]  [<ffffffff81776a09>] ? retint_swapgs+0xe/0x13
      [ 4299.934098]  [<ffffffff811b2eb1>] SyS_ioctl+0x91/0xb0
      [ 4299.934100]  [<ffffffff813eecde>] ? trace_hardirqs_on_thunk+0x3a/0x3f
      [ 4299.934102]  [<ffffffff8177ef12>] system_call_fastpath+0x16/0x1b
      [ 4299.934102]  [<ffffffff8177ef12>] system_call_fastpath+0x16/0x1b
      [ 4299.934104] ---[ end trace 48f0cfc902491414 ]---
      [ 4299.934378] btrfs bad fsid on block 0
      
      These tree mod log operations that must be performed atomically, tree_mod_log_free_eb,
      tree_mod_log_eb_copy, tree_mod_log_insert_root and tree_mod_log_insert_move, used to
      be performed atomically before the following commit:
      
        c8cc6341
        (Btrfs: stop using GFP_ATOMIC for the tree mod log allocations)
      
      That change removed the atomicity of such operations. This patch restores the
      atomicity while still not doing the GFP_ATOMIC allocations of tree_mod_elem
      structures, so it has to do the allocations using GFP_NOFS before acquiring
      the mod log lock.
      
      This issue has been experienced by several users recently, such as for example:
      
        http://www.spinics.net/lists/linux-btrfs/msg28574.html
      
      After running the btrfs/004 test for 679 consecutive iterations with this
      patch applied, I didn't ran into the issue anymore.
      
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarFilipe David Borba Manana <fdmanana@gmail.com>
      Signed-off-by: default avatarJosef Bacik <jbacik@fb.com>
      Signed-off-by: default avatarChris Mason <clm@fb.com>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      3939448a
    • Filipe David Borba Manana's avatar
      Btrfs: return immediately if tree log mod is not necessary · 4e6e3bd4
      Filipe David Borba Manana authored
      commit 78357766 upstream.
      
      In ctree.c:tree_mod_log_set_node_key() we were calling
      __tree_mod_log_insert_key() even when the modification doesn't need
      to be logged. This would allocate a tree_mod_elem structure, fill it
      and pass it to  __tree_mod_log_insert(), which would just acquire
      the tree mod log write lock and then free the tree_mod_elem structure
      and return (that is, a no-op).
      
      Therefore call tree_mod_log_insert() instead of __tree_mod_log_insert()
      which just returns immediately if the modification doesn't need to be
      logged (without allocating the structure, fill it, acquire write lock,
      free structure).
      Signed-off-by: default avatarFilipe David Borba Manana <fdmanana@gmail.com>
      Signed-off-by: default avatarJosef Bacik <jbacik@fb.com>
      Signed-off-by: default avatarChris Mason <clm@fb.com>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      4e6e3bd4
  3. 26 Feb, 2014 21 commits
    • Knut Petersen's avatar
      perf: Enforce 1 as lower limit for perf_event_max_sample_rate · 2b183ff2
      Knut Petersen authored
      commit 723478c8 upstream.
      
      /proc/sys/kernel/perf_event_max_sample_rate will accept
      negative values as well as 0.
      
      Negative values are unreasonable, and 0 causes a
      divide by zero exception in perf_proc_update_handler.
      
      This patch enforces a lower limit of 1.
      Signed-off-by: default avatarKnut Petersen <Knut_Petersen@t-online.de>
      Signed-off-by: default avatarPeter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/5242DB0C.4070005@t-online.deSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      2b183ff2
    • Eric Dumazet's avatar
      net: use __GFP_NORETRY for high order allocations · 454d0fc5
      Eric Dumazet authored
      [ Upstream commit ed98df33 ]
      
      sock_alloc_send_pskb() & sk_page_frag_refill()
      have a loop trying high order allocations to prepare
      skb with low number of fragments as this increases performance.
      
      Problem is that under memory pressure/fragmentation, this can
      trigger OOM while the intent was only to try the high order
      allocations, then fallback to order-0 allocations.
      
      We had various reports from unexpected regressions.
      
      According to David, setting __GFP_NORETRY should be fine,
      as the asynchronous compaction is still enabled, and this
      will prevent OOM from kicking as in :
      
      CFSClientEventm invoked oom-killer: gfp_mask=0x42d0, order=3, oom_adj=0,
      oom_score_adj=0, oom_score_badness=2 (enabled),memcg_scoring=disabled
      CFSClientEventm
      
      Call Trace:
       [<ffffffff8043766c>] dump_header+0xe1/0x23e
       [<ffffffff80437a02>] oom_kill_process+0x6a/0x323
       [<ffffffff80438443>] out_of_memory+0x4b3/0x50d
       [<ffffffff8043a4a6>] __alloc_pages_may_oom+0xa2/0xc7
       [<ffffffff80236f42>] __alloc_pages_nodemask+0x1002/0x17f0
       [<ffffffff8024bd23>] alloc_pages_current+0x103/0x2b0
       [<ffffffff8028567f>] sk_page_frag_refill+0x8f/0x160
       [<ffffffff80295fa0>] tcp_sendmsg+0x560/0xee0
       [<ffffffff802a5037>] inet_sendmsg+0x67/0x100
       [<ffffffff80283c9c>] __sock_sendmsg_nosec+0x6c/0x90
       [<ffffffff80283e85>] sock_sendmsg+0xc5/0xf0
       [<ffffffff802847b6>] __sys_sendmsg+0x136/0x430
       [<ffffffff80284ec8>] sys_sendmsg+0x88/0x110
       [<ffffffff80711472>] system_call_fastpath+0x16/0x1b
      Out of Memory: Kill process 2856 (bash) score 9999 or sacrifice child
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Acked-by: default avatarDavid Rientjes <rientjes@google.com>
      Acked-by: default avatar"Eric W. Biederman" <ebiederm@xmission.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      454d0fc5
    • Florian Westphal's avatar
      net: ip, ipv6: handle gso skbs in forwarding path · 91e97e57
      Florian Westphal authored
      commit fe6cc55f upstream.
      
      Marcelo Ricardo Leitner reported problems when the forwarding link path
      has a lower mtu than the incoming one if the inbound interface supports GRO.
      
      Given:
      Host <mtu1500> R1 <mtu1200> R2
      
      Host sends tcp stream which is routed via R1 and R2.  R1 performs GRO.
      
      In this case, the kernel will fail to send ICMP fragmentation needed
      messages (or pkt too big for ipv6), as GSO packets currently bypass dstmtu
      checks in forward path. Instead, Linux tries to send out packets exceeding
      the mtu.
      
      When locking route MTU on Host (i.e., no ipv4 DF bit set), R1 does
      not fragment the packets when forwarding, and again tries to send out
      packets exceeding R1-R2 link mtu.
      
      This alters the forwarding dstmtu checks to take the individual gso
      segment lengths into account.
      
      For ipv6, we send out pkt too big error for gso if the individual
      segments are too big.
      
      For ipv4, we either send icmp fragmentation needed, or, if the DF bit
      is not set, perform software segmentation and let the output path
      create fragments when the packet is leaving the machine.
      It is not 100% correct as the error message will contain the headers of
      the GRO skb instead of the original/segmented one, but it seems to
      work fine in my (limited) tests.
      
      Eric Dumazet suggested to simply shrink mss via ->gso_size to avoid
      sofware segmentation.
      
      However it turns out that skb_segment() assumes skb nr_frags is related
      to mss size so we would BUG there.  I don't want to mess with it considering
      Herbert and Eric disagree on what the correct behavior should be.
      
      Hannes Frederic Sowa notes that when we would shrink gso_size
      skb_segment would then also need to deal with the case where
      SKB_MAX_FRAGS would be exceeded.
      
      This uses sofware segmentation in the forward path when we hit ipv4
      non-DF packets and the outgoing link mtu is too small.  Its not perfect,
      but given the lack of bug reports wrt. GRO fwd being broken this is a
      rare case anyway.  Also its not like this could not be improved later
      once the dust settles.
      Acked-by: default avatarHerbert Xu <herbert@gondor.apana.org.au>
      Reported-by: default avatarMarcelo Ricardo Leitner <mleitner@redhat.com>
      Signed-off-by: default avatarFlorian Westphal <fw@strlen.de>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      91e97e57
    • Florian Westphal's avatar
      net: core: introduce netif_skb_dev_features · 0777d823
      Florian Westphal authored
      commit d2069403 upstream.
      
      Will be used by upcoming ipv4 forward path change that needs to
      determine feature mask using skb->dst->dev instead of skb->dev.
      Signed-off-by: default avatarFlorian Westphal <fw@strlen.de>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      0777d823
    • Florian Westphal's avatar
      net: add and use skb_gso_transport_seglen() · 449c5733
      Florian Westphal authored
      commit de960aa9 upstream.
      
      This moves part of Eric Dumazets skb_gso_seglen helper from tbf sched to
      skbuff core so it may be reused by upcoming ip forwarding path patch.
      Signed-off-by: default avatarFlorian Westphal <fw@strlen.de>
      Acked-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      449c5733
    • Daniel Borkmann's avatar
      net: sctp: fix sctp_connectx abi for ia32 emulation/compat mode · 515266de
      Daniel Borkmann authored
      [ Upstream commit ffd59393 ]
      
      SCTP's sctp_connectx() abi breaks for 64bit kernels compiled with 32bit
      emulation (e.g. ia32 emulation or x86_x32). Due to internal usage of
      'struct sctp_getaddrs_old' which includes a struct sockaddr pointer,
      sizeof(param) check will always fail in kernel as the structure in
      64bit kernel space is 4bytes larger than for user binaries compiled
      in 32bit mode. Thus, applications making use of sctp_connectx() won't
      be able to run under such circumstances.
      
      Introduce a compat interface in the kernel to deal with such
      situations by using a 'struct compat_sctp_getaddrs_old' structure
      where user data is copied into it, and then sucessively transformed
      into a 'struct sctp_getaddrs_old' structure with the help of
      compat_ptr(). That fixes sctp_connectx() abi without any changes
      needed in user space, and lets the SCTP test suite pass when compiled
      in 32bit and run on 64bit kernels.
      
      Fixes: f9c67811 ("sctp: Fix regression introduced by new sctp_connectx api")
      Signed-off-by: default avatarDaniel Borkmann <dborkman@redhat.com>
      Acked-by: default avatarNeil Horman <nhorman@tuxdriver.com>
      Acked-by: default avatarVlad Yasevich <vyasevich@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      515266de
    • Duan Jiong's avatar
      ipv4: fix counter in_slow_tot · a97a04ff
      Duan Jiong authored
      [ Upstream commit a6254864 ]
      
      since commit 89aef892("ipv4: Delete routing cache."), the counter
      in_slow_tot can't work correctly.
      
      The counter in_slow_tot increase by one when fib_lookup() return successfully
      in ip_route_input_slow(), but actually the dst struct maybe not be created and
      cached, so we can increase in_slow_tot after the dst struct is created.
      Signed-off-by: default avatarDuan Jiong <duanj.fnst@cn.fujitsu.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      a97a04ff
    • Jiri Bohac's avatar
      bonding: 802.3ad: make aggregator_identifier bond-private · c92210fd
      Jiri Bohac authored
      [ Upstream commit 163c8ff3 ]
      
      aggregator_identifier is used to assign unique aggregator identifiers
      to aggregators of a bond during device enslaving.
      
      aggregator_identifier is currently a global variable that is zeroed in
      bond_3ad_initialize().
      
      This sequence will lead to duplicate aggregator identifiers for eth1 and eth3:
      
      create bond0
      change bond0 mode to 802.3ad
      enslave eth0 to bond0 		//eth0 gets agg id 1
      enslave eth1 to bond0 		//eth1 gets agg id 2
      create bond1
      change bond1 mode to 802.3ad
      enslave eth2 to bond1		//aggregator_identifier is reset to 0
      				//eth2 gets agg id 1
      enslave eth3 to bond0 		//eth3 gets agg id 2
      
      Fix this by making aggregator_identifier private to the bond.
      Signed-off-by: default avatarJiri Bohac <jbohac@suse.cz>
      Acked-by: default avatarVeaceslav Falico <vfalico@redhat.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      c92210fd
    • Emil Goode's avatar
      usbnet: remove generic hard_header_len check · e8047a7d
      Emil Goode authored
      [ Upstream commit eb85569f ]
      
      This patch removes a generic hard_header_len check from the usbnet
      module that is causing dropped packages under certain circumstances
      for devices that send rx packets that cross urb boundaries.
      
      One example is the AX88772B which occasionally send rx packets that
      cross urb boundaries where the remaining partial packet is sent with
      no hardware header. When the buffer with a partial packet is of less
      number of octets than the value of hard_header_len the buffer is
      discarded by the usbnet module.
      
      With AX88772B this can be reproduced by using ping with a packet
      size between 1965-1976.
      
      The bug has been reported here:
      
      https://bugzilla.kernel.org/show_bug.cgi?id=29082
      
      This patch introduces the following changes:
      - Removes the generic hard_header_len check in the rx_complete
        function in the usbnet module.
      - Introduces a ETH_HLEN check for skbs that are not cloned from
        within a rx_fixup callback.
      - For safety a hard_header_len check is added to each rx_fixup
        callback function that could be affected by this change.
        These extra checks could possibly be removed by someone
        who has the hardware to test.
      - Removes a call to dev_kfree_skb_any() and instead utilizes the
        dev->done list to queue skbs for cleanup.
      
      The changes place full responsibility on the rx_fixup callback
      functions that clone skbs to only pass valid skbs to the
      usbnet_skb_return function.
      Signed-off-by: default avatarEmil Goode <emilgoode@gmail.com>
      Reported-by: default avatarIgor Gnatenko <i.gnatenko.brain@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      e8047a7d
    • Emil Goode's avatar
      net: asix: add missing flag to struct driver_info · 30899235
      Emil Goode authored
      [ Upstream commit d43ff4cd ]
      
      The struct driver_info ax88178_info is assigned the function
      asix_rx_fixup_common as it's rx_fixup callback. This means that
      FLAG_MULTI_PACKET must be set as this function is cloning the
      data and calling usbnet_skb_return. Not setting this flag leads
      to usbnet_skb_return beeing called a second time from within
      the rx_process function in the usbnet module.
      Signed-off-by: default avatarEmil Goode <emilgoode@gmail.com>
      Reported-by: default avatarBjørn Mork <bjorn@mork.no>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      30899235
    • Michael S. Tsirkin's avatar
      vhost: fix ref cnt checking deadlock · 1d2e001b
      Michael S. Tsirkin authored
      [ Upstream commit 0ad8b480 ]
      
      vhost checked the counter within the refcnt before decrementing.  It
      really wanted to know that it is the one that has the last reference, as
      a way to batch freeing resources a bit more efficiently.
      
      Note: we only let refcount go to 0 on device release.
      
      This works well but we now access the ref counter twice so there's a
      race: all users might see a high count and decide to defer freeing
      resources.
      In the end no one initiates freeing resources until the last reference
      is gone (which is on VM shotdown so might happen after a looooong time).
      
      Let's do what we probably should have done straight away:
      switch from kref to plain atomic, documenting the
      semantics, return the refcount value atomically after decrement,
      then use that to avoid the deadlock.
      Reported-by: default avatarQin Chuanyu <qinchuanyu@huawei.com>
      Signed-off-by: default avatarMichael S. Tsirkin <mst@redhat.com>
      Acked-by: default avatarJason Wang <jasowang@redhat.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      1d2e001b
    • Nithin Sujir's avatar
      tg3: Fix deadlock in tg3_change_mtu() · 917cc7f0
      Nithin Sujir authored
      [ Upstream commit c6993dfd ]
      
      Quoting David Vrabel -
      "5780 cards cannot have jumbo frames and TSO enabled together.  When
      jumbo frames are enabled by setting the MTU, the TSO feature must be
      cleared.  This is done indirectly by calling netdev_update_features()
      which will call tg3_fix_features() to actually clear the flags.
      
      netdev_update_features() will also trigger a new netlink message for the
      feature change event which will result in a call to tg3_get_stats64()
      which deadlocks on the tg3 lock."
      
      tg3_set_mtu() does not need to be under the tg3 lock since converting
      the flags to use set_bit(). Move it out to after tg3_netif_stop().
      Reported-by: default avatarDavid Vrabel <david.vrabel@citrix.com>
      Tested-by: default avatarDavid Vrabel <david.vrabel@citrix.com>
      Signed-off-by: default avatarMichael Chan <mchan@broadcom.com>
      Signed-off-by: default avatarNithin Nayak Sujir <nsujir@broadcom.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      917cc7f0
    • John Ogness's avatar
      tcp: tsq: fix nonagle handling · 6f7656e8
      John Ogness authored
      [ Upstream commit bf06200e ]
      
      Commit 46d3ceab ("tcp: TCP Small Queues") introduced a possible
      regression for applications using TCP_NODELAY.
      
      If TCP session is throttled because of tsq, we should consult
      tp->nonagle when TX completion is done and allow us to send additional
      segment, especially if this segment is not a full MSS.
      Otherwise this segment is sent after an RTO.
      
      [edumazet] : Cooked the changelog, added another fix about testing
      sk_wmem_alloc twice because TX completion can happen right before
      setting TSQ_THROTTLED bit.
      
      This problem is particularly visible with recent auto corking,
      but might also be triggered with low tcp_limit_output_bytes
      values or NIC drivers delaying TX completion by hundred of usec,
      and very low rtt.
      
      Thomas Glanzmann for example reported an iscsi regression, caused
      by tcp auto corking making this bug quite visible.
      
      Fixes: 46d3ceab ("tcp: TCP Small Queues")
      Signed-off-by: default avatarJohn Ogness <john.ogness@linutronix.de>
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Reported-by: default avatarThomas Glanzmann <thomas@glanzmann.de>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      6f7656e8
    • Bjørn Mork's avatar
      net: qmi_wwan: add Netgear Aircard 340U · dce84839
      Bjørn Mork authored
      [ Upstream commit fbd3a77d ]
      
      This device was mentioned in an OpenWRT forum.  Seems to have a "standard"
      Sierra Wireless ifnumber to function layout:
       0: qcdm
       2: nmea
       3: modem
       8: qmi
       9: storage
      Signed-off-by: default avatarBjørn Mork <bjorn@mork.no>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      dce84839
    • Sabrina Dubroca's avatar
      netpoll: fix netconsole IPv6 setup · 7b694760
      Sabrina Dubroca authored
      [ Upstream commit 00fe11b3 ]
      
      Currently, to make netconsole start over IPv6, the source address
      needs to be specified. Without a source address, netpoll_parse_options
      assumes we're setting up over IPv4 and the destination IPv6 address is
      rejected.
      
      Check if the IP version has been forced by a source address before
      checking for a version mismatch when parsing the destination address.
      Signed-off-by: default avatarSabrina Dubroca <sd@queasysnail.net>
      Acked-by: default avatarCong Wang <cwang@twopensource.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      7b694760
    • Maciej Żenczykowski's avatar
      net: fix 'ip rule' iif/oif device rename · 7d9c1b6b
      Maciej Żenczykowski authored
      [ Upstream commit 946c032e ]
      
      ip rules with iif/oif references do not update:
      (detach/attach) across interface renames.
      Signed-off-by: default avatarMaciej Żenczykowski <maze@google.com>
      CC: Willem de Bruijn <willemb@google.com>
      CC: Eric Dumazet <edumazet@google.com>
      CC: Chris Davis <chrismd@google.com>
      CC: Carlo Contavalli <ccontavalli@google.com>
      
      Google-Bug-Id: 12936021
      Acked-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      7d9c1b6b
    • Geert Uytterhoeven's avatar
      ipv4: Fix runtime WARNING in rtmsg_ifa() · 0c360776
      Geert Uytterhoeven authored
      [ Upstream commit 63b5f152 ]
      
      On m68k/ARAnyM:
      
      WARNING: CPU: 0 PID: 407 at net/ipv4/devinet.c:1599 0x316a99()
      Modules linked in:
      CPU: 0 PID: 407 Comm: ifconfig Not tainted
      3.13.0-atari-09263-g0c71d68014d1 #1378
      Stack from 10c4fdf0:
              10c4fdf0 002ffabb 000243e8 00000000 008ced6c 00024416 00316a99 0000063f
              00316a99 00000009 00000000 002501b4 00316a99 0000063f c0a86117 00000080
              c0a86117 00ad0c90 00250a5a 00000014 00ad0c90 00000000 00000000 00000001
              00b02dd0 00356594 00000000 00356594 c0a86117 eff6c9e4 008ced6c 00000002
              008ced60 0024f9b4 00250b52 00ad0c90 00000000 00000000 00252390 00ad0c90
              eff6c9e4 0000004f 00000000 00000000 eff6c9e4 8000e25c eff6c9e4 80001020
      Call Trace: [<000243e8>] warn_slowpath_common+0x52/0x6c
       [<00024416>] warn_slowpath_null+0x14/0x1a
       [<002501b4>] rtmsg_ifa+0xdc/0xf0
       [<00250a5a>] __inet_insert_ifa+0xd6/0x1c2
       [<0024f9b4>] inet_abc_len+0x0/0x42
       [<00250b52>] inet_insert_ifa+0xc/0x12
       [<00252390>] devinet_ioctl+0x2ae/0x5d6
      
      Adding some debugging code reveals that net_fill_ifaddr() fails in
      
          put_cacheinfo(skb, ifa->ifa_cstamp, ifa->ifa_tstamp,
                                    preferred, valid))
      
      nla_put complains:
      
          lib/nlattr.c:454: skb_tailroom(skb) = 12, nla_total_size(attrlen) = 20
      
      Apparently commit 5c766d64 ("ipv4:
      introduce address lifetime") forgot to take into account the addition of
      struct ifa_cacheinfo in inet_nlmsg_size(). Hence add it, like is already
      done for ipv6.
      Suggested-by: default avatarCong Wang <cwang@twopensource.com>
      Signed-off-by: default avatarGeert Uytterhoeven <geert@linux-m68k.org>
      Signed-off-by: default avatarCong Wang <cwang@twopensource.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      0c360776
    • Oliver Hartkopp's avatar
      can: add destructor for self generated skbs · b3097b17
      Oliver Hartkopp authored
      [ Upstream commit 0ae89beb ]
      
      Self generated skbuffs in net/can/bcm.c are setting a skb->sk reference but
      no explicit destructor which is enforced since Linux 3.11 with commit
      376c7311 (net: add a temporary sanity check in skb_orphan()).
      
      This patch adds some helper functions to make sure that a destructor is
      properly defined when a sock reference is assigned to a CAN related skb.
      To create an unshared skb owned by the original sock a common helper function
      has been introduced to replace open coded functions to create CAN echo skbs.
      Signed-off-by: default avatarOliver Hartkopp <socketcan@hartkopp.net>
      Tested-by: default avatarAndre Naujoks <nautsch2@gmail.com>
      Reviewed-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      b3097b17
    • Cong Wang's avatar
      bridge: fix netconsole setup over bridge · d7691e10
      Cong Wang authored
      [ Upstream commit dbe17307 ]
      
      Commit 93d8bf9f ("bridge: cleanup netpoll code") introduced
      a check in br_netpoll_enable(), but this check is incorrect for
      br_netpoll_setup(). This patch moves the code after the check
      into __br_netpoll_enable() and calls it in br_netpoll_setup().
      For br_add_if(), the check is still needed.
      
      Fixes: 93d8bf9f ("bridge: cleanup netpoll code")
      Cc: Toshiaki Makita <makita.toshiaki@lab.ntt.co.jp>
      Cc: Stephen Hemminger <stephen@networkplumber.org>
      Cc: David S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarCong Wang <cwang@twopensource.com>
      Signed-off-by: default avatarCong Wang <xiyou.wangcong@gmail.com>
      Acked-by: default avatarToshiaki Makita <makita.toshiaki@lab.ntt.co.jp>
      Tested-by: default avatarToshiaki Makita <makita.toshiaki@lab.ntt.co.jp>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      d7691e10
    • Richard Yao's avatar
      9p/trans_virtio.c: Fix broken zero-copy on vmalloc() buffers · ef9daf83
      Richard Yao authored
      [ Upstream commit b6f52ae2 ]
      
      The 9p-virtio transport does zero copy on things larger than 1024 bytes
      in size. It accomplishes this by returning the physical addresses of
      pages to the virtio-pci device. At present, the translation is usually a
      bit shift.
      
      That approach produces an invalid page address when we read/write to
      vmalloc buffers, such as those used for Linux kernel modules. Any
      attempt to load a Linux kernel module from 9p-virtio produces the
      following stack.
      
      [<ffffffff814878ce>] p9_virtio_zc_request+0x45e/0x510
      [<ffffffff814814ed>] p9_client_zc_rpc.constprop.16+0xfd/0x4f0
      [<ffffffff814839dd>] p9_client_read+0x15d/0x240
      [<ffffffff811c8440>] v9fs_fid_readn+0x50/0xa0
      [<ffffffff811c84a0>] v9fs_file_readn+0x10/0x20
      [<ffffffff811c84e7>] v9fs_file_read+0x37/0x70
      [<ffffffff8114e3fb>] vfs_read+0x9b/0x160
      [<ffffffff81153571>] kernel_read+0x41/0x60
      [<ffffffff810c83ab>] copy_module_from_fd.isra.34+0xfb/0x180
      
      Subsequently, QEMU will die printing:
      
      qemu-system-x86_64: virtio: trying to map MMIO memory
      
      This patch enables 9p-virtio to correctly handle this case. This not
      only enables us to load Linux kernel modules off virtfs, but also
      enables ZFS file-based vdevs on virtfs to be used without killing QEMU.
      
      Special thanks to both Avi Kivity and Alexander Graf for their
      interpretation of QEMU backtraces. Without their guidence, tracking down
      this bug would have taken much longer. Also, special thanks to Linus
      Torvalds for his insightful explanation of why this should use
      is_vmalloc_addr() instead of is_vmalloc_or_module_addr():
      
      https://lkml.org/lkml/2014/2/8/272Signed-off-by: default avatarRichard Yao <ryao@gentoo.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      ef9daf83
    • Eric Dumazet's avatar
      6lowpan: fix lockdep splats · 9399fb67
      Eric Dumazet authored
      [ Upstream commit 20e7c4e8 ]
      
      When a device ndo_start_xmit() calls again dev_queue_xmit(),
      lockdep can complain because dev_queue_xmit() is re-entered and the
      spinlocks protecting tx queues share a common lockdep class.
      
      Same issue was fixed for bonding/l2tp/ppp in commits
      
      0daa2303 ("[PATCH] bonding: lockdep annotation")
      49ee4920 ("bonding: set qdisc_tx_busylock to avoid LOCKDEP splat")
      23d3b8bf ("net: qdisc busylock needs lockdep annotations ")
      303c07db ("ppp: set qdisc_tx_busylock to avoid LOCKDEP splat ")
      Reported-by: default avatarAlexander Aring <alex.aring@gmail.com>
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Tested-by: default avatarAlexander Aring <alex.aring@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      9399fb67
  4. 22 Feb, 2014 14 commits
    • Greg Kroah-Hartman's avatar
      Linux 3.12.13 · 1e91ad22
      Greg Kroah-Hartman authored
      1e91ad22
    • Borislav Petkov's avatar
      EDAC: Correct workqueue setup path · c7e30e0a
      Borislav Petkov authored
      commit cb6ef42e upstream.
      
      We're using edac_mc_workq_setup() both on the init path, when
      we load an edac driver and when we change the polling period
      (edac_mc_reset_delay_period) through /sys/.../edac_mc_poll_msec.
      
      On that second path we don't need to init the workqueue which has been
      initialized already.
      
      Thanks to Tejun for workqueue insights.
      Signed-off-by: default avatarBorislav Petkov <bp@suse.de>
      Link: http://lkml.kernel.org/r/1391457913-881-1-git-send-email-prarit@redhat.comSigned-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      c7e30e0a
    • Borislav Petkov's avatar
      EDAC: Poll timeout cannot be zero, p2 · 0e790ebd
      Borislav Petkov authored
      commit 9da21b15 upstream.
      
      Sanitize code even more to accept unsigned longs only and to not allow
      polling intervals below 1 second as this is unnecessary and doesn't make
      much sense anyway for polling errors.
      Signed-off-by: default avatarBorislav Petkov <bp@suse.de>
      Link: http://lkml.kernel.org/r/1391457913-881-1-git-send-email-prarit@redhat.com
      Cc: Doug Thompson <dougthompson@xmission.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      0e790ebd
    • Prarit Bhargava's avatar
      drivers/edac/edac_mc_sysfs.c: poll timeout cannot be zero · 3e475641
      Prarit Bhargava authored
      commit 79040cad upstream.
      
      If you do
      
        echo 0 > /sys/module/edac_core/parameters/edac_mc_poll_msec
      
      the following stack trace is output because the edac module is not
      designed to poll with a timeout of zero.
      
        WARNING: CPU: 12 PID: 0 at lib/list_debug.c:33 __list_add+0xac/0xc0()
        list_add corruption. prev->next should be next (ffff8808291dd1b8), but was           (null). (prev=ffff8808286fe3f8).
        Modules linked in: sg nfsv3 rpcsec_gss_krb5 nfsv4 dns_resolver nfs fscache cfg80211 rfkill x86_pkg_temp_thermal coretemp kvm_intel kvm ixgbe e1000e crct10dif_pclmul crc32_pclmul crc32c_intel ghash_clmulni_intel aesni_intel lrw gf128mul glue_helper ablk_helper cryptd iTCO_wdt ptp sb_edac iTCO_vendor_support pps_core mdio ipmi_devintf edac_core ioatdma microcode shpchp lpc_ich pcspkr i2c_i801 dca mfd_core ipmi_si wmi ipmi_msghandler nfsd auth_rpcgss nfs_acl lockd sunrpc xfs libcrc32c sd_mod sr_mod cdrom crc_t10dif crct10dif_common mgag200 syscopyarea sysfillrect sysimgblt isci i2c_algo_bit drm_kms_helper ttm drm libsas ahci libahci scsi_transport_sas libata i2c_core dm_mirror dm_region_hash dm_log dm_mod
        CPU: 12 PID: 0 Comm: swapper/12 Not tainted 3.13.0+ #1
        Hardware name: Intel Corporation LH Pass ........../SVRBD-ROW_T, BIOS SE5C600.86B.01.08.0003.022620131521 02/26/2013
        Call Trace:
         <IRQ>
          __list_add+0xac/0xc0
          __internal_add_timer+0xab/0x130
          internal_add_timer+0x17/0x40
          mod_timer_pinned+0xca/0x170
          intel_pstate_timer_func+0x28a/0x380
          call_timer_fn+0x36/0x100
          run_timer_softirq+0x1ff/0x2f0
          __do_softirq+0xf5/0x2e0
          irq_exit+0x10d/0x120
          smp_apic_timer_interrupt+0x45/0x60
          apic_timer_interrupt+0x6d/0x80
         <EOI>
          cpuidle_idle_call+0xb9/0x1f0
          arch_cpu_idle+0xe/0x30
          cpu_startup_entry+0x9e/0x240
          start_secondary+0x1e4/0x290
      
        kernel BUG at kernel/timer.c:1084!
        invalid opcode: 0000 [#1] SMP
        Modules linked in: sg nfsv3 rpcsec_gss_krb5 nfsv4 dns_resolver nfs fscache cfg80211 rfkill x86_pkg_temp_thermal coretemp kvm_intel kvm ixgbe e1000e crct10dif_pclmul crc32_pclmul crc32c_intel ghash_clmulni_intel aesni_intel lrw gf128mul glue_helper ablk_helper cryptd iTCO_wdt ptp sb_edac iTCO_vendor_support pps_core mdio ipmi_devintf edac_core ioatdma microcode shpchp lpc_ich pcspkr i2c_i801 dca mfd_core ipmi_si wmi ipmi_msghandler nfsd auth_rpcgss nfs_acl lockd sunrpc xfs libcrc32c sd_mod sr_mod cdrom crc_t10dif crct10dif_common mgag200 syscopyarea sysfillrect sysimgblt isci i2c_algo_bit drm_kms_helper ttm drm libsas ahci libahci scsi_transport_sas libata i2c_core dm_mirror dm_region_hash dm_log dm_mod
        CPU: 12 PID: 0 Comm: swapper/12 Tainted: G        W    3.13.0+ #1
        Hardware name: Intel Corporation LH Pass ........../SVRBD-ROW_T, BIOS SE5C600.86B.01.08.0003.022620131521 02/26/2013
        Call Trace:
         <IRQ>
          run_timer_softirq+0x245/0x2f0
          __do_softirq+0xf5/0x2e0
          irq_exit+0x10d/0x120
          smp_apic_timer_interrupt+0x45/0x60
          apic_timer_interrupt+0x6d/0x80
         <EOI>
          cpuidle_idle_call+0xb9/0x1f0
          arch_cpu_idle+0xe/0x30
          cpu_startup_entry+0x9e/0x240
          start_secondary+0x1e4/0x290
        RIP   cascade+0x93/0xa0
      
        WARNING: CPU: 36 PID: 1154 at kernel/workqueue.c:1461 __queue_delayed_work+0xed/0x1a0()
        Modules linked in: sg nfsv3 rpcsec_gss_krb5 nfsv4 dns_resolver nfs fscache cfg80211 rfkill x86_pkg_temp_thermal coretemp kvm_intel kvm ixgbe e1000e crct10dif_pclmul crc32_pclmul crc32c_intel ghash_clmulni_intel aesni_intel lrw gf128mul glue_helper ablk_helper cryptd iTCO_wdt ptp sb_edac iTCO_vendor_support pps_core mdio ipmi_devintf edac_core ioatdma microcode shpchp lpc_ich pcspkr i2c_i801 dca mfd_core ipmi_si wmi ipmi_msghandler nfsd auth_rpcgss nfs_acl lockd sunrpc xfs libcrc32c sd_mod sr_mod cdrom crc_t10dif crct10dif_common mgag200 syscopyarea sysfillrect sysimgblt isci i2c_algo_bit drm_kms_helper ttm drm libsas ahci libahci scsi_transport_sas libata i2c_core dm_mirror dm_region_hash dm_log dm_mod
        CPU: 36 PID: 1154 Comm: kworker/u481:3 Tainted: G        W    3.13.0+ #1
        Hardware name: Intel Corporation LH Pass ........../SVRBD-ROW_T, BIOS SE5C600.86B.01.08.0003.022620131521 02/26/2013
        Workqueue: edac-poller edac_mc_workq_function [edac_core]
        Call Trace:
          dump_stack+0x45/0x56
          warn_slowpath_common+0x7d/0xa0
          warn_slowpath_null+0x1a/0x20
          __queue_delayed_work+0xed/0x1a0
          queue_delayed_work_on+0x27/0x50
          edac_mc_workq_function+0x72/0xa0 [edac_core]
          process_one_work+0x17b/0x460
          worker_thread+0x11b/0x400
          kthread+0xd2/0xf0
          ret_from_fork+0x7c/0xb0
      
      This patch adds a range check in the edac_mc_poll_msec code to check for 0.
      Signed-off-by: default avatarPrarit Bhargava <prarit@redhat.com>
      Cc: Doug Thompson <dougthompson@xmission.com>
      Cc: Mauro Carvalho Chehab <mchehab@infradead.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      3e475641
    • Paul Gortmaker's avatar
      genirq: Add missing irq_to_desc export for CONFIG_SPARSE_IRQ=n · d588c3d8
      Paul Gortmaker authored
      commit 2c45aada upstream.
      
      In allmodconfig builds for sparc and any other arch which does
      not set CONFIG_SPARSE_IRQ, the following will be seen at modpost:
      
        CC [M]  lib/cpu-notifier-error-inject.o
        CC [M]  lib/pm-notifier-error-inject.o
      ERROR: "irq_to_desc" [drivers/gpio/gpio-mcp23s08.ko] undefined!
      make[2]: *** [__modpost] Error 1
      
      This happens because commit 3911ff30 ("genirq: export
      handle_edge_irq() and irq_to_desc()") added one export for it, but
      there were actually two instances of it, in an if/else clause for
      CONFIG_SPARSE_IRQ.  Add the second one.
      Signed-off-by: default avatarPaul Gortmaker <paul.gortmaker@windriver.com>
      Cc: Jiri Kosina <jkosina@suse.cz>
      Link: http://lkml.kernel.org/r/1392057610-11514-1-git-send-email-paul.gortmaker@windriver.comSigned-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      d588c3d8
    • Nicholas Bellinger's avatar
      target: Fix free-after-use regression in PR unregister · e495ac8f
      Nicholas Bellinger authored
      commit fc09149d upstream.
      
      This patch addresses a >= v3.11 free-after-use regression
      in core_scsi3_emulate_pro_register() that was introduced
      in the following commit:
      
      commit bc118fe4
      Author: Andy Grover <agrover@redhat.com>
      Date:   Thu May 16 10:41:04 2013 -0700
      
          target: Further refactoring of core_scsi3_emulate_pro_register()
      
      To avoid the free-after-use, save an type value before hand, and
      only call core_scsi3_put_pr_reg() with a valid *pr_reg.
      Reported-by: default avatarDan Carpenter <dan.carpenter@oracle.com>
      Cc: Andy Grover <agrover@redhat.com>
      Signed-off-by: default avatarNicholas Bellinger <nab@linux-iscsi.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      e495ac8f
    • Steven Rostedt (Red Hat)'s avatar
      ring-buffer: Fix first commit on sub-buffer having non-zero delta · fd60cc9e
      Steven Rostedt (Red Hat) authored
      commit d651aa1d upstream.
      
      Each sub-buffer (buffer page) has a full 64 bit timestamp. The events on
      that page use a 27 bit delta against that timestamp in order to save on
      bits written to the ring buffer. If the time between events is larger than
      what the 27 bits can hold, a "time extend" event is added to hold the
      entire 64 bit timestamp again and the events after that hold a delta from
      that timestamp.
      
      As a "time extend" is always paired with an event, it is logical to just
      allocate the event with the time extend, to make things a bit more efficient.
      
      Unfortunately, when the pairing code was written, it removed the "delta = 0"
      from the first commit on a page, causing the events on the page to be
      slightly skewed.
      
      Fixes: 69d1b839 "ring-buffer: Bind time extend and data events together"
      Signed-off-by: default avatarSteven Rostedt <rostedt@goodmis.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      fd60cc9e
    • Krzysztof Kozlowski's avatar
      power: max17040: Fix NULL pointer dereference when there is no platform_data · c12b12ed
      Krzysztof Kozlowski authored
      commit ac323d8d upstream.
      
      Fix NULL pointer dereference of "chip->pdata" if platform_data was not
      supplied to the driver.
      
      The driver during probe stored the pointer to the platform_data:
      	chip->pdata = client->dev.platform_data;
      Later it was dereferenced in max17040_get_online() and
      max17040_get_status().
      
      If platform_data was not supplied, the NULL pointer exception would
      happen:
      
      [    6.626094] Unable to handle kernel  of a at virtual address 00000000
      [    6.628557] pgd = c0004000
      [    6.632868] [00000000] *pgd=66262564
      [    6.634636] Unable to handle kernel paging request at virtual address e6262000
      [    6.642014] pgd = de468000
      [    6.644700] [e6262000] *pgd=00000000
      [    6.648265] Internal error: Oops: 5 [#1] PREEMPT SMP ARM
      [    6.653552] Modules linked in:
      [    6.656598] CPU: 0 PID: 31 Comm: kworker/0:1 Not tainted 3.10.14-02717-gc58b4b4 #505
      [    6.664334] Workqueue: events max17040_work
      [    6.668488] task: dfa11b80 ti: df9f6000 task.ti: df9f6000
      [    6.673873] PC is at show_pte+0x80/0xb8
      [    6.677687] LR is at show_pte+0x3c/0xb8
      [    6.681503] pc : [<c001b7b8>]    lr : [<c001b774>]    psr: 600f0113
      [    6.681503] sp : df9f7d58  ip : 600f0113  fp : 00000009
      [    6.692965] r10: 00000000  r9 : 00000000  r8 : dfa11b80
      [    6.698171] r7 : df9f7ea0  r6 : e6262000  r5 : 00000000  r4 : 00000000
      [    6.704680] r3 : 00000000  r2 : e6262000  r1 : 600f0193  r0 : c05b3750
      [    6.711194] Flags: nZCv  IRQs on  FIQs on  Mode SVC_32  ISA ARM  Segment kernel
      [    6.718485] Control: 10c53c7d  Table: 5e46806a  DAC: 00000015
      [    6.724218] Process kworker/0:1 (pid: 31, stack limit = 0xdf9f6238)
      [    6.730465] Stack: (0xdf9f7d58 to 0xdf9f8000)
      [    6.914325] [<c001b7b8>] (show_pte+0x80/0xb8) from [<c047107c>] (__do_kernel_fault.part.9+0x44/0x74)
      [    6.923425] [<c047107c>] (__do_kernel_fault.part.9+0x44/0x74) from [<c001bb7c>] (do_page_fault+0x2c4/0x360)
      [    6.933144] [<c001bb7c>] (do_page_fault+0x2c4/0x360) from [<c0008400>] (do_DataAbort+0x34/0x9c)
      [    6.941825] [<c0008400>] (do_DataAbort+0x34/0x9c) from [<c000e5d8>] (__dabt_svc+0x38/0x60)
      [    6.950058] Exception stack(0xdf9f7ea0 to 0xdf9f7ee8)
      [    6.955099] 7ea0: df0c1790 00000000 00000002 00000000 df0c1794 df0c1790 df0c1790 00000042
      [    6.963271] 7ec0: df0c1794 00000001 00000000 00000009 00000000 df9f7ee8 c0306268 c0306270
      [    6.971419] 7ee0: a00f0113 ffffffff
      [    6.974902] [<c000e5d8>] (__dabt_svc+0x38/0x60) from [<c0306270>] (max17040_work+0x8c/0x144)
      [    6.983317] [<c0306270>] (max17040_work+0x8c/0x144) from [<c003f364>] (process_one_work+0x138/0x440)
      [    6.992429] [<c003f364>] (process_one_work+0x138/0x440) from [<c003fa64>] (worker_thread+0x134/0x3b8)
      [    7.001628] [<c003fa64>] (worker_thread+0x134/0x3b8) from [<c00454bc>] (kthread+0xa4/0xb0)
      [    7.009875] [<c00454bc>] (kthread+0xa4/0xb0) from [<c000eb28>] (ret_from_fork+0x14/0x2c)
      [    7.017943] Code: e1a03005 e2422480 e0826104 e59f002c (e7922104)
      [    7.024017] ---[ end trace 73bc7006b9cc5c79 ]---
      Signed-off-by: default avatarKrzysztof Kozlowski <k.kozlowski@samsung.com>
      Fixes: c6f4a42dSigned-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      c12b12ed
    • Mikulas Patocka's avatar
      time: Fix overflow when HZ is smaller than 60 · 536bfe9d
      Mikulas Patocka authored
      commit 80d767d7 upstream.
      
      When compiling for the IA-64 ski emulator, HZ is set to 32 because the
      emulation is slow and we don't want to waste too many cycles processing
      timers. Alpha also has an option to set HZ to 32.
      
      This causes integer underflow in
      kernel/time/jiffies.c:
      kernel/time/jiffies.c:66:2: warning: large integer implicitly truncated to unsigned type [-Woverflow]
        .mult  = NSEC_PER_JIFFY << JIFFIES_SHIFT, /* details above */
        ^
      
      This patch reduces the JIFFIES_SHIFT value to avoid the overflow.
      Signed-off-by: default avatarMikulas Patocka <mikulas@artax.karlin.mff.cuni.cz>
      Link: http://lkml.kernel.org/r/alpine.LRH.2.02.1401241639100.23871@file01.intranet.prod.int.rdu2.redhat.comSigned-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      536bfe9d
    • Wolfram Sang's avatar
      i2c: mv64xxx: refactor message start to ensure proper initialization · 3bf3519d
      Wolfram Sang authored
      commit 79970db2 upstream.
      
      Because the offload mechanism can fall back to a standard transfer,
      having two seperate initialization states is unfortunate. Let's just
      have one state which does things consistently. This fixes a bug where
      some preparation was missing when the fallback happened. And it makes
      the code much easier to follow. To implement this, we put the check
      if offload is possible at the top of the offload setup function.
      Signed-off-by: default avatarWolfram Sang <wsa@the-dreams.de>
      Tested-by: default avatarGregory CLEMENT <gregory.clement@free-electrons.com>
      Fixes: 930ab3d4 (i2c: mv64xxx: Add I2C Transaction Generator support)
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      3bf3519d
    • Oleg Nesterov's avatar
      md/raid5: Fix CPU hotplug callback registration · 2d470eab
      Oleg Nesterov authored
      commit 789b5e03 upstream.
      
      Subsystems that want to register CPU hotplug callbacks, as well as perform
      initialization for the CPUs that are already online, often do it as shown
      below:
      
      	get_online_cpus();
      
      	for_each_online_cpu(cpu)
      		init_cpu(cpu);
      
      	register_cpu_notifier(&foobar_cpu_notifier);
      
      	put_online_cpus();
      
      This is wrong, since it is prone to ABBA deadlocks involving the
      cpu_add_remove_lock and the cpu_hotplug.lock (when running concurrently
      with CPU hotplug operations).
      
      Interestingly, the raid5 code can actually prevent double initialization and
      hence can use the following simplified form of callback registration:
      
      	register_cpu_notifier(&foobar_cpu_notifier);
      
      	get_online_cpus();
      
      	for_each_online_cpu(cpu)
      		init_cpu(cpu);
      
      	put_online_cpus();
      
      A hotplug operation that occurs between registering the notifier and calling
      get_online_cpus(), won't disrupt anything, because the code takes care to
      perform the memory allocations only once.
      
      So reorganize the code in raid5 this way to fix the deadlock with callback
      registration.
      
      Cc: linux-raid@vger.kernel.org
      Fixes: 36d1c647Signed-off-by: default avatarOleg Nesterov <oleg@redhat.com>
      [Srivatsa: Fixed the unregister_cpu_notifier() deadlock, added the
      free_scratch_buffer() helper to condense code further and wrote the changelog.]
      Signed-off-by: default avatarSrivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
      Signed-off-by: default avatarNeilBrown <neilb@suse.de>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      2d470eab
    • NeilBrown's avatar
      md/raid1: restore ability for check and repair to fix read errors. · 93bc0551
      NeilBrown authored
      commit 1877db75 upstream.
      
      commit 30bc9b53
          md/raid1: fix bio handling problems in process_checks()
      
      Move the bio_reset() to a point before where BIO_UPTODATE is checked,
      so that check now always report that the bio is uptodate, even if it is not.
      
      This causes process_check() to sometimes treat read-errors as
      successful matches so the good data isn't written out.
      
      This patch preserves the flag until it is needed.
      
      Bug was introduced in 3.11, but backported to 3.10-stable (as it fixed
      an even worse bug).  So suitable for any -stable since 3.10.
      Reported-and-tested-by: default avatarMichael Tokarev <mjt@tls.msk.ru>
      Fixed: 30bc9b53Signed-off-by: default avatarNeilBrown <neilb@suse.de>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      93bc0551
    • Thomas Gleixner's avatar
      tick: Clear broadcast pending bit when switching to oneshot · fb4251e4
      Thomas Gleixner authored
      commit dd5fd9b9 upstream.
      
      AMD systems which use the C1E workaround in the amd_e400_idle routine
      trigger the WARN_ON_ONCE in the broadcast code when onlining a CPU.
      
      The reason is that the idle routine of those AMD systems switches the
      cpu into forced broadcast mode early on before the newly brought up
      CPU can switch over to high resolution / NOHZ mode. The timer related
      CPU1 bringup looks like this:
      
        clockevent_register_device(local_apic);
        tick_setup(local_apic);
        ...
        idle()
      	tick_broadcast_on_off(FORCE);
      	tick_broadcast_oneshot_control(ENTER)
      	  cpumask_set(cpu, broadcast_oneshot_mask);
      	halt();
      
      Now the broadcast interrupt on CPU0 sets CPU1 in the
      broadcast_pending_mask and wakes CPU1. So CPU1 continues:
      
      	local_apic_timer_interrupt()
      	   tick_handle_periodic();
      	   softirq()
      	     tick_init_highres();
      	       cpumask_clr(cpu, broadcast_oneshot_mask);
      
      	tick_broadcast_oneshot_control(ENTER)
      	   WARN_ON(cpumask_test(cpu, broadcast_pending_mask);
      
      So while we remove CPU1 from the broadcast_oneshot_mask when we switch
      over to highres mode, we do not clear the pending bit, which then
      triggers the warning when we go back to idle.
      
      The reason why this is only visible on C1E affected AMD systems is
      that the other machines enter the deep sleep states via
      acpi_idle/intel_idle and exit the broadcast mode before executing the
      remote triggered local_apic_timer_interrupt. So the pending bit is
      already cleared when the switch over to highres mode is clearing the
      oneshot mask.
      
      The solution is simple: Clear the pending bit together with the mask
      bit when we switch over to highres mode.
      
      Stanislaw came up independently with the same patch by enforcing the
      C1E workaround and debugging the fallout. I picked mine, because mine
      has a changelog :)
      Reported-by: default avatarpoma <pomidorabelisima@gmail.com>
      Debugged-by: default avatarStanislaw Gruszka <sgruszka@redhat.com>
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Cc: Olaf Hering <olaf@aepfle.de>
      Cc: Dave Jones <davej@redhat.com>
      Cc: Justin M. Forbes <jforbes@redhat.com>
      Cc: Josh Boyer <jwboyer@redhat.com>
      Link: http://lkml.kernel.org/r/alpine.DEB.2.02.1402111434180.21991@ionos.tec.linutronix.deSigned-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      fb4251e4
    • Dan Carpenter's avatar
      KVM: return an error code in kvm_vm_ioctl_register_coalesced_mmio() · 3f77ba33
      Dan Carpenter authored
      commit aac5c422 upstream.
      
      If kvm_io_bus_register_dev() fails then it returns success but it should
      return an error code.
      
      I also did a little cleanup like removing an impossible NULL test.
      
      Fixes: 2b3c246a ('KVM: Make coalesced mmio use a device per zone')
      Signed-off-by: default avatarDan Carpenter <dan.carpenter@oracle.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      3f77ba33