1. 03 Sep, 2013 1 commit
    • Maxim Patlasov's avatar
      fuse: postpone end_page_writeback() in fuse_writepage_locked() · 4a4ac4eb
      Maxim Patlasov authored
      The patch fixes a race between ftruncate(2), mmap-ed write and write(2):
      
      1) An user makes a page dirty via mmap-ed write.
      2) The user performs shrinking truncate(2) intended to purge the page.
      3) Before fuse_do_setattr calls truncate_pagecache, the page goes to
         writeback. fuse_writepage_locked fills FUSE_WRITE request and releases
         the original page by end_page_writeback.
      4) fuse_do_setattr() completes and successfully returns. Since now, i_mutex
         is free.
      5) Ordinary write(2) extends i_size back to cover the page. Note that
         fuse_send_write_pages do wait for fuse writeback, but for another
         page->index.
      6) fuse_writepage_locked proceeds by queueing FUSE_WRITE request.
         fuse_send_writepage is supposed to crop inarg->size of the request,
         but it doesn't because i_size has already been extended back.
      
      Moving end_page_writeback to the end of fuse_writepage_locked fixes the
      race because now the fact that truncate_pagecache is successfully returned
      infers that fuse_writepage_locked has already called end_page_writeback.
      And this, in turn, infers that fuse_flush_writepages has already called
      fuse_send_writepage, and the latter used valid (shrunk) i_size. write(2)
      could not extend it because of i_mutex held by ftruncate(2).
      Signed-off-by: default avatarMaxim Patlasov <mpatlasov@parallels.com>
      Signed-off-by: default avatarMiklos Szeredi <mszeredi@suse.cz>
      Cc: stable@vger.kernel.org
      4a4ac4eb
  2. 02 Sep, 2013 4 commits
  3. 31 Aug, 2013 3 commits
    • Linus Torvalds's avatar
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net · a8787645
      Linus Torvalds authored
      Pull networking fixes from David Miller:
      
       1) There was a simplification in the ipv6 ndisc packet sending
          attempted here, which avoided using memory accounting on the
          per-netns ndisc socket for sending NDISC packets.  It did fix some
          important issues, but it causes regressions so it gets reverted here
          too.  Specifically, the problem with this change is that the IPV6
          output path really depends upon there being a valid skb->sk
          attached.
      
          The reason we want to do this change in some form when we figure out
          how to do it right, is that if a device goes down the ndisc_sk
          socket send queue will fill up and block NDISC packets that we want
          to send to other devices too.  That's really bad behavior.
      
          Hopefully Thomas can come up with a better version of this change.
      
       2) Fix a severe TCP performance regression by reverting a change made
          to dev_pick_tx() quite some time ago.  From Eric Dumazet.
      
       3) TIPC returns wrongly signed error codes, fix from Erik Hugne.
      
       4) Fix OOPS when doing IPSEC over ipv4 tunnels due to orphaning the
          skb->sk too early.  Fix from Li Hongjun.
      
       5) RAW ipv4 sockets can use the wrong routing key during lookup, from
          Chris Clark.
      
       6) Similar to #1 revert an older change that tried to use plain
          alloc_skb() for SYN/ACK TCP packets, this broke the netfilter owner
          mark which needs to see the skb->sk for such frames.  From Phil
          Oester.
      
       7) BNX2x driver bug fixes from Ariel Elior and Yuval Mintz,
          specifically in the handling of virtual functions.
      
       8) IPSEC path error propagations to sockets is not done properly when
          we have v4 in v6, and v6 in v4 type rules.  Fix from Hannes Frederic
          Sowa.
      
       9) Fix missing channel context release in mac80211, from Johannes Berg.
      
      10) Fix network namespace handing wrt.  SCM_RIGHTS, from Andy
          Lutomirski.
      
      11) Fix usage of bogus NAPI weight in jme, netxen, and ps3_gelic
          drivers.  From Michal Schmidt.
      
      12) Hopefully a complete and correct fix for the genetlink dump locking
          and module reference counting.  From Pravin B Shelar.
      
      13) sk_busy_loop() must do a cpu_relax(), from Eliezer Tamir.
      
      14) Fix handling of timestamp offset when restoring a snapshotted TCP
          socket.  From Andrew Vagin.
      
      * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (44 commits)
        net: fec: fix time stamping logic after napi conversion
        net: bridge: convert MLDv2 Query MRC into msecs_to_jiffies for max_delay
        mISDN: return -EINVAL on error in dsp_control_req()
        net: revert 8728c544 ("net: dev_pick_tx() fix")
        Revert "ipv6: Don't depend on per socket memory for neighbour discovery messages"
        ipv4 tunnels: fix an oops when using ipip/sit with IPsec
        tipc: set sk_err correctly when connection fails
        tcp: tcp_make_synack() should use sock_wmalloc
        bridge: separate querier and query timer into IGMP/IPv4 and MLD/IPv6 ones
        ipv6: Don't depend on per socket memory for neighbour discovery messages
        ipv4: sendto/hdrincl: don't use destination address found in header
        tcp: don't apply tsoffset if rcv_tsecr is zero
        tcp: initialize rcv_tstamp for restored sockets
        net: xilinx: fix memleak
        net: usb: Add HP hs2434 device to ZLP exception table
        net: add cpu_relax to busy poll loop
        net: stmmac: fixed the pbl setting with DT
        genl: Hold reference on correct module while netlink-dump.
        genl: Fix genl dumpit() locking.
        xfrm: Fix potential null pointer dereference in xdst_queue_output
        ...
      a8787645
    • Ian Campbell's avatar
      MAINTAINERS: change my DT related maintainer address · de80963e
      Ian Campbell authored
      Filtering capabilities on my work email are pretty much non-existent and this
      has turned out to be something of a firehose...
      
      Cc: Stephen Warren <swarren@wwwdotorg.org>
      Cc: Rob Herring <rob.herring@calxeda.com>
      Cc: Olof Johansson <olof@lixom.net>
      Cc: Linus Walleij <linus.walleij@linaro.org>
      Signed-off-by: default avatarIan Campbell <ian.campbell@citrix.com>
      Acked-by: default avatarPawel Moll <pawel.moll@arm.com>
      Acked-by: default avatarMark Rutland <mark.rutland@arm.com>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      de80963e
    • Linus Torvalds's avatar
      Merge tag 'sound-3.11' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound · 936dbcc3
      Linus Torvalds authored
      Pull sound fixes from Takashi Iwai:
       "This contains two Oops fixes (opti9xx and HD-audio) and a simple fixup
        for an Acer laptop.  All marked as stable patches"
      
      * tag 'sound-3.11' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound:
        ALSA: opti9xx: Fix conflicting driver object name
        ALSA: hda - Fix NULL dereference with CONFIG_SND_DYNAMIC_MINORS=n
        ALSA: hda - Add inverted digital mic fixup for Acer Aspire One
      936dbcc3
  4. 30 Aug, 2013 15 commits
  5. 29 Aug, 2013 17 commits
    • Jakob Bornecrantz's avatar
      drm/vmwgfx: Split GMR2_REMAP commands if they are to large · 6e4dcff3
      Jakob Bornecrantz authored
      This fixes the piglit test texturing/max-texture-size
      causing the VM to die due to a too large SVGA command.
      Signed-off-by: default avatarJakob Bornecrantz <jakob@vmware.com>
      Reviewed-by: default avatarBiran Paul <brianp@vmware.com>
      Reviewed-by: default avatarZack Rusin <zackr@vmware.com>
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarDave Airlie <airlied@gmail.com>
      6e4dcff3
    • Dave Airlie's avatar
      Merge tag 'drm-intel-fixes-2013-08-30' of... · 1dcff832
      Dave Airlie authored
      Merge tag 'drm-intel-fixes-2013-08-30' of git://people.freedesktop.org/~danvet/drm-intel into drm-fixes
      
      Just a one-line patch to fix a black screen issue on rare ivb machines,
      cc: stable. Normally I'd just shovel this into the -next pull request this
      late in the -rc cycle, but Linus was making noises about not getting real
      fixes which are cc: stable. So here we go ;-)
      
      * tag 'drm-intel-fixes-2013-08-30' of git://people.freedesktop.org/~danvet/drm-intel:
        drm/i915: ivb: fix edp voltage swing reg val
      1dcff832
    • Imre Deak's avatar
      drm/i915: ivb: fix edp voltage swing reg val · 77fa4cbd
      Imre Deak authored
      Fix the typo introduced in
      
      commit 1a2eb460
      Author: Keith Packard <keithp@keithp.com>
      Date:   Wed Nov 16 16:26:07 2011 -0800
      
          drm/i915: Hook up Ivybridge eDP
      
      This fixes eDP link-training failures and cases where all voltage swing
      /pre-emphasis levels were tried and failed during clock recovery and -
      as a fallback - we go on to do channel equalization with the last voltage
      swing/pre-emphasis level which will succeed. Both issues can lead to a
      blank screen.
      
      v2:
      - improve commit message
      
      CC: stable@vger.kernel.org
      Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=64880Tested-by: default avatarJeremy Moles <cubicool@gmail.com>
      Signed-off-by: default avatarImre Deak <imre.deak@intel.com>
      Reviewed-by: default avatarPaulo Zanoni <paulo.r.zanoni@intel.com>
      Signed-off-by: default avatarDaniel Vetter <daniel.vetter@ffwll.ch>
      77fa4cbd
    • David S. Miller's avatar
      Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec · 79f9ab7e
      David S. Miller authored
      Steffen Klassert says:
      
      ====================
      This pull request fixes some issues that arise when 6in4 or 4in6 tunnels
      are used in combination with IPsec, all from Hannes Frederic Sowa and a
      null pointer dereference when queueing packets to the policy hold queue.
      
      1) We might access the local error handler of the wrong address family if
         6in4 or 4in6 tunnel is protected by ipsec. Fix this by addind a pointer
         to the correct local_error to xfrm_state_afinet.
      
      2) Add a helper function to always refer to the correct interpretation
         of skb->sk.
      
      3) Call skb_reset_inner_headers to record the position of the inner headers
         when adding a new one in various ipv6 tunnels. This is needed to identify
         the addresses where to send back errors in the xfrm layer.
      
      4) Dereference inner ipv6 header if encapsulated to always call the
         right error handler.
      
      5) Choose protocol family by skb protocol to not call the wrong
         xfrm{4,6}_local_error handler in case an ipv6 sockets is used
         in ipv4 mode.
      
      6) Partly revert "xfrm: introduce helper for safe determination of mtu"
         because this introduced pmtu discovery problems.
      
      7) Set skb->protocol on tcp, raw and ip6_append_data genereated skbs.
         We need this to get the correct mtu informations in xfrm.
      
      8) Fix null pointer dereference in xdst_queue_output.
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      79f9ab7e
    • Thomas Graf's avatar
      ipv6: Don't depend on per socket memory for neighbour discovery messages · 1f324e38
      Thomas Graf authored
      Allocating skbs when sending out neighbour discovery messages
      currently uses sock_alloc_send_skb() based on a per net namespace
      socket and thus share a socket wmem buffer space.
      
      If a netdevice is temporarily unable to transmit due to carrier
      loss or for other reasons, the queued up ndisc messages will cosnume
      all of the wmem space and will thus prevent from any more skbs to
      be allocated even for netdevices that are able to transmit packets.
      
      The number of neighbour discovery messages sent is very limited,
      simply use alloc_skb() and don't depend on any socket wmem space any
      longer.
      
      This patch has orginally been posted by Eric Dumazet in a modified
      form.
      Signed-off-by: default avatarThomas Graf <tgraf@suug.ch>
      Cc: Eric Dumazet <eric.dumazet@gmail.com>
      Acked-by: default avatarHannes Frederic Sowa <hannes@stressinduktion.org>
      Acked-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      1f324e38
    • Chris Clark's avatar
      ipv4: sendto/hdrincl: don't use destination address found in header · c27c9322
      Chris Clark authored
      ipv4: raw_sendmsg: don't use header's destination address
      
      A sendto() regression was bisected and found to start with commit
      f8126f1d (ipv4: Adjust semantics of rt->rt_gateway.)
      
      The problem is that it tries to ARP-lookup the constructed packet's
      destination address rather than the explicitly provided address.
      
      Fix this using FLOWI_FLAG_KNOWN_NH so that given nexthop is used.
      
      cf. commit 2ad5b9e4Reported-by: default avatarChris Clark <chris.clark@alcatel-lucent.com>
      Bisected-by: default avatarChris Clark <chris.clark@alcatel-lucent.com>
      Tested-by: default avatarChris Clark <chris.clark@alcatel-lucent.com>
      Suggested-by: default avatarJulian Anastasov <ja@ssi.bg>
      Signed-off-by: default avatarChris Clark <chris.clark@alcatel-lucent.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      c27c9322
    • Andrew Vagin's avatar
      tcp: don't apply tsoffset if rcv_tsecr is zero · e3e12028
      Andrew Vagin authored
      The zero value means that tsecr is not valid, so it's a special case.
      
      tsoffset is used to customize tcp_time_stamp for one socket.
      tsoffset is usually zero, it's used when a socket was moved from one
      host to another host.
      
      Currently this issue affects logic of tcp_rcv_rtt_measure_ts. Due to
      incorrect value of rcv_tsecr, tcp_rcv_rtt_measure_ts sets rto to
      TCP_RTO_MAX.
      
      Cc: Pavel Emelyanov <xemul@parallels.com>
      Cc: Eric Dumazet <eric.dumazet@gmail.com>
      Cc: "David S. Miller" <davem@davemloft.net>
      Cc: Alexey Kuznetsov <kuznet@ms2.inr.ac.ru>
      Cc: James Morris <jmorris@namei.org>
      Cc: Hideaki YOSHIFUJI <yoshfuji@linux-ipv6.org>
      Cc: Patrick McHardy <kaber@trash.net>
      Reported-by: default avatarCyrill Gorcunov <gorcunov@openvz.org>
      Signed-off-by: default avatarAndrey Vagin <avagin@openvz.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      e3e12028
    • Andrew Vagin's avatar
      tcp: initialize rcv_tstamp for restored sockets · c7781a6e
      Andrew Vagin authored
      u32 rcv_tstamp;     /* timestamp of last received ACK */
      
      Its value used in tcp_retransmit_timer, which closes socket
      if the last ack was received more then TCP_RTO_MAX ago.
      
      Currently rcv_tstamp is initialized to zero and if tcp_retransmit_timer
      is called before receiving a first ack, the connection is closed.
      
      This patch initializes rcv_tstamp to a timestamp, when a socket was
      restored.
      
      Cc: Pavel Emelyanov <xemul@parallels.com>
      Cc: Eric Dumazet <eric.dumazet@gmail.com>
      Cc: "David S. Miller" <davem@davemloft.net>
      Cc: Alexey Kuznetsov <kuznet@ms2.inr.ac.ru>
      Cc: James Morris <jmorris@namei.org>
      Cc: Hideaki YOSHIFUJI <yoshfuji@linux-ipv6.org>
      Cc: Patrick McHardy <kaber@trash.net>
      Reported-by: default avatarCyrill Gorcunov <gorcunov@openvz.org>
      Signed-off-by: default avatarAndrey Vagin <avagin@openvz.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      c7781a6e
    • Barry Song's avatar
      arm: prima2: drop nr_irqs in mach as we moved to linear irqdomain · f8ab658b
      Barry Song authored
      we don't need nr_irqs in machine any more after we move to
      linear irqdomain for sirfsoc irqchip, so drop them.
      Signed-off-by: default avatarBarry Song <Baohua.Song@csr.com>
      Signed-off-by: default avatarOlof Johansson <olof@lixom.net>
      f8ab658b
    • Barry Song's avatar
      irqchip: sirf: move from legacy mode to linear irqdomain · 29eb51a7
      Barry Song authored
      the series of patches for irqdomain core in 3.11 has broken sirf
      irq which uses legacy mapping. all users fail in the new kernel
      while setupping irq.
      
      this patch moves to linear irqdomain and drop old legacy irqdomain
      codes since we don't need it any more, and at the same time, it
      also fixes the broken interrupts of sirfsoc in 3.11.
      
      on the other hand, we actually only have 64 interrupt sources for
      prima2 and atlas6, but there are 128 interrupt souces for marco
      which uses GIC. in the legacy codes, sirf gpio also uses legacy
      irqdomain, so to make gpio interrupt mapping not depend on the
      prima2/atlas6/marco an use unified marco,we enlarge prima2/atlas6
      interrupt number to 128. here we don't need this workaround any
      more as sirf gpio also moved to linear mode before. so we move
      SIRFSOC_NUM_IRQS back to 64 too.
      Signed-off-by: default avatarBarry Song <Baohua.Song@csr.com>
      Signed-off-by: default avatarOlof Johansson <olof@lixom.net>
      29eb51a7
    • Mischa Jonker's avatar
      Input: i8042 - disable the driver on ARC platforms · fa46c798
      Mischa Jonker authored
      It causes crashes when enabled, and we don't have such a peripheral
      anyway on ARC platforms.
      Signed-off-by: default avatarMischa Jonker <mjonker@synopsys.com>
      Signed-off-by: default avatarDmitry Torokhov <dmitry.torokhov@gmail.com>
      fa46c798
    • Hugh Dickins's avatar
      cgroup: fix rmdir EBUSY regression in 3.11 · bb78a92f
      Hugh Dickins authored
      On 3.11-rc we are seeing cgroup directories left behind when they should
      have been removed.  Here's a trivial reproducer:
      
      cd /sys/fs/cgroup/memory
      mkdir parent parent/child; rmdir parent/child parent
      rmdir: failed to remove `parent': Device or resource busy
      
      It's because cgroup_destroy_locked() (step 1 of destruction) leaves
      cgroup on parent's children list, letting cgroup_offline_fn() (step 2 of
      destruction) remove it; but step 2 is run by work queue, which may not
      yet have removed the children when parent destruction checks the list.
      
      Fix that by checking through a non-empty list of children: if every one
      of them has already been marked CGRP_DEAD, then it's safe to proceed:
      those children are invisible to userspace, and should not obstruct rmdir.
      
      (I didn't see any reason to keep the cgrp->children checks under the
      unrelated css_set_lock, so moved them out.)
      
      tj: Flattened nested ifs a bit and updated comment so that it's
          correct on both for-3.11-fixes and for-3.12.
      Signed-off-by: default avatarHugh Dickins <hughd@google.com>
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      bb78a92f
    • Tejun Heo's avatar
      workqueue: cond_resched() after processing each work item · b22ce278
      Tejun Heo authored
      If !PREEMPT, a kworker running work items back to back can hog CPU.
      This becomes dangerous when a self-requeueing work item which is
      waiting for something to happen races against stop_machine.  Such
      self-requeueing work item would requeue itself indefinitely hogging
      the kworker and CPU it's running on while stop_machine would wait for
      that CPU to enter stop_machine while preventing anything else from
      happening on all other CPUs.  The two would deadlock.
      
      Jamie Liu reports that this deadlock scenario exists around
      scsi_requeue_run_queue() and libata port multiplier support, where one
      port may exclude command processing from other ports.  With the right
      timing, scsi_requeue_run_queue() can end up requeueing itself trying
      to execute an IO which is asked to be retried while another device has
      an exclusive access, which in turn can't make forward progress due to
      stop_machine.
      
      Fix it by invoking cond_resched() after executing each work item.
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      Reported-by: default avatarJamie Liu <jamieliu@google.com>
      References: http://thread.gmane.org/gmane.linux.kernel/1552567
      Cc: stable@vger.kernel.org
      --
       kernel/workqueue.c |    9 +++++++++
       1 file changed, 9 insertions(+)
      b22ce278
    • Linus Torvalds's avatar
      Merge branch 'akpm' (patches from Andrew Morton) · c95389b4
      Linus Torvalds authored
      Merge fixes from Andrew Morton:
       "Five fixes.
      
        err, make that six.  let me try again"
      
      * emailed patches from Andrew Morton <akpm@linux-foundation.org>:
        fs/ocfs2/super.c: Use bigger nodestr to accomodate 32-bit node numbers
        memcg: check that kmem_cache has memcg_params before accessing it
        drivers/base/memory.c: fix show_mem_removable() to handle missing sections
        IPC: bugfix for msgrcv with msgtyp < 0
        Omnikey Cardman 4000: pull in ioctl.h in user header
        timer_list: correct the iterator for timer_list
      c95389b4
    • Goldwyn Rodrigues's avatar
      fs/ocfs2/super.c: Use bigger nodestr to accomodate 32-bit node numbers · 49fa8140
      Goldwyn Rodrigues authored
      While using pacemaker/corosync, the node numbers are generated using IP
      address as opposed to serial node number generation.  This may not fit
      in a 8-byte string.  Use a bigger string to print the complete node
      number.
      Signed-off-by: default avatarGoldwyn Rodrigues <rgoldwyn@suse.com>
      Cc: Mark Fasheh <mfasheh@suse.com>
      Cc: Joel Becker <jlbec@evilplan.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      49fa8140
    • Andrey Vagin's avatar
      memcg: check that kmem_cache has memcg_params before accessing it · 6f6b8951
      Andrey Vagin authored
      If the system had a few memory groups and all of them were destroyed,
      memcg_limited_groups_array_size has non-zero value, but all new caches
      are created without memcg_params, because memcg_kmem_enabled() returns
      false.
      
      We try to enumirate child caches in a few places and all of them are
      potentially dangerous.
      
      For example my kernel is compiled with CONFIG_SLAB and it crashed when I
      tryed to mount a NFS share after a few experiments with kmemcg.
      
        BUG: unable to handle kernel NULL pointer dereference at 0000000000000008
        IP: [<ffffffff8118166a>] do_tune_cpucache+0x8a/0xd0
        PGD b942a067 PUD b999f067 PMD 0
        Oops: 0000 [#1] SMP
        Modules linked in: fscache(+) ip6table_filter ip6_tables iptable_filter ip_tables i2c_piix4 pcspkr virtio_net virtio_balloon i2c_core floppy
        CPU: 0 PID: 357 Comm: modprobe Not tainted 3.11.0-rc7+ #59
        Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
        task: ffff8800b9f98240 ti: ffff8800ba32e000 task.ti: ffff8800ba32e000
        RIP: 0010:[<ffffffff8118166a>]  [<ffffffff8118166a>] do_tune_cpucache+0x8a/0xd0
        RSP: 0018:ffff8800ba32fb70  EFLAGS: 00010246
        RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000006
        RDX: 0000000000000000 RSI: ffff8800b9f98910 RDI: 0000000000000246
        RBP: ffff8800ba32fba0 R08: 0000000000000002 R09: 0000000000000004
        R10: 0000000000000001 R11: 0000000000000001 R12: 0000000000000010
        R13: 0000000000000008 R14: 00000000000000d0 R15: ffff8800375d0200
        FS:  00007f55f1378740(0000) GS:ffff8800bfa00000(0000) knlGS:0000000000000000
        CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
        CR2: 00007f24feba57a0 CR3: 0000000037b51000 CR4: 00000000000006f0
        Call Trace:
          enable_cpucache+0x49/0x100
          setup_cpu_cache+0x215/0x280
          __kmem_cache_create+0x2fa/0x450
          kmem_cache_create_memcg+0x214/0x350
          kmem_cache_create+0x2b/0x30
          fscache_init+0x19b/0x230 [fscache]
          do_one_initcall+0xfa/0x1b0
          load_module+0x1c41/0x26d0
          SyS_finit_module+0x86/0xb0
          system_call_fastpath+0x16/0x1b
      Signed-off-by: default avatarAndrey Vagin <avagin@openvz.org>
      Cc: Pekka Enberg <penberg@kernel.org>
      Cc: Christoph Lameter <cl@linux.com>
      Cc: Glauber Costa <glommer@openvz.org>
      Cc: Joonsoo Kim <js1304@gmail.com>
      Cc: Michal Hocko <mhocko@suse.cz>
      Cc: Johannes Weiner <hannes@cmpxchg.org>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      6f6b8951
    • Russ Anderson's avatar
      drivers/base/memory.c: fix show_mem_removable() to handle missing sections · 21ea9f5a
      Russ Anderson authored
      "cat /sys/devices/system/memory/memory*/removable" crashed the system.
      
      The problem is that show_mem_removable() is passing a
      bad pfn to is_mem_section_removable(), which causes
      
          if (!node_online(page_to_nid(page)))
      
      to blow up.  Why is it passing in a bad pfn?
      
      The reason is that show_mem_removable() will loop sections_per_block
      times.  sections_per_block is 16, but mem->section_count is 8,
      indicating holes in this memory block.  Checking that the memory section
      is present before checking to see if the memory section is removable
      fixes the problem.
      
         harp5-sys:~ # cat /sys/devices/system/memory/memory*/removable
         0
         1
         1
         1
         1
         1
         1
         1
         1
         1
         1
         1
         1
         1
         BUG: unable to handle kernel paging request at ffffea00c3200000
         IP: [<ffffffff81117ed1>] is_pageblock_removable_nolock+0x1/0x90
         PGD 83ffd4067 PUD 37bdfce067 PMD 0
         Oops: 0000 [#1] SMP
         Modules linked in: autofs4 binfmt_misc rdma_ucm rdma_cm iw_cm ib_addr ib_srp scsi_transport_srp scsi_tgt ib_ipoib ib_cm ib_uverbs ib_umad iw_cxgb3 cxgb3 mdio mlx4_en mlx4_ib ib_sa mlx4_core ib_mthca ib_mad ib_core fuse nls_iso8859_1 nls_cp437 vfat fat joydev loop hid_generic usbhid hid hwperf(O) numatools(O) dm_mod iTCO_wdt ipv6 iTCO_vendor_support igb i2c_i801 ioatdma i2c_algo_bit ehci_pci pcspkr lpc_ich i2c_core ehci_hcd ptp sg mfd_core dca rtc_cmos pps_core mperf button xhci_hcd sd_mod crc_t10dif usbcore usb_common scsi_dh_emc scsi_dh_hp_sw scsi_dh_alua scsi_dh_rdac scsi_dh gru(O) xvma(O) xfs crc32c libcrc32c thermal sata_nv processor piix mptsas mptscsih scsi_transport_sas mptbase megaraid_sas fan thermal_sys hwmon ext3 jbd ata_piix ahci libahci libata scsi_mod
         CPU: 4 PID: 5991 Comm: cat Tainted: G           O 3.11.0-rc5-rja-uv+ #10
         Hardware name: SGI UV2000/ROMLEY, BIOS SGI UV 2000/3000 series BIOS 01/15/2013
         task: ffff88081f034580 ti: ffff880820022000 task.ti: ffff880820022000
         RIP: 0010:[<ffffffff81117ed1>]  [<ffffffff81117ed1>] is_pageblock_removable_nolock+0x1/0x90
         RSP: 0018:ffff880820023df8  EFLAGS: 00010287
         RAX: 0000000000040000 RBX: ffffea00c3200000 RCX: 0000000000000004
         RDX: ffffea00c30b0000 RSI: 00000000001c0000 RDI: ffffea00c3200000
         RBP: ffff880820023e38 R08: 0000000000000000 R09: 0000000000000001
         R10: 0000000000000000 R11: 0000000000000001 R12: ffffea00c33c0000
         R13: 0000160000000000 R14: 6db6db6db6db6db7 R15: 0000000000000001
         FS:  00007ffff7fb2700(0000) GS:ffff88083fc80000(0000) knlGS:0000000000000000
         CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
         CR2: ffffea00c3200000 CR3: 000000081b954000 CR4: 00000000000407e0
         Call Trace:
           show_mem_removable+0x41/0x70
           dev_attr_show+0x2a/0x60
           sysfs_read_file+0xf7/0x1c0
           vfs_read+0xc8/0x130
           SyS_read+0x5d/0xa0
           system_call_fastpath+0x16/0x1b
      Signed-off-by: default avatarRuss Anderson <rja@sgi.com>
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: "Rafael J. Wysocki" <rafael.j.wysocki@intel.com>
      Cc: Yinghai Lu <yinghai@kernel.org>
      Reviewed-by: default avatarYasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      21ea9f5a