1. 26 Mar, 2016 22 commits
  2. 25 Mar, 2016 18 commits
    • Linus Torvalds's avatar
      Merge branch 'akpm' (patches from Andrew) · 606c61a0
      Linus Torvalds authored
      Merge fourth patch-bomb from Andrew Morton:
       "A lot more stuff than expected, sorry.  A bunch of ocfs2 reviewing was
        finished off.
      
         - mhocko's oom-reaper out-of-memory-handler changes
      
         - ocfs2 fixes and features
      
         - KASAN feature work
      
         - various fixes"
      
      * emailed patches from Andrew Morton <akpm@linux-foundation.org>: (42 commits)
        thp: fix typo in khugepaged_scan_pmd()
        MAINTAINERS: fill entries for KASAN
        mm/filemap: generic_file_read_iter(): check for zero reads unconditionally
        kasan: test fix: warn if the UAF could not be detected in kmalloc_uaf2
        mm, kasan: stackdepot implementation. Enable stackdepot for SLAB
        arch, ftrace: for KASAN put hard/soft IRQ entries into separate sections
        mm, kasan: add GFP flags to KASAN API
        mm, kasan: SLAB support
        kasan: modify kmalloc_large_oob_right(), add kmalloc_pagealloc_oob_right()
        include/linux/oom.h: remove undefined oom_kills_count()/note_oom_kill()
        mm/page_alloc: prevent merging between isolated and other pageblocks
        drivers/memstick/host/r592.c: avoid gcc-6 warning
        ocfs2: extend enough credits for freeing one truncate record while replaying truncate records
        ocfs2: extend transaction for ocfs2_remove_rightmost_path() and ocfs2_update_edge_lengths() before to avoid inconsistency between inode and et
        ocfs2/dlm: move lock to the tail of grant queue while doing in-place convert
        ocfs2: solve a problem of crossing the boundary in updating backups
        ocfs2: fix occurring deadlock by changing ocfs2_wq from global to local
        ocfs2/dlm: fix BUG in dlm_move_lockres_to_recovery_list
        ocfs2/dlm: fix race between convert and recovery
        ocfs2: fix a deadlock issue in ocfs2_dio_end_io_write()
        ...
      606c61a0
    • Linus Torvalds's avatar
      Merge tag 'pm+acpi-4.6-rc1-3' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm · 15dbc136
      Linus Torvalds authored
      Pull power management fixlet from Rafael Wysocki:
       "One of commits in my previous pull request changed the permissions of
        drivers/power/avs/rockchip-io-domain.c to executable by mistake"
      
      * tag 'pm+acpi-4.6-rc1-3' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
        Fix permissions of drivers/power/avs/rockchip-io-domain.c
      15dbc136
    • Linus Torvalds's avatar
      Merge tag 'please-pull-preadv2' of git://git.kernel.org/pub/scm/linux/kernel/git/aegl/linux · dad44dec
      Linus Torvalds authored
      Pull ia64 update from Tony Luck:
       "Wire up new system calls p{read,write}v2 for ia64"
      
      * tag 'please-pull-preadv2' of git://git.kernel.org/pub/scm/linux/kernel/git/aegl/linux:
        [IA64] Enable preadv2 and pwritev2 syscalls for ia64
      dad44dec
    • Linus Torvalds's avatar
      Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input · c155c749
      Linus Torvalds authored
      Pull more input updates from Dmitry Torokhov:
       "Second round of updates for the input subsystem.
      
        The BYD PS/2 protocol driver now uses absolute reporting mode and
        should behave more like other touchpads; Synaptics driver needed to
        extend one of its quirks to a newer firmware version, and a few USB
        drivers got tightened up checks for the contents of their descriptors"
      
      * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input:
        Input: sur40 - fix DMA on stack
        Input: ati_remote2 - fix crashes on detecting device with invalid descriptor
        Input: synaptics - handle spurious release of trackstick buttons, again
        Input: synaptics-rmi4 - remove check of Non-NULL array
        Input: byd - enable absolute mode
        Input: ims-pcu - sanity check against missing interfaces
        Input: melfas_mip4 - add hw_version sysfs attribute
      c155c749
    • Kirill A. Shutemov's avatar
      thp: fix typo in khugepaged_scan_pmd() · 0fda2788
      Kirill A. Shutemov authored
      !PageLRU should lead to SCAN_PAGE_LRU, not SCAN_SCAN_ABORT result.
      Signed-off-by: default avatarKirill A. Shutemov <kirill.shutemov@linux.intel.com>
      Cc: Ebru Akagunduz <ebru.akagunduz@gmail.com>
      Cc: Rik van Riel <riel@redhat.com>
      Cc: Vlastimil Babka <vbabka@suse.cz>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      0fda2788
    • Andrey Ryabinin's avatar
      MAINTAINERS: fill entries for KASAN · 0ba1d91d
      Andrey Ryabinin authored
      Signed-off-by: default avatarAndrey Ryabinin <aryabinin@virtuozzo.com>
      Cc: Alexander Potapenko <glider@google.com>
      Acked-by: default avatarDmitry Vyukov <dvyukov@google.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      0ba1d91d
    • Nicolai Stange's avatar
      mm/filemap: generic_file_read_iter(): check for zero reads unconditionally · e7080a43
      Nicolai Stange authored
      If
       - generic_file_read_iter() gets called with a zero read length,
       - the read offset is at a page boundary,
       - IOCB_DIRECT is not set
      -  and the page in question hasn't made it into the page cache yet,
      then do_generic_file_read() will trigger a readahead with a req_size hint
      of zero.
      
      Since roundup_pow_of_two(0) is undefined, UBSAN reports
      
        UBSAN: Undefined behaviour in include/linux/log2.h:63:13
        shift exponent 64 is too large for 64-bit type 'long unsigned int'
        CPU: 3 PID: 1017 Comm: sa1 Tainted: G L 4.5.0-next-20160318+ #14
        [...]
        Call Trace:
         [...]
         [<ffffffff813ef61a>] ondemand_readahead+0x3aa/0x3d0
         [<ffffffff813ef61a>] ? ondemand_readahead+0x3aa/0x3d0
         [<ffffffff813c73bd>] ? find_get_entry+0x2d/0x210
         [<ffffffff813ef9c3>] page_cache_sync_readahead+0x63/0xa0
         [<ffffffff813cc04d>] do_generic_file_read+0x80d/0xf90
         [<ffffffff813cc955>] generic_file_read_iter+0x185/0x420
         [...]
         [<ffffffff81510b06>] __vfs_read+0x256/0x3d0
         [...]
      
      when get_init_ra_size() gets called from ondemand_readahead().
      
      The net effect is that the initial readahead size is arch dependent for
      requested read lengths of zero: for example, since
      
        1UL << (sizeof(unsigned long) * 8)
      
      evaluates to 1 on x86 while its result is 0 on ARMv7, the initial readahead
      size becomes 4 on the former and 0 on the latter.
      
      What's more, whether or not the file access timestamp is updated for zero
      length reads is decided differently for the two cases of IOCB_DIRECT
      being set or cleared: in the first case, generic_file_read_iter()
      explicitly skips updating that timestamp while in the latter case, it is
      always updated through the call to do_generic_file_read().
      
      According to POSIX, zero length reads "do not modify the last data access
      timestamp" and thus, the IOCB_DIRECT behaviour is POSIXly correct.
      
      Let generic_file_read_iter() unconditionally check the requested read
      length at its entry and return immediately with success if it is zero.
      Signed-off-by: default avatarNicolai Stange <nicstange@gmail.com>
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Reviewed-by: default avatarJan Kara <jack@suse.cz>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      e7080a43
    • Alexander Potapenko's avatar
      kasan: test fix: warn if the UAF could not be detected in kmalloc_uaf2 · 9dcadd38
      Alexander Potapenko authored
      Signed-off-by: default avatarAlexander Potapenko <glider@google.com>
      Acked-by: default avatarAndrey Ryabinin <aryabinin@virtuozzo.com>
      Cc: Christoph Lameter <cl@linux.com>
      Cc: Pekka Enberg <penberg@kernel.org>
      Cc: David Rientjes <rientjes@google.com>
      Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
      Cc: Andrey Konovalov <adech.fo@gmail.com>
      Cc: Dmitry Vyukov <dvyukov@google.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Konstantin Serebryany <kcc@google.com>
      Cc: Dmitry Chernenkov <dmitryc@google.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      9dcadd38
    • Alexander Potapenko's avatar
      mm, kasan: stackdepot implementation. Enable stackdepot for SLAB · cd11016e
      Alexander Potapenko authored
      Implement the stack depot and provide CONFIG_STACKDEPOT.  Stack depot
      will allow KASAN store allocation/deallocation stack traces for memory
      chunks.  The stack traces are stored in a hash table and referenced by
      handles which reside in the kasan_alloc_meta and kasan_free_meta
      structures in the allocated memory chunks.
      
      IRQ stack traces are cut below the IRQ entry point to avoid unnecessary
      duplication.
      
      Right now stackdepot support is only enabled in SLAB allocator.  Once
      KASAN features in SLAB are on par with those in SLUB we can switch SLUB
      to stackdepot as well, thus removing the dependency on SLUB stack
      bookkeeping, which wastes a lot of memory.
      
      This patch is based on the "mm: kasan: stack depots" patch originally
      prepared by Dmitry Chernenkov.
      
      Joonsoo has said that he plans to reuse the stackdepot code for the
      mm/page_owner.c debugging facility.
      
      [akpm@linux-foundation.org: s/depot_stack_handle/depot_stack_handle_t]
      [aryabinin@virtuozzo.com: comment style fixes]
      Signed-off-by: default avatarAlexander Potapenko <glider@google.com>
      Signed-off-by: default avatarAndrey Ryabinin <aryabinin@virtuozzo.com>
      Cc: Christoph Lameter <cl@linux.com>
      Cc: Pekka Enberg <penberg@kernel.org>
      Cc: David Rientjes <rientjes@google.com>
      Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
      Cc: Andrey Konovalov <adech.fo@gmail.com>
      Cc: Dmitry Vyukov <dvyukov@google.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Konstantin Serebryany <kcc@google.com>
      Cc: Dmitry Chernenkov <dmitryc@google.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      cd11016e
    • Alexander Potapenko's avatar
      arch, ftrace: for KASAN put hard/soft IRQ entries into separate sections · be7635e7
      Alexander Potapenko authored
      KASAN needs to know whether the allocation happens in an IRQ handler.
      This lets us strip everything below the IRQ entry point to reduce the
      number of unique stack traces needed to be stored.
      
      Move the definition of __irq_entry to <linux/interrupt.h> so that the
      users don't need to pull in <linux/ftrace.h>.  Also introduce the
      __softirq_entry macro which is similar to __irq_entry, but puts the
      corresponding functions to the .softirqentry.text section.
      Signed-off-by: default avatarAlexander Potapenko <glider@google.com>
      Acked-by: default avatarSteven Rostedt <rostedt@goodmis.org>
      Cc: Christoph Lameter <cl@linux.com>
      Cc: Pekka Enberg <penberg@kernel.org>
      Cc: David Rientjes <rientjes@google.com>
      Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
      Cc: Andrey Konovalov <adech.fo@gmail.com>
      Cc: Dmitry Vyukov <dvyukov@google.com>
      Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com>
      Cc: Konstantin Serebryany <kcc@google.com>
      Cc: Dmitry Chernenkov <dmitryc@google.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      be7635e7
    • Alexander Potapenko's avatar
      mm, kasan: add GFP flags to KASAN API · 505f5dcb
      Alexander Potapenko authored
      Add GFP flags to KASAN hooks for future patches to use.
      
      This patch is based on the "mm: kasan: unified support for SLUB and SLAB
      allocators" patch originally prepared by Dmitry Chernenkov.
      Signed-off-by: default avatarAlexander Potapenko <glider@google.com>
      Cc: Christoph Lameter <cl@linux.com>
      Cc: Pekka Enberg <penberg@kernel.org>
      Cc: David Rientjes <rientjes@google.com>
      Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
      Cc: Andrey Konovalov <adech.fo@gmail.com>
      Cc: Dmitry Vyukov <dvyukov@google.com>
      Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Konstantin Serebryany <kcc@google.com>
      Cc: Dmitry Chernenkov <dmitryc@google.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      505f5dcb
    • Alexander Potapenko's avatar
      mm, kasan: SLAB support · 7ed2f9e6
      Alexander Potapenko authored
      Add KASAN hooks to SLAB allocator.
      
      This patch is based on the "mm: kasan: unified support for SLUB and SLAB
      allocators" patch originally prepared by Dmitry Chernenkov.
      Signed-off-by: default avatarAlexander Potapenko <glider@google.com>
      Cc: Christoph Lameter <cl@linux.com>
      Cc: Pekka Enberg <penberg@kernel.org>
      Cc: David Rientjes <rientjes@google.com>
      Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
      Cc: Andrey Konovalov <adech.fo@gmail.com>
      Cc: Dmitry Vyukov <dvyukov@google.com>
      Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Konstantin Serebryany <kcc@google.com>
      Cc: Dmitry Chernenkov <dmitryc@google.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      7ed2f9e6
    • Alexander Potapenko's avatar
      kasan: modify kmalloc_large_oob_right(), add kmalloc_pagealloc_oob_right() · e6e8379c
      Alexander Potapenko authored
      This patchset implements SLAB support for KASAN
      
      Unlike SLUB, SLAB doesn't store allocation/deallocation stacks for heap
      objects, therefore we reimplement this feature in mm/kasan/stackdepot.c.
      The intention is to ultimately switch SLUB to use this implementation as
      well, which will save a lot of memory (right now SLUB bloats each object
      by 256 bytes to store the allocation/deallocation stacks).
      
      Also neither SLUB nor SLAB delay the reuse of freed memory chunks, which
      is necessary for better detection of use-after-free errors.  We
      introduce memory quarantine (mm/kasan/quarantine.c), which allows
      delayed reuse of deallocated memory.
      
      This patch (of 7):
      
      Rename kmalloc_large_oob_right() to kmalloc_pagealloc_oob_right(), as
      the test only checks the page allocator functionality.  Also reimplement
      kmalloc_large_oob_right() so that the test allocates a large enough
      chunk of memory that still does not trigger the page allocator fallback.
      Signed-off-by: default avatarAlexander Potapenko <glider@google.com>
      Cc: Christoph Lameter <cl@linux.com>
      Cc: Pekka Enberg <penberg@kernel.org>
      Cc: David Rientjes <rientjes@google.com>
      Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
      Cc: Andrey Konovalov <adech.fo@gmail.com>
      Cc: Dmitry Vyukov <dvyukov@google.com>
      Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Konstantin Serebryany <kcc@google.com>
      Cc: Dmitry Chernenkov <dmitryc@google.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      e6e8379c
    • Tetsuo Handa's avatar
      include/linux/oom.h: remove undefined oom_kills_count()/note_oom_kill() · aaf4fb71
      Tetsuo Handa authored
      A leftover from commit c32b3cbe ("oom, PM: make OOM detection in the
      freezer path raceless").
      Signed-off-by: default avatarTetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
      Acked-by: default avatarMichal Hocko <mhocko@suse.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      aaf4fb71
    • Vlastimil Babka's avatar
      mm/page_alloc: prevent merging between isolated and other pageblocks · d9dddbf5
      Vlastimil Babka authored
      Hanjun Guo has reported that a CMA stress test causes broken accounting of
      CMA and free pages:
      
      > Before the test, I got:
      > -bash-4.3# cat /proc/meminfo | grep Cma
      > CmaTotal:         204800 kB
      > CmaFree:          195044 kB
      >
      >
      > After running the test:
      > -bash-4.3# cat /proc/meminfo | grep Cma
      > CmaTotal:         204800 kB
      > CmaFree:         6602584 kB
      >
      > So the freed CMA memory is more than total..
      >
      > Also the the MemFree is more than mem total:
      >
      > -bash-4.3# cat /proc/meminfo
      > MemTotal:       16342016 kB
      > MemFree:        22367268 kB
      > MemAvailable:   22370528 kB
      
      Laura Abbott has confirmed the issue and suspected the freepage accounting
      rewrite around 3.18/4.0 by Joonsoo Kim.  Joonsoo had a theory that this is
      caused by unexpected merging between MIGRATE_ISOLATE and MIGRATE_CMA
      pageblocks:
      
      > CMA isolates MAX_ORDER aligned blocks, but, during the process,
      > partialy isolated block exists. If MAX_ORDER is 11 and
      > pageblock_order is 9, two pageblocks make up MAX_ORDER
      > aligned block and I can think following scenario because pageblock
      > (un)isolation would be done one by one.
      >
      > (each character means one pageblock. 'C', 'I' means MIGRATE_CMA,
      > MIGRATE_ISOLATE, respectively.
      >
      > CC -> IC -> II (Isolation)
      > II -> CI -> CC (Un-isolation)
      >
      > If some pages are freed at this intermediate state such as IC or CI,
      > that page could be merged to the other page that is resident on
      > different type of pageblock and it will cause wrong freepage count.
      
      This was supposed to be prevented by CMA operating on MAX_ORDER blocks,
      but since it doesn't hold the zone->lock between pageblocks, a race
      window does exist.
      
      It's also likely that unexpected merging can occur between
      MIGRATE_ISOLATE and non-CMA pageblocks.  This should be prevented in
      __free_one_page() since commit 3c605096 ("mm/page_alloc: restrict
      max order of merging on isolated pageblock").  However, we only check
      the migratetype of the pageblock where buddy merging has been initiated,
      not the migratetype of the buddy pageblock (or group of pageblocks)
      which can be MIGRATE_ISOLATE.
      
      Joonsoo has suggested checking for buddy migratetype as part of
      page_is_buddy(), but that would add extra checks in allocator hotpath
      and bloat-o-meter has shown significant code bloat (the function is
      inline).
      
      This patch reduces the bloat at some expense of more complicated code.
      The buddy-merging while-loop in __free_one_page() is initially bounded
      to pageblock_border and without any migratetype checks.  The checks are
      placed outside, bumping the max_order if merging is allowed, and
      returning to the while-loop with a statement which can't be possibly
      considered harmful.
      
      This fixes the accounting bug and also removes the arguably weird state
      in the original commit 3c605096 where buddies could be left
      unmerged.
      
      Fixes: 3c605096 ("mm/page_alloc: restrict max order of merging on isolated pageblock")
      Link: https://lkml.org/lkml/2016/3/2/280Signed-off-by: default avatarVlastimil Babka <vbabka@suse.cz>
      Reported-by: default avatarHanjun Guo <guohanjun@huawei.com>
      Tested-by: default avatarHanjun Guo <guohanjun@huawei.com>
      Acked-by: default avatarJoonsoo Kim <iamjoonsoo.kim@lge.com>
      Debugged-by: default avatarLaura Abbott <labbott@redhat.com>
      Debugged-by: default avatarJoonsoo Kim <iamjoonsoo.kim@lge.com>
      Cc: Mel Gorman <mgorman@techsingularity.net>
      Cc: "Kirill A. Shutemov" <kirill@shutemov.name>
      Cc: Johannes Weiner <hannes@cmpxchg.org>
      Cc: Minchan Kim <minchan@kernel.org>
      Cc: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com>
      Cc: Zhang Yanfei <zhangyanfei@cn.fujitsu.com>
      Cc: Michal Nazarewicz <mina86@mina86.com>
      Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
      Cc: "Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com>
      Cc: <stable@vger.kernel.org>	[3.18+]
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      d9dddbf5
    • Arnd Bergmann's avatar
      drivers/memstick/host/r592.c: avoid gcc-6 warning · f419a08f
      Arnd Bergmann authored
      The r592 driver relies on behavior of the DMA mapping API that is
      normally observed but not guaranteed by the API.  Instead it uses a
      runtime check to fail transfers if the API ever behaves
      
      When CONFIG_NEED_SG_DMA_LENGTH is not set, one of the checks turns into a
      comparison of a variable with itself, which gcc-6.0 now warns about:
      
      drivers/memstick/host/r592.c: In function 'r592_transfer_fifo_dma':
      drivers/memstick/host/r592.c:302:31: error: self-comparison always evaluates to false [-Werror=tautological-compare]
          (sg_dma_len(&dev->req->sg) < dev->req->sg.length)) {
                                     ^
      
      The check itself is not a problem, so this patch just rephrases the
      condition in a way that gcc does not consider an indication of a mistake.
      We already know that dev->req->sg.length was initially R592_LFIFO_SIZE, so
      we can compare it to that constant again.
      Signed-off-by: default avatarArnd Bergmann <arnd@arndb.de>
      Cc: Maxim Levitsky <maximlevitsky@gmail.com>
      Cc: Quentin Lambert <lambert.quentin@gmail.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      f419a08f
    • Xue jiufei's avatar
      ocfs2: extend enough credits for freeing one truncate record while replaying truncate records · 102c2595
      Xue jiufei authored
      Now function ocfs2_replay_truncate_records() first modifies tl_used,
      then calls ocfs2_extend_trans() to extend transactions for gd and alloc
      inode used for freeing clusters.  jbd2_journal_restart() may be called
      and it may happen that tl_used in truncate log is decreased but the
      clusters are not freed, which means these clusters are lost.  So we
      should avoid extending transactions in these two operations.
      Signed-off-by: default avatarjoyce.xue <xuejiufei@huawei.com>
      Reviewed-by: default avatarMark Fasheh <mfasheh@suse.de>
      Acked-by: default avatarJoseph Qi <joseph.qi@huawei.com>
      Cc: Joel Becker <jlbec@evilplan.org>
      Cc: Junxiao Bi <junxiao.bi@oracle.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      102c2595
    • Xue jiufei's avatar
      ocfs2: extend transaction for ocfs2_remove_rightmost_path() and... · 17215989
      Xue jiufei authored
      ocfs2: extend transaction for ocfs2_remove_rightmost_path() and ocfs2_update_edge_lengths() before to avoid inconsistency between inode and et
      
      I found that jbd2_journal_restart() is called in some places without
      keeping things consistently before.  However, jbd2_journal_restart() may
      commit the handle's transaction and restart another one.  If the first
      transaction is committed successfully while another not, it may cause
      filesystem inconsistency or read only.  This is an effort to fix this
      kind of problems.
      
      This patch (of 3):
      
      The following functions will be called while truncating an extent:
      ocfs2_remove_btree_range
        -> ocfs2_start_trans
        -> ocfs2_remove_extent
           -> ocfs2_truncate_rec
             -> ocfs2_extend_rotate_transaction
               -> jbd2_journal_restart if jbd2_journal_extend fail
             -> ocfs2_rotate_tree_left
               -> ocfs2_remove_rightmost_path
                   -> ocfs2_extend_rotate_transaction
                     -> ocfs2_unlink_subtree
                      -> ocfs2_update_edge_lengths
                        -> ocfs2_extend_trans
                          -> jbd2_journal_restart if jbd2_journal_extend fail
        -> ocfs2_et_update_clusters
        -> ocfs2_commit_trans
      
      jbd2_journal_restart() may be called and it may happened that the buffers
      dirtied in ocfs2_truncate_rec() are committed while buffers dirtied in
      ocfs2_et_update_clusters() are not, the total clusters on extent tree and
      i_clusters in ocfs2_dinode is inconsistency.  So the clusters got from
      ocfs2_dinode is incorrect, and it also cause read-only problem when call
      ocfs2_commit_truncate() with the error message: "Inode %llu has empty
      extent block at %llu".
      
      We should extend enough credits for function ocfs2_remove_rightmost_path
      and ocfs2_update_edge_lengths to avoid this inconsistency.
      Signed-off-by: default avatarjoyce.xue <xuejiufei@huawei.com>
      Acked-by: default avatarJoseph Qi <joseph.qi@huawei.com>
      Cc: Mark Fasheh <mfasheh@suse.com>
      Cc: Joel Becker <jlbec@evilplan.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      17215989