1. 28 Jan, 2012 5 commits
    • Prasad Joshi's avatar
      logfs: Propagate page parameter to __logfs_write_inode · 0bd90387
      Prasad Joshi authored
      During GC LogFS has to rewrite each valid block to a separate segment.
      Rewrite operation reads data from an old segment and writes it to a
      newly allocated segment. Since every write operation changes data
      block pointers maintained in inode, inode should also be rewritten.
      
      In GC path to avoid AB-BA deadlock LogFS marks a page with
      PG_pre_locked in addition to locking the page (PG_locked). The page
      lock is ignored iff the page is pre-locked.
      
      LogFS uses a special file called segment file. The segment file
      maintains an 8 bytes entry for every segment. It keeps track of erase
      count, level etc. for every segment.
      
      Bad things happen with a segment belonging to the segment file is GCed
      
       ------------[ cut here ]------------
      kernel BUG at /home/prasad/logfs/readwrite.c:297!
      invalid opcode: 0000 [#1] SMP
      Modules linked in: logfs joydev usbhid hid psmouse e1000 i2c_piix4
      		serio_raw [last unloaded: logfs]
      Pid: 20161, comm: mount Not tainted 3.1.0-rc3+ #3 innotek GmbH
      		VirtualBox
      EIP: 0060:[<f809132a>] EFLAGS: 00010292 CPU: 0
      EIP is at logfs_lock_write_page+0x6a/0x70 [logfs]
      EAX: 00000027 EBX: f73f5b20 ECX: c16007c8 EDX: 00000094
      ESI: 00000000 EDI: e59be6e4 EBP: c7337b28 ESP: c7337b18
      DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068
      Process mount (pid: 20161, ti=c7336000 task=eb323f70 task.ti=c7336000)
      Stack:
      f8099a3d c7337b24 f73f5b20 00001002 c7337b50 f8091f6d f8099a4d f80994e4
      00000003 00000000 c7337b68 00000000 c67e4400 00001000 c7337b80 f80935e5
      00000000 00000000 00000000 00000000 e1fcf000 0000000f e59be618 c70bf900
      Call Trace:
      [<f8091f6d>] logfs_get_write_page.clone.16+0xdd/0x100 [logfs]
      [<f80935e5>] logfs_mod_segment_entry+0x55/0x110 [logfs]
      [<f809460d>] logfs_get_segment_entry+0x1d/0x20 [logfs]
      [<f8091060>] ? logfs_cleanup_journal+0x50/0x50 [logfs]
      [<f809521b>] ostore_get_erase_count+0x1b/0x40 [logfs]
      [<f80965b8>] logfs_open_area+0xc8/0x150 [logfs]
      [<c141a7ec>] ? kmemleak_alloc+0x2c/0x60
      [<f809668e>] __logfs_segment_write.clone.16+0x4e/0x1b0 [logfs]
      [<c10dd563>] ? mempool_kmalloc+0x13/0x20
      [<c10dd563>] ? mempool_kmalloc+0x13/0x20
      [<f809696f>] logfs_segment_write+0x17f/0x1d0 [logfs]
      [<f8092e8c>] logfs_write_i0+0x11c/0x180 [logfs]
      [<f8092f35>] logfs_write_direct+0x45/0x90 [logfs]
      [<f80934cd>] __logfs_write_buf+0xbd/0xf0 [logfs]
      [<c102900e>] ? kmap_atomic_prot+0x4e/0xe0
      [<f809424b>] logfs_write_buf+0x3b/0x60 [logfs]
      [<f80947a9>] __logfs_write_inode+0xa9/0x110 [logfs]
      [<f8094cb0>] logfs_rewrite_block+0xc0/0x110 [logfs]
      [<f8095300>] ? get_mapping_page+0x10/0x60 [logfs]
      [<f8095aa0>] ? logfs_load_object_aliases+0x2e0/0x2f0 [logfs]
      [<f808e57d>] logfs_gc_segment+0x2ad/0x310 [logfs]
      [<f808e62a>] __logfs_gc_once+0x4a/0x80 [logfs]
      [<f808ed43>] logfs_gc_pass+0x683/0x6a0 [logfs]
      [<f8097a89>] logfs_mount+0x5a9/0x680 [logfs]
      [<c1126b21>] mount_fs+0x21/0xd0
      [<c10f6f6f>] ? __alloc_percpu+0xf/0x20
      [<c113da41>] ? alloc_vfsmnt+0xb1/0x130
      [<c113db4b>] vfs_kern_mount+0x4b/0xa0
      [<c113e06e>] do_kern_mount+0x3e/0xe0
      [<c113f60d>] do_mount+0x34d/0x670
      [<c10f2749>] ? strndup_user+0x49/0x70
      [<c113fcab>] sys_mount+0x6b/0xa0
      [<c142d87c>] syscall_call+0x7/0xb
      Code: f8 e8 8b 93 39 c9 8b 45 f8 3e 0f ba 28 00 19 d2 85 d2 74 ca eb d0 0f 0b 8d 45 fc 89 44 24 04 c7 04 24 3d 9a 09 f8 e8 09 92 39 c9 <0f> 0b 8d 74 26 00 55 89 e5 3e 8d 74 26 00 8b 10 80 e6 01 74 09
      EIP: [<f809132a>] logfs_lock_write_page+0x6a/0x70 [logfs] SS:ESP 0068:c7337b18
      ---[ end trace 96e67d5b3aa3d6ca ]---
      
      The patch passes locked page to __logfs_write_inode. It calls function
      logfs_get_wblocks() to pre-lock the page. This ensures any further
      attempts to lock the page are ignored (esp from get_erase_count).
      Acked-by: default avatarJoern Engel <joern@logfs.org>
      Signed-off-by: default avatarPrasad Joshi <prasadjoshi.linux@gmail.com>
      0bd90387
    • Prasad Joshi's avatar
      logfs: set superblock shutdown flag after generic sb shutdown · ecfd8909
      Prasad Joshi authored
      While unmounting the file system LogFS calls generic_shutdown_super.
      The function does file system independent superblock shutdown.
      However, it might result in call file system specific inode eviction.
      
      LogFS marks FS shutting down by setting bit LOGFS_SB_FLAG_SHUTDOWN in
      super->s_flags. Since, inode eviction might call truncate on inode,
      following BUG is observed when file system is unmounted:
      
      ------------[ cut here ]------------
      kernel BUG at /home/prasad/logfs/segment.c:362!
      invalid opcode: 0000 [#1] PREEMPT SMP
      CPU 3
      Modules linked in: logfs binfmt_misc ppdev virtio_blk parport_pc lp
      	parport psmouse floppy virtio_pci serio_raw virtio_ring virtio
      
      Pid: 1933, comm: umount Not tainted 3.0.0+ #4 Bochs Bochs
      RIP: 0010:[<ffffffffa008c841>]  [<ffffffffa008c841>]
      		logfs_segment_write+0x211/0x230 [logfs]
      RSP: 0018:ffff880062d7b9e8  EFLAGS: 00010202
      RAX: 000000000000000e RBX: ffff88006eca9000 RCX: 0000000000000000
      RDX: ffff88006fd87c40 RSI: ffffea00014ff468 RDI: ffff88007b68e000
      RBP: ffff880062d7ba48 R08: 8000000020451430 R09: 0000000000000000
      R10: dead000000100100 R11: 0000000000000000 R12: ffff88006fd87c40
      R13: ffffea00014ff468 R14: ffff88005ad0a460 R15: 0000000000000000
      FS:  00007f25d50ea760(0000) GS:ffff88007fd80000(0000)
      	knlGS:0000000000000000
      CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
      CR2: 0000000000d05e48 CR3: 0000000062c72000 CR4: 00000000000006e0
      DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
      Process umount (pid: 1933, threadinfo ffff880062d7a000,
      	task ffff880070b44500)
      Stack:
      ffff880062d7ba38 ffff88005ad0a508 0000000000001000 0000000000000000
      8000000020451430 ffffea00014ff468 ffff880062d7ba48 ffff88005ad0a460
      ffff880062d7bad8 ffffea00014ff468 ffff88006fd87c40 0000000000000000
      Call Trace:
      [<ffffffffa0088fee>] logfs_write_i0+0x12e/0x190 [logfs]
      [<ffffffffa0089360>] __logfs_write_rec+0x140/0x220 [logfs]
      [<ffffffffa0089312>] __logfs_write_rec+0xf2/0x220 [logfs]
      [<ffffffffa00894a4>] logfs_write_rec+0x64/0xd0 [logfs]
      [<ffffffffa0089616>] __logfs_write_buf+0x106/0x110 [logfs]
      [<ffffffffa008a19e>] logfs_write_buf+0x4e/0x80 [logfs]
      [<ffffffffa008a6b8>] __logfs_write_inode+0x98/0x110 [logfs]
      [<ffffffffa008a7c4>] logfs_truncate+0x54/0x290 [logfs]
      [<ffffffffa008abfc>] logfs_evict_inode+0xdc/0x190 [logfs]
      [<ffffffff8115eef5>] evict+0x85/0x170
      [<ffffffff8115f126>] iput+0xe6/0x1b0
      [<ffffffff8115b4a8>] shrink_dcache_for_umount_subtree+0x218/0x280
      [<ffffffff8115ce91>] shrink_dcache_for_umount+0x51/0x90
      [<ffffffff8114796c>] generic_shutdown_super+0x2c/0x100
      [<ffffffffa008cc47>] logfs_kill_sb+0x57/0xf0 [logfs]
      [<ffffffff81147de5>] deactivate_locked_super+0x45/0x70
      [<ffffffff811487ea>] deactivate_super+0x4a/0x70
      [<ffffffff81163934>] mntput_no_expire+0xa4/0xf0
      [<ffffffff8116469f>] sys_umount+0x6f/0x380
      [<ffffffff814dd46b>] system_call_fastpath+0x16/0x1b
      Code: 55 c8 49 8d b6 a8 00 00 00 45 89 f9 45 89 e8 4c 89 e1 4c 89 55
      b8 c7 04 24 00 00 00 00 e8 68 fc ff ff 4c 8b 55 b8 e9 3c ff ff ff <0f>
      0b 0f 0b c7 45 c0 00 00 00 00 e9 44 fe ff ff 66 66 66 66 66
      RIP  [<ffffffffa008c841>] logfs_segment_write+0x211/0x230 [logfs]
      RSP <ffff880062d7b9e8>
      ---[ end trace fe6b040cea952290 ]---
      
      Therefore, move super->s_flags setting after the fs-indenpendent work
      has been finished.
      Reviewed-by: default avatarJoern Engel <joern@logfs.org>
      Signed-off-by: default avatarPrasad Joshi <prasadjoshi.linux@gmail.com>
      ecfd8909
    • Prasad Joshi's avatar
      logfs: take write mutex lock during fsync and sync · 13ced29c
      Prasad Joshi authored
      LogFS uses super->s_write_mutex while writing data to disk. Taking the
      same mutex lock in sync and fsync code path solves the following BUG:
      
      ------------[ cut here ]------------
      kernel BUG at /home/prasad/logfs/dev_bdev.c:134!
      
      Pid: 2387, comm: flush-253:16 Not tainted 3.0.0+ #4 Bochs Bochs
      RIP: 0010:[<ffffffffa007deed>]  [<ffffffffa007deed>]
                      bdev_writeseg+0x25d/0x270 [logfs]
      Call Trace:
      [<ffffffffa007c381>] logfs_open_area+0x91/0x150 [logfs]
      [<ffffffff8128dcb2>] ? find_level.clone.9+0x62/0x100
      [<ffffffffa007c49c>] __logfs_segment_write.clone.20+0x5c/0x190 [logfs]
      [<ffffffff810ef005>] ? mempool_kmalloc+0x15/0x20
      [<ffffffff810ef383>] ? mempool_alloc+0x53/0x130
      [<ffffffffa007c7a4>] logfs_segment_write+0x1d4/0x230 [logfs]
      [<ffffffffa0078f8e>] logfs_write_i0+0x12e/0x190 [logfs]
      [<ffffffffa0079300>] __logfs_write_rec+0x140/0x220 [logfs]
      [<ffffffffa0079444>] logfs_write_rec+0x64/0xd0 [logfs]
      [<ffffffffa00795b6>] __logfs_write_buf+0x106/0x110 [logfs]
      [<ffffffffa007a13e>] logfs_write_buf+0x4e/0x80 [logfs]
      [<ffffffffa0073e33>] __logfs_writepage+0x23/0x80 [logfs]
      [<ffffffffa007410c>] logfs_writepage+0xdc/0x110 [logfs]
      [<ffffffff810f5ba7>] __writepage+0x17/0x40
      [<ffffffff810f6208>] write_cache_pages+0x208/0x4f0
      [<ffffffff810f5b90>] ? set_page_dirty+0x70/0x70
      [<ffffffff810f653a>] generic_writepages+0x4a/0x70
      [<ffffffff810f75d1>] do_writepages+0x21/0x40
      [<ffffffff8116b9d1>] writeback_single_inode+0x101/0x250
      [<ffffffff8116bdbd>] writeback_sb_inodes+0xed/0x1c0
      [<ffffffff8116c5fb>] writeback_inodes_wb+0x7b/0x1e0
      [<ffffffff8116cc23>] wb_writeback+0x4c3/0x530
      [<ffffffff814d984d>] ? sub_preempt_count+0x9d/0xd0
      [<ffffffff8116cd6b>] wb_do_writeback+0xdb/0x290
      [<ffffffff814d984d>] ? sub_preempt_count+0x9d/0xd0
      [<ffffffff814d6208>] ? _raw_spin_unlock_irqrestore+0x18/0x40
      [<ffffffff8105aa5a>] ? del_timer+0x8a/0x120
      [<ffffffff8116cfac>] bdi_writeback_thread+0x8c/0x2e0
      [<ffffffff8116cf20>] ? wb_do_writeback+0x290/0x290
      [<ffffffff8106d2e6>] kthread+0x96/0xa0
      [<ffffffff814de514>] kernel_thread_helper+0x4/0x10
      [<ffffffff8106d250>] ? kthread_worker_fn+0x190/0x190
      [<ffffffff814de510>] ? gs_change+0xb/0xb
      RIP  [<ffffffffa007deed>] bdev_writeseg+0x25d/0x270 [logfs]
      ---[ end trace 0211ad60a57657c4 ]---
      Reviewed-by: default avatarJoern Engel <joern@logfs.org>
      Signed-off-by: default avatarPrasad Joshi <prasadjoshi.linux@gmail.com>
      13ced29c
    • Joern Engel's avatar
      logfs: Prevent memory corruption · 934eed39
      Joern Engel authored
      This is a bad one.  I wonder whether we were so far protected by
      no_free_segments(sb) usually being smaller than LOGFS_NO_AREAS.
      
      Found by Dan Carpenter <dan.carpenter@oracle.com> using smatch.
      Signed-off-by: default avatarJoern Engel <joern@logfs.org>
      Signed-off-by: default avatarPrasad Joshi <prasadjoshi.linux@gmail.com>
      934eed39
    • Prasad Joshi's avatar
      logfs: update page reference count for pined pages · 96150606
      Prasad Joshi authored
      LogFS sets PG_private flag to indicate a pined page. We assumed that
      marking a page as private is enough to ensure its existence. But
      instead it is necessary to hold a reference count to the page.
      
      The change resolves the following BUG
      
      BUG: Bad page state in process flush-253:16  pfn:6a6d0
      page flags: 0x100000000000808(uptodate|private)
      Suggested-and-Acked-by: default avatarJoern Engel <joern@logfs.org>
      Signed-off-by: default avatarPrasad Joshi <prasadjoshi.linux@gmail.com>
      96150606
  2. 04 Jan, 2012 4 commits
    • Linus Torvalds's avatar
      Revert "rtc: Expire alarms after the time is set." · f423fc62
      Linus Torvalds authored
      This reverts commit 93b2ec01.
      
      The call to "schedule_work()" in rtc_initialize_alarm() happens too
      early, and can cause oopses at bootup
      
      Neil Brown explains why we do it:
      
        "If you set an alarm in the future, then shutdown and boot again after
         that time, then you will end up with a timer_queue node which is in
         the past.
      
         When this happens the queue gets stuck.  That entry-in-the-past won't
         get removed until and interrupt happens and an interrupt won't happen
         because the RTC only triggers an interrupt when the alarm is "now".
      
         So you'll find that e.g.  "hwclock" will always tell you that
         'select' timed out.
      
         So we force the interrupt work to happen at the start just in case."
      
      and has a patch that convert it to do things in-process rather than with
      the worker thread, but right now it's too late to play around with this,
      so we just revert the patch that caused problems for now.
      Reported-by: default avatarSander Eikelenboom <linux@eikelenboom.it>
      Requested-by: default avatarKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>
      Requested-by: default avatarJohn Stultz <john.stultz@linaro.org>
      Cc: Neil Brown <neilb@suse.de>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      f423fc62
    • Linus Torvalds's avatar
      Revert "rtc: Disable the alarm in the hardware" · 157e8bf8
      Linus Torvalds authored
      This reverts commit c0afabd3.
      
      It causes failures on Toshiba laptops - instead of disabling the alarm,
      it actually seems to enable it on the affected laptops, resulting in
      (for example) the laptop powering on automatically five minutes after
      shutdown.
      
      There's a patch for it that appears to work for at least some people,
      but it's too late to play around with this, so revert for now and try
      again in the next merge window.
      
      See for example
      
      	http://bugs.debian.org/652869
      
      Reported-and-bisected-by: Andreas Friedrich <afrie@gmx.net> (Toshiba Tecra)
      Reported-by: Antonio-M. Corbi Bellot <antonio.corbi@ua.es> (Toshiba Portege R500)
      Reported-by: Marco Santos <marco.santos@waynext.com> (Toshiba Portege Z830)
      Reported-by: Christophe Vu-Brugier <cvubrugier@yahoo.fr>  (Toshiba Portege R830)
      Cc: Jonathan Nieder <jrnieder@gmail.com>
      Requested-by: default avatarJohn Stultz <john.stultz@linaro.org>
      Cc: stable@kernel.org  # for the versions that applied this
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      157e8bf8
    • Mandeep Singh Baines's avatar
      hung_task: fix false positive during vfork · f9fab10b
      Mandeep Singh Baines authored
      vfork parent uninterruptibly and unkillably waits for its child to
      exec/exit. This wait is of unbounded length. Ignore such waits
      in the hung_task detector.
      Signed-off-by: default avatarMandeep Singh Baines <msb@chromium.org>
      Reported-by: default avatarSasha Levin <levinsasha928@gmail.com>
      LKML-Reference: <1325344394.28904.43.camel@lappy>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: John Kacur <jkacur@redhat.com>
      Cc: stable@kernel.org
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      f9fab10b
    • Jan Kara's avatar
      security: Fix security_old_inode_init_security() when CONFIG_SECURITY is not set · 30e05324
      Jan Kara authored
      Commit 1e39f384 ("evm: fix build problems") makes the stub version
      of security_old_inode_init_security() return 0 when CONFIG_SECURITY is
      not set.
      
      But that makes callers such as reiserfs_security_init() assume that
      security_old_inode_init_security() has set name, value, and len
      arguments properly - but security_old_inode_init_security() left them
      uninitialized which then results in interesting failures.
      
      Revert security_old_inode_init_security() to the old behavior of
      returning EOPNOTSUPP since both callers (reiserfs and ocfs2) handle this
      just fine.
      
      [ Also fixed the S_PRIVATE(inode) case of the actual non-stub
        security_old_inode_init_security() function to return EOPNOTSUPP
        for the same reason, as pointed out by Mimi Zohar.
      
        It got incorrectly changed to match the new function in commit
        fb88c2b6: "evm: fix security/security_old_init_security return
        code".   - Linus ]
      Reported-by: default avatarJorge Bastos <mysql.jorge@decimal.pt>
      Acked-by: default avatarJames Morris <jmorris@namei.org>
      Acked-by: default avatarMimi Zohar <zohar@us.ibm.com>
      Signed-off-by: default avatarJan Kara <jack@suse.cz>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      30e05324
  3. 03 Jan, 2012 1 commit
  4. 02 Jan, 2012 2 commits
  5. 31 Dec, 2011 6 commits
  6. 30 Dec, 2011 17 commits
  7. 29 Dec, 2011 3 commits
  8. 28 Dec, 2011 2 commits