1. 17 Mar, 2015 1 commit
    • Petr Mladek's avatar
      livepatch: Fix subtle race with coming and going modules · 8cb2c2dc
      Petr Mladek authored
      There is a notifier that handles live patches for coming and going modules.
      It takes klp_mutex lock to avoid races with coming and going patches but
      it does not keep the lock all the time. Therefore the following races are
      possible:
      
        1. The notifier is called sometime in STATE_MODULE_COMING. The module
           is visible by find_module() in this state all the time. It means that
           new patch can be registered and enabled even before the notifier is
           called. It might create wrong order of stacked patches, see below
           for an example.
      
         2. New patch could still see the module in the GOING state even after
            the notifier has been called. It will try to initialize the related
            object structures but the module could disappear at any time. There
            will stay mess in the structures. It might even cause an invalid
            memory access.
      
      This patch solves the problem by adding a boolean variable into struct module.
      The value is true after the coming and before the going handler is called.
      New patches need to be applied when the value is true and they need to ignore
      the module when the value is false.
      
      Note that we need to know state of all modules on the system. The races are
      related to new patches. Therefore we do not know what modules will get
      patched.
      
      Also note that we could not simply ignore going modules. The code from the
      module could be called even in the GOING state until mod->exit() finishes.
      If we start supporting patches with semantic changes between function
      calls, we need to apply new patches to any still usable code.
      See below for an example.
      
      Finally note that the patch solves only the situation when a new patch is
      registered. There are no such problems when the patch is being removed.
      It does not matter who disable the patch first, whether the normal
      disable_patch() or the module notifier. There is nothing to do
      once the patch is disabled.
      
      Alternative solutions:
      ======================
      
      + reject new patches when a patched module is coming or going; this is ugly
      
      + wait with adding new patch until the module leaves the COMING and GOING
        states; this might be dangerous and complicated; we would need to release
        kgr_lock in the middle of the patch registration to avoid a deadlock
        with the coming and going handlers; also we might need a waitqueue for
        each module which seems to be even bigger overhead than the boolean
      
      + stop modules from entering COMING and GOING states; wait until modules
        leave these states when they are already there; looks complicated; we would
        need to ignore the module that asked to stop the others to avoid a deadlock;
        also it is unclear what to do when two modules asked to stop others and
        both are in COMING state (situation when two new patches are applied)
      
      + always register/enable new patches and fix up the potential mess (registered
        patches order) in klp_module_init(); this is nasty and prone to regressions
        in the future development
      
      + add another MODULE_STATE where the kallsyms are visible but the module is not
        used yet; this looks too complex; the module states are checked on "many"
        locations
      
      Example of patch stacking breakage:
      ===================================
      
      The notifier could _not_ _simply_ ignore already initialized module objects.
      For example, let's have three patches (P1, P2, P3) for functions a() and b()
      where a() is from vmcore and b() is from a module M. Something like:
      
      	a()	b()
      P1	a1()	b1()
      P2	a2()	b2()
      P3	a3()	b3(3)
      
      If you load the module M after all patches are registered and enabled.
      The ftrace ops for function a() and b() has listed the functions in this
      order:
      
      	ops_a->func_stack -> list(a3,a2,a1)
      	ops_b->func_stack -> list(b3,b2,b1)
      
      , so the pointer to b3() is the first and will be used.
      
      Then you might have the following scenario. Let's start with state when patches
      P1 and P2 are registered and enabled but the module M is not loaded. Then ftrace
      ops for b() does not exist. Then we get into the following race:
      
      CPU0					CPU1
      
      load_module(M)
      
        complete_formation()
      
        mod->state = MODULE_STATE_COMING;
        mutex_unlock(&module_mutex);
      
      					klp_register_patch(P3);
      					klp_enable_patch(P3);
      
      					# STATE 1
      
        klp_module_notify(M)
          klp_module_notify_coming(P1);
          klp_module_notify_coming(P2);
          klp_module_notify_coming(P3);
      
      					# STATE 2
      
      The ftrace ops for a() and b() then looks:
      
        STATE1:
      
      	ops_a->func_stack -> list(a3,a2,a1);
      	ops_b->func_stack -> list(b3);
      
        STATE2:
      	ops_a->func_stack -> list(a3,a2,a1);
      	ops_b->func_stack -> list(b2,b1,b3);
      
      therefore, b2() is used for the module but a3() is used for vmcore
      because they were the last added.
      
      Example of the race with going modules:
      =======================================
      
      CPU0					CPU1
      
      delete_module()  #SYSCALL
      
         try_stop_module()
           mod->state = MODULE_STATE_GOING;
      
         mutex_unlock(&module_mutex);
      
      					klp_register_patch()
      					klp_enable_patch()
      
      					#save place to switch universe
      
      					b()     # from module that is going
      					  a()   # from core (patched)
      
         mod->exit();
      
      Note that the function b() can be called until we call mod->exit().
      
      If we do not apply patch against b() because it is in MODULE_STATE_GOING,
      it will call patched a() with modified semantic and things might get wrong.
      
      [jpoimboe@redhat.com: use one boolean instead of two]
      Signed-off-by: default avatarPetr Mladek <pmladek@suse.cz>
      Acked-by: default avatarJosh Poimboeuf <jpoimboe@redhat.com>
      Acked-by: default avatarRusty Russell <rusty@rustcorp.com.au>
      Signed-off-by: default avatarJiri Kosina <jkosina@suse.cz>
      8cb2c2dc
  2. 02 Mar, 2015 1 commit
  3. 22 Feb, 2015 1 commit
  4. 16 Feb, 2015 1 commit
  5. 11 Feb, 2015 6 commits
    • Linus Torvalds's avatar
      Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/livepatching · 1d9c5d79
      Linus Torvalds authored
      Pull live patching infrastructure from Jiri Kosina:
       "Let me provide a bit of history first, before describing what is in
        this pile.
      
        Originally, there was kSplice as a standalone project that implemented
        stop_machine()-based patching for the linux kernel.  This project got
        later acquired, and the current owner is providing live patching as a
        proprietary service, without any intentions to have their
        implementation merged.
      
        Then, due to rising user/customer demand, both Red Hat and SUSE
        started working on their own implementation (not knowing about each
        other), and announced first versions roughly at the same time [1] [2].
      
        The principle difference between the two solutions is how they are
        making sure that the patching is performed in a consistent way when it
        comes to different execution threads with respect to the semantic
        nature of the change that is being introduced.
      
        In a nutshell, kPatch is issuing stop_machine(), then looking at
        stacks of all existing processess, and if it decides that the system
        is in a state that can be patched safely, it proceeds insterting code
        redirection machinery to the patched functions.
      
        On the other hand, kGraft provides a per-thread consistency during one
        single pass of a process through the kernel and performs a lazy
        contignuous migration of threads from "unpatched" universe to the
        "patched" one at safe checkpoints.
      
        If interested in a more detailed discussion about the consistency
        models and its possible combinations, please see the thread that
        evolved around [3].
      
        It pretty quickly became obvious to the interested parties that it's
        absolutely impractical in this case to have several isolated solutions
        for one task to co-exist in the kernel.  During a dedicated Live
        Kernel Patching track at LPC in Dusseldorf, all the interested parties
        sat together and came up with a joint aproach that would work for both
        distro vendors.  Steven Rostedt took notes [4] from this meeting.
      
        And the foundation for that aproach is what's present in this pull
        request.
      
        It provides a basic infrastructure for function "live patching" (i.e.
        code redirection), including API for kernel modules containing the
        actual patches, and API/ABI for userspace to be able to operate on the
        patches (look up what patches are applied, enable/disable them, etc).
      
        It's relatively simple and minimalistic, as it's making use of
        existing kernel infrastructure (namely ftrace) as much as possible.
        It's also self-contained, in a sense that it doesn't hook itself in
        any other kernel subsystem (it doesn't even touch any other code).
        It's now implemented for x86 only as a reference architecture, but
        support for powerpc, s390 and arm is already in the works (adding
        arch-specific support basically boils down to teaching ftrace about
        regs-saving).
      
        Once this common infrastructure gets merged, both Red Hat and SUSE
        have agreed to immediately start porting their current solutions on
        top of this, abandoning their out-of-tree code.  The plan basically is
        that each patch will be marked by flag(s) that would indicate which
        consistency model it is willing to use (again, the details have been
        sketched out already in the thread at [3]).
      
        Before this happens, the current codebase can be used to patch a large
        group of secruity/stability problems the patches for which are not too
        complex (in a sense that they don't introduce non-trivial change of
        function's return value semantics, they don't change layout of data
        structures, etc) -- this corresponds to LEAVE_FUNCTION &&
        SWITCH_FUNCTION semantics described at [3].
      
        This tree has been in linux-next since December.
      
          [1] https://lkml.org/lkml/2014/4/30/477
          [2] https://lkml.org/lkml/2014/7/14/857
          [3] https://lkml.org/lkml/2014/11/7/354
          [4] http://linuxplumbersconf.org/2014/wp-content/uploads/2014/10/LPC2014_LivePatching.txt
      
        [ The core code is introduced by the three commits authored by Seth
          Jennings, which got a lot of changes incorporated during numerous
          respins and reviews of the initial implementation.  All the followup
          commits have materialized only after public tree has been created,
          so they were not folded into initial three commits so that the
          public tree doesn't get rebased ]"
      
      * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/livepatching:
        livepatch: add missing newline to error message
        livepatch: rename config to CONFIG_LIVEPATCH
        livepatch: fix uninitialized return value
        livepatch: support for repatching a function
        livepatch: enforce patch stacking semantics
        livepatch: change ARCH_HAVE_LIVE_PATCHING to HAVE_LIVE_PATCHING
        livepatch: fix deferred module patching order
        livepatch: handle ancient compilers with more grace
        livepatch: kconfig: use bool instead of boolean
        livepatch: samples: fix usage example comments
        livepatch: MAINTAINERS: add git tree location
        livepatch: use FTRACE_OPS_FL_IPMODIFY
        livepatch: move x86 specific ftrace handler code to arch/x86
        livepatch: samples: add sample live patching module
        livepatch: kernel: add support for live patching
        livepatch: kernel: add TAINT_LIVEPATCH
      1d9c5d79
    • Linus Torvalds's avatar
      Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/hid · 870fd0f5
      Linus Torvalds authored
      Pull HID updates from Jiri Kosina:
       "Updates for HID code
      
         - improveements of Logitech HID++ procotol implementation, from
           Benjamin Tissoires
      
         - support for composite RMI devices, from Andrew Duggan
      
         - new driver for BETOP controller, from Huang Bo
      
         - fixup for conflicting mapping in HID core between PC-101/103/104
           and PC-102/105 keyboards from David Herrmann
      
         - new hardware support and fixes in Wacom driver, from Ping Cheng
      
         - assorted small fixes and device ID additions all over the place"
      
      * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/hid: (33 commits)
        HID: wacom: add support for Cintiq 27QHD and 27QHD touch
        HID: wacom: consolidate input capability settings for pen and touch
        HID: wacom: make sure touch arbitration is applied consistently
        HID: pidff: Fix initialisation forMicrosoft Sidewinder FF Pro 2
        HID: hyperv: match wait_for_completion_timeout return type
        HID: wacom: Report ABS_MISC event for Cintiq Companion Hybrid
        HID: Use Kbuild idiom in Makefiles
        HID: do not bind to Microchip Pick16F1454
        HID: hid-lg4ff: use DEVICE_ATTR_RW macro
        HID: hid-lg4ff: fix sysfs attribute permission
        HID: wacom: peport In Range event according to the spec
        HID: wacom: process invalid Cintiq and Intuos data in wacom_intuos_inout()
        HID: rmi: Add support for the touchpad in the Razer Blade 14 laptop
        HID: rmi: Support touchpads with external buttons
        HID: rmi: Use hid_report_len to compute the size of reports
        HID: logitech-hidpp: store the name of the device in struct hidpp
        HID: microsoft: add support for Japanese Surface Type Cover 3
        HID: fixup the conflicting keyboard mappings quirk
        HID: apple: fix battery support for the 2009 ANSI wireless keyboard
        HID: fix Kconfig text
        ...
      870fd0f5
    • Linus Torvalds's avatar
      sata_dwc_460ex: disable COMPILE_TEST again · 06cc01a0
      Linus Torvalds authored
      Commit 84683a7e ("sata_dwc_460ex: enable COMPILE_TEST for the
      driver") enabled this driver for non-ppc460-ex platforms, but it was
      then disabled for ARM and ARM64 by commit 2de5a9c0 ("sata_dwc_460ex:
      disable compilation on ARM and ARM64") because it's too noisy and
      broken.
      
      This disabled is entirely, because it's too noisy on x86-64 too, and
      there's no point in disabling architectures one by one.  At a minimum,
      the code isn't 64-bit clean, and even on 32-bit it is questionable
      whether it makes sense.
      
      Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
      Cc: Tejun Heo <tj@kernel.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      06cc01a0
    • Linus Torvalds's avatar
      Merge branch 'akpm' (patches from Andrew) · 992de5a8
      Linus Torvalds authored
      Merge misc updates from Andrew Morton:
       "Bite-sized chunks this time, to avoid the MTA ratelimiting woes.
      
         - fs/notify updates
      
         - ocfs2
      
         - some of MM"
      
      That laconic "some MM" is mainly the removal of remap_file_pages(),
      which is a big simplification of the VM, and which gets rid of a *lot*
      of random cruft and special cases because we no longer support the
      non-linear mappings that it used.
      
      From a user interface perspective, nothing has changed, because the
      remap_file_pages() syscall still exists, it's just done by emulating the
      old behavior by creating a lot of individual small mappings instead of
      one non-linear one.
      
      The emulation is slower than the old "native" non-linear mappings, but
      nobody really uses or cares about remap_file_pages(), and simplifying
      the VM is a big advantage.
      
      * emailed patches from Andrew Morton <akpm@linux-foundation.org>: (78 commits)
        memcg: zap memcg_slab_caches and memcg_slab_mutex
        memcg: zap memcg_name argument of memcg_create_kmem_cache
        memcg: zap __memcg_{charge,uncharge}_slab
        mm/page_alloc.c: place zone_id check before VM_BUG_ON_PAGE check
        mm: hugetlb: fix type of hugetlb_treat_as_movable variable
        mm, hugetlb: remove unnecessary lower bound on sysctl handlers"?
        mm: memory: merge shared-writable dirtying branches in do_wp_page()
        mm: memory: remove ->vm_file check on shared writable vmas
        xtensa: drop _PAGE_FILE and pte_file()-related helpers
        x86: drop _PAGE_FILE and pte_file()-related helpers
        unicore32: drop pte_file()-related helpers
        um: drop _PAGE_FILE and pte_file()-related helpers
        tile: drop pte_file()-related helpers
        sparc: drop pte_file()-related helpers
        sh: drop _PAGE_FILE and pte_file()-related helpers
        score: drop _PAGE_FILE and pte_file()-related helpers
        s390: drop pte_file()-related helpers
        parisc: drop _PAGE_FILE and pte_file()-related helpers
        openrisc: drop _PAGE_FILE and pte_file()-related helpers
        nios2: drop _PAGE_FILE and pte_file()-related helpers
        ...
      992de5a8
    • Linus Torvalds's avatar
      Merge tag 'gfs2-merge-window' of git://git.kernel.org/pub/scm/linux/kernel/git/steve/gfs2-3.0-nmw · b2718bff
      Linus Torvalds authored
      Pull gfs2 updates from Steven Whitehouse:
       "This time we have mostly clean ups.  There is a bug fix for a NULL
        dereference relating to ACLs, and another which improves (but does not
        fix entirely) an allocation fall-back code path.  The other three
        patches are small clean ups"
      
      * tag 'gfs2-merge-window' of git://git.kernel.org/pub/scm/linux/kernel/git/steve/gfs2-3.0-nmw:
        GFS2: Fix crash during ACL deletion in acl max entry check in gfs2_set_acl()
        GFS2: use __vmalloc GFP_NOFS for fs-related allocations.
        GFS2: Eliminate a nonsense goto
        GFS2: fix sprintf format specifier
        GFS2: Eliminate __gfs2_glock_remove_from_lru
      b2718bff
    • Linus Torvalds's avatar
      Merge tag 'xfs-for-linus-3.20-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/dgc/linux-xfs · ae90fb14
      Linus Torvalds authored
      Pull xfs update from Dave Chinner:
       "This update contains:
      
         - RENAME_EXCHANGE support
      
         - Rework of the superblock logging infrastructure
      
         - Rework of the XFS_IOCTL_SETXATTR implementation
             * enables use inside user namespaces
             * fixes inconsistencies setting extent size hints
      
         - fixes for missing buffer type annotations used in log recovery
      
         - more consolidation of libxfs headers
      
         - preparation patches for block based PNFS support
      
         - miscellaneous bug fixes and cleanups"
      
      * tag 'xfs-for-linus-3.20-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/dgc/linux-xfs: (37 commits)
        xfs: only trace buffer items if they exist
        xfs: report proper f_files in statfs if we overshoot imaxpct
        xfs: fix panic_mask documentation
        xfs: xfs_ioctl_setattr_check_projid can be static
        xfs: growfs should use synchronous transactions
        xfs: fix behaviour of XFS_IOC_FSSETXATTR on directories
        xfs: factor projid hint checking out of xfs_ioctl_setattr
        xfs: factor extsize hint checking out of xfs_ioctl_setattr
        xfs: XFS_IOCTL_SETXATTR can run in user namespaces
        xfs: kill xfs_ioctl_setattr behaviour mask
        xfs: disaggregate xfs_ioctl_setattr
        xfs: factor out xfs_ioctl_setattr transaciton preamble
        xfs: separate xflags from xfs_ioctl_setattr
        xfs: FSX_NONBLOCK is not used
        xfs: don't allocate an ioend for direct I/O completions
        xfs: change kmem_free to use generic kvfree()
        xfs: factor out a xfs_update_prealloc_flags() helper
        xfs: remove incorrect error negation in attr_multi ioctl
        xfs: set superblock buffer type correctly
        xfs: set buf types when converting extent formats
        ...
      ae90fb14
  6. 10 Feb, 2015 30 commits