• Heiko Carstens's avatar
    Fix fixpoint divide exception in acct_update_integrals · 6d5b5acc
    Heiko Carstens authored
    Frans Pop reported the crash below when running an s390 kernel under Hercules:
    
      Kernel BUG at 000738b4  verbose debug info unavailable!
      fixpoint divide exception: 0009  #1! SMP
      Modules linked in: nfs lockd nfs_acl sunrpc ctcm fsm tape_34xx
         cu3088 tape ccwgroup tape_class ext3 jbd mbcache dm_mirror dm_log dm_snapshot
         dm_mod dasd_eckd_mod dasd_mod
      CPU: 0 Not tainted 2.6.27.19 #13
      Process awk (pid: 2069, task: 0f9ed9b8, ksp: 0f4f7d18)
      Krnl PSW : 070c1000 800738b4 (acct_update_integrals+0x4c/0x118)
                 R:0 T:1 IO:1 EX:1 Key:0 M:1 W:0 P:0 AS:0 CC:1 PM:0
      Krnl GPRS: 00000000 000007d0 7fffffff fffff830
                 00000000 ffffffff 00000002 0f9ed9b8
                 00000000 00008ca0 00000000 0f9ed9b8
                 0f9edda4 8007386e 0f4f7ec8 0f4f7e98
      Krnl Code: 800738aa: a71807d0         lhi     %r1,2000
                 800738ae: 8c200001         srdl    %r2,1
                 800738b2: 1d21             dr      %r2,%r1
                >800738b4: 5810d10e         l       %r1,270(%r13)
                 800738b8: 1823             lr      %r2,%r3
                 800738ba: 4130f060         la      %r3,96(%r15)
                 800738be: 0de1             basr    %r14,%r1
                 800738c0: 5800f060         l       %r0,96(%r15)
      Call Trace:
      ( <000000000004fdea>! blocking_notifier_call_chain+0x1e/0x2c)
        <0000000000038502>! do_exit+0x106/0x7c0
        <0000000000038c36>! do_group_exit+0x7a/0xb4
        <0000000000038c8e>! SyS_exit_group+0x1e/0x30
        <0000000000021c28>! sysc_do_restart+0x12/0x16
        <0000000077e7e924>! 0x77e7e924
    
    Reason for this is that cpu time accounting usually only happens from
    interrupt context, but acct_update_integrals gets also called from
    process context with interrupts enabled.
    
    So in acct_update_integrals we may end up with the following scenario:
    
    Between reading tsk->stime/tsk->utime and tsk->acct_timexpd an interrupt
    happens which updates accouting values.  This causes acct_timexpd to be
    greater than the former stime + utime.  The subsequent calculation of
    
    	dtime = cputime_sub(time, tsk->acct_timexpd);
    
    will be negative and the division performed by
    
    	cputime_to_jiffies(dtime)
    
    will generate an exception since the result won't fit into a 32 bit
    register.
    
    In order to fix this just always disable interrupts while accessing any
    of the accounting values.
    
    Reported by: Frans Pop <elendil@planet.nl>
    Tested by: Frans Pop <elendil@planet.nl>
    Cc: stable@kernel.org
    Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
    Signed-off-by: default avatarHeiko Carstens <heiko.carstens@de.ibm.com>
    Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
    6d5b5acc
tsacct.c 4.25 KB