1. 31 Mar, 2018 27 commits
  2. 28 Mar, 2018 13 commits
    • Greg Kroah-Hartman's avatar
      Linux 4.9.91 · c44cfe06
      Greg Kroah-Hartman authored
      c44cfe06
    • Daniel Borkmann's avatar
      bpf, x64: increase number of passes · c9e30719
      Daniel Borkmann authored
      commit 6007b080 upstream.
      
      In Cilium some of the main programs we run today are hitting 9 passes
      on x64's JIT compiler, and we've had cases already where we surpassed
      the limit where the JIT then punts the program to the interpreter
      instead, leading to insertion failures due to CONFIG_BPF_JIT_ALWAYS_ON
      or insertion failures due to the prog array owner being JITed but the
      program to insert not (both must have the same JITed/non-JITed property).
      
      One concrete case the program image shrunk from 12,767 bytes down to
      10,288 bytes where the image converged after 16 steps. I've measured
      that this took 340us in the JIT until it converges on my i7-6600U. Thus,
      increase the original limit we had from day one where the JIT covered
      cBPF only back then before we run into the case (as similar with the
      complexity limit) where we trip over this and hit program rejections.
      Also add a cond_resched() into the compilation loop, the JIT process
      runs without any locks and may sleep anyway.
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Acked-by: default avatarAlexei Starovoitov <ast@kernel.org>
      Reviewed-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      c9e30719
    • Chenbo Feng's avatar
      bpf: skip unnecessary capability check · 3eb88807
      Chenbo Feng authored
      commit 0fa4fe85 upstream.
      
      The current check statement in BPF syscall will do a capability check
      for CAP_SYS_ADMIN before checking sysctl_unprivileged_bpf_disabled. This
      code path will trigger unnecessary security hooks on capability checking
      and cause false alarms on unprivileged process trying to get CAP_SYS_ADMIN
      access. This can be resolved by simply switch the order of the statement
      and CAP_SYS_ADMIN is not required anyway if unprivileged bpf syscall is
      allowed.
      Signed-off-by: default avatarChenbo Feng <fengc@google.com>
      Acked-by: default avatarLorenzo Colitti <lorenzo@google.com>
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      3eb88807
    • Daniel Borkmann's avatar
      kbuild: disable clang's default use of -fmerge-all-constants · 733a4e1a
      Daniel Borkmann authored
      commit 87e0d4f0 upstream.
      
      Prasad reported that he has seen crashes in BPF subsystem with netd
      on Android with arm64 in the form of (note, the taint is unrelated):
      
        [ 4134.721483] Unable to handle kernel paging request at virtual address 800000001
        [ 4134.820925] Mem abort info:
        [ 4134.901283]   Exception class = DABT (current EL), IL = 32 bits
        [ 4135.016736]   SET = 0, FnV = 0
        [ 4135.119820]   EA = 0, S1PTW = 0
        [ 4135.201431] Data abort info:
        [ 4135.301388]   ISV = 0, ISS = 0x00000021
        [ 4135.359599]   CM = 0, WnR = 0
        [ 4135.470873] user pgtable: 4k pages, 39-bit VAs, pgd = ffffffe39b946000
        [ 4135.499757] [0000000800000001] *pgd=0000000000000000, *pud=0000000000000000
        [ 4135.660725] Internal error: Oops: 96000021 [#1] PREEMPT SMP
        [ 4135.674610] Modules linked in:
        [ 4135.682883] CPU: 5 PID: 1260 Comm: netd Tainted: G S      W       4.14.19+ #1
        [ 4135.716188] task: ffffffe39f4aa380 task.stack: ffffff801d4e0000
        [ 4135.731599] PC is at bpf_prog_add+0x20/0x68
        [ 4135.741746] LR is at bpf_prog_inc+0x20/0x2c
        [ 4135.751788] pc : [<ffffff94ab7ad584>] lr : [<ffffff94ab7ad638>] pstate: 60400145
        [ 4135.769062] sp : ffffff801d4e3ce0
        [...]
        [ 4136.258315] Process netd (pid: 1260, stack limit = 0xffffff801d4e0000)
        [ 4136.273746] Call trace:
        [...]
        [ 4136.442494] 3ca0: ffffff94ab7ad584 0000000060400145 ffffffe3a01bf8f8 0000000000000006
        [ 4136.460936] 3cc0: 0000008000000000 ffffff94ab844204 ffffff801d4e3cf0 ffffff94ab7ad584
        [ 4136.479241] [<ffffff94ab7ad584>] bpf_prog_add+0x20/0x68
        [ 4136.491767] [<ffffff94ab7ad638>] bpf_prog_inc+0x20/0x2c
        [ 4136.504536] [<ffffff94ab7b5d08>] bpf_obj_get_user+0x204/0x22c
        [ 4136.518746] [<ffffff94ab7ade68>] SyS_bpf+0x5a8/0x1a88
      
      Android's netd was basically pinning the uid cookie BPF map in BPF
      fs (/sys/fs/bpf/traffic_cookie_uid_map) and later on retrieving it
      again resulting in above panic. Issue is that the map was wrongly
      identified as a prog! Above kernel was compiled with clang 4.0,
      and it turns out that clang decided to merge the bpf_prog_iops and
      bpf_map_iops into a single memory location, such that the two i_ops
      could then not be distinguished anymore.
      
      Reason for this miscompilation is that clang has the more aggressive
      -fmerge-all-constants enabled by default. In fact, clang source code
      has a comment about it in lib/AST/ExprConstant.cpp on why it is okay
      to do so:
      
        Pointers with different bases cannot represent the same object.
        (Note that clang defaults to -fmerge-all-constants, which can
        lead to inconsistent results for comparisons involving the address
        of a constant; this generally doesn't matter in practice.)
      
      The issue never appeared with gcc however, since gcc does not enable
      -fmerge-all-constants by default and even *explicitly* states in
      it's option description that using this flag results in non-conforming
      behavior, quote from man gcc:
      
        Languages like C or C++ require each variable, including multiple
        instances of the same variable in recursive calls, to have distinct
        locations, so using this option results in non-conforming behavior.
      
      There are also various clang bug reports open on that matter [1],
      where clang developers acknowledge the non-conforming behavior,
      and refer to disabling it with -fno-merge-all-constants. But even
      if this gets fixed in clang today, there are already users out there
      that triggered this. Thus, fix this issue by explicitly adding
      -fno-merge-all-constants to the kernel's Makefile to generically
      disable this optimization, since potentially other places in the
      kernel could subtly break as well.
      
      Note, there is also a flag called -fmerge-constants (not supported
      by clang), which is more conservative and only applies to strings
      and it's enabled in gcc's -O/-O2/-O3/-Os optimization levels. In
      gcc's code, the two flags -fmerge-{all-,}constants share the same
      variable internally, so when disabling it via -fno-merge-all-constants,
      then we really don't merge any const data (e.g. strings), and text
      size increases with gcc (14,927,214 -> 14,942,646 for vmlinux.o).
      
        $ gcc -fverbose-asm -O2 foo.c -S -o foo.S
          -> foo.S lists -fmerge-constants under options enabled
        $ gcc -fverbose-asm -O2 -fno-merge-all-constants foo.c -S -o foo.S
          -> foo.S doesn't list -fmerge-constants under options enabled
        $ gcc -fverbose-asm -O2 -fno-merge-all-constants -fmerge-constants foo.c -S -o foo.S
          -> foo.S lists -fmerge-constants under options enabled
      
      Thus, as a workaround we need to set both -fno-merge-all-constants
      *and* -fmerge-constants in the Makefile in order for text size to
      stay as is.
      
        [1] https://bugs.llvm.org/show_bug.cgi?id=18538Reported-by: default avatarPrasad Sodagudi <psodagud@codeaurora.org>
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Chenbo Feng <fengc@google.com>
      Cc: Richard Smith <richard-llvm@metafoo.co.uk>
      Cc: Chandler Carruth <chandlerc@gmail.com>
      Cc: linux-kernel@vger.kernel.org
      Tested-by: default avatarPrasad Sodagudi <psodagud@codeaurora.org>
      Acked-by: default avatarAlexei Starovoitov <ast@kernel.org>
      Signed-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      733a4e1a
    • Shuah Khan's avatar
      selftests: x86: sysret_ss_attrs doesn't build on a PIE build · 353f71fe
      Shuah Khan authored
      commit 3346a6a4 upstream.
      
      sysret_ss_attrs fails to compile leading x86 test run to fail on systems
      configured to build using PIE by default. Add -no-pie fix it.
      
      Relocation might still fail if relocated above 4G. For now this change
      fixes the build and runs x86 tests.
      
      tools/testing/selftests/x86$ make
      gcc -m64 -o .../tools/testing/selftests/x86/single_step_syscall_64 -O2
      -g -std=gnu99 -pthread -Wall  single_step_syscall.c -lrt -ldl
      gcc -m64 -o .../tools/testing/selftests/x86/sysret_ss_attrs_64 -O2 -g
      -std=gnu99 -pthread -Wall  sysret_ss_attrs.c thunks.S -lrt -ldl
      /usr/bin/ld: /tmp/ccS6pvIh.o: relocation R_X86_64_32S against `.text'
      can not be used when making a shared object; recompile with -fPIC
      /usr/bin/ld: final link failed: Nonrepresentable section on output
      collect2: error: ld returned 1 exit status
      Makefile:49: recipe for target
      '.../tools/testing/selftests/x86/sysret_ss_attrs_64' failed
      make: *** [.../tools/testing/selftests/x86/sysret_ss_attrs_64] Error 1
      Suggested-by: default avatarAndy Lutomirski <luto@kernel.org>
      Signed-off-by: default avatarShuah Khan <shuahkh@osg.samsung.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      353f71fe
    • Dave Hansen's avatar
      x86/pkeys/selftests: Rename 'si_pkey' to 'siginfo_pkey' · 1443abc9
      Dave Hansen authored
      commit 91c49c2d upstream.
      
      'si_pkey' is now #defined to be the name of the new siginfo field that
      protection keys uses.  Rename it not to conflict.
      Signed-off-by: default avatarDave Hansen <dave.hansen@linux.intel.com>
      Acked-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Josh Poimboeuf <jpoimboe@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/20171111001231.DFFC8285@viggo.jf.intel.comSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      1443abc9
    • Eric W. Biederman's avatar
      signal/testing: Don't look for __SI_FAULT in userspace · f41f8156
      Eric W. Biederman authored
      commit d12fe87e upstream.
      
      Fix the debug print statements in these tests where they reference
      si_codes and in particular __SI_FAULT.  __SI_FAULT is a kernel
      internal value and should never be seen by userspace.
      
      While I am in there also fix si_code_str.  si_codes are an enumeration
      there are not a bitmap so == and not & is the apropriate operation to
      test for an si_code.
      
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Fixes: 5f23f6d0 ("x86/pkeys: Add self-tests")
      Fixes: e754aedc ("x86/mpx, selftests: Add MPX self test")
      Signed-off-by: default avatar"Eric W. Biederman" <ebiederm@xmission.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      f41f8156
    • Andy Lutomirski's avatar
      selftests/x86/protection_keys: Fix syscall NR redefinition warnings · 93b48392
      Andy Lutomirski authored
      commit 693cb558 upstream.
      
      On new enough glibc, the pkey syscalls numbers are available.  Check
      first before defining them to avoid warnings like:
      
      protection_keys.c:198:0: warning: "SYS_pkey_alloc" redefined
      Signed-off-by: default avatarAndy Lutomirski <luto@kernel.org>
      Cc: Borislav Petkov <bpetkov@suse.de>
      Cc: Dave Hansen <dave.hansen@intel.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: stable@vger.kernel.org
      Link: http://lkml.kernel.org/r/1fbef53a9e6befb7165ff855fc1a7d4788a191d6.1509794321.git.luto@kernel.orgSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      93b48392
    • Dave Hansen's avatar
      selftests, x86, protection_keys: fix wrong offset in siginfo · 26e9852f
      Dave Hansen authored
      commit 2195bff0 upstream.
      
      The siginfo contains a bunch of information about the fault.
      For protection keys, it tells us which protection key's
      permissions were violated.
      
      The wrong offset in here leads to reading garbage and thus
      failures in the tests.
      
      We should probably eventually move this over to using the
      kernel's headers defining the siginfo instead of a hard-coded
      offset.  But, for now, just do the simplest fix.
      Signed-off-by: default avatarDave Hansen <dave.hansen@linux.intel.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Shuah Khan <shuahkh@osg.samsung.com>
      Signed-off-by: default avatarShuah Khan <shuahkh@osg.samsung.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      26e9852f
    • Nadav Amit's avatar
      staging: lustre: ptlrpc: kfree used instead of kvfree · 1e0fc7db
      Nadav Amit authored
      commit c3eec596 upstream.
      
      rq_reqbuf is allocated using kvmalloc() but released in one occasion
      using kfree() instead of kvfree().
      
      The issue was found using grep based on a similar bug.
      
      Fixes: d7e09d03 ("add Lustre file system client support")
      Fixes: ee0ec194 ("lustre: ptlrpc: Replace uses of OBD_{ALLOC,FREE}_LARGE")
      
      Cc: Peng Tao <bergwolf@gmail.com>
      Cc: Oleg Drokin <oleg.drokin@intel.com>
      Cc: James Simmons <jsimmons@infradead.org>
      Signed-off-by: default avatarNadav Amit <namit@vmware.com>
      Signed-off-by: default avatarAndreas Dilger <andreas.dilger@intel.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      1e0fc7db
    • Linus Walleij's avatar
      iio: ABI: Fix name of timestamp sysfs file · 162daa27
      Linus Walleij authored
      commit b9a35893 upstream.
      
      The name of the file is "current_timetamp_clock" not
      "timestamp_clock".
      
      Fixes: bc2b7dab ("iio:core: timestamping clock selection support")
      Cc: Gregor Boirie <gregor.boirie@parrot.com>
      Signed-off-by: default avatarLinus Walleij <linus.walleij@linaro.org>
      Signed-off-by: default avatarJonathan Cameron <Jonathan.Cameron@huawei.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      162daa27
    • Kan Liang's avatar
      perf/x86/intel/uncore: Fix multi-domain PCI CHA enumeration bug on Skylake servers · 9c0d0a0c
      Kan Liang authored
      commit 320b0651 upstream.
      
      The number of CHAs is miscalculated on multi-domain PCI Skylake server systems,
      resulting in an uncore driver initialization error.
      
      Gary Kroening explains:
      
       "For systems with a single PCI segment, it is sufficient to look for the
        bus number to change in order to determine that all of the CHa's have
        been counted for a single socket.
      
        However, for multi PCI segment systems, each socket is given a new
        segment and the bus number does NOT change.  So looking only for the
        bus number to change ends up counting all of the CHa's on all sockets
        in the system.  This leads to writing CPU MSRs beyond a valid range and
        causes an error in ivbep_uncore_msr_init_box()."
      
      To fix this bug, query the number of CHAs from the CAPID6 register:
      it should read bits 27:0 in the CAPID6 register located at
      Device 30, Function 3, Offset 0x9C. These 28 bits form a bit vector
      of available LLC slices and the CHAs that manage those slices.
      Reported-by: default avatarKroening, Gary <gary.kroening@hpe.com>
      Tested-by: default avatarKroening, Gary <gary.kroening@hpe.com>
      Signed-off-by: default avatarKan Liang <kan.liang@linux.intel.com>
      Signed-off-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Reviewed-by: default avatarAndy Shevchenko <andy.shevchenko@gmail.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Vince Weaver <vincent.weaver@maine.edu>
      Cc: abanman@hpe.com
      Cc: dimitri.sivanich@hpe.com
      Cc: hpa@zytor.com
      Cc: mike.travis@hpe.com
      Cc: russ.anderson@hpe.com
      Fixes: cd34cd97 ("perf/x86/intel/uncore: Add Skylake server uncore support")
      Link: http://lkml.kernel.org/r/1520967094-13219-1-git-send-email-kan.liang@linux.intel.comSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      9c0d0a0c
    • Dan Carpenter's avatar
      perf/x86/intel: Don't accidentally clear high bits in bdw_limit_period() · e91ec349
      Dan Carpenter authored
      commit e5ea9b54 upstream.
      
      We intended to clear the lowest 6 bits but because of a type bug we
      clear the high 32 bits as well.  Andi says that periods are rarely more
      than U32_MAX so this bug probably doesn't have a huge runtime impact.
      Signed-off-by: default avatarDan Carpenter <dan.carpenter@oracle.com>
      Signed-off-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Vince Weaver <vincent.weaver@maine.edu>
      Fixes: 294fe0f5 ("perf/x86/intel: Add INST_RETIRED.ALL workarounds")
      Link: http://lkml.kernel.org/r/20180317115216.GB4035@mwandaSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      e91ec349