- 19 Jul, 2018 5 commits
-
-
Bharat Bhushan authored
Update the comment to account for the spurious interrupt number. The code was already accounting for it, but that was unclear because it was achieved by mpic_setup_error_int() knowing that the number it was passed was the last used vector, rather than the first free vector. So change the meaning of the argument to the first free vector and update the caller to pass 13, instead of 12, to achieve the same result. Signed-off-by: Bharat Bhushan <Bharat.Bhushan@nxp.com> [mpe: Rewrite change log] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
-
David Gibson authored
The HUGEPD_*_SHIFT macros are always defined to be PGDIR_SHIFT and PUD_SHIFT, and have to have those values to work properly. They once used to have different values, but that was really only because they were used to mean different things in different contexts. 6fa50483 "powerpc/mm/hugetlb: initialize the pagetable cache correctly for hugetlb" removed that double meaning, but left the now useless constants. Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
-
Randy Dunlap authored
Add MODULE_LICENSE() to the chrp nvram.c driver to fix the build warning message: WARNING: modpost: missing MODULE_LICENSE() in arch/powerpc/platforms/chrp/nvram.o Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: linuxppc-dev@lists.ozlabs.org Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
-
Christophe Leroy authored
NULL pointers are pointers to user memory space. So user pagetable has to be set in order to avoid random behaviour in case of NULL pointer dereference, otherwise we may encounter random memory access hence Machine Check Exception from TLB Miss handlers. Set user pagetable as early as possible in order to properly catch early kernel NULL pointer dereference. Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
-
Michael Ellerman authored
Merge in some commits we're sharing with the KVM tree. I manually propagated the change from commit d3d4ffaa ("powerpc/powernv/ioda2: Reduce upper limit for DMA window size") into pci-ioda-tce.c. Conflicts: arch/powerpc/include/asm/cputable.h arch/powerpc/platforms/powernv/pci-ioda.c arch/powerpc/platforms/powernv/pci.h
-
- 16 Jul, 2018 7 commits
-
-
Alexey Kardashevskiy authored
At the moment we allocate the entire TCE table, twice (hardware part and userspace translation cache). This normally works as we normally have contigous memory and the guest will map entire RAM for 64bit DMA. However if we have sparse RAM (one example is a memory device), then we will allocate TCEs which will never be used as the guest only maps actual memory for DMA. If it is a single level TCE table, there is nothing we can really do but if it a multilevel table, we can skip allocating TCEs we know we won't need. This adds ability to allocate only first level, saving memory. This changes iommu_table::free() to avoid allocating of an extra level; iommu_table::set() will do this when needed. This adds @alloc parameter to iommu_table::exchange() to tell the callback if it can allocate an extra level; the flag is set to "false" for the realmode KVM handlers of H_PUT_TCE hcalls and the callback returns H_TOO_HARD. This still requires the entire table to be counted in mm::locked_vm. To be conservative, this only does on-demand allocation when the usespace cache table is requested which is the case of VFIO. The example math for a system replicating a powernv setup with NVLink2 in a guest: 16GB RAM mapped at 0x0 128GB GPU RAM window (16GB of actual RAM) mapped at 0x244000000000 the table to cover that all with 64K pages takes: (((0x244000000000 + 0x2000000000) >> 16)*8)>>20 = 4556MB If we allocate only necessary TCE levels, we will only need: (((0x400000000 + 0x400000000) >> 16)*8)>>20 = 4MB (plus some for indirect levels). Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
-
Alexey Kardashevskiy authored
This moves actual pages allocation to a separate function which is going to be reused later in on-demand TCE allocation. While we are at it, remove unnecessary level size round up as the caller does this already. Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
-
Alexey Kardashevskiy authored
We want to support sparse memory and therefore huge chunks of DMA windows do not need to be mapped. If a DMA window big enough to require 2 or more indirect levels, and a DMA window is used to map all RAM (which is a default case for 64bit window), we can actually save some memory by not allocation TCE for regions which we are not going to map anyway. The hardware tables alreary support indirect levels but we also keep host-physical-to-userspace translation array which is allocated by vmalloc() and is a flat array which might use quite some memory. This converts it_userspace from vmalloc'ed array to a multi level table. As the format becomes platform dependend, this replaces the direct access to it_usespace with a iommu_table_ops::useraddrptr hook which returns a pointer to the userspace copy of a TCE; future extension will return NULL if the level was not allocated. This should not change non-KVM handling of TCE tables and it_userspace will not be allocated for non-KVM tables. Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
-
Alexey Kardashevskiy authored
We are going to reuse multilevel TCE code for the userspace copy of the TCE table and since it is big endian, let's make the copy big endian too. Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Acked-by: Paul Mackerras <paulus@ozlabs.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
-
Alexey Kardashevskiy authored
Right now we have allocation code in pci-ioda.c and traversing code in pci.c, let's keep them toghether. However both files are big enough already so let's move this business to a new file. While we at it, move the code which links IOMMU table groups to IOMMU tables as it is not specific to any PNV PHB model. These puts exported symbols from the new file together. This fixes several warnings from checkpatch.pl like this: "WARNING: Prefer 'unsigned int' to bare use of 'unsigned'". As this is almost cut-n-paste, there should be no behavioral change. Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
-
Alexey Kardashevskiy authored
This gets rid of a useless wrapper around pnv_pci_ioda2_table_free_pages(). Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
-
Nicholas Piggin authored
POWER9 DD1 was never a product. It is no longer supported by upstream firmware, and it is not effectively supported in Linux due to lack of testing. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Reviewed-by: Michael Ellerman <mpe@ellerman.id.au> [mpe: Remove arch_make_huge_pte() entirely] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
-
- 12 Jul, 2018 3 commits
-
-
Daniel Klamt authored
Replace msleep(x) with with msleep(OPAL_BUSY_DELAY_MS) to document these sleeps are to wait for opal (firmware). Signed-off-by: Daniel Klamt <eleon@ele0n.de> Signed-off-by: Bjoern Noetel <bjoern@br3ak3r.de> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
-
Michael Ellerman authored
When we take an SLB multi-hit on bare metal, we see both the multi-hit and parity error bits set in DSISR. The user manuals indicates this is expected to always happen on Power8, whereas on Power9 it says a multi-hit will "usually" also cause a parity error. We decide what to do based on the various error tables in mce_power.c, and because we process them in order and only report the first, we currently always report a parity error but not the multi-hit, eg: Severe Machine check interrupt [Recovered] Initiator: CPU Error type: SLB [Parity] Effective address: c000000ffffd4300 Although this is correct, it leaves the user wondering why they got a parity error. It would be clearer instead if we reported the multi-hit because that is more likely to be simply a software bug, whereas a true parity error is possibly an indication of a bad core. We can do that simply by reordering the error tables so that multi-hit appears before parity. That doesn't affect the error recovery at all, because we flush the SLB either way. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Reviewed-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
-
Joel Stanley authored
This was added to support an early version of Power8 that did not have working doorbells. These machines were not publicly available, and all of the internal users have long since upgraded. Signed-off-by: Joel Stanley <joel@jms.id.au> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
-
- 10 Jul, 2018 3 commits
-
-
Bartosz Golaszewski authored
Using 'at24' as fallback is now deprecated - use the full 'atmel,<model>' string. Signed-off-by: Bartosz Golaszewski <brgl@bgdev.pl> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
-
Bartosz Golaszewski authored
Using compatible strings without the <manufacturer> part for at24 is now deprecated. Use a correct 'atmel,<model>' value. Signed-off-by: Bartosz Golaszewski <brgl@bgdev.pl> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
-
Bartosz Golaszewski authored
Using 'at' as the <manufacturer> part of the compatible string is now deprecated. Use a correct string: 'atmel,<model>'. Signed-off-by: Bartosz Golaszewski <brgl@bgdev.pl> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
-
- 09 Jul, 2018 1 commit
-
-
Shilpasri G Bhat authored
POWER9 does not support global pstate requests for the chip. So remove the timer logic which slowly ramps down the global pstate in P9 platforms. Signed-off-by: Shilpasri G Bhat <shilpa.bhat@linux.vnet.ibm.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> [mpe: Drop NULL check before kfree(policy->driver_data)] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
-
- 04 Jul, 2018 4 commits
-
-
Aaro Koskinen authored
Enable kernel XZ compression option on BOOK3S_32. Tested on G4 PowerBook. Signed-off-by: Aaro Koskinen <aaro.koskinen@iki.fi> [mpe: Use one select under the PPC symbol guarded by if PPC_BOOK3S] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
-
Kees Cook authored
In the quest to remove all stack VLA usage from the kernel[1], this switches from an unchanging variable to a constant expression to eliminate the VLA generation. [1] https://lkml.kernel.org/r/CA+55aFzCG-zNmZwX4A2FQpadafLfEzK6CC=qPXydAacU1RqZWA@mail.gmail.comSigned-off-by: Kees Cook <keescook@chromium.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
-
Alexey Kardashevskiy authored
The sketchy bypass uses 256M pages so add this page size as well. This should cause no behavioral change but will be used later. Fixes: 477afd6ea6 "powerpc/ioda: Use ibm,supported-tce-sizes for IOMMU page size mask" Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
-
Hari Bathini authored
Memory reservation for crashkernel could fail if there are holes around kdump kernel offset (128M). Fail gracefully in such cases and print an error message. Signed-off-by: Hari Bathini <hbathini@linux.ibm.com> Tested-by: David Gibson <dgibson@redhat.com> Reviewed-by: Dave Young <dyoung@redhat.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
-
- 03 Jul, 2018 1 commit
-
-
Kees Cook authored
In the quest to remove all stack VLA usage from the kernel[1], this switches to using a stack size large enough for the saved routine and adds a sanity check making sure the routine doesn't overflow into the 0x600 exception handler. [1] https://lkml.kernel.org/r/CA+55aFzCG-zNmZwX4A2FQpadafLfEzK6CC=qPXydAacU1RqZWA@mail.gmail.comSigned-off-by: Kees Cook <keescook@chromium.org> Reviewed-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
-
- 02 Jul, 2018 16 commits
-
-
Frederic Barrat authored
If a process exits without doing proper cleanup, there's a window where an opencapi device can try to access the memory of the dying process and may trigger a page fault. That's an expected scenario and the ocxl driver holds a reference on the mm_struct of the process until the opencapi device is notified of the process exiting. However, if mm_users is already at 0, i.e. the address space of the process has already been destroyed, the driver shouldn't try resolving the page fault, as it will fail, but it can also try accessing already freed data. It is fixed by only calling the bottom half of the page fault handler if mm_users is greater than 0 and get a reference on mm_users instead of mm_count. Otherwise, we can safely return a translation fault to the device, as its associated memory context is being removed. The opencapi device will be properly cleaned up shortly after when closing the file descriptors. Fixes: 5ef3166e ("ocxl: Driver code for 'generic' opencapi devices") Cc: stable@vger.kernel.org # v4.16+ Signed-off-by: Frederic Barrat <fbarrat@linux.ibm.com> Reviewed-By: Alastair D'Silva <alastair@d-silva.org> Acked-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
-
Breno Leitao authored
Fix two typos in the file header. Replacing the word 'priviledged' by 'privileged' and 'exuecuted' by 'executed'. Signed-off-by: Breno Leitao <leitao@debian.org> Signed-off-by: Gustavo Romero <gromero@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
-
Breno Leitao authored
There is a buffer overflow in dscr_inherit_test.c test. In main(), strncpy()'s third argument is the length of the source, not the size of the destination buffer, which makes strncpy() behaves like strcpy(), causing a buffer overflow if argv[0] is bigger than LEN_MAX (100). This patch maps 'prog' to the argv[0] memory region, removing the static allocation and the LEN_MAX size restriction. Signed-off-by: Breno Leitao <leitao@debian.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
-
Frederic Barrat authored
Remove a few XSL/CX4 oddities which are no longer needed. A simple revert of the initial commits was not possible (or not worth it) due to the history of the code. Signed-off-by: Frederic Barrat <fbarrat@linux.ibm.com> Acked-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
-
Frederic Barrat authored
Remove abandonned capi support for the Mellanox CX4. This reverts commit a19bd79e. Signed-off-by: Frederic Barrat <fbarrat@linux.ibm.com> Acked-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
-
Frederic Barrat authored
Remove abandonned capi support for the Mellanox CX4. This reverts commit 4e56f858. Signed-off-by: Frederic Barrat <fbarrat@linux.ibm.com> Acked-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
-
Alastair D'Silva authored
Remove abandonned capi support for the Mellanox CX4. This reverts commit 4361b034. Signed-off-by: Alastair D'Silva <alastair@d-silva.org> Acked-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
-
Alastair D'Silva authored
Remove abandonned capi support for the Mellanox CX4. This reverts commit 317f5ef1. Signed-off-by: Alastair D'Silva <alastair@d-silva.org> Acked-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
-
Alastair D'Silva authored
Remove abandonned capi support for the Mellanox CX4. This reverts commit b0b5e591. Signed-off-by: Alastair D'Silva <alastair@d-silva.org> Acked-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
-
Alastair D'Silva authored
Remove abandonned capi support for the Mellanox CX4. This reverts commit 79384e4b. Signed-off-by: Alastair D'Silva <alastair@d-silva.org> Acked-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
-
Alastair D'Silva authored
Remove abandonned capi support for the Mellanox CX4. This reverts commit cbce0917. Signed-off-by: Alastair D'Silva <alastair@d-silva.org> Acked-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
-
Alastair D'Silva authored
Remove abandonned capi support for the Mellanox CX4. This reverts commit a2f67d5e. Signed-off-by: Alastair D'Silva <alastair@d-silva.org> Acked-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
-
Alastair D'Silva authored
Remove abandonned capi support for the Mellanox CX4. The symbol 'cxl_set_translation_mode' is never called, so ctx->real_mode is always false. This reverts commit 7a0d85d3. Signed-off-by: Alastair D'Silva <alastair@d-silva.org> Acked-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
-
Michael Neuling authored
debugfs doesn't support mmap(), so this code is never used. Signed-off-by: Michael Neuling <mikey@neuling.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
-
Alexey Kardashevskiy authored
We use PHB in mode1 which uses bit 59 to select a correct DMA window. However there is mode2 which uses bits 59:55 and allows up to 32 DMA windows per a PE. Even though documentation does not clearly specify that, it seems that the actual hardware does not support bits 59:55 even in mode1, in other words we can create a window as big as 1<<58 but DMA simply won't work. This reduces the upper limit from 59 to 55 bits to let the userspace know about the hardware limits. Fixes: 7aafac11 "powerpc/powernv/ioda2: Gracefully fail if too many TCE levels requested" Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
-
Michael Neuling authored
Signed-off-by: Michael Neuling <mikey@neuling.org> Acked-by: Stewart Smith <stewart@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
-