- 09 Oct, 2018 25 commits
-
-
Vasily Gorbik authored
Introduce sclp_early_get_hsa_size function to be used during early memory detection. This function allows to find a memory limit imposed during zfcpdump. Reviewed-by: Heiko Carstens <heiko.carstens@de.ibm.com> Reviewed-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
-
Vasily Gorbik authored
Print mem_detect info source when memblock=debug is specified. Reviewed-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
-
Vasily Gorbik authored
In a situation when other memory detection methods are not available (no SCLP and no z/VM diag260), continuous online memory is assumed. Replacing tprot loop with faster binary search, as only online memory end has to be found. Reviewed-by: Heiko Carstens <heiko.carstens@de.ibm.com> Reviewed-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
-
Vasily Gorbik authored
When neither SCLP storage info, nor z/VM diag260 "storage configuration" are available assume a continuous online memory of size specified by SCLP info. Reviewed-by: Heiko Carstens <heiko.carstens@de.ibm.com> Reviewed-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
-
Vasily Gorbik authored
In the case when z/VM memory is defined with "define storage config" command, SCLP storage info is not available. Utilize diag260 "storage configuration" call, to get information about z/VM specific guest memory definitions with potential memory holes. Reviewed-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
-
Vasily Gorbik authored
SCLP storage info allows to detect continuous and non-continuous online memory under LPAR, z/VM and KVM, when standby memory is defined. Reviewed-by: Heiko Carstens <heiko.carstens@de.ibm.com> Reviewed-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
-
Vasily Gorbik authored
Make sure that .boot.data sections of vmlinux and arch/s390/compressed/vmlinux match before producing the compressed kernel image. Symbols presence, order and sizes are cross-checked. Reviewed-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
-
Vasily Gorbik authored
Move memory detection to early boot phase. To store online memory regions "struct mem_detect_info" has been introduced together with for_each_mem_detect_block iterator. mem_detect_info is later converted to memblock. Also introduces sclp_early_get_meminfo function to get maximum physical memory and maximum increment number. Reviewed-by: Heiko Carstens <heiko.carstens@de.ibm.com> Reviewed-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
-
Vasily Gorbik authored
To enable early online memory detection sclp_early_read_info has been moved to sclp_early_core.c. sclp_info_sccb has been made a part of .boot.data, which allows to reuse it later during early kernel startup and make sclp_early_read_info call just once. Reviewed-by: Heiko Carstens <heiko.carstens@de.ibm.com> Reviewed-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
-
Vasily Gorbik authored
Introduce .boot.data section which is "shared" between the decompressor code and the decompressed kernel. The decompressor will store values in it, and copy over to the decompressed image before starting it. This method allows to avoid using pre-defined addresses and other hacks to pass values between those boot phases. .boot.data section is a part of init data, and will be freed after kernel initialization is complete. For uncompressed kernel image, .boot.data section is basically the same as .init.data Reviewed-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
-
Vasily Gorbik authored
Since compressed/misc.c is conditionally compiled move error reporting code to boot/main.c. With that being done compressed/misc.c has no "miscellaneous" functions left and is all about plain decompression now. Rename it accordingly. Reviewed-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
-
Vasily Gorbik authored
To avoid multi-stage initrd rescue operation and to simplify assumptions during early memory allocations move initrd at some final safe destination as early as possible. This would also allow us to drop .bss usage restrictions for some files. Reviewed-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
-
Vasily Gorbik authored
Architecture documentation suggests that hsa_size has been available in the read info since the list-directed ipl dump has been introduced. By using this value few early sclp calls could be avoided. Reviewed-by: Heiko Carstens <heiko.carstens@de.ibm.com> Reviewed-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
-
Vasily Gorbik authored
Using .bss in early code should be avoided. It might overlay initrd image or not yet be initialized. Clean up the last couple of places in the decompressor's code where .bss is used and enfore no .bss usage check on boot/compressed/misc.c. In particular: - initializing free_mem_ptr and free_mem_end_ptr with values guarantee that these variables won't end up in the .bss section. - define STATIC_RW_DATA to go into .data section. Reviewed-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
-
Vasily Gorbik authored
The kernel decompressor has to know several bits of information about uncompressed image. Currently this info is collected by running "nm" on uncompressed vmlinux + "sed" and producing sizes.h file. This method worked well, but it has several disadvantages. Obscure symbols name pattern matching is fragile. Adding new values makes pattern even longer. Logic is spread across code and make file. Limited ability to adjust symbols values (currently magic lma value of 0x100000 is always subtracted). Apart from that same pieces of information (and more) would be needed for early memory detection and features like KASLR outside of boot/compressed/ folder where sizes.h is generated. To overcome limitations new "struct vmlinux_info" has been introduced to include values needed for the decompressor and the rest of the boot code. The only static instance of vmlinux_info is produced during vmlinux link step by filling in struct fields by the linker (like it is done with input_data in boot/compressed/vmlinux.scr.lds.S). This way individual values could be adjusted with all the knowledge linker has and arithmetic it supports. Later .vmlinux.info section (which contains struct vmlinux_info) is transplanted into the decompressor image and dropped from uncompressed image altogether. While doing that replace "compressed/vmlinux.scr.lds.S" linker script (whose purpose is to rename .data section in piggy.o to .rodata.compressed) with plain objcopy command. And simplify decompressor's linker script. Reviewed-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
-
Vasily Gorbik authored
Decompressor's head.S provided "data mover" sole purpose of which has been to safely move uncompressed kernel at 0x100000 and jump to it. With current bzImage layout entire decompressor's code guaranteed to be in a safe location under 0x100000, and hence could not be overwritten during kernel move. For that reason head.S could be replaced with simple memmove function. To do so introduce early boot code phase which is executed from arch/s390/boot/head.S after "verify_facilities" and takes care of optional kernel image decompression and transition to it. Reviewed-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
-
Vasily Gorbik authored
Remove STACK_ORDER and STACK_SIZE in favour of identical THREAD_SIZE_ORDER and THREAD_SIZE definitions. THREAD_SIZE and THREAD_SIZE_ORDER naming is misleading since it is used as general kernel stack size information. But both those definitions are used in the common code and throughout architectures specific code, so changing the naming is problematic. Reviewed-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
-
Martin Schwidefsky authored
With virtually mapped kernel stacks the kernel stack overflow detection is now fault based, every stack has a guard page in the vmalloc space. The panic_stack is renamed to nodat_stack and is used for all function that need to run without DAT, e.g. memcpy_real or do_start_kdump. The main effect is a reduction in the kernel image size as with vmap stacks the old style overflow checking that adds two instructions per function is not needed anymore. Result from bloat-o-meter: add/remove: 20/1 grow/shrink: 13/26854 up/down: 2198/-216240 (-214042) In regard to performance the micro-benchmark for fork has a hit of a few microseconds, allocating 4 pages in vmalloc space is more expensive compare to an order-2 page allocation. But with real workload I could not find a noticeable difference. Acked-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
-
Martin Schwidefsky authored
Reviewed-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
-
Martin Schwidefsky authored
With CONFIG_VMAP_STACK=y the kernel stack of all tasks should be allocated in the vmalloc space. The initial stack used for all the early init code is in the init_thread_union. To be able to switch from this early stack to a properly allocated stack from vmalloc the architecture needs a switch-over point. Introduce the arch_call_rest_init() function with a weak definition in init/main.c with the only purpose to call rest_init() from the end of start_kernel(). The architecture override can then do the necessary magic to switch to the new vmalloc'ed stack. Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
-
Martin Schwidefsky authored
With CONFIG_VMAP_STACK=y the stack is allocated from the vmalloc space. Data structures passed to a hardware or a hypervisor interface that requires V=R can not be allocated on the stack anymore. Make the init and fini pfault parameter blocks static variables. Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
-
Martin Schwidefsky authored
With CONFIG_VMAP_STACK=y the stack is allocated from the vmalloc space. Data structures passed to a hardware or a hypervisor interface that requires V=R can not be allocated on the stack anymore. Use kmalloc to get memory for the appldata_parameter_list and appldata_product_id structures. Reviewed-by: Gerald Schaefer <gerald.schaefer@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
-
Martin Schwidefsky authored
With CONFIG_VMAP_STACK=y the stack is allocated from the vmalloc space. Data structures passed to a hardware or a hypervisor interface that requires V=R can not be allocated on the stack anymore. Use kmalloc to get memory for the hypsfs_diag304 structure. Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
-
Martin Schwidefsky authored
With CONFIG_VMAP_STACK=y the stack is allocated from the vmalloc space. Data structures passed to a hardware or a hypervisor interface that requires V=R can not be allocated on the stack anymore. Use kmalloc to get memory for the appldata_product_id and the appldata_parameter_list structures. Reviewed-by: Gerald Schaefer <gerald.schaefer@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
-
Martin Schwidefsky authored
In preparation for CONFIG_VMAP_STACK=y move the allocation of the struct appldata_parameter_list to the caller of appldata_asm(). Reviewed-by: Gerald Schaefer <gerald.schaefer@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
-
- 08 Oct, 2018 2 commits
-
-
Julian Wiedmann authored
Provide function to find a ccwgroup device by its busid. Acked-by: Sebastian Ott <sebott@linux.ibm.com> Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
-
Harald Freudenberger authored
This patch is an extension to the zcrypt device driver to provide, support and maintain multiple zcrypt device nodes. The individual zcrypt device nodes can be restricted in terms of crypto cards, domains and available ioctls. Such a device node can be used as a base for container solutions like docker to control and restrict the access to crypto resources. The handling is done with a new sysfs subdir /sys/class/zcrypt. Echoing a name (or an empty sting) into the attribute "create" creates a new zcrypt device node. In /sys/class/zcrypt a new link will appear which points to the sysfs device tree of this new device. The attribute files "ioctlmask", "apmask" and "aqmask" in this directory are used to customize this new zcrypt device node instance. Finally the zcrypt device node can be destroyed by echoing the name into /sys/class/zcrypt/destroy. The internal structs holding the device info are reference counted - so a destroy will not hard remove a device but only marks it as removable when the reference counter drops to zero. The mask values are bitmaps in big endian order starting with bit 0. So adapter number 0 is the leftmost bit, mask is 0x8000... The sysfs attributes accept 2 different formats: * Absolute hex string starting with 0x like "0x12345678" does set the mask starting from left to right. If the given string is shorter than the mask it is padded with 0s on the right. If the string is longer than the mask an error comes back (EINVAL). * Relative format - a concatenation (done with ',') of the terms +<bitnr>[-<bitnr>] or -<bitnr>[-<bitnr>]. <bitnr> may be any valid number (hex, decimal or octal) in the range 0...255. Here are some examples: "+0-15,+32,-128,-0xFF" "-0-255,+1-16,+0x128" "+1,+2,+3,+4,-5,-7-10" A simple usage examples: # create new zcrypt device 'my_zcrypt': echo "my_zcrypt" >/sys/class/zcrypt/create # go into the device dir of this new device echo "my_zcrypt" >create cd my_zcrypt/ ls -l total 0 -rw-r--r-- 1 root root 4096 Jul 20 15:23 apmask -rw-r--r-- 1 root root 4096 Jul 20 15:23 aqmask -r--r--r-- 1 root root 4096 Jul 20 15:23 dev -rw-r--r-- 1 root root 4096 Jul 20 15:23 ioctlmask lrwxrwxrwx 1 root root 0 Jul 20 15:23 subsystem -> ../../../../class/zcrypt ... # customize this zcrypt node clone # enable only adapter 0 and 2 echo "0xa0" >apmask # enable only domain 6 echo "+6" >aqmask # enable all 256 ioctls echo "+0-255" >ioctls # now the /dev/my_zcrypt may be used # finally destroy it echo "my_zcrypt" >/sys/class/zcrypt/destroy Please note that a very similar 'filtering behavior' also applies to the parent z90crypt device. The two mask attributes apmask and aqmask in /sys/bus/ap act the very same for the z90crypt device node. However the implementation here is totally different as the ap bus acts on bind/unbind of queue devices and associated drivers but the effect is still the same. So there are two filters active for each additional zcrypt device node: The adapter/domain needs to be enabled on the ap bus level and it needs to be active on the zcrypt device node level. Signed-off-by: Harald Freudenberger <freude@linux.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
-
- 20 Sep, 2018 13 commits
-
-
Julian Wiedmann authored
I've stumbled over this too many times now... AOBs are only ever used on Output Queues. So in qdio_kick_handler(), move the call to their handler into the Output-only path, and get rid of the convoluted contains_aobs() helper. No functional change. While at it, also remove 1. the unused sbal_state->aob field. For processing an async completion, upper-layer drivers get their AOB pointer from the CQ buffer. 2. an unused EXPORT for qdio_allocate_aob(). External users would have no way of passing an allocated AOB back into qdio.ko anyways... Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
-
Vasily Gorbik authored
Replace hard coded stack frame overhead values with STACK_FRAME_OVERHEAD definition. Avoid unnecessary arithmetic instructions. Signed-off-by: Vasily Gorbik <gor@linux.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
-
Vasily Gorbik authored
Correct stack frame overhead for 31-bit vdso, which should be 96 rather then 160. This is done by reusing STACK_FRAME_OVERHEAD definition which contains correct value based on build flags. This fixes stack unwinding within vdso code for 31-bit processes. While at it replace all hard coded stack frame overhead values with the same definition in vdso64 as well. Reviewed-by: Hendrik Brueckner <brueckner@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
-
Vasily Gorbik authored
vdso_fault used is_compat_task function (on s390 it tests "current" thread_info flags) to distinguish compat tasks and map 31-bit vdso pages. But "current" task might not correspond to mm context. When 31-bit compat inferior is executed under gdb, gdb does PTRACE_PEEKTEXT on vdso page, causing vdso_fault with "current" being 64-bit gdb process. So, 31-bit inferior ends up with 64-bit vdso mapped. To avoid this problem a new compat_mm flag has been introduced into mm context. This flag is used in vdso_fault and vdso_mremap instead of is_compat_task. Signed-off-by: Vasily Gorbik <gor@linux.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
-
Halil Pasic authored
The AP bus scan is aborted before doing anything worth mentioning if ap_select_domain() fails, e.g. if the ap_rights.aqm mask is all zeros. As the result of this the ap bus fails to manage (e.g. create and register) devices like it is supposed to. Let us make ap_scan_bus() work even if ap_select_domain() can't select a default domain. Let's also make ap_select_domain() return void, as there are no more callers interested in its return value. Signed-off-by: Halil Pasic <pasic@linux.ibm.com> Reported-by: Michael Mueller <mimu@linux.ibm.com> Fixes: 7e0bdbe5 "s390/zcrypt: AP bus support for alternate driver(s)" [freude@linux.ibm.com: title and patch header slightly modified] Signed-off-by: Harald Freudenberger <freude@linux.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
-
Martin Schwidefsky authored
To be able to start a kernel image loaded into memory with a PSW restart, place a 64-bit restart PSW at 0x1a0 in absolute lowcore. Suggested-by: Dominik Klein <dominik.klein@linux.ibm.com> Tested-by: Dominik Klein <dominik.klein@linux.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
-
Colin Ian King authored
Trivial fix to spelling mistake in message text Signed-off-by: Colin Ian King <colin.king@canonical.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
-
zhong jiang authored
Use the common code ARRAY_SIZE macro instead of a private implementation. Reviewed-by: Jean Delvare <jdelvare@suse.de> Signed-off-by: zhong jiang <zhongjiang@huawei.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
-
zhong jiang authored
kmemdup has implemented the function that kmalloc() + memcpy() will do. We prefer to use the kmemdup function rather than an open coded implementation. Signed-off-by: zhong jiang <zhongjiang@huawei.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
-
Jan Höppner authored
The SCLP event 24 "Adapter Error Notification" supports three different action qualifier of which 'adapter reset' is currently not enabled in the sysfs interface. However, userspace tools might want to be able to use the reset functionality as well. Enable the 'adapter reset' qualifier. Signed-off-by: Jan Höppner <hoeppner@linux.ibm.com> Reviewed-by: Sebastian Ott <sebott@linux.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
-
Chengguang Xu authored
kmem_cache_destroy() can handle NULL pointer correctly, so there is no need to check NULL pointer before calling kmem_cache_destroy(). Signed-off-by: Chengguang Xu <cgxu519@gmx.com> Signed-off-by: Jan Höppner <hoeppner@linux.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
-
Gerald Schaefer authored
The resume code checks if the resume cpu is the same as the suspend cpu. If not, and if it is also not possible to switch to the suspend cpu, an error message should be printed and the resume process should be stopped by loading a disabled wait psw. The current logic is broken in multiple ways, the message is never printed, and the disabled wait psw never loaded because the kernel panics before that: - sam31 and SIGP_SET_ARCHITECTURE to ESA mode is wrong, this will break on the first 64bit instruction in sclp_early_printk(). - The init stack should be used, but the stack pointer is not set up correctly (missing aghi %r15,-STACK_FRAME_OVERHEAD). - __sclp_early_printk() checks the sclp_init_state. If it is not sclp_init_state_uninitialized, it simply returns w/o printing anything. In the resumed kernel however, sclp_init_state will never be uninitialized. This patch fixes those issues by removing the sam31/ESA logic, adding a correct init stack pointer, and also introducing sclp_early_printk_force() to allow using sclp_early_printk() even when sclp_init_state is not uninitialized. Reviewed-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Gerald Schaefer <gerald.schaefer@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
-
git://git.infradead.org/linux-mtdGreg Kroah-Hartman authored
Boris writes: "- Fixes a bug in the ->read/write_reg() implementation of the m25p80 driver - Make sure of_node_get/put() calls are balanced in the partition parsing code - Fix a race in the denali NAND controller driver - Fix false positive WARN_ON() in the marvell NAND controller driver" * tag 'mtd/fixes-for-4.19-rc5' of git://git.infradead.org/linux-mtd: mtd: devices: m25p80: Make sure the buffer passed in op is DMA-able mtd: partitions: fix unbalanced of_node_get/put() mtd: rawnand: denali: fix a race condition when DMA is kicked mtd: rawnand: marvell: prevent harmless warnings
-