- 08 Oct, 2016 5 commits
-
-
Nathan Sullivan authored
To ensure the dev->phydev pointer is not used after becoming invalid in mdiobus_unregister, set it to NULL. This happens when removing the macb driver without first taking its interface down, since unregister_netdev will end up calling macb_close. Signed-off-by: Xander Huff <xander.huff@ni.com> Signed-off-by: Nathan Sullivan <nathan.sullivan@ni.com> Signed-off-by: Brad Mouring <brad.mouring@ni.com> Reviewed-by: Moritz Fischer <moritz.fischer@ettus.com> Acked-by: Nicolas Ferre <nicolas.ferre@atmel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Paul Durrant authored
In the case when a frontend only negotiates a single queue with xen- netback it is possible for a skbuff with a s/w hash to result in a hash extra_info segment being sent to the frontend even when no hash algorithm has been configured. (The ndo_select_queue() entry point makes sure the hash is not set if no algorithm is configured, but this entry point is not called when there is only a single queue). This can result in a frontend that is unable to handle extra_info segments being given such a segment, causing it to crash. This patch fixes the problem by clearing the hash in ndo_start_xmit() instead, which is clearly guaranteed to be called irrespective of the number of queues. Signed-off-by: Paul Durrant <paul.durrant@citrix.com> Cc: Wei Liu <wei.liu2@citrix.com> Acked-by: Wei Liu <wei.liu2@citrix.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Alex Sidorenko authored
Roundrobin runner of team driver uses 'unsigned int' variable to count the number of sent_packets. Later it is passed to a subroutine team_num_to_port_index(struct team *team, int num) as 'num' and when we reach MAXINT (2**31-1), 'num' becomes negative. This leads to using incorrect hash-bucket for port lookup and as a result, packets are dropped. The fix consists of changing 'int num' to 'unsigned int num'. Testing of a fixed kernel shows that there is no packet drop anymore. Signed-off-by: Alex Sidorenko <alexandre.sidorenko@hpe.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Paul Durrant authored
Signed-off-by: Paul Durrant <paul.durrant@citrix.com> Cc: Wei Liu <wei.liu2@citrix.com> Acked-by: Wei Liu <wei.liu2@citrix.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Maciej Żenczykowski authored
This disallows setting /proc/sys/net/ipv6/conf/*/router_solicitations to values below -1. -1 continues to mean an unlimited number of retransmits. Note: this depends on 'ipv6 addrconf: remove addrconf_sysctl_hop_limit()' Signed-off-by: Maciej Żenczykowski <maze@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
- 07 Oct, 2016 26 commits
-
-
Laura Abbott authored
An extra entry for MDIO_XGENE got added during merging. Delete it. Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: Laura Abbott <labbott@redhat.com> Acked-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Geert Uytterhoeven authored
If NO_DMA=y: drivers/built-in.o: In function `emac_probe': emac.c:(.text+0x3780b8): undefined reference to `bad_dma_ops' emac.c:(.text+0x3780e2): undefined reference to `bad_dma_ops' emac.c:(.text+0x378112): undefined reference to `bad_dma_ops' emac.c:(.text+0x378146): undefined reference to `bad_dma_ops' emac.c:(.text+0x37816e): undefined reference to `bad_dma_ops' drivers/built-in.o:emac.c:(.text+0x37819a): more undefined references to `bad_dma_ops' follow If NO_IOMEM=y: drivers/net/ethernet/qualcomm/emac/emac.c: In function ‘emac_remove’: drivers/net/ethernet/qualcomm/emac/emac.c:736:3: error: implicit declaration of function ‘iounmap’ [-Werror=implicit-function-declaration] iounmap(adpt->phy.digital); ^ Add dependencies on HAS_DMA and HAS_IOMEM to fix this. Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org> Acked-by: Timur Tabi <timur@codeaurora.org> Signed-off-by: David S. Miller <davem@davemloft.net>
-
David S. Miller authored
Nelson Chang says: ==================== net: ethernet: mediatek: check the hw lro capability by the chip id instead of the dtsi The series modify to check if hw lro is supported by the chip id. changes since v3: - Refine mtk_is_hwlro_supported() function changes since v2: - Refine mtk_get_chip_id() function changes since v1: - Because hw lro started to be supported from MT7623, the proper way to check if the feature is capable is to judge by the chip id instead of by the dtsi. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
-
Nelson Chang authored
Since the proper way to check the hw lro capability is by the chip id, hwlro property in the device tree should be removed. Signed-off-by: Nelson Chang <nelson.chang@mediatek.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Nelson Chang authored
Because hw lro started to be supported from MT7623, the proper way to check if the feature is capable is to judge by the chip id instead of by the dtsi. Signed-off-by: Nelson Chang <nelson.chang@mediatek.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Nelson Chang authored
The driver gets the chip id by ETHSYS_CHIPID0_3/ETHSYS_CHIPID4_7 registers in mtk_probe(). Signed-off-by: Nelson Chang <nelson.chang@mediatek.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
David S. Miller authored
Merge tag 'rxrpc-rewrite-20161004' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs David Howells says: ==================== rxrpc: Fixes This set of patches contains a bunch of fixes: (1) Fix an oops on incoming call to a local endpoint without a bound service. (2) Only ping for a lost reply in a client call (this is inapplicable to service calls). (3) Fix maybe uninitialised variable warnings in the ACK/ABORT sending function by splitting it. (4) Fix loss of PING RESPONSE ACKs due to them being subsumed by PING ACK generation. (5) OpenAFS improperly terminates calls it makes as a client under some circumstances by not fully hard-ACK'ing the last DATA packets. This is alleviated by a new call appearing on the same channel implicitly completing the previous call on that channel. Handle this implicit completion. (6) Properly handle expiry of service calls due to the aforementioned improper termination with no follow up call to implicitly complete it: (a) The call's background processor needs to be queued to complete the call, send an abort and notify the socket. (b) The call's background processor needs to notify the socket (or the kernel service) when it has completed the call. (c) A negative error code must thence be returned to the kernel service so that it knows the call died. (d) The AFS filesystem must detect the fatal error and end the call. (7) Must produce a DELAY ACK when the actual service operation takes a while to process and must cancel the ACK when the reply is ready. (8) Don't request an ACK on the last DATA packet of the Tx phase as this confuses OpenAFS. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
-
Jon Mason authored
During the conversion to the feature flags, a check against ci->id != BCMA_CHIP_ID_BCM47162 became bgmac->feature_flags & BGMAC_FEAT_CLKCTLS instead of !(bgmac->feature_flags & BGMAC_FEAT_CLKCTLS) Reported-by: Rafał Miłecki <zajec5@gmail.com> Signed-off-by: Jon Mason <jon.mason@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Eric Dumazet authored
Since linux-3.15, netlink_dump() can use up to 16384 bytes skb allocations. Due to struct skb_shared_info ~320 bytes overhead, we end up using order-3 (on x86) page allocations, that might trigger direct reclaim and add stress. The intent was really to attempt a large allocation but immediately fallback to a smaller one (order-1 on x86) in case of memory stress. On recent kernels (linux-4.4), we can remove __GFP_DIRECT_RECLAIM to meet the goal. Old kernels would need to remove __GFP_WAIT While we are at it, since we do an order-3 allocation, allow to use all the allocated bytes instead of 16384 to reduce syscalls during large dumps. iproute2 already uses 32KB recvmsg() buffer sizes. Alexei provided an initial patch downsizing to SKB_WITH_OVERHEAD(16384) Fixes: 9063e21f ("netlink: autosize skb lengthes") Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: Alexei Starovoitov <ast@kernel.org> Cc: Greg Thelen <gthelen@google.com> Reviewed-by: Greg Rose <grose@lightfleet.com> Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Anoob Soman authored
If a socket has FANOUT sockopt set, a new proto_hook is registered as part of fanout_add(). When processing a NETDEV_UNREGISTER event in af_packet, __fanout_unlink is called for all sockets, but prot_hook which was registered as part of fanout_add is not removed. Call fanout_release, on a NETDEV_UNREGISTER, which removes prot_hook and removes fanout from the fanout_list. This fixes BUG_ON(!list_empty(&dev->ptype_specific)) in netdev_run_todo() Signed-off-by: Anoob Soman <anoob.soman@citrix.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Mike Looijmans authored
The KSZ9031 skew registers contain an offset, the chip's default value is "neutral" which does not add any skew. Programming a 0 into a skew property will actually set it the maximal negative adjustment and not to a neutral position as one would expect. Explain this situation in the devicetree binding documentation and list the settings that the chip considers neutral. Changing the implementation to accept negative values would have been a better solution, but would break existing configurations. Signed-off-by: Mike Looijmans <mike.looijmans@topic.nl> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Raju Lakkaraju authored
Wake-on-LAN (WoL) is an Ethernet networking standard that allows a computer/device to be turned on or awakened by a network message. VSC8531 PHY can support this feature configure by driver set function. WoL status get by driver get function. Tested on Beaglebone Black with VSC 8531 PHY. Signed-off-by: Raju Lakkaraju <Raju.Lakkaraju@microsemi.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Laurent Pinchart authored
Add a new compatible string for the R8A7796 (M3-W) RAVB. Signed-off-by: Laurent Pinchart <laurent.pinchart+renesas@ideasonboard.com> Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Mugunthan V N authored
Add support to enable CPSW RGMII internal delay (id mode) bits when rgmii internal delay is configured in phy. Signed-off-by: Mugunthan V N <mugunthanvnm@ti.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Colin Ian King authored
Trival fix, dev_err messages are missing a \n, so add it. Also fix grammer, spelling mistake and add white spaces to various error messages. Signed-off-by: Colin Ian King <colin.king@canonical.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Colin Ian King authored
Trival fix, dev_dbg message is missing a \n, so add it. Signed-off-by: Colin Ian King <colin.king@canonical.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Colin Ian King authored
Trival fix, dev_err messages are missing a \n, so add it. Signed-off-by: Colin Ian King <colin.king@canonical.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
David S. Miller authored
Paul Durrant says: ==================== xen-netback: guest rx side refactor This series refactors the guest rx side of xen-netback: - The code is moved into its own source module. - The prefix variant of GSO handling is retired (since it is no longer in common use, and alternatives exist). - The code is then simplified and modifications made to improve performance. v2: - Rebased onto refreshed net-next ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
-
Ross Lagerwall authored
This allows full 64K skbuffs (with 1500 mtu ethernet, composed of 45 fragments) to be handled by netback for to-guest rx. Signed-off-by: Ross Lagerwall <ross.lagerwall@citrix.com> [re-based] Signed-off-by: Paul Durrant <paul.durrant@citrix.com> Reviewed-by: David Vrabel <david.vrabel@citrix.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
David Vrabel authored
Instead of flushing the copy ops when an packet is complete, complete packets when their copy ops are done. This improves performance by reducing the number of grant copy hypercalls. Latency is still limited by the relatively small size of the copy batch. Signed-off-by: David Vrabel <david.vrabel@citrix.com> [re-based] Signed-off-by: Paul Durrant <paul.durrant@citrix.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
David Vrabel authored
Instead of only placing one skb on the guest rx ring at a time, process a batch of up-to 64. This improves performance by ~10% in some tests. Signed-off-by: David Vrabel <david.vrabel@citrix.com> [re-based] Signed-off-by: Paul Durrant <paul.durrant@citrix.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
David Vrabel authored
When an skb is removed from the guest rx queue, immediately wake the tx queue, instead of after processing them. Signed-off-by: David Vrabel <david.vrabel@citrix.com> [re-based] Signed-off-by: Paul Durrant <paul.durrant@citrix.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
David Vrabel authored
Refactor the to-guest (rx) path to: 1. Push responses for completed skbs earlier, reducing latency. 2. Reduce the per-queue memory overhead by greatly reducing the maximum number of grant copy ops in each hypercall (from 4352 to 64). Each struct xenvif_queue is now only 44 kB instead of 220 kB. 3. Make the code more maintainable. Signed-off-by: David Vrabel <david.vrabel@citrix.com> [re-based] Signed-off-by: Paul Durrant <paul.durrant@citrix.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Paul Durrant authored
As far as I am aware only very old Windows network frontends make use of this style of passing GSO packets from backend to frontend. These frontends can easily be replaced by the freely available Xen Project Windows PV network frontend, which uses the 'default' mechanism for passing GSO packets, which is also used by all Linux frontends. NOTE: Removal of this feature will not cause breakage in old Windows frontends. They simply will no longer receive GSO packets - the packets instead being fragmented in the backend. Signed-off-by: Paul Durrant <paul.durrant@citrix.com> Reviewed-by: David Vrabel <david.vrabel@citrix.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Paul Durrant authored
The netback source module has become very large and somewhat confusing. This patch simply moves all code related to the backend to frontend (i.e guest side rx) data-path into a separate rx source module. This patch contains no functional change, it is code movement and minimal changes to avoid patch style-check issues. Signed-off-by: Paul Durrant <paul.durrant@citrix.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
git://git.freescale.com/ppc/upstream/linuxDavid S. Miller authored
Madalin Bucur says: ==================== fsl/fman: cleanup and small fixes This series contains fixes for the DPAA FMan driver. Adding myself as maintainer of the driver. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
-
- 06 Oct, 2016 9 commits
-
-
git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fsLinus Torvalds authored
Pull f2fs updates from Jaegeuk Kim: "In this round, we've investigated how f2fs deals with errors given by our fault injection facility. With this, we could fix several corner cases. And, in order to improve the performance, we set inline_dentry by default and enhance the exisiting discard issue flow. In addition, we added f2fs_migrate_page for better memory management. Enhancements: - set inline_dentry by default - improve discard issue flow - add more fault injection cases in f2fs - allow block preallocation for encrypted files - introduce migrate_page callback function - avoid truncating the next direct node block at every checkpoint Bug fixes: - set page flag correctly between write_begin and write_end - missing error handling cases detected by fault injection - preallocate blocks regarding to 4KB alignement correctly - dentry and filename handling of encryption - lost xattrs of directories" * tag 'for-f2fs-4.9' of git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs: (69 commits) f2fs: introduce update_ckpt_flags to clean up f2fs: don't submit irrelevant page f2fs: fix to commit bio cache after flushing node pages f2fs: introduce get_checkpoint_version for cleanup f2fs: remove dead variable f2fs: remove redundant io plug f2fs: support checkpoint error injection f2fs: fix to recover old fault injection config in ->remount_fs f2fs: do fault injection initialization in default_options f2fs: remove redundant value definition f2fs: support configuring fault injection per superblock f2fs: adjust display format of segment bit f2fs: remove dirty inode pages in error path f2fs: do not unnecessarily null-terminate encrypted symlink data f2fs: handle errors during recover_orphan_inodes f2fs: avoid gc in cp_error case f2fs: should put_page for summary page f2fs: assign return value in f2fs_gc f2fs: add customized migrate_page callback f2fs: introduce cp_lock to protect updating of ckpt_flags ...
-
git://git.kernel.org/pub/scm/linux/kernel/git/kees/linuxLinus Torvalds authored
Pull pstore updates from Kees Cook: - Fix bug in module unloading - Switch to always using spinlock over cmpxchg - Explicitly define pstore backend's supported modes - Remove bounce buffer from pmsg - Switch to using memcpy_to/fromio() - Error checking improvements * tag 'pstore-v4.9-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux: ramoops: move spin_lock_init after kmalloc error checking pstore/ram: Use memcpy_fromio() to save old buffer pstore/ram: Use memcpy_toio instead of memcpy pstore/pmsg: drop bounce buffer pstore/ram: Set pstore flags dynamically pstore: Split pstore fragile flags pstore/core: drop cmpxchg based updates pstore/ramoops: fixup driver removal
-
git://git.kernel.org/pub/scm/linux/kernel/git/hubcap/linuxLinus Torvalds authored
Pull orangefs updates from Mike Marshall: "Miscellaneous improvements: - clean up debugfs globals - remove dead code in sysfs - reorganize duplicated sysfs attribute structs - consolidate sysfs show and store functions - remove duplicated sysfs_ops structures - describe organization of sysfs - make devreq_mutex static - g_orangefs_stats -> orangefs_stats for consistency - rename most remaining global variables Feature negotiation: - enable Orangefs userspace and kernel module to negotiate mutually supported features" * tag 'for-linus-4.9-ofs1' of git://git.kernel.org/pub/scm/linux/kernel/git/hubcap/linux: Revert "orangefs: bump minimum userspace version" orangefs: bump minimum userspace version orangefs: rename most remaining global variables orangefs: g_orangefs_stats -> orangefs_stats for consistency orangefs: make devreq_mutex static orangefs: describe organization of sysfs orangefs: remove duplicated sysfs_ops structures orangefs: consolidate sysfs show and store functions orangefs: reorganize duplicated sysfs attribute structs orangefs: remove dead code in sysfs orangefs: clean up debugfs globals orangefs: do not allow client readahead cache without feature bit orangefs: add features op orangefs: record userspace version for feature compatbility orangefs: add readahead count and size to sysfs orangefs: re-add flush_racache from out-of-tree orangefs: turn param response value into union orangefs: add missing param request ops orangefs: rename remaining bits of mmap readahead cache
-
git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-traceLinus Torvalds authored
Pull tracing updates from Steven Rostedt: "This release cycle is rather small. Just a few fixes to tracing. The big change is the addition of the hwlat tracer. It not only detects SMIs, but also other latency that's caused by the hardware. I have detected some latency from large boxes having bus contention" * tag 'trace-v4.9' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace: tracing: Call traceoff trigger after event is recorded ftrace/scripts: Add helper script to bisect function tracing problem functions tracing: Have max_latency be defined for HWLAT_TRACER as well tracing: Add NMI tracing in hwlat detector tracing: Have hwlat trace migrate across tracing_cpumask CPUs tracing: Add documentation for hwlat_detector tracer tracing: Added hardware latency tracer ftrace: Access ret_stack->subtime only in the function profiler function_graph: Handle TRACE_BPUTS in print_graph_comment tracing/uprobe: Drop isdigit() check in create_trace_uprobe
-
git://git.kernel.org/pub/scm/linux/kernel/git/xen/tipLinus Torvalds authored
Pull xen updates from David Vrabel: "xen features and fixes for 4.9: - switch to new CPU hotplug mechanism - support driver_override in pciback - require vector callback for HVM guests (the alternate mechanism via the platform device has been broken for ages)" * tag 'for-linus-4.9-rc0-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip: xen/x86: Update topology map for PV VCPUs xen/x86: Initialize per_cpu(xen_vcpu, 0) a little earlier xen/pciback: support driver_override xen/pciback: avoid multiple entries in slot list xen/pciback: simplify pcistub device handling xen: Remove event channel notification through Xen PCI platform device xen/events: Convert to hotplug state machine xen/x86: Convert to hotplug state machine x86/xen: add missing \n at end of printk warning message xen/grant-table: Use kmalloc_array() in arch_gnttab_valloc() xen: Make VPMU init message look less scary xen: rename xen_pmu_init() in sys-hypervisor.c hotplug: Prevent alloc/free of irq descriptors during cpu up/down (again) xen/x86: Move irq allocation from Xen smp_op.cpu_up()
-
git://git.kernel.org/pub/scm/virt/kvm/kvmLinus Torvalds authored
Pull KVM updates from Radim Krčmář: "All architectures: - move `make kvmconfig` stubs from x86 - use 64 bits for debugfs stats ARM: - Important fixes for not using an in-kernel irqchip - handle SError exceptions and present them to guests if appropriate - proxying of GICV access at EL2 if guest mappings are unsafe - GICv3 on AArch32 on ARMv8 - preparations for GICv3 save/restore, including ABI docs - cleanups and a bit of optimizations MIPS: - A couple of fixes in preparation for supporting MIPS EVA host kernels - MIPS SMP host & TLB invalidation fixes PPC: - Fix the bug which caused guests to falsely report lockups - other minor fixes - a small optimization s390: - Lazy enablement of runtime instrumentation - up to 255 CPUs for nested guests - rework of machine check deliver - cleanups and fixes x86: - IOMMU part of AMD's AVIC for vmexit-less interrupt delivery - Hyper-V TSC page - per-vcpu tsc_offset in debugfs - accelerated INS/OUTS in nVMX - cleanups and fixes" * tag 'kvm-4.9-1' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (140 commits) KVM: MIPS: Drop dubious EntryHi optimisation KVM: MIPS: Invalidate TLB by regenerating ASIDs KVM: MIPS: Split kernel/user ASID regeneration KVM: MIPS: Drop other CPU ASIDs on guest MMU changes KVM: arm/arm64: vgic: Don't flush/sync without a working vgic KVM: arm64: Require in-kernel irqchip for PMU support KVM: PPC: Book3s PR: Allow access to unprivileged MMCR2 register KVM: PPC: Book3S PR: Support 64kB page size on POWER8E and POWER8NVL KVM: PPC: Book3S: Remove duplicate setting of the B field in tlbie KVM: PPC: BookE: Fix a sanity check KVM: PPC: Book3S HV: Take out virtual core piggybacking code KVM: PPC: Book3S: Treat VTB as a per-subcore register, not per-thread ARM: gic-v3: Work around definition of gic_write_bpr1 KVM: nVMX: Fix the NMI IDT-vectoring handling KVM: VMX: Enable MSR-BASED TPR shadow even if APICv is inactive KVM: nVMX: Fix reload apic access page warning kvmconfig: add virtio-gpu to config fragment config: move x86 kvm_guest.config to a common location arm64: KVM: Remove duplicating init code for setting VMID ARM: KVM: Support vgic-v3 ...
-
git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespaceLinus Torvalds authored
Pull namespace updates from Eric Biederman: "This set of changes is a number of smaller things that have been overlooked in other development cycles focused on more fundamental change. The devpts changes are small things that were a distraction until we managed to kill off DEVPTS_MULTPLE_INSTANCES. There is an trivial regression fix to autofs for the unprivileged mount changes that went in last cycle. A pair of ioctls has been added by Andrey Vagin making it is possible to discover the relationships between namespaces when referring to them through file descriptors. The big user visible change is starting to add simple resource limits to catch programs that misbehave. With namespaces in general and user namespaces in particular allowing users to use more kinds of resources, it has become important to have something to limit errant programs. Because the purpose of these limits is to catch errant programs the code needs to be inexpensive to use as it always on, and the default limits need to be high enough that well behaved programs on well behaved systems don't encounter them. To this end, after some review I have implemented per user per user namespace limits, and use them to limit the number of namespaces. The limits being per user mean that one user can not exhause the limits of another user. The limits being per user namespace allow contexts where the limit is 0 and security conscious folks can remove from their threat anlysis the code used to manage namespaces (as they have historically done as it root only). At the same time the limits being per user namespace allow other parts of the system to use namespaces. Namespaces are increasingly being used in application sand boxing scenarios so an all or nothing disable for the entire system for the security conscious folks makes increasing use of these sandboxes impossible. There is also added a limit on the maximum number of mounts present in a single mount namespace. It is nontrivial to guess what a reasonable system wide limit on the number of mount structure in the kernel would be, especially as it various based on how a system is using containers. A limit on the number of mounts in a mount namespace however is much easier to understand and set. In most cases in practice only about 1000 mounts are used. Given that some autofs scenarious have the potential to be 30,000 to 50,000 mounts I have set the default limit for the number of mounts at 100,000 which is well above every known set of users but low enough that the mount hash tables don't degrade unreaonsably. These limits are a start. I expect this estabilishes a pattern that other limits for resources that namespaces use will follow. There has been interest in making inotify event limits per user per user namespace as well as interest expressed in making details about what is going on in the kernel more visible" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace: (28 commits) autofs: Fix automounts by using current_real_cred()->uid mnt: Add a per mount namespace limit on the number of mounts netns: move {inc,dec}_net_namespaces into #ifdef nsfs: Simplify __ns_get_path tools/testing: add a test to check nsfs ioctl-s nsfs: add ioctl to get a parent namespace nsfs: add ioctl to get an owning user namespace for ns file descriptor kernel: add a helper to get an owning user namespace for a namespace devpts: Change the owner of /dev/pts/ptmx to the mounter of /dev/pts devpts: Remove sync_filesystems devpts: Make devpts_kill_sb safe if fsi is NULL devpts: Simplify devpts_mount by using mount_nodev devpts: Move the creation of /dev/pts/ptmx into fill_super devpts: Move parse_mount_options into fill_super userns: When the per user per user namespace limit is reached return ENOSPC userns; Document per user per user namespace limits. mntns: Add a limit on the number of mount namespaces. netns: Add a limit on the number of net namespaces cgroupns: Add a limit on the number of cgroup namespaces ipcns: Add a limit on the number of ipc namespaces ...
-
git://git.kernel.org/pub/scm/linux/kernel/git/dgc/linux-xfsLinus Torvalds authored
Pull xfs and iomap updates from Dave Chinner: "The main things in this update are the iomap-based DAX infrastructure, an XFS delalloc rework, and a chunk of fixes to how log recovery schedules writeback to prevent spurious corruption detections when recovery of certain items was not required. The other main chunk of code is some preparation for the upcoming reflink functionality. Most of it is generic and cleanups that stand alone, but they were ready and reviewed so are in this pull request. Speaking of reflink, I'm currently planning to send you another pull request next week containing all the new reflink functionality. I'm working through a similar process to the last cycle, where I sent the reverse mapping code in a separate request because of how large it was. The reflink code merge is even bigger than reverse mapping, so I'll be doing the same thing again.... Summary for this update: - change of XFS mailing list to linux-xfs@vger.kernel.org - iomap-based DAX infrastructure w/ XFS and ext2 support - small iomap fixes and additions - more efficient XFS delayed allocation infrastructure based on iomap - a rework of log recovery writeback scheduling to ensure we don't fail recovery when trying to replay items that are already on disk - some preparation patches for upcoming reflink support - configurable error handling fixes and documentation - aio access time update race fixes for XFS and generic_file_read_iter" * tag 'xfs-for-linus-4.9-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/dgc/linux-xfs: (40 commits) fs: update atime before I/O in generic_file_read_iter xfs: update atime before I/O in xfs_file_dio_aio_read ext2: fix possible integer truncation in ext2_iomap_begin xfs: log recovery tracepoints to track current lsn and buffer submission xfs: update metadata LSN in buffers during log recovery xfs: don't warn on buffers not being recovered due to LSN xfs: pass current lsn to log recovery buffer validation xfs: rework log recovery to submit buffers on LSN boundaries xfs: quiesce the filesystem after recovery on readonly mount xfs: remote attribute blocks aren't really userdata ext2: use iomap to implement DAX ext2: stop passing buffer_head to ext2_get_blocks xfs: use iomap to implement DAX xfs: refactor xfs_setfilesize xfs: take the ilock shared if possible in xfs_file_iomap_begin xfs: fix locking for DAX writes dax: provide an iomap based fault handler dax: provide an iomap based dax read/write path dax: don't pass buffer_head to copy_user_dax dax: don't pass buffer_head to dax_insert_mapping ...
-
git://git.kernel.org/pub/scm/linux/kernel/git/davem/netLinus Torvalds authored
Pull networking fixups from David Miller: "Here are the build and merge fixups for the networking stuff" * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: phy: micrel.c: Enable ksz9031 energy-detect power-down mode netfilter: merge fixup for "nf_tables_netdev: remove redundant ip_hdr assignment" netfilter: nft_limit: fix divided by zero panic netfilter: fix namespace handling in nf_log_proc_dostring netfilter: xt_hashlimit: Fix link error in 32bit arch because of 64bit division netfilter: accommodate different kconfig in nf_set_hooks_head netfilter: Fix potential null pointer dereference
-