Commits · fa6114d4bde70152765ba1c35fed4fcd8481faf6 · Kirill Smelkov / linux

08 Oct, 2016 5 commits

net: macb: NULL out phydev after removing mdio bus · fa6114d4

Nathan Sullivan authored Oct 07, 2016

To ensure the dev->phydev pointer is not used after becoming invalid in
mdiobus_unregister, set it to NULL. This happens when removing the macb
driver without first taking its interface down, since unregister_netdev
will end up calling macb_close.
Signed-off-by: Xander Huff <xander.huff@ni.com>
Signed-off-by: Nathan Sullivan <nathan.sullivan@ni.com>
Signed-off-by: Brad Mouring <brad.mouring@ni.com>
Reviewed-by: Moritz Fischer <moritz.fischer@ettus.com>
Acked-by: Nicolas Ferre <nicolas.ferre@atmel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

fa6114d4

xen-netback: make sure that hashes are not send to unaware frontends · 912e27e8

Paul Durrant authored Oct 07, 2016

In the case when a frontend only negotiates a single queue with xen-
netback it is possible for a skbuff with a s/w hash to result in a
hash extra_info segment being sent to the frontend even when no hash
algorithm has been configured. (The ndo_select_queue() entry point makes
sure the hash is not set if no algorithm is configured, but this entry
point is not called when there is only a single queue). This can result
in a frontend that is unable to handle extra_info segments being given
such a segment, causing it to crash.

This patch fixes the problem by clearing the hash in ndo_start_xmit()
instead, which is clearly guaranteed to be called irrespective of the
number of queues.
Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Cc: Wei Liu <wei.liu2@citrix.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

912e27e8

Fixing a bug in team driver due to incorrect 'unsigned int' to 'int' conversion · 21d9629a

Alex Sidorenko authored Oct 07, 2016

Roundrobin runner of team driver uses 'unsigned int' variable to count
the number of sent_packets. Later it is passed to a subroutine
team_num_to_port_index(struct team *team, int num) as 'num' and when
we reach MAXINT (2**31-1), 'num' becomes negative.

This leads to using incorrect hash-bucket for port lookup
and as a result, packets are dropped. The fix consists of changing
'int num' to 'unsigned int num'. Testing of a fixed kernel shows that
there is no packet drop anymore.
Signed-off-by: Alex Sidorenko <alexandre.sidorenko@hpe.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

21d9629a

MAINTAINERS: add myself as a maintainer of xen-netback · 7d3cfc36

Paul Durrant authored Oct 07, 2016

Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Cc: Wei Liu <wei.liu2@citrix.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

7d3cfc36

ipv6 addrconf: disallow rtr_solicits < -1 · cb4a4c69

Maciej Żenczykowski authored Oct 07, 2016

This disallows setting /proc/sys/net/ipv6/conf/*/router_solicitations
to values below -1.

-1 continues to mean an unlimited number of retransmits.

Note: this depends on 'ipv6 addrconf: remove addrconf_sysctl_hop_limit()'
Signed-off-by: Maciej Żenczykowski <maze@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

cb4a4c69

07 Oct, 2016 26 commits

drivers: net: phy: Correct duplicate MDIO_XGENE entry · 7aa6ec22

Laura Abbott authored Oct 06, 2016

An extra entry for MDIO_XGENE got added during merging.
Delete it.
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Laura Abbott <labbott@redhat.com>
Acked-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

7aa6ec22

ethernet: qualcomm: QCOM_EMAC should depend on HAS_DMA and HAS_IOMEM · e5e0fbfc

Geert Uytterhoeven authored Oct 06, 2016

If NO_DMA=y:

    drivers/built-in.o: In function `emac_probe':
    emac.c:(.text+0x3780b8): undefined reference to `bad_dma_ops'
    emac.c:(.text+0x3780e2): undefined reference to `bad_dma_ops'
    emac.c:(.text+0x378112): undefined reference to `bad_dma_ops'
    emac.c:(.text+0x378146): undefined reference to `bad_dma_ops'
    emac.c:(.text+0x37816e): undefined reference to `bad_dma_ops'
    drivers/built-in.o:emac.c:(.text+0x37819a): more undefined references to `bad_dma_ops' follow

If NO_IOMEM=y:

    drivers/net/ethernet/qualcomm/emac/emac.c: In function ‘emac_remove’:
    drivers/net/ethernet/qualcomm/emac/emac.c:736:3: error: implicit declaration of function ‘iounmap’ [-Werror=implicit-function-declaration]
       iounmap(adpt->phy.digital);
       ^

Add dependencies on HAS_DMA and HAS_IOMEM to fix this.
Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
Acked-by: Timur Tabi <timur@codeaurora.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

e5e0fbfc

Merge branch 'mediatek-hw-lro-chip-id-check' · a0ec9319

David S. Miller authored Oct 06, 2016

Nelson Chang says:

====================
net: ethernet: mediatek: check the hw lro capability by the chip id instead of the dtsi

The series modify to check if hw lro is supported by the chip id.

changes since v3:
- Refine mtk_is_hwlro_supported() function

changes since v2:
- Refine mtk_get_chip_id() function

changes since v1:
- Because hw lro started to be supported from MT7623, the proper way to check if the feature is capable is to judge by the chip id instead of by the dtsi.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>

a0ec9319

net: ethernet: mediatek: remove hwlro property in the device tree · 3a09f18e

Nelson Chang authored Oct 06, 2016

Since the proper way to check the hw lro capability is by the chip id,
hwlro property in the device tree should be removed.
Signed-off-by: Nelson Chang <nelson.chang@mediatek.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

3a09f18e

net: ethernet: mediatek: get hw lro capability by the chip id instead of by the dtsi · 983e1a6c

Nelson Chang authored Oct 06, 2016

Because hw lro started to be supported from MT7623, the proper way to check if
the feature is capable is to judge by the chip id instead of by the dtsi.
Signed-off-by: Nelson Chang <nelson.chang@mediatek.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

983e1a6c

net: ethernet: mediatek: get the chip id by ETHDMASYS registers · b95b6d99

Nelson Chang authored Oct 06, 2016

The driver gets the chip id by ETHSYS_CHIPID0_3/ETHSYS_CHIPID4_7 registers
in mtk_probe().
Signed-off-by: Nelson Chang <nelson.chang@mediatek.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

b95b6d99

Merge tag 'rxrpc-rewrite-20161004' of... · 0d818c28

David S. Miller authored Oct 06, 2016

Merge tag 'rxrpc-rewrite-20161004' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs

David Howells says:

====================
rxrpc: Fixes

This set of patches contains a bunch of fixes:

 (1) Fix an oops on incoming call to a local endpoint without a bound
     service.

 (2) Only ping for a lost reply in a client call (this is inapplicable to
     service calls).

 (3) Fix maybe uninitialised variable warnings in the ACK/ABORT sending
     function by splitting it.

 (4) Fix loss of PING RESPONSE ACKs due to them being subsumed by PING ACK
     generation.

 (5) OpenAFS improperly terminates calls it makes as a client under some
     circumstances by not fully hard-ACK'ing the last DATA packets.  This
     is alleviated by a new call appearing on the same channel implicitly
     completing the previous call on that channel.  Handle this implicit
     completion.

 (6) Properly handle expiry of service calls due to the aforementioned
     improper termination with no follow up call to implicitly complete it:

     (a) The call's background processor needs to be queued to complete the
     	 call, send an abort and notify the socket.

     (b) The call's background processor needs to notify the socket (or the
     	 kernel service) when it has completed the call.

     (c) A negative error code must thence be returned to the kernel
     	 service so that it knows the call died.

     (d) The AFS filesystem must detect the fatal error and end the call.

 (7) Must produce a DELAY ACK when the actual service operation takes a
     while to process and must cancel the ACK when the reply is ready.

 (8) Don't request an ACK on the last DATA packet of the Tx phase as this
     confuses OpenAFS.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>

0d818c28

net: bgmac: Fix errant feature flag check · 4af1474e

Jon Mason authored Oct 05, 2016

During the conversion to the feature flags, a check against
ci->id != BCMA_CHIP_ID_BCM47162
became
bgmac->feature_flags & BGMAC_FEAT_CLKCTLS
instead of
!(bgmac->feature_flags & BGMAC_FEAT_CLKCTLS)
Reported-by: Rafał Miłecki <zajec5@gmail.com>
Signed-off-by: Jon Mason <jon.mason@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

4af1474e

netlink: do not enter direct reclaim from netlink_dump() · d35c99ff

Eric Dumazet authored Oct 06, 2016

Since linux-3.15, netlink_dump() can use up to 16384 bytes skb
allocations.

Due to struct skb_shared_info ~320 bytes overhead, we end up using
order-3 (on x86) page allocations, that might trigger direct reclaim and
add stress.

The intent was really to attempt a large allocation but immediately
fallback to a smaller one (order-1 on x86) in case of memory stress.

On recent kernels (linux-4.4), we can remove __GFP_DIRECT_RECLAIM to
meet the goal. Old kernels would need to remove __GFP_WAIT

While we are at it, since we do an order-3 allocation, allow to use
all the allocated bytes instead of 16384 to reduce syscalls during
large dumps.

iproute2 already uses 32KB recvmsg() buffer sizes.

Alexei provided an initial patch downsizing to SKB_WITH_OVERHEAD(16384)

Fixes: 9063e21f ("netlink: autosize skb lengthes")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: Alexei Starovoitov <ast@kernel.org>
Cc: Greg Thelen <gthelen@google.com>
Reviewed-by: Greg Rose <grose@lightfleet.com>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

d35c99ff

packet: call fanout_release, while UNREGISTERING a netdev · 66644982

Anoob Soman authored Oct 05, 2016

If a socket has FANOUT sockopt set, a new proto_hook is registered
as part of fanout_add(). When processing a NETDEV_UNREGISTER event in
af_packet, __fanout_unlink is called for all sockets, but prot_hook which was
registered as part of fanout_add is not removed. Call fanout_release, on a
NETDEV_UNREGISTER, which removes prot_hook and removes fanout from the
fanout_list.

This fixes BUG_ON(!list_empty(&dev->ptype_specific)) in netdev_run_todo()
Signed-off-by: Anoob Soman <anoob.soman@citrix.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

66644982

devicetree: net: micrel-ksz90x1.txt: Properly explain skew settings · 3d9e133f

Mike Looijmans authored Oct 05, 2016

The KSZ9031 skew registers contain an offset, the chip's default value
is "neutral" which does not add any skew. Programming a 0 into a skew
property will actually set it the maximal negative adjustment and not
to a neutral position as one would expect.

Explain this situation in the devicetree binding documentation and list
the settings that the chip considers neutral.

Changing the implementation to accept negative values would have been
a better solution, but would break existing configurations.
Signed-off-by: Mike Looijmans <mike.looijmans@topic.nl>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

3d9e133f

net: phy: Add Wake-on-LAN driver for Microsemi PHYs. · 0a55c12f

Raju Lakkaraju authored Oct 05, 2016

Wake-on-LAN (WoL) is an Ethernet networking standard that allows
a computer/device to be turned on or awakened by a network message.

VSC8531 PHY can support this feature configure by driver set function.
WoL status get by driver get function.

Tested on Beaglebone Black with VSC 8531 PHY.
Signed-off-by: Raju Lakkaraju <Raju.Lakkaraju@microsemi.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

0a55c12f

dt-bindings: net: renesas-ravb: Add support for R8A7796 RAVB · ed2eb0fb

Laurent Pinchart authored Oct 04, 2016

Add a new compatible string for the R8A7796 (M3-W) RAVB.
Signed-off-by: Laurent Pinchart <laurent.pinchart+renesas@ideasonboard.com>
Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be>
Signed-off-by: David S. Miller <davem@davemloft.net>

ed2eb0fb

drivers: net: cpsw-phy-sel: add support to configure rgmii internal delay · 0fb26c30

Mugunthan V N authored Oct 04, 2016

Add support to enable CPSW RGMII internal delay (id mode) bits
when rgmii internal delay is configured in phy.
Signed-off-by: Mugunthan V N <mugunthanvnm@ti.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

0fb26c30

net: hns: Add missing \n to end of dev_err messages, tidy up text · 451e856e

Colin Ian King authored Oct 04, 2016

Trival fix, dev_err messages are missing a \n, so add it. Also
fix grammer, spelling mistake and add white spaces to various
error messages.
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

451e856e

net: ps3_gelic: Add missing \n to end of deb_dbg message · 87089dd7

Colin Ian King authored Oct 04, 2016

Trival fix, dev_dbg message is missing a \n, so add it.
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

87089dd7

net: axienet: Add missing \n to end of dev_err messages · 68c8182b

Colin Ian King authored Oct 04, 2016

Trival fix, dev_err messages are missing a \n, so add it.
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

68c8182b

Merge branch 'xen-netback-rx-refactor' · eb76a9f1

David S. Miller authored Oct 06, 2016

Paul Durrant says:

====================
xen-netback: guest rx side refactor

This series refactors the guest rx side of xen-netback:

- The code is moved into its own source module.

- The prefix variant of GSO handling is retired (since it is no longer
  in common use, and alternatives exist).

- The code is then simplified and modifications made to improve
  performance.

v2:
- Rebased onto refreshed net-next
====================
Signed-off-by: David S. Miller <davem@davemloft.net>

eb76a9f1

xen/netback: add fraglist support for to-guest rx · 2167ca02

Ross Lagerwall authored Oct 04, 2016

This allows full 64K skbuffs (with 1500 mtu ethernet, composed of 45
fragments) to be handled by netback for to-guest rx.
Signed-off-by: Ross Lagerwall <ross.lagerwall@citrix.com>
[re-based]
Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Reviewed-by: David Vrabel <david.vrabel@citrix.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

2167ca02

xen-netback: batch copies for multiple to-guest rx packets · a37f1229

David Vrabel authored Oct 04, 2016

Instead of flushing the copy ops when an packet is complete, complete
packets when their copy ops are done.  This improves performance by
reducing the number of grant copy hypercalls.

Latency is still limited by the relatively small size of the copy
batch.
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
[re-based]
Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

a37f1229

xen-netback: process guest rx packets in batches · 98f6d57c

David Vrabel authored Oct 04, 2016

Instead of only placing one skb on the guest rx ring at a time, process
a batch of up-to 64.  This improves performance by ~10% in some tests.
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
[re-based]
Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

98f6d57c

xen-netback: immediately wake tx queue when guest rx queue has space · 7c0b1a23

David Vrabel authored Oct 04, 2016

When an skb is removed from the guest rx queue, immediately wake the
tx queue, instead of after processing them.
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
[re-based]
Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

7c0b1a23

xen-netback: refactor guest rx · eb1723a2

David Vrabel authored Oct 04, 2016

Refactor the to-guest (rx) path to:

1. Push responses for completed skbs earlier, reducing latency.

2. Reduce the per-queue memory overhead by greatly reducing the
   maximum number of grant copy ops in each hypercall (from 4352 to
   64).  Each struct xenvif_queue is now only 44 kB instead of 220 kB.

3. Make the code more maintainable.
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
[re-based]
Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

eb1723a2

xen-netback: retire guest rx side prefix GSO feature · fedbc8c1

Paul Durrant authored Oct 04, 2016

As far as I am aware only very old Windows network frontends make use of
this style of passing GSO packets from backend to frontend. These
frontends can easily be replaced by the freely available Xen Project
Windows PV network frontend, which uses the 'default' mechanism for
passing GSO packets, which is also used by all Linux frontends.

NOTE: Removal of this feature will not cause breakage in old Windows
frontends. They simply will no longer receive GSO packets - the
packets instead being fragmented in the backend.
Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Reviewed-by: David Vrabel <david.vrabel@citrix.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

fedbc8c1

xen-netback: separate guest side rx code into separate module · 3254f836

Paul Durrant authored Oct 04, 2016

The netback source module has become very large and somewhat confusing.
This patch simply moves all code related to the backend to frontend (i.e
guest side rx) data-path into a separate rx source module.

This patch contains no functional change, it is code movement and
minimal changes to avoid patch style-check issues.
Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

3254f836

Merge branch 'fman-next' of git://git.freescale.com/ppc/upstream/linux · 00c06ed7

David S. Miller authored Oct 06, 2016

Madalin Bucur says:

====================
fsl/fman: cleanup and small fixes

This series contains fixes for the DPAA FMan driver.
Adding myself as maintainer of the driver.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>

00c06ed7

06 Oct, 2016 9 commits

Merge tag 'for-f2fs-4.9' of git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs · 4c1fad64

Linus Torvalds authored Oct 06, 2016

Pull f2fs updates from Jaegeuk Kim:
 "In this round, we've investigated how f2fs deals with errors given by
  our fault injection facility. With this, we could fix several corner
  cases. And, in order to improve the performance, we set inline_dentry
  by default and enhance the exisiting discard issue flow. In addition,
  we added f2fs_migrate_page for better memory management.

  Enhancements:
   - set inline_dentry by default
   - improve discard issue flow
   - add more fault injection cases in f2fs
   - allow block preallocation for encrypted files
   - introduce migrate_page callback function
   - avoid truncating the next direct node block at every checkpoint

  Bug fixes:
   - set page flag correctly between write_begin and write_end
   - missing error handling cases detected by fault injection
   - preallocate blocks regarding to 4KB alignement correctly
   - dentry and filename handling of encryption
   - lost xattrs of directories"

* tag 'for-f2fs-4.9' of git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs: (69 commits)
  f2fs: introduce update_ckpt_flags to clean up
  f2fs: don't submit irrelevant page
  f2fs: fix to commit bio cache after flushing node pages
  f2fs: introduce get_checkpoint_version for cleanup
  f2fs: remove dead variable
  f2fs: remove redundant io plug
  f2fs: support checkpoint error injection
  f2fs: fix to recover old fault injection config in ->remount_fs
  f2fs: do fault injection initialization in default_options
  f2fs: remove redundant value definition
  f2fs: support configuring fault injection per superblock
  f2fs: adjust display format of segment bit
  f2fs: remove dirty inode pages in error path
  f2fs: do not unnecessarily null-terminate encrypted symlink data
  f2fs: handle errors during recover_orphan_inodes
  f2fs: avoid gc in cp_error case
  f2fs: should put_page for summary page
  f2fs: assign return value in f2fs_gc
  f2fs: add customized migrate_page callback
  f2fs: introduce cp_lock to protect updating of ckpt_flags
  ...

4c1fad64

Merge tag 'pstore-v4.9-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux · 0fb3ca44

Linus Torvalds authored Oct 06, 2016

Pull pstore updates from Kees Cook:

 - Fix bug in module unloading

 - Switch to always using spinlock over cmpxchg

 - Explicitly define pstore backend's supported modes

 - Remove bounce buffer from pmsg

 - Switch to using memcpy_to/fromio()

 - Error checking improvements

* tag 'pstore-v4.9-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux:
  ramoops: move spin_lock_init after kmalloc error checking
  pstore/ram: Use memcpy_fromio() to save old buffer
  pstore/ram: Use memcpy_toio instead of memcpy
  pstore/pmsg: drop bounce buffer
  pstore/ram: Set pstore flags dynamically
  pstore: Split pstore fragile flags
  pstore/core: drop cmpxchg based updates
  pstore/ramoops: fixup driver removal

0fb3ca44

Merge tag 'for-linus-4.9-ofs1' of git://git.kernel.org/pub/scm/linux/kernel/git/hubcap/linux · 3940ee36

Linus Torvalds authored Oct 06, 2016

Pull orangefs updates from Mike Marshall:
 "Miscellaneous improvements:
   - clean up debugfs globals
   - remove dead code in sysfs
   - reorganize duplicated sysfs attribute structs
   - consolidate sysfs show and store functions
   - remove duplicated sysfs_ops structures
   - describe organization of sysfs
   - make devreq_mutex static
   - g_orangefs_stats -> orangefs_stats for consistency
   - rename most remaining global variables

  Feature negotiation:
   - enable Orangefs userspace and kernel module to negotiate mutually
     supported features"

* tag 'for-linus-4.9-ofs1' of git://git.kernel.org/pub/scm/linux/kernel/git/hubcap/linux:
  Revert "orangefs: bump minimum userspace version"
  orangefs: bump minimum userspace version
  orangefs: rename most remaining global variables
  orangefs: g_orangefs_stats -> orangefs_stats for consistency
  orangefs: make devreq_mutex static
  orangefs: describe organization of sysfs
  orangefs: remove duplicated sysfs_ops structures
  orangefs: consolidate sysfs show and store functions
  orangefs: reorganize duplicated sysfs attribute structs
  orangefs: remove dead code in sysfs
  orangefs: clean up debugfs globals
  orangefs: do not allow client readahead cache without feature bit
  orangefs: add features op
  orangefs: record userspace version for feature compatbility
  orangefs: add readahead count and size to sysfs
  orangefs: re-add flush_racache from out-of-tree
  orangefs: turn param response value into union
  orangefs: add missing param request ops
  orangefs: rename remaining bits of mmap readahead cache

3940ee36

Merge tag 'trace-v4.9' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace · 95107b30

Linus Torvalds authored Oct 06, 2016

Pull tracing updates from Steven Rostedt:
 "This release cycle is rather small.  Just a few fixes to tracing.

  The big change is the addition of the hwlat tracer. It not only
  detects SMIs, but also other latency that's caused by the hardware. I
  have detected some latency from large boxes having bus contention"

* tag 'trace-v4.9' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace:
  tracing: Call traceoff trigger after event is recorded
  ftrace/scripts: Add helper script to bisect function tracing problem functions
  tracing: Have max_latency be defined for HWLAT_TRACER as well
  tracing: Add NMI tracing in hwlat detector
  tracing: Have hwlat trace migrate across tracing_cpumask CPUs
  tracing: Add documentation for hwlat_detector tracer
  tracing: Added hardware latency tracer
  ftrace: Access ret_stack->subtime only in the function profiler
  function_graph: Handle TRACE_BPUTS in print_graph_comment
  tracing/uprobe: Drop isdigit() check in create_trace_uprobe

95107b30

Merge tag 'for-linus-4.9-rc0-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip · 541efb76

Linus Torvalds authored Oct 06, 2016

Pull xen updates from David Vrabel:
 "xen features and fixes for 4.9:

   - switch to new CPU hotplug mechanism

   - support driver_override in pciback

   - require vector callback for HVM guests (the alternate mechanism via
     the platform device has been broken for ages)"

* tag 'for-linus-4.9-rc0-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip:
  xen/x86: Update topology map for PV VCPUs
  xen/x86: Initialize per_cpu(xen_vcpu, 0) a little earlier
  xen/pciback: support driver_override
  xen/pciback: avoid multiple entries in slot list
  xen/pciback: simplify pcistub device handling
  xen: Remove event channel notification through Xen PCI platform device
  xen/events: Convert to hotplug state machine
  xen/x86: Convert to hotplug state machine
  x86/xen: add missing \n at end of printk warning message
  xen/grant-table: Use kmalloc_array() in arch_gnttab_valloc()
  xen: Make VPMU init message look less scary
  xen: rename xen_pmu_init() in sys-hypervisor.c
  hotplug: Prevent alloc/free of irq descriptors during cpu up/down (again)
  xen/x86: Move irq allocation from Xen smp_op.cpu_up()

541efb76

Merge tag 'kvm-4.9-1' of git://git.kernel.org/pub/scm/virt/kvm/kvm · 6218590b

Linus Torvalds authored Oct 06, 2016

Pull KVM updates from Radim Krčmář:
 "All architectures:
   - move `make kvmconfig` stubs from x86
   - use 64 bits for debugfs stats

  ARM:
   - Important fixes for not using an in-kernel irqchip
   - handle SError exceptions and present them to guests if appropriate
   - proxying of GICV access at EL2 if guest mappings are unsafe
   - GICv3 on AArch32 on ARMv8
   - preparations for GICv3 save/restore, including ABI docs
   - cleanups and a bit of optimizations

  MIPS:
   - A couple of fixes in preparation for supporting MIPS EVA host
     kernels
   - MIPS SMP host & TLB invalidation fixes

  PPC:
   - Fix the bug which caused guests to falsely report lockups
   - other minor fixes
   - a small optimization

  s390:
   - Lazy enablement of runtime instrumentation
   - up to 255 CPUs for nested guests
   - rework of machine check deliver
   - cleanups and fixes

  x86:
   - IOMMU part of AMD's AVIC for vmexit-less interrupt delivery
   - Hyper-V TSC page
   - per-vcpu tsc_offset in debugfs
   - accelerated INS/OUTS in nVMX
   - cleanups and fixes"

* tag 'kvm-4.9-1' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (140 commits)
  KVM: MIPS: Drop dubious EntryHi optimisation
  KVM: MIPS: Invalidate TLB by regenerating ASIDs
  KVM: MIPS: Split kernel/user ASID regeneration
  KVM: MIPS: Drop other CPU ASIDs on guest MMU changes
  KVM: arm/arm64: vgic: Don't flush/sync without a working vgic
  KVM: arm64: Require in-kernel irqchip for PMU support
  KVM: PPC: Book3s PR: Allow access to unprivileged MMCR2 register
  KVM: PPC: Book3S PR: Support 64kB page size on POWER8E and POWER8NVL
  KVM: PPC: Book3S: Remove duplicate setting of the B field in tlbie
  KVM: PPC: BookE: Fix a sanity check
  KVM: PPC: Book3S HV: Take out virtual core piggybacking code
  KVM: PPC: Book3S: Treat VTB as a per-subcore register, not per-thread
  ARM: gic-v3: Work around definition of gic_write_bpr1
  KVM: nVMX: Fix the NMI IDT-vectoring handling
  KVM: VMX: Enable MSR-BASED TPR shadow even if APICv is inactive
  KVM: nVMX: Fix reload apic access page warning
  kvmconfig: add virtio-gpu to config fragment
  config: move x86 kvm_guest.config to a common location
  arm64: KVM: Remove duplicating init code for setting VMID
  ARM: KVM: Support vgic-v3
  ...

6218590b

Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace · 14986a34

Linus Torvalds authored Oct 06, 2016

Pull namespace updates from Eric Biederman:
"This set of changes is a number of smaller things that have been
overlooked in other development cycles focused on more fundamental
change. The devpts changes are small things that were a distraction
until we managed to kill off DEVPTS_MULTPLE_INSTANCES. There is an
trivial regression fix to autofs for the unprivileged mount changes
that went in last cycle. A pair of ioctls has been added by Andrey
Vagin making it is possible to discover the relationships between
namespaces when referring to them through file descriptors.

The big user visible change is starting to add simple resource limits
to catch programs that misbehave. With namespaces in general and user
namespaces in particular allowing users to use more kinds of
resources, it has become important to have something to limit errant
programs. Because the purpose of these limits is to catch errant
programs the code needs to be inexpensive to use as it always on, and
the default limits need to be high enough that well behaved programs
on well behaved systems don't encounter them.

To this end, after some review I have implemented per user per user
namespace limits, and use them to limit the number of namespaces. The
limits being per user mean that one user can not exhause the limits of
another user. The limits being per user namespace allow contexts where
the limit is 0 and security conscious folks can remove from their
threat anlysis the code used to manage namespaces (as they have
historically done as it root only). At the same time the limits being
per user namespace allow other parts of the system to use namespaces.

Namespaces are increasingly being used in application sand boxing
scenarios so an all or nothing disable for the entire system for the
security conscious folks makes increasing use of these sandboxes
impossible.

There is also added a limit on the maximum number of mounts present in
a single mount namespace. It is nontrivial to guess what a reasonable
system wide limit on the number of mount structure in the kernel would
be, especially as it various based on how a system is using
containers. A limit on the number of mounts in a mount namespace
however is much easier to understand and set. In most cases in
practice only about 1000 mounts are used. Given that some autofs
scenarious have the potential to be 30,000 to 50,000 mounts I have set
the default limit for the number of mounts at 100,000 which is well
above every known set of users but low enough that the mount hash
tables don't degrade unreaonsably.

These limits are a start. I expect this estabilishes a pattern that
other limits for resources that namespaces use will follow. There has
been interest in making inotify event limits per user per user
namespace as well as interest expressed in making details about what
is going on in the kernel more visible"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace: (28 commits)
autofs: Fix automounts by using current_real_cred()->uid
mnt: Add a per mount namespace limit on the number of mounts
netns: move {inc,dec}_net_namespaces into #ifdef
nsfs: Simplify __ns_get_path
tools/testing: add a test to check nsfs ioctl-s
nsfs: add ioctl to get a parent namespace
nsfs: add ioctl to get an owning user namespace for ns file descriptor
kernel: add a helper to get an owning user namespace for a namespace
devpts: Change the owner of /dev/pts/ptmx to the mounter of /dev/pts
devpts: Remove sync_filesystems
devpts: Make devpts_kill_sb safe if fsi is NULL
devpts: Simplify devpts_mount by using mount_nodev
devpts: Move the creation of /dev/pts/ptmx into fill_super
devpts: Move parse_mount_options into fill_super
userns: When the per user per user namespace limit is reached return ENOSPC
userns; Document per user per user namespace limits.
mntns: Add a limit on the number of mount namespaces.
netns: Add a limit on the number of net namespaces
cgroupns: Add a limit on the number of cgroup namespaces
ipcns: Add a limit on the number of ipc namespaces
...

14986a34

Merge tag 'xfs-for-linus-4.9-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/dgc/linux-xfs · 8d370595

Linus Torvalds authored Oct 06, 2016

Pull xfs and iomap updates from Dave Chinner:
 "The main things in this update are the iomap-based DAX infrastructure,
  an XFS delalloc rework, and a chunk of fixes to how log recovery
  schedules writeback to prevent spurious corruption detections when
  recovery of certain items was not required.

  The other main chunk of code is some preparation for the upcoming
  reflink functionality. Most of it is generic and cleanups that stand
  alone, but they were ready and reviewed so are in this pull request.

  Speaking of reflink, I'm currently planning to send you another pull
  request next week containing all the new reflink functionality. I'm
  working through a similar process to the last cycle, where I sent the
  reverse mapping code in a separate request because of how large it
  was. The reflink code merge is even bigger than reverse mapping, so
  I'll be doing the same thing again....

  Summary for this update:

   - change of XFS mailing list to linux-xfs@vger.kernel.org

   - iomap-based DAX infrastructure w/ XFS and ext2 support

   - small iomap fixes and additions

   - more efficient XFS delayed allocation infrastructure based on iomap

   - a rework of log recovery writeback scheduling to ensure we don't
     fail recovery when trying to replay items that are already on disk

   - some preparation patches for upcoming reflink support

   - configurable error handling fixes and documentation

   - aio access time update race fixes for XFS and
     generic_file_read_iter"

* tag 'xfs-for-linus-4.9-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/dgc/linux-xfs: (40 commits)
  fs: update atime before I/O in generic_file_read_iter
  xfs: update atime before I/O in xfs_file_dio_aio_read
  ext2: fix possible integer truncation in ext2_iomap_begin
  xfs: log recovery tracepoints to track current lsn and buffer submission
  xfs: update metadata LSN in buffers during log recovery
  xfs: don't warn on buffers not being recovered due to LSN
  xfs: pass current lsn to log recovery buffer validation
  xfs: rework log recovery to submit buffers on LSN boundaries
  xfs: quiesce the filesystem after recovery on readonly mount
  xfs: remote attribute blocks aren't really userdata
  ext2: use iomap to implement DAX
  ext2: stop passing buffer_head to ext2_get_blocks
  xfs: use iomap to implement DAX
  xfs: refactor xfs_setfilesize
  xfs: take the ilock shared if possible in xfs_file_iomap_begin
  xfs: fix locking for DAX writes
  dax: provide an iomap based fault handler
  dax: provide an iomap based dax read/write path
  dax: don't pass buffer_head to copy_user_dax
  dax: don't pass buffer_head to dax_insert_mapping
  ...

8d370595

Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net · d230ec72

Linus Torvalds authored Oct 06, 2016

Pull networking fixups from David Miller:
 "Here are the build and merge fixups for the networking stuff"

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net:
  phy: micrel.c: Enable ksz9031 energy-detect power-down mode
  netfilter: merge fixup for "nf_tables_netdev: remove redundant ip_hdr assignment"
  netfilter: nft_limit: fix divided by zero panic
  netfilter: fix namespace handling in nf_log_proc_dostring
  netfilter: xt_hashlimit: Fix link error in 32bit arch because of 64bit division
  netfilter: accommodate different kconfig in nf_set_hooks_head
  netfilter: Fix potential null pointer dereference

d230ec72