- 23 May, 2019 34 commits
-
-
Daniel Borkmann authored
Alexei Starovoitov says: ==================== Patch 1 - jmp sequence limit Patch 2 - improve existing tests Patch 3 - add pyperf-based realistic bpf program that takes advantage of higher limit and use it as a stress test v1->v2: fixed nit in patch 3. added Andrii's acks ==================== Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
-
Alexei Starovoitov authored
Add a snippet of pyperf bpf program used to collect python stack traces as a scale test for the verifier. At 189 loop iterations llvm 9.0 starts ignoring '#pragma unroll' and generates partially unrolled loop instead. Hence use 50, 100, and 180 loop iterations to stress test. Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Andrii Nakryiko <andriin@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
-
Alexei Starovoitov authored
Adjust scale tests to check for new jmp sequence limit. BPF_JGT had to be changed to BPF_JEQ because the verifier was too smart. It tracked the known safe range of R0 values and pruned the search earlier before hitting exact 8192 limit. bpf_semi_rand_get() was too (un)?lucky. k = 0; was missing in bpf_fill_scale2. It was testing a bit shorter sequence of jumps than intended. Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Andrii Nakryiko <andriin@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
-
Alexei Starovoitov authored
The limit of 1024 subsequent jumps was causing otherwise valid programs to be rejected. Bump it to 8192 and make the error more verbose. Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Andrii Nakryiko <andriin@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
-
Andrii Nakryiko authored
It's easy to have a mismatch of "intended to be public" vs really exposed API functions. While Makefile does check for this mismatch, if it actually occurs it's not trivial to determine which functions are accidentally exposed. This patch dumps out a diff showing what's not supposed to be exposed facilitating easier fixing. Signed-off-by: Andrii Nakryiko <andriin@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
-
Sunil Muthuswamy authored
Currently, the hv_sock send() iterates once over the buffer, puts data into the VMBUS channel and returns. It doesn't maximize on the case when there is a simultaneous reader draining data from the channel. In such a case, the send() can maximize the bandwidth (and consequently minimize the cpu cycles) by iterating until the channel is found to be full. Perf data: Total Data Transfer: 10GB/iteration Single threaded reader/writer, Linux hvsocket writer with Windows hvsocket reader Packet size: 64KB CPU sys time was captured using the 'time' command for the writer to send 10GB of data. 'Send Buffer Loop' is with the patch applied. The values below are over 10 iterations. |--------------------------------------------------------| | | Current | Send Buffer Loop | |--------------------------------------------------------| | | Throughput | CPU sys | Throughput | CPU sys | | | (MB/s) | time (s) | (MB/s) | time (s) | |--------------------------------------------------------| | Min | 407 | 7.048 | 401 | 5.958 | |--------------------------------------------------------| | Max | 455 | 7.563 | 542 | 6.993 | |--------------------------------------------------------| | Avg | 440 | 7.411 | 451 | 6.639 | |--------------------------------------------------------| | Median | 446 | 7.417 | 447 | 6.761 | |--------------------------------------------------------| Observation: 1. The avg throughput doesn't really change much with this change for this scenario. This is most probably because the bottleneck on throughput is somewhere else. 2. The average system (or kernel) cpu time goes down by 10%+ with this change, for the same amount of data transfer. Signed-off-by: Sunil Muthuswamy <sunilmut@microsoft.com> Reviewed-by: Dexuan Cui <decui@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Sunil Muthuswamy authored
Currently, the hv_sock buffer size is static and can't scale to the bandwidth requirements of the application. This change allows the applications to influence the socket buffer sizes using the SO_SNDBUF and the SO_RCVBUF socket options. Few interesting points to note: 1. Since the VMBUS does not allow a resize operation of the ring size, the socket buffer size option should be set prior to establishing the connection for it to take effect. 2. Setting the socket option comes with the cost of that much memory being reserved/allocated by the kernel, for the lifetime of the connection. Perf data: Total Data Transfer: 1GB Single threaded reader/writer Results below are summarized over 10 iterations. Linux hvsocket writer + Windows hvsocket reader: |---------------------------------------------------------------------------------------------| |Packet size -> | 128B | 1KB | 4KB | 64KB | |---------------------------------------------------------------------------------------------| |SO_SNDBUF size | | Throughput in MB/s (min/max/avg/median): | | v | | |---------------------------------------------------------------------------------------------| | Default | 109/118/114/116 | 636/774/701/700 | 435/507/480/476 | 410/491/462/470 | | 16KB | 110/116/112/111 | 575/705/662/671 | 749/900/854/869 | 592/824/692/676 | | 32KB | 108/120/115/115 | 703/823/767/772 | 718/878/850/866 | 1593/2124/2000/2085 | | 64KB | 108/119/114/114 | 592/732/683/688 | 805/934/903/911 | 1784/1943/1862/1843 | |---------------------------------------------------------------------------------------------| Windows hvsocket writer + Linux hvsocket reader: |---------------------------------------------------------------------------------------------| |Packet size -> | 128B | 1KB | 4KB | 64KB | |---------------------------------------------------------------------------------------------| |SO_RCVBUF size | | Throughput in MB/s (min/max/avg/median): | | v | | |---------------------------------------------------------------------------------------------| | Default | 69/82/75/73 | 313/343/333/336 | 418/477/446/445 | 659/701/676/678 | | 16KB | 69/83/76/77 | 350/401/375/382 | 506/548/517/516 | 602/624/615/615 | | 32KB | 62/83/73/73 | 471/529/496/494 | 830/1046/935/939 | 944/1180/1070/1100 | | 64KB | 64/70/68/69 | 467/533/501/497 | 1260/1590/1430/1431 | 1605/1819/1670/1660 | |---------------------------------------------------------------------------------------------| Signed-off-by: Sunil Muthuswamy <sunilmut@microsoft.com> Reviewed-by: Dexuan Cui <decui@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Eric Dumazet authored
Removing two 4 bytes holes allows to use kmalloc-32 kmem cache instead of kmalloc-64 on 64bit kernels. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
David Ahern authored
Add tracepoint to __neigh_create to enable debugging of new entries. Signed-off-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
David Ahern authored
The point of the pause-on-fail argument is to leave the setup as is after a test fails to allow a user to debug why it failed. Move the cleanup after posting the result to the user to make it so. Random names for the namespaces are not user friendly when trying to debug a failure. Make them simpler and more direct for the tests. Run cleanup at the beginning to ensure they are cleaned up if they already exist. Remove cleanup_done. There is no harm in doing cleanup twice; just ignore any errors related to not existing - which is already done. Signed-off-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
David Ahern authored
Add VERBOSE argument to fib-onlink-tests.sh and make output quiet by default. Add getopt parsing of inputs and support for -v (verbose) and -p (pause on fail). Signed-off-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
David Ahern authored
New userspace on an older kernel can send unknown and unsupported attributes resulting in an incompelete config which is almost always wrong for routing (few exceptions are passthrough settings like the protocol that installed the route). Set strict_start_type in the policies for IPv4 and IPv6 routes and rules to detect new, unsupported attributes and fail the route add. Signed-off-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
David S. Miller authored
David Ahern says: ==================== net: Export functions for nexthop code This set exports ipv4 and ipv6 fib functions for use by the nexthop code. It also adds new ones to send route notifications if a nexthop configuration changes. v2 - repost of patches dropped at the end of the last dev window added patch 8 which exports nh_update_mtu since it is inline with the other patches ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
-
David Ahern authored
Rename nh_update_mtu to fib_nhc_update_mtu and export for use by the nexthop code. Signed-off-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
David Ahern authored
Add scope as input argument versus relying on fib_info reference in fib_nh, and export fib_info_update_nh_saddr. Signed-off-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
David Ahern authored
As nexthops are deleted, fib entries referencing it are marked dead. Export fib_flush so those entries can be removed in a timely manner. Signed-off-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
David Ahern authored
Change fib_check_nh to take net, table and scope as input arguments over struct fib_config and export for use by nexthop code. Signed-off-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
David Ahern authored
Add fib_info_notify_update to walk the fib and send RTM_NEWROUTE notifications with NLM_F_REPLACE set for entries linked to a fib_info that have nh_updated flag set. This helper will be used by the nexthop code to notify userspace of routes that are impacted when a nexthop config is updated via replace. The new function and its helper are similar to how fib_flush and fib_table_flush work for address delete and link down events. This notification is needed for legacy apps that do not understand the new nexthop object. Apps that are nexthop aware can use the RTA_NH_ID attribute in the route notification to just ignore it. In the future this should be wrapped in a sysctl to allow OS'es that are fully updated to avoid the notificaton storm. Signed-off-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
David Ahern authored
Add fib6_rt_update to send RTM_NEWROUTE with NLM_F_REPLACE set. This helper will be used by the nexthop code to notify userspace of routes that are impacted when a nexthop config is updated via replace. This notification is needed for legacy apps that do not understand the new nexthop object. Apps that are nexthop aware can use the RTA_NH_ID attribute in the route notification to just ignore it. In the future this should be wrapped in a sysctl to allow OS'es that are fully updated to avoid the notificaton storm. Signed-off-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
David Ahern authored
Add hook to ipv6 stub to bump the sernum up to the root node for a route. This is needed by the nexthop code when a nexthop config changes. Signed-off-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
David Ahern authored
Add ip6_del_rt to the IPv6 stub. The hook is needed by the nexthop code to remove entries linked to a nexthop that is getting deleted. Signed-off-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
David S. Miller authored
Andrew Lunn says: ==================== net: phy: T1 support T1 PHYs make use of a single twisted pair, rather than the traditional 2 pair for 100BaseT or 4 pair for 1000BaseT. This patchset adds link modes for 100BaseT1 and 1000BaseT1, and them makes use of 100BaseT1 in the list of PHY features used by current T1 drivers. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
-
Andrew Lunn authored
Now that there is a link mode for 100BaseT1, use it in phy_basic_t1_features so T1 PHY drivers will indicate this mode via the Ethtool API. Signed-off-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Andrew Lunn authored
Add link modes for 100Mbps and 1Gbps over a single pair. Signed-off-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Trent Piepho authored
This was being done in config the first time the phy was configured. Should be in the probe method. Cc: Andrew Lunn <andrew@lunn.ch> Cc: Florian Fainelli <f.fainelli@gmail.com> Cc: Heiner Kallweit <hkallweit1@gmail.com> Signed-off-by: Trent Piepho <tpiepho@impinj.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Trent Piepho authored
Insure property is in valid range and fail when reading DT if it is not. Also add error message for existing failure if required property is not present. Cc: Andrew Lunn <andrew@lunn.ch> Cc: Florian Fainelli <f.fainelli@gmail.com> Cc: Heiner Kallweit <hkallweit1@gmail.com> Signed-off-by: Trent Piepho <tpiepho@impinj.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Trent Piepho authored
The driver would only set the IO impedance value when RGMII internal delays were enabled. There is no reason for this. Move the IO impedance block out of the RGMII delay block. Cc: Andrew Lunn <andrew@lunn.ch> Cc: Florian Fainelli <f.fainelli@gmail.com> Cc: Heiner Kallweit <hkallweit1@gmail.com> Signed-off-by: Trent Piepho <tpiepho@impinj.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Trent Piepho authored
The variables used to store u32 DT properties were signed ints. This doesn't work properly if the value of the property were to overflow. Use unsigned variables so this doesn't happen. Cc: Andrew Lunn <andrew@lunn.ch> Cc: Florian Fainelli <f.fainelli@gmail.com> Cc: Heiner Kallweit <hkallweit1@gmail.com> Signed-off-by: Trent Piepho <tpiepho@impinj.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Trent Piepho authored
The code was assuming the reset default of the delay control register was to have delay disabled. This is what the datasheet shows as the register's initial value. However, that's not actually true: the default is controlled by the PHY's pin strapping. If the interface mode is selected as RX or TX delay only, insure the other direction's delay is disabled. If the interface mode is just "rgmii", with neither TX or RX internal delay, one might expect that the driver should disable both delays. But this is not what the driver does. It leaves the setting at the PHY's strapping's default. And that default, for no pins with strapping resistors, is to have delay enabled and 2.00 ns. Rather than change this behavior, I've kept it the same and documented it. No delay will most likely not work and will break ethernet on any board using "rgmii" mode. If the board is strapped to have a delay and is configured to use "rgmii" mode a warning is generated that "rgmii-id" should have been used. Also validate the delay values and fail if they are not in range. Cc: Andrew Lunn <andrew@lunn.ch> Cc: Florian Fainelli <f.fainelli@gmail.com> Cc: Heiner Kallweit <hkallweit1@gmail.com> Signed-off-by: Trent Piepho <tpiepho@impinj.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Trent Piepho authored
Generally, the output clock pin is only used for testing and only serves as a source of RF noise after this. It could be used to daisy-chain PHYs, but this is uncommon. Since the PHY can disable the output, make doing so an option. I do this by adding another enumeration to the allowed values of ti,clk-output-sel. The code was not using the value DP83867_CLK_O_SEL_REF_CLK as one might expect: to select the REF_CLK as the output. Rather it meant "keep clock output setting as is", which, depending on PHY strapping, might not be outputting REF_CLK. Change this so DP83867_CLK_O_SEL_REF_CLK means enable REF_CLK output. Omitting the property will leave the setting as is (which was the previous behavior in this case). Out of range values were silently converted into DP83867_CLK_O_SEL_REF_CLK. Change this so they generate an error. Cc: Andrew Lunn <andrew@lunn.ch> Cc: Florian Fainelli <f.fainelli@gmail.com> Cc: Heiner Kallweit <hkallweit1@gmail.com> Signed-off-by: Trent Piepho <tpiepho@impinj.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Trent Piepho authored
The clock output is generally only used for testing and development and not used to daisy-chain PHYs. It's just a source of RF noise afterward. Add a mux value for "off". I've added it as another enumeration to the output property. In the actual PHY, the mux and the output enable are independently controllable. However, it doesn't seem useful to be able to describe the mux setting when the output is disabled. Document that PHY's default setting will be left as is if the property is omitted. Cc: Rob Herring <robh+dt@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Trent Piepho <tpiepho@impinj.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Trent Piepho authored
Add a note to make it more clear how the driver behaves when "rgmii" vs "rgmii-id", "rgmii-idrx", or "rgmii-idtx" interface modes are selected. Cc: Rob Herring <robh+dt@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Trent Piepho <tpiepho@impinj.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Vishal Kulkarni authored
Hash (exact-match) filters used for offloading flows share the same active region resources on the chip with upper layer drivers, like iw_cxgb4, chcr, etc. Currently, only either Hash filters or ULDs can use the active region resources, but not both. Hence, use the new firmware configuration parameters (when available) to allow both the Hash filters and ULDs to share the active region simultaneously. Signed-off-by: Vishal Kulkarni <vishal@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Baruch Siach authored
Don't disable the ipg clock in the regulator error path. The clock is disable unconditionally two lines below the failed_regulator label. Signed-off-by: Baruch Siach <baruch@tkos.co.il> Signed-off-by: David S. Miller <davem@davemloft.net>
-
- 22 May, 2019 6 commits
-
-
Felipe Gasper authored
This adds the ability for Netlink to report a socket's UID along with the other UNIX diagnostic information that is already available. This will allow diagnostic tools greater insight into which users control which socket. To test this, do the following as a non-root user: unshare -U -r bash nc -l -U user.socket.$$ & .. and verify from within that same session that Netlink UNIX socket diagnostics report the socket's UID as 0. Also verify that Netlink UNIX socket diagnostics report the socket's UID as the user's UID from an unprivileged process in a different session. Verify the same from a root process. Signed-off-by: Felipe Gasper <felipe@felipegasper.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linuxLinus Torvalds authored
Pull arm64 fixes from Will Deacon: - Fix SPE probe failure when backing auxbuf with high-order pages - Fix handling of DMA allocations from outside of the vmalloc area - Fix generation of build-id ELF section for vDSO object - Disable huge I/O mappings if kernel page table dumping is enabled - A few other minor fixes (comments, kconfig etc) * tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: arm64: vdso: Explicitly add build-id option arm64/mm: Inhibit huge-vmap with ptdump arm64: Print physical address of page table base in show_pte() arm64: don't trash config with compat symbol if COMPAT is disabled arm64: assembler: Update comment above cond_yield_neon() macro drivers/perf: arm_spe: Don't error on high-order pages for aux buf arm64/iommu: handle non-remapped addresses in ->mmap and ->get_sgtable
-
git://git.kernel.org/pub/scm/linux/kernel/git/gfs2/linux-gfs2Linus Torvalds authored
Pull gfs2 fix from Andreas Gruenbacher: "Fix a gfs2 sign extension bug introduced in v4.3" * tag 'gfs2-5.1.fixes2' of git://git.kernel.org/pub/scm/linux/kernel/git/gfs2/linux-gfs2: gfs2: Fix sign extension bug in gfs2_update_stats
-
git://git.kernel.org/pub/scm/linux/kernel/git/davem/netLinus Torvalds authored
Pull networking fixes from David Miller: 1) Clear up some recent tipc regressions because of registration ordering. Fix from Junwei Hu. 2) tipc's TLV_SET() can read past the end of the supplied buffer during the copy. From Chris Packham. 3) ptp example program doesn't match the kernel, from Richard Cochran. 4) Outgoing message type fix in qrtr, from Bjorn Andersson. 5) Flow control regression in stmmac, from Tan Tee Min. 6) Fix inband autonegotiation in phylink, from Russell King. 7) Fix sk_bound_dev_if handling in rawv6_bind(), from Mike Manning. 8) Fix usbnet crash after disconnect, from Kloetzke Jan. * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (21 commits) usbnet: fix kernel crash after disconnect selftests: fib_rule_tests: use pre-defined DEV_ADDR net-next: net: Fix typos in ip-sysctl.txt ipv6: Consider sk_bound_dev_if when binding a raw socket to an address net: phylink: ensure inband AN works correctly usbnet: ipheth: fix racing condition net: stmmac: dma channel control register need to be init first net: stmmac: fix ethtool flow control not able to get/set net: qrtr: Fix message type of outgoing packets networking: : fix typos in code comments ptp: Fix example program to match kernel. fddi: fix typos in code comments selftests: fib_rule_tests: enable forwarding before ipv4 from/iif test selftests: fib_rule_tests: fix local IPv4 address typo tipc: Avoid copying bytes beyond the supplied data 2/2] net: xilinx_emaclite: use readx_poll_timeout() in mdio wait function 1/2] net: axienet: use readx_poll_timeout() in mdio wait function vlan: Mark expected switch fall-through macvlan: Mark expected switch fall-through net/mlx4_en: ethtool, Remove unsupported SFP EEPROM high pages query ...
-
Linus Torvalds authored
Merge tag 'for-5.2/dm-fix-1' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm Pull device mapper fix from Mike Snitzer: "Fix a particularly glaring oversight in a DM core commit from 5.1 that doesn't properly trim special IOs (e.g. discards) relative to corresponding target's max_io_len_target_boundary()" * tag 'for-5.2/dm-fix-1' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm: dm: make sure to obey max_io_len_target_boundary
-
Andreas Gruenbacher authored
Commit 4d207133 changed the types of the statistic values in struct gfs2_lkstats from s64 to u64. Because of that, what should be a signed value in gfs2_update_stats turned into an unsigned value. When shifted right, we end up with a large positive value instead of a small negative value, which results in an incorrect variance estimate. Fixes: 4d207133 ("gfs2: Make statistics unsigned, suitable for use with do_div()") Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com> Cc: stable@vger.kernel.org # v4.4+
-