1. 12 Jul, 2019 8 commits
    • John Hurley's avatar
      nfp: flower: ensure ip protocol is specified for L4 matches · 103b7c25
      John Hurley authored
      Flower rules on the NFP firmware are able to match on an IP protocol
      field. When parsing rules in the driver, unknown IP protocols are only
      rejected when further matches are to be carried out on layer 4 fields, as
      the firmware will not be able to extract such fields from packets.
      
      L4 protocol dissectors such as FLOW_DISSECTOR_KEY_PORTS are only parsed if
      an IP protocol is specified. This leaves a loophole whereby a rule that
      attempts to match on transport layer information such as port numbers but
      does not explicitly give an IP protocol type can be incorrectly offloaded
      (in this case with wildcard port numbers matches).
      
      Fix this by rejecting the offload of flows that attempt to match on L4
      information, not only when matching on an unknown IP protocol type, but
      also when the protocol is wildcarded.
      
      Fixes: 2a047845 ("nfp: flower: check L4 matches on unknown IP protocols")
      Signed-off-by: default avatarJohn Hurley <john.hurley@netronome.com>
      Reviewed-by: default avatarJakub Kicinski <jakub.kicinski@netronome.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      103b7c25
    • John Hurley's avatar
      nfp: flower: fix ethernet check on match fields · fd262a6d
      John Hurley authored
      NFP firmware does not explicitly match on an ethernet type field. Rather,
      each rule has a bitmask of match fields that can be used to infer the
      ethernet type.
      
      Currently, if a flower rule contains an unknown ethernet type, a check is
      carried out for matches on other fields of the packet. If matches on
      layer 3 or 4 are found, then the offload is rejected as firmware will not
      be able to extract these fields from a packet with an ethernet type it
      does not currently understand.
      
      However, if a rule contains an unknown ethernet type without any L3 (or
      above) matches then this will effectively be offloaded as a rule with a
      wildcarded ethertype. This can lead to misclassifications on the firmware.
      
      Fix this issue by rejecting all flower rules that specify a match on an
      unknown ethernet type.
      
      Further ensure correct offloads by moving the 'L3 and above' check to any
      rule that does not specify an ethernet type and rejecting rules with
      further matches. This means that we can still offload rules with a
      wildcarded ethertype if they only match on L2 fields but will prevent
      rules which match on further fields that we cannot be sure if the firmware
      will be able to extract.
      
      Fixes: af9d842c ("nfp: extend flower add flow offload")
      Signed-off-by: default avatarJohn Hurley <john.hurley@netronome.com>
      Reviewed-by: default avatarJakub Kicinski <jakub.kicinski@netronome.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      fd262a6d
    • Vlad Buslov's avatar
      net/mlx5e: Provide cb_list pointer when setting up tc block on rep · 3929502b
      Vlad Buslov authored
      Recent refactoring of tc block offloads infrastructure introduced new
      flow_block_cb_setup_simple() method intended to be used as unified way for
      all drivers to register offload callbacks. However, commit that actually
      extended all users (drivers) with block cb list and provided it to
      flow_block infra missed mlx5 en_rep. This leads to following NULL-pointer
      dereference when creating Qdisc:
      
      [  278.385175] BUG: kernel NULL pointer dereference, address: 0000000000000000
      [  278.393233] #PF: supervisor read access in kernel mode
      [  278.399446] #PF: error_code(0x0000) - not-present page
      [  278.405847] PGD 8000000850e73067 P4D 8000000850e73067 PUD 8620cd067 PMD 0
      [  278.414141] Oops: 0000 [#1] SMP PTI
      [  278.419019] CPU: 7 PID: 3369 Comm: tc Not tainted 5.2.0-rc6+ #492
      [  278.426580] Hardware name: Supermicro SYS-2028TP-DECR/X10DRT-P, BIOS 2.0b 03/30/2017
      [  278.435853] RIP: 0010:flow_block_cb_setup_simple+0xc4/0x190
      [  278.442953] Code: 10 48 89 42 08 48 89 10 48 b8 00 01 00 00 00 00 ad de 49 89 00 48 05 00 01 00 00 49 89 40 08 31 c0 c3 b8 a1 ff ff ff c3 f3 c3 <48> 8b 06 48 39 c6 75 0a eb 1a 48 8b 00 48 39 c6 74 12
       48 3b 50 28
      [  278.464829] RSP: 0018:ffffaf07c3f97990 EFLAGS: 00010246
      [  278.471648] RAX: 0000000000000000 RBX: ffff9b43ed4c7680 RCX: ffff9b43d5f80840
      [  278.480408] RDX: ffffffffc0491650 RSI: 0000000000000000 RDI: ffffaf07c3f97998
      [  278.489110] RBP: ffff9b43ddff9000 R08: ffff9b43d5f80840 R09: 0000000000000001
      [  278.497838] R10: 0000000000000009 R11: 00000000000003ad R12: ffffaf07c3f97c08
      [  278.506595] R13: ffff9b43d5f80000 R14: ffff9b43ed4c7680 R15: ffff9b43dfa20b40
      [  278.515374] FS:  00007f796be1b400(0000) GS:ffff9b43ef840000(0000) knlGS:0000000000000000
      [  278.525099] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      [  278.532453] CR2: 0000000000000000 CR3: 0000000840398002 CR4: 00000000001606e0
      [  278.541197] Call Trace:
      [  278.545252]  tcf_block_offload_cmd.isra.52+0x7e/0xb0
      [  278.551871]  tcf_block_get_ext+0x365/0x3e0
      [  278.557569]  qdisc_create+0x15c/0x4e0
      [  278.562859]  ? kmem_cache_alloc_trace+0x1a2/0x1c0
      [  278.569235]  tc_modify_qdisc+0x1c8/0x780
      [  278.574761]  rtnetlink_rcv_msg+0x291/0x340
      [  278.580518]  ? _cond_resched+0x15/0x40
      [  278.585856]  ? rtnl_calcit.isra.29+0x120/0x120
      [  278.591868]  netlink_rcv_skb+0x4a/0x110
      [  278.597198]  netlink_unicast+0x1a0/0x250
      [  278.602601]  netlink_sendmsg+0x2c1/0x3c0
      [  278.608022]  sock_sendmsg+0x5b/0x60
      [  278.612969]  ___sys_sendmsg+0x289/0x310
      [  278.618231]  ? do_wp_page+0x99/0x730
      [  278.623216]  ? page_add_new_anon_rmap+0xbe/0x140
      [  278.629298]  ? __handle_mm_fault+0xc84/0x1360
      [  278.635113]  ? __sys_sendmsg+0x5e/0xa0
      [  278.640285]  __sys_sendmsg+0x5e/0xa0
      [  278.645239]  do_syscall_64+0x5b/0x1b0
      [  278.650274]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
      [  278.656697] RIP: 0033:0x7f796abdeb87
      [  278.661628] Code: 64 89 02 48 c7 c0 ff ff ff ff eb b9 0f 1f 80 00 00 00 00 8b 05 6a 2b 2c 00 48 63 d2 48 63 ff 85 c0 75 18 b8 2e 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 59 f3 c3 0f 1f 80 00 00 00 00 53
       48 89 f3 48
      [  278.683248] RSP: 002b:00007ffde213ba48 EFLAGS: 00000246 ORIG_RAX: 000000000000002e
      [  278.692245] RAX: ffffffffffffffda RBX: 000000005d261e6f RCX: 00007f796abdeb87
      [  278.700862] RDX: 0000000000000000 RSI: 00007ffde213bab0 RDI: 0000000000000003
      [  278.709527] RBP: 0000000000000000 R08: 0000000000000001 R09: 0000000000000006
      [  278.718167] R10: 000000000000000c R11: 0000000000000246 R12: 0000000000000001
      [  278.726743] R13: 000000000067b580 R14: 0000000000000000 R15: 0000000000000000
      [  278.735302] Modules linked in: dummy vxlan ip6_udp_tunnel udp_tunnel sch_ingress nfsv3 nfs_acl nfs lockd grace fscache bridge stp llc sunrpc mlx5_ib ib_uverbs intel_rapl ib_core sb_edac x86_pkg_temp_
      thermal intel_powerclamp coretemp kvm_intel kvm mlx5_core irqbypass crct10dif_pclmul crc32_pclmul crc32c_intel igb ghash_clmulni_intel ses mei_me enclosure mlxfw ipmi_ssif intel_cstate iTCO_wdt ptp mei
      pps_core iTCO_vendor_support pcspkr joydev intel_uncore i2c_i801 ipmi_si lpc_ich intel_rapl_perf ioatdma wmi dca pcc_cpufreq ipmi_devintf ipmi_msghandler acpi_power_meter acpi_pad ast i2c_algo_bit drm_k
      ms_helper ttm drm mpt3sas raid_class scsi_transport_sas
      [  278.802263] CR2: 0000000000000000
      [  278.807170] ---[ end trace b1f0a442a279e66f ]---
      
      Extend en_rep with new static mlx5e_rep_block_cb_list list and pass it to
      flow_block_cb_setup_simple() function instead of hardcoded NULL pointer.
      
      Fixes: 955bcb6e ("drivers: net: use flow block API")
      Signed-off-by: default avatarVlad Buslov <vladbu@mellanox.com>
      Acked-by: default avatarPablo Neira Ayuso <pablo@netfilter.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      3929502b
    • Denis Efremov's avatar
      net: phy: make exported variables non-static · 54638c6e
      Denis Efremov authored
      The variables phy_basic_ports_array, phy_fibre_port_array and
      phy_all_ports_features_array are declared static and marked
      EXPORT_SYMBOL_GPL(), which is at best an odd combination.
      Because the variables were decided to be a part of API, this commit
      removes the static attributes and adds the declarations to the header.
      
      Fixes: 3c1bcc86 ("net: ethernet: Convert phydev advertize and supported from u32 to link mode")
      Signed-off-by: default avatarDenis Efremov <efremov@linux.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      54638c6e
    • Vlad Buslov's avatar
      net: sched: Fix NULL-pointer dereference in tc_indr_block_ing_cmd() · c1a970d0
      Vlad Buslov authored
      After recent refactoring of block offlads infrastructure, indr_dev->block
      pointer is dereferenced before it is verified to be non-NULL. Example stack
      trace where this behavior leads to NULL-pointer dereference error when
      creating vxlan dev on system with mlx5 NIC with offloads enabled:
      
      [ 1157.852938] ==================================================================
      [ 1157.866877] BUG: KASAN: null-ptr-deref in tc_indr_block_ing_cmd.isra.41+0x9c/0x160
      [ 1157.880877] Read of size 4 at addr 0000000000000090 by task ip/3829
      [ 1157.901637] CPU: 22 PID: 3829 Comm: ip Not tainted 5.2.0-rc6+ #488
      [ 1157.914438] Hardware name: Supermicro SYS-2028TP-DECR/X10DRT-P, BIOS 2.0b 03/30/2017
      [ 1157.929031] Call Trace:
      [ 1157.938318]  dump_stack+0x9a/0xeb
      [ 1157.948362]  ? tc_indr_block_ing_cmd.isra.41+0x9c/0x160
      [ 1157.960262]  ? tc_indr_block_ing_cmd.isra.41+0x9c/0x160
      [ 1157.972082]  __kasan_report+0x176/0x192
      [ 1157.982513]  ? tc_indr_block_ing_cmd.isra.41+0x9c/0x160
      [ 1157.994348]  kasan_report+0xe/0x20
      [ 1158.004324]  tc_indr_block_ing_cmd.isra.41+0x9c/0x160
      [ 1158.015950]  ? tcf_block_setup+0x430/0x430
      [ 1158.026558]  ? kasan_unpoison_shadow+0x30/0x40
      [ 1158.037464]  __tc_indr_block_cb_register+0x5f5/0xf20
      [ 1158.049288]  ? mlx5e_rep_indr_tc_block_unbind+0xa0/0xa0 [mlx5_core]
      [ 1158.062344]  ? tc_indr_block_dev_put.part.47+0x5c0/0x5c0
      [ 1158.074498]  ? rdma_roce_rescan_device+0x20/0x20 [ib_core]
      [ 1158.086580]  ? br_device_event+0x98/0x480 [bridge]
      [ 1158.097870]  ? strcmp+0x30/0x50
      [ 1158.107578]  mlx5e_nic_rep_netdevice_event+0xdd/0x180 [mlx5_core]
      [ 1158.120212]  notifier_call_chain+0x6d/0xa0
      [ 1158.130753]  register_netdevice+0x6fc/0x7e0
      [ 1158.141322]  ? netdev_change_features+0xa0/0xa0
      [ 1158.152218]  ? vxlan_config_apply+0x210/0x310 [vxlan]
      [ 1158.163593]  __vxlan_dev_create+0x2ad/0x520 [vxlan]
      [ 1158.174770]  ? vxlan_changelink+0x490/0x490 [vxlan]
      [ 1158.185870]  ? rcu_read_unlock+0x60/0x60 [vxlan]
      [ 1158.196798]  vxlan_newlink+0x99/0xf0 [vxlan]
      [ 1158.207303]  ? __vxlan_dev_create+0x520/0x520 [vxlan]
      [ 1158.218601]  ? rtnl_create_link+0x3d0/0x450
      [ 1158.228900]  __rtnl_newlink+0x8a7/0xb00
      [ 1158.238701]  ? stack_access_ok+0x35/0x80
      [ 1158.248450]  ? rtnl_link_unregister+0x1a0/0x1a0
      [ 1158.258735]  ? find_held_lock+0x6d/0xd0
      [ 1158.268379]  ? is_bpf_text_address+0x67/0xf0
      [ 1158.278330]  ? lock_acquire+0xc1/0x1f0
      [ 1158.287686]  ? is_bpf_text_address+0x5/0xf0
      [ 1158.297449]  ? is_bpf_text_address+0x86/0xf0
      [ 1158.307310]  ? kernel_text_address+0xec/0x100
      [ 1158.317155]  ? arch_stack_walk+0x92/0xe0
      [ 1158.326497]  ? __kernel_text_address+0xe/0x30
      [ 1158.336213]  ? unwind_get_return_address+0x2f/0x50
      [ 1158.346267]  ? create_prof_cpu_mask+0x20/0x20
      [ 1158.355936]  ? arch_stack_walk+0x92/0xe0
      [ 1158.365117]  ? stack_trace_save+0x8a/0xb0
      [ 1158.374272]  ? stack_trace_consume_entry+0x80/0x80
      [ 1158.384226]  ? match_held_lock+0x33/0x210
      [ 1158.393216]  ? kasan_unpoison_shadow+0x30/0x40
      [ 1158.402593]  rtnl_newlink+0x53/0x80
      [ 1158.410925]  rtnetlink_rcv_msg+0x3a5/0x600
      [ 1158.419777]  ? validate_linkmsg+0x400/0x400
      [ 1158.428620]  ? find_held_lock+0x6d/0xd0
      [ 1158.437117]  ? match_held_lock+0x1b/0x210
      [ 1158.445760]  ? validate_linkmsg+0x400/0x400
      [ 1158.454642]  netlink_rcv_skb+0xc7/0x1f0
      [ 1158.463150]  ? netlink_ack+0x470/0x470
      [ 1158.471538]  ? netlink_deliver_tap+0x1f3/0x5a0
      [ 1158.480607]  netlink_unicast+0x2ae/0x350
      [ 1158.489099]  ? netlink_attachskb+0x340/0x340
      [ 1158.497935]  ? _copy_from_iter_full+0xde/0x3b0
      [ 1158.506945]  ? __virt_addr_valid+0xb6/0xf0
      [ 1158.515578]  ? __check_object_size+0x159/0x240
      [ 1158.524515]  netlink_sendmsg+0x4d3/0x630
      [ 1158.532879]  ? netlink_unicast+0x350/0x350
      [ 1158.541400]  ? netlink_unicast+0x350/0x350
      [ 1158.549805]  sock_sendmsg+0x94/0xa0
      [ 1158.557561]  ___sys_sendmsg+0x49d/0x570
      [ 1158.565625]  ? copy_msghdr_from_user+0x210/0x210
      [ 1158.574457]  ? __fput+0x1e2/0x330
      [ 1158.581948]  ? __kasan_slab_free+0x130/0x180
      [ 1158.590407]  ? kmem_cache_free+0xb6/0x2d0
      [ 1158.598574]  ? mark_lock+0xc7/0x790
      [ 1158.606177]  ? task_work_run+0xcf/0x100
      [ 1158.614165]  ? exit_to_usermode_loop+0x102/0x110
      [ 1158.622954]  ? __lock_acquire+0x963/0x1ee0
      [ 1158.631199]  ? lockdep_hardirqs_on+0x260/0x260
      [ 1158.639777]  ? match_held_lock+0x1b/0x210
      [ 1158.647918]  ? lockdep_hardirqs_on+0x260/0x260
      [ 1158.656501]  ? match_held_lock+0x1b/0x210
      [ 1158.664643]  ? __fget_light+0xa6/0xe0
      [ 1158.672423]  ? __sys_sendmsg+0xd2/0x150
      [ 1158.680334]  __sys_sendmsg+0xd2/0x150
      [ 1158.688063]  ? __ia32_sys_shutdown+0x30/0x30
      [ 1158.696435]  ? lock_downgrade+0x2e0/0x2e0
      [ 1158.704541]  ? mark_held_locks+0x1a/0x90
      [ 1158.712611]  ? mark_held_locks+0x1a/0x90
      [ 1158.720619]  ? do_syscall_64+0x1e/0x2c0
      [ 1158.728530]  do_syscall_64+0x78/0x2c0
      [ 1158.736254]  entry_SYSCALL_64_after_hwframe+0x49/0xbe
      [ 1158.745414] RIP: 0033:0x7f62d505cb87
      [ 1158.753070] Code: 64 89 02 48 c7 c0 ff ff ff ff eb b9 0f 1f 80 00 00 00 00 8b 05 6a 2b 2c 00 48 63 d2 48 63 ff 85 c0 75 18 b8 2e 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 59 f3 c3 0f 1f 80 00 00[87/1817]
       48 89 f3 48
      [ 1158.780924] RSP: 002b:00007fffd9832268 EFLAGS: 00000246 ORIG_RAX: 000000000000002e
      [ 1158.793204] RAX: ffffffffffffffda RBX: 000000005d26048f RCX: 00007f62d505cb87
      [ 1158.805111] RDX: 0000000000000000 RSI: 00007fffd98322d0 RDI: 0000000000000003
      [ 1158.817055] RBP: 0000000000000000 R08: 0000000000000001 R09: 0000000000000006
      [ 1158.828987] R10: 00007f62d50ce260 R11: 0000000000000246 R12: 0000000000000001
      [ 1158.840909] R13: 000000000067e540 R14: 0000000000000000 R15: 000000000067ed20
      [ 1158.852873] ==================================================================
      
      Introduce new function tcf_block_non_null_shared() that verifies block
      pointer before dereferencing it to obtain index. Use the function in
      tc_indr_block_ing_cmd() to prevent NULL pointer dereference.
      
      Fixes: 955bcb6e ("drivers: net: use flow block API")
      Signed-off-by: default avatarVlad Buslov <vladbu@mellanox.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      c1a970d0
    • Arnd Bergmann's avatar
      davinci_cpdma: don't cast dma_addr_t to pointer · c653f61a
      Arnd Bergmann authored
      dma_addr_t may be 64-bit wide on 32-bit architectures, so it is not
      valid to cast between it and a pointer:
      
      drivers/net/ethernet/ti/davinci_cpdma.c: In function 'cpdma_chan_submit_si':
      drivers/net/ethernet/ti/davinci_cpdma.c:1047:12: error: cast from pointer to integer of different size [-Werror=pointer-to-int-cast]
      drivers/net/ethernet/ti/davinci_cpdma.c: In function 'cpdma_chan_idle_submit_mapped':
      drivers/net/ethernet/ti/davinci_cpdma.c:1114:12: error: cast to pointer from integer of different size [-Werror=int-to-pointer-cast]
      drivers/net/ethernet/ti/davinci_cpdma.c: In function 'cpdma_chan_submit_mapped':
      drivers/net/ethernet/ti/davinci_cpdma.c:1164:12: error: cast to pointer from integer of different size [-Werror=int-to-pointer-cast]
      
      Solve this by using two separate members in 'struct submit_info'.
      Since this avoids the use of the 'flag' member, the structure does
      not even grow in typical configurations.
      
      Fixes: 6670acac ("net: ethernet: ti: davinci_cpdma: add dma mapped submit")
      Signed-off-by: default avatarArnd Bergmann <arnd@arndb.de>
      Reviewed-by: default avatarIvan Khoronzhuk <ivan.khoronzhuk@linaro.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      c653f61a
    • Taehee Yoo's avatar
      net: openvswitch: do not update max_headroom if new headroom is equal to old headroom · 6b660c41
      Taehee Yoo authored
      When a vport is deleted, the maximum headroom size would be changed.
      If the vport which has the largest headroom is deleted,
      the new max_headroom would be set.
      But, if the new headroom size is equal to the old headroom size,
      updating routine is unnecessary.
      Signed-off-by: default avatarTaehee Yoo <ap420073@gmail.com>
      Tested-by: default avatarGreg Rose <gvrose8192@gmail.com>
      Reviewed-by: default avatarGreg Rose <gvrose8192@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      6b660c41
    • Nathan Chancellor's avatar
      net/mlx5e: Convert single case statement switch statements into if statements · 9db7e618
      Nathan Chancellor authored
      During the review of commit 1ff2f0fa ("net/mlx5e: Return in default
      case statement in tx_post_resync_params"), Leon and Nick pointed out
      that the switch statements can be converted to single if statements
      that return early so that the code is easier to follow.
      Suggested-by: default avatarLeon Romanovsky <leon@kernel.org>
      Suggested-by: default avatarNick Desaulniers <ndesaulniers@google.com>
      Signed-off-by: default avatarNathan Chancellor <natechancellor@gmail.com>
      Reviewed-by: default avatarLeon Romanovsky <leonro@mellanox.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      9db7e618
  2. 11 Jul, 2019 32 commits
    • David S. Miller's avatar
      Merge branch 'net/rds-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/ssantosh/linux · 3194d6ad
      David S. Miller authored
      Santosh Shilimkar says:
      
      ====================
      rds fixes
      
      Few rds fixes which makes rds rdma transport reliably working on mainline
      
      First two fixes are applicable to v4.11+ stable versions and last
      three patches applies to only v5.1 stable and current mainline.
      
      Patchset is re-based against 'net' and also available on below tree
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      3194d6ad
    • David S. Miller's avatar
      Merge tag 'mlx5-fixes-2019-07-11' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux · 114a5c32
      David S. Miller authored
      Saeed Mahameed says:
      
      ====================
      Mellanox, mlx5 fixes 2019-07-11
      
      This series introduces some fixes to mlx5 driver.
      
      Please pull and let me know if there is any problem.
      
      For -stable v4.15
      ('net/mlx5e: IPoIB, Add error path in mlx5_rdma_setup_rn')
      
      For -stable v5.1
      ('net/mlx5e: Fix port tunnel GRE entropy control')
      ('net/mlx5e: Rx, Fix checksum calculation for new hardware')
      ('net/mlx5e: Fix return value from timeout recover function')
      ('net/mlx5e: Fix error flow in tx reporter diagnose')
      
      For -stable v5.2
      ('net/mlx5: E-Switch, Fix default encap mode')
      
      Conflict note: This pull request will produce a small conflict when
      merged with net-next.
      In drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads.c
      Take the hunk from net and replace:
      esw_offloads_steering_init(esw, vf_nvports, total_nvports);
      with:
      esw_offloads_steering_init(esw);
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      114a5c32
    • David S. Miller's avatar
      Merge branch 'mlx5-build-fixes' · 08d14c49
      David S. Miller authored
      Saeed Mahameed says:
      
      ====================
      Mellanox, mlx5 build fixes
      
      I know net-next is closed but these patches are fixing some compiler
      build and warnings issues people have been complaining about.
      
      I hope it is not too late, but in case it is a lot of trouble for you,
      I guess they can wait.
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      08d14c49
    • Saeed Mahameed's avatar
      net/mlx5: E-Switch, Reduce ingress acl modify metadata stack usage · 9446d17e
      Saeed Mahameed authored
      Fix the following compiler warning:
      In function ‘esw_vport_add_ingress_acl_modify_metadata’:
      the frame size of 1084 bytes is larger than 1024 bytes [-Wframe-larger-than=]
      
      Since the structure is never written to, we can statically allocate
      it to avoid the stack usage.
      
      Fixes: 7445cfb1 ("net/mlx5: E-Switch, Tag packet with vport number in VF vports and uplink ingress ACLs")
      Signed-off-by: default avatarSaeed Mahameed <saeedm@mellanox.com>
      Reviewed-by: default avatarJianbo Liu <jianbol@mellanox.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      9446d17e
    • Saeed Mahameed's avatar
      net/mlx5e: Fix unused variable warning when CONFIG_MLX5_ESWITCH is off · 2f1f5a77
      Saeed Mahameed authored
      In mlx5e_setup_tc "priv" variable is not being used if
      CONFIG_MLX5_ESWITCH is off, one way to fix this is to actually use it.
      
      mlx5e_setup_tc_mqprio also needs the "priv" variable and it extracts it
      on its own. We can simply pass priv to mlx5e_setup_tc_mqprio instead of
      netdev and avoid extracting the priv var, which will also resolve the
      compiler warning.
      
      Fixes: 4e95bc26 ("net: flow_offload: add flow_block_cb_setup_simple()")
      Signed-off-by: default avatarSaeed Mahameed <saeedm@mellanox.com>
      Reviewed-by: default avatarMark Bloch <markb@mellanox.com>
      Reviewed-by: default avatarTariq Toukan <tariqt@mellanox.com>
      CC: Nathan Chancellor <natechancellor@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      2f1f5a77
    • Tariq Toukan's avatar
      net/mlx5e: Fix compilation error in TLS code · c93dfec1
      Tariq Toukan authored
      In the cited patch below, the Kconfig flags combination of:
      CONFIG_MLX5_FPGA is not set
      CONFIG_MLX5_TLS=y
      CONFIG_MLX5_EN_TLS=y
      
      leads to the compilation error:
      
      ./include/linux/mlx5/device.h:61:39: error: invalid application of
      sizeof to incomplete type struct mlx5_ifc_tls_flow_bits.
      
      Fix it.
      
      Fixes: 90687e1a9a50 ("net/mlx5: Kconfig, Better organize compilation flags")
      Signed-off-by: default avatarTariq Toukan <tariqt@mellanox.com>
      Signed-off-by: default avatarSaeed Mahameed <saeedm@mellanox.com>
      CC: Mao Wenan <maowenan@huawei.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      c93dfec1
    • Eric Dumazet's avatar
      ipv6: fix static key imbalance in fl_create() · d44e3fa5
      Eric Dumazet authored
      fl_create() should call static_branch_deferred_inc() only in
      case of success.
      
      Also we should not call fl_free() in error path, as this could
      cause a static key imbalance.
      
      jump label: negative count!
      WARNING: CPU: 0 PID: 15907 at kernel/jump_label.c:221 static_key_slow_try_dec kernel/jump_label.c:221 [inline]
      WARNING: CPU: 0 PID: 15907 at kernel/jump_label.c:221 static_key_slow_try_dec+0x1ab/0x1d0 kernel/jump_label.c:206
      Kernel panic - not syncing: panic_on_warn set ...
      CPU: 0 PID: 15907 Comm: syz-executor.2 Not tainted 5.2.0-rc6+ #62
      Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
      Call Trace:
       __dump_stack lib/dump_stack.c:77 [inline]
       dump_stack+0x172/0x1f0 lib/dump_stack.c:113
       panic+0x2cb/0x744 kernel/panic.c:219
       __warn.cold+0x20/0x4d kernel/panic.c:576
       report_bug+0x263/0x2b0 lib/bug.c:186
       fixup_bug arch/x86/kernel/traps.c:179 [inline]
       fixup_bug arch/x86/kernel/traps.c:174 [inline]
       do_error_trap+0x11b/0x200 arch/x86/kernel/traps.c:272
       do_invalid_op+0x37/0x50 arch/x86/kernel/traps.c:291
       invalid_op+0x14/0x20 arch/x86/entry/entry_64.S:986
      RIP: 0010:static_key_slow_try_dec kernel/jump_label.c:221 [inline]
      RIP: 0010:static_key_slow_try_dec+0x1ab/0x1d0 kernel/jump_label.c:206
      Code: c0 e8 e9 3e e5 ff 83 fb 01 0f 85 32 ff ff ff e8 5b 3d e5 ff 45 31 ff eb a0 e8 51 3d e5 ff 48 c7 c7 40 99 92 87 e8 13 75 b7 ff <0f> 0b eb 8b 4c 89 e7 e8 a9 c0 1e 00 e9 de fe ff ff e8 bf 6d b7 ff
      RSP: 0018:ffff88805f9c7450 EFLAGS: 00010286
      RAX: 0000000000000000 RBX: 00000000ffffffff RCX: 0000000000000000
      RDX: 000000000000e3e1 RSI: ffffffff815adb06 RDI: ffffed100bf38e7c
      RBP: ffff88805f9c74e0 R08: ffff88806acf0700 R09: ffffed1015d060a9
      R10: ffffed1015d060a8 R11: ffff8880ae830547 R12: ffffffff89832ce0
      R13: ffff88805f9c74b8 R14: 1ffff1100bf38e8b R15: 00000000ffffff01
       __static_key_slow_dec_deferred+0x65/0x110 kernel/jump_label.c:272
       fl_free+0xa9/0xe0 net/ipv6/ip6_flowlabel.c:121
       fl_create+0x6af/0x9f0 net/ipv6/ip6_flowlabel.c:457
       ipv6_flowlabel_opt+0x80e/0x2730 net/ipv6/ip6_flowlabel.c:624
       do_ipv6_setsockopt.isra.0+0x2119/0x4100 net/ipv6/ipv6_sockglue.c:825
       ipv6_setsockopt+0xf6/0x170 net/ipv6/ipv6_sockglue.c:944
       tcp_setsockopt net/ipv4/tcp.c:3131 [inline]
       tcp_setsockopt+0x8f/0xe0 net/ipv4/tcp.c:3125
       sock_common_setsockopt+0x94/0xd0 net/core/sock.c:3130
       __sys_setsockopt+0x253/0x4b0 net/socket.c:2080
       __do_sys_setsockopt net/socket.c:2096 [inline]
       __se_sys_setsockopt net/socket.c:2093 [inline]
       __x64_sys_setsockopt+0xbe/0x150 net/socket.c:2093
       do_syscall_64+0xfd/0x680 arch/x86/entry/common.c:301
       entry_SYSCALL_64_after_hwframe+0x49/0xbe
      RIP: 0033:0x4597c9
      Code: fd b7 fb ff c3 66 2e 0f 1f 84 00 00 00 00 00 66 90 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 0f 83 cb b7 fb ff c3 66 2e 0f 1f 84 00 00 00 00
      RSP: 002b:00007f2670556c78 EFLAGS: 00000246 ORIG_RAX: 0000000000000036
      RAX: ffffffffffffffda RBX: 0000000000000005 RCX: 00000000004597c9
      RDX: 0000000000000020 RSI: 0000000000000029 RDI: 0000000000000003
      RBP: 000000000075bfc8 R08: 000000000000fdf7 R09: 0000000000000000
      R10: 0000000020000000 R11: 0000000000000246 R12: 00007f26705576d4
      R13: 00000000004cec00 R14: 00000000004dd520 R15: 00000000ffffffff
      Kernel Offset: disabled
      Rebooting in 86400 seconds..
      
      Fixes: 59c820b2 ("ipv6: elide flowlabel check if no exclusive leases exist")
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Acked-by: default avatarWillem de Bruijn <willemb@google.com>
      Reported-by: default avatarsyzbot <syzkaller@googlegroups.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      d44e3fa5
    • Eric Dumazet's avatar
      ipv6: fix potential crash in ip6_datagram_dst_update() · 8975a3ab
      Eric Dumazet authored
      Willem forgot to change one of the calls to fl6_sock_lookup(),
      which can now return an error or NULL.
      
      syzbot reported :
      
      kasan: CONFIG_KASAN_INLINE enabled
      kasan: GPF could be caused by NULL-ptr deref or user memory access
      general protection fault: 0000 [#1] PREEMPT SMP KASAN
      CPU: 1 PID: 31763 Comm: syz-executor.0 Not tainted 5.2.0-rc6+ #63
      Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
      RIP: 0010:ip6_datagram_dst_update+0x559/0xc30 net/ipv6/datagram.c:83
      Code: 00 00 e8 ea 29 3f fb 4d 85 f6 0f 84 96 04 00 00 e8 dc 29 3f fb 49 8d 7e 20 48 b8 00 00 00 00 00 fc ff df 48 89 fa 48 c1 ea 03 <80> 3c 02 00 0f 85 16 06 00 00 4d 8b 6e 20 e8 b4 29 3f fb 4c 89 ee
      RSP: 0018:ffff88809ba97ae0 EFLAGS: 00010207
      RAX: dffffc0000000000 RBX: ffff8880a81254b0 RCX: ffffc90008118000
      RDX: 0000000000000003 RSI: ffffffff86319a84 RDI: 000000000000001e
      RBP: ffff88809ba97c10 R08: ffff888065e9e700 R09: ffffed1015d26c80
      R10: ffffed1015d26c7f R11: ffff8880ae9363fb R12: ffff8880a8124f40
      R13: 0000000000000001 R14: fffffffffffffffe R15: ffff88809ba97b40
      FS:  00007f38e606a700(0000) GS:ffff8880ae900000(0000) knlGS:0000000000000000
      CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      CR2: 00000000202c0140 CR3: 00000000a026a000 CR4: 00000000001406e0
      DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
      Call Trace:
       __ip6_datagram_connect+0x5e9/0x1390 net/ipv6/datagram.c:246
       ip6_datagram_connect+0x30/0x50 net/ipv6/datagram.c:269
       ip6_datagram_connect_v6_only+0x69/0x90 net/ipv6/datagram.c:281
       inet_dgram_connect+0x14a/0x2d0 net/ipv4/af_inet.c:571
       __sys_connect+0x264/0x330 net/socket.c:1824
       __do_sys_connect net/socket.c:1835 [inline]
       __se_sys_connect net/socket.c:1832 [inline]
       __x64_sys_connect+0x73/0xb0 net/socket.c:1832
       do_syscall_64+0xfd/0x680 arch/x86/entry/common.c:301
       entry_SYSCALL_64_after_hwframe+0x49/0xbe
      RIP: 0033:0x4597c9
      Code: fd b7 fb ff c3 66 2e 0f 1f 84 00 00 00 00 00 66 90 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 0f 83 cb b7 fb ff c3 66 2e 0f 1f 84 00 00 00 00
      RSP: 002b:00007f38e6069c78 EFLAGS: 00000246 ORIG_RAX: 000000000000002a
      RAX: ffffffffffffffda RBX: 0000000000000003 RCX: 00000000004597c9
      RDX: 000000000000001c RSI: 0000000020000040 RDI: 0000000000000003
      RBP: 000000000075bf20 R08: 0000000000000000 R09: 0000000000000000
      R10: 0000000000000000 R11: 0000000000000246 R12: 00007f38e606a6d4
      R13: 00000000004bfd07 R14: 00000000004d1838 R15: 00000000ffffffff
      Modules linked in:
      RIP: 0010:ip6_datagram_dst_update+0x559/0xc30 net/ipv6/datagram.c:83
      Code: 00 00 e8 ea 29 3f fb 4d 85 f6 0f 84 96 04 00 00 e8 dc 29 3f fb 49 8d 7e 20 48 b8 00 00 00 00 00 fc ff df 48 89 fa 48 c1 ea 03 <80> 3c 02 00 0f 85 16 06 00 00 4d 8b 6e 20 e8 b4 29 3f fb 4c 89 ee
      
      Fixes: 59c820b2 ("ipv6: elide flowlabel check if no exclusive leases exist")
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Acked-by: default avatarWillem de Bruijn <willemb@google.com>
      Reported-by: default avatarsyzbot <syzkaller@googlegroups.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      8975a3ab
    • Eric Dumazet's avatar
      ipv6: tcp: fix flowlabels reflection for RST packets · 052e0690
      Eric Dumazet authored
      In 323a53c4 ("ipv6: tcp: enable flowlabel reflection in some RST packets")
      and 50a8accf ("ipv6: tcp: send consistent flowlabel in TIME_WAIT state")
      we took care of IPv6 flowlabel reflections for two cases.
      
      This patch takes care of the remaining case, when the RST packet
      is sent on behalf of a 'full' socket.
      
      In Marek use case, this was a socket in TCP_CLOSE state.
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Reported-by: default avatarMarek Majkowski <marek@cloudflare.com>
      Tested-by: default avatarMarek Majkowski <marek@cloudflare.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      052e0690
    • yangxingwu's avatar
      ipv6: Use ipv6_authlen for len · 416e8126
      yangxingwu authored
      The length of AH header is computed manually as (hp->hdrlen+2)<<2.
      However, in include/linux/ipv6.h, a macro named ipv6_authlen is
      already defined for exactly the same job. This commit replaces
      the manual computation code with the macro.
      Signed-off-by: default avataryangxingwu <xingwu.yang@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      416e8126
    • Cong Wang's avatar
      hsr: switch ->dellink() to ->ndo_uninit() · 311633b6
      Cong Wang authored
      Switching from ->priv_destructor to dellink() has an unexpected
      consequence: existing RCU readers, that is, hsr_port_get_hsr()
      callers, may still be able to read the port list.
      
      Instead of checking the return value of each hsr_port_get_hsr(),
      we can just move it to ->ndo_uninit() which is called after
      device unregister and synchronize_net(), and we still have RTNL
      lock there.
      
      Fixes: b9a1e627 ("hsr: implement dellink to clean up resources")
      Fixes: edf070a0 ("hsr: fix a NULL pointer deref in hsr_dev_xmit()")
      Reported-by: syzbot+097ef84cdc95843fbaa8@syzkaller.appspotmail.com
      Cc: Arvid Brodin <arvid.brodin@alten.se>
      Signed-off-by: default avatarCong Wang <xiyou.wangcong@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      311633b6
    • Joe Perches's avatar
      net: stmmac: Fix misuses of GENMASK macro · aa4c0c90
      Joe Perches authored
      Arguments are supposed to be ordered high then low.
      
      Fixes: 293e4365 ("stmmac: change descriptor layout")
      Fixes: 9f93ac8d ("net-next: stmmac: Add dwmac-sun8i")
      Signed-off-by: default avatarJoe Perches <joe@perches.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      aa4c0c90
    • Joe Perches's avatar
      net: ethernet: mediatek: Fix misuses of GENMASK macro · 937a9440
      Joe Perches authored
      Arguments are supposed to be ordered high then low.
      Signed-off-by: default avatarJoe Perches <joe@perches.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      937a9440
    • Petar Penkov's avatar
      net: fib_rules: do not flow dissect local packets · 63f9ba1b
      Petar Penkov authored
      Rules matching on loopback iif do not need early flow dissection as the
      packet originates from the host. Stop counting such rules in
      fib_rule_requires_fldissect
      Signed-off-by: default avatarPetar Penkov <ppenkov@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      63f9ba1b
    • Aya Levin's avatar
      net/mlx5e: IPoIB, Add error path in mlx5_rdma_setup_rn · ef1ce7d7
      Aya Levin authored
      Check return value from mlx5e_attach_netdev, add error path on failure.
      
      Fixes: 48935bbb ("net/mlx5e: IPoIB, Add netdevice profile skeleton")
      Signed-off-by: default avatarAya Levin <ayal@mellanox.com>
      Reviewed-by: default avatarFeras Daoud <ferasda@mellanox.com>
      Signed-off-by: default avatarSaeed Mahameed <saeedm@mellanox.com>
      ef1ce7d7
    • Aya Levin's avatar
      net/mlx5e: Fix error flow in tx reporter diagnose · 99d31cbd
      Aya Levin authored
      Fix tx reporter's diagnose callback. Propagate error when failing to
      gather diagnostics information or failing to print diagnostic data per
      queue.
      
      Fixes: de8650a8 ("net/mlx5e: Add tx reporter support")
      Signed-off-by: default avatarAya Levin <ayal@mellanox.com>
      Reviewed-by: default avatarTariq Toukan <tariqt@mellanox.com>
      Acked-by: default avatarJiri Pirko <jiri@mellanox.com>
      Signed-off-by: default avatarSaeed Mahameed <saeedm@mellanox.com>
      99d31cbd
    • Aya Levin's avatar
      net/mlx5e: Fix return value from timeout recover function · 39825350
      Aya Levin authored
      Fix timeout recover function to return a meaningful return value.
      When an interrupt was not sent by the FW, return IO error instead of
      'true'.
      
      Fixes: c7981bea ("net/mlx5e: Fix return status of TX reporter timeout recover")
      Signed-off-by: default avatarAya Levin <ayal@mellanox.com>
      Acked-by: default avatarJiri Pirko <jiri@mellanox.com>
      Reviewed-by: default avatarTariq Toukan <tariqt@mellanox.com>
      Signed-off-by: default avatarSaeed Mahameed <saeedm@mellanox.com>
      39825350
    • Saeed Mahameed's avatar
      net/mlx5e: Rx, Fix checksum calculation for new hardware · db849faa
      Saeed Mahameed authored
      CQE checksum full mode in new HW, provides a full checksum of rx frame.
      Covering bytes starting from eth protocol up to last byte in the received
      frame (frame_size - ETH_HLEN), as expected by the stack.
      
      Fixing up skb->csum by the driver is not required in such case. This fix
      is to avoid wrong checksum calculation in drivers which already support
      the new hardware with the new checksum mode.
      
      Fixes: 85327a9c ("net/mlx5: Update the list of the PCI supported devices")
      Signed-off-by: default avatarSaeed Mahameed <saeedm@mellanox.com>
      db849faa
    • Eli Britstein's avatar
      net/mlx5e: Fix port tunnel GRE entropy control · 914adbb1
      Eli Britstein authored
      GRE entropy calculation is a single bit per card, and not per port.
      Force disable GRE entropy calculation upon the first GRE encap rule,
      and release the force at the last GRE encap rule removal. This is done
      per port.
      
      Fixes: 97417f61 ("net/mlx5e: Fix GRE key by controlling port tunnel entropy calculation")
      Signed-off-by: default avatarEli Britstein <elibr@mellanox.com>
      Signed-off-by: default avatarSaeed Mahameed <saeedm@mellanox.com>
      914adbb1
    • Maor Gottlieb's avatar
      net/mlx5: E-Switch, Fix default encap mode · 9a64144d
      Maor Gottlieb authored
      Encap mode is related to switchdev mode only. Move the init of
      the encap mode to eswitch_offloads. Before this change, we reported
      that eswitch supports encap, even tough the device was in non
      SRIOV mode.
      
      Fixes: 7768d197 ('net/mlx5: E-Switch, Add control for encapsulation')
      Signed-off-by: default avatarMaor Gottlieb <maorg@mellanox.com>
      Reviewed-by: default avatarRoi Dayan <roid@mellanox.com>
      Signed-off-by: default avatarSaeed Mahameed <saeedm@mellanox.com>
      9a64144d
    • Linus Torvalds's avatar
      Merge tag 'acpi-5.3-rc1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm · a131c2bf
      Linus Torvalds authored
      Pull ACPI fix from Rafael Wysocki:
       "Revert a recent ACPICA commit causing systems to hang at boot time"
      
      * tag 'acpi-5.3-rc1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
        Revert "ACPICA: Update table load object initialization"
      a131c2bf
    • Linus Torvalds's avatar
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next · 237f83df
      Linus Torvalds authored
      Pull networking updates from David Miller:
       "Some highlights from this development cycle:
      
         1) Big refactoring of ipv6 route and neigh handling to support
            nexthop objects configurable as units from userspace. From David
            Ahern.
      
         2) Convert explored_states in BPF verifier into a hash table,
            significantly decreased state held for programs with bpf2bpf
            calls, from Alexei Starovoitov.
      
         3) Implement bpf_send_signal() helper, from Yonghong Song.
      
         4) Various classifier enhancements to mvpp2 driver, from Maxime
            Chevallier.
      
         5) Add aRFS support to hns3 driver, from Jian Shen.
      
         6) Fix use after free in inet frags by allocating fqdirs dynamically
            and reworking how rhashtable dismantle occurs, from Eric Dumazet.
      
         7) Add act_ctinfo packet classifier action, from Kevin
            Darbyshire-Bryant.
      
         8) Add TFO key backup infrastructure, from Jason Baron.
      
         9) Remove several old and unused ISDN drivers, from Arnd Bergmann.
      
        10) Add devlink notifications for flash update status to mlxsw driver,
            from Jiri Pirko.
      
        11) Lots of kTLS offload infrastructure fixes, from Jakub Kicinski.
      
        12) Add support for mv88e6250 DSA chips, from Rasmus Villemoes.
      
        13) Various enhancements to ipv6 flow label handling, from Eric
            Dumazet and Willem de Bruijn.
      
        14) Support TLS offload in nfp driver, from Jakub Kicinski, Dirk van
            der Merwe, and others.
      
        15) Various improvements to axienet driver including converting it to
            phylink, from Robert Hancock.
      
        16) Add PTP support to sja1105 DSA driver, from Vladimir Oltean.
      
        17) Add mqprio qdisc offload support to dpaa2-eth, from Ioana
            Radulescu.
      
        18) Add devlink health reporting to mlx5, from Moshe Shemesh.
      
        19) Convert stmmac over to phylink, from Jose Abreu.
      
        20) Add PTP PHC (Physical Hardware Clock) support to mlxsw, from
            Shalom Toledo.
      
        21) Add nftables SYNPROXY support, from Fernando Fernandez Mancera.
      
        22) Convert tcp_fastopen over to use SipHash, from Ard Biesheuvel.
      
        23) Track spill/fill of constants in BPF verifier, from Alexei
            Starovoitov.
      
        24) Support bounded loops in BPF, from Alexei Starovoitov.
      
        25) Various page_pool API fixes and improvements, from Jesper Dangaard
            Brouer.
      
        26) Just like ipv4, support ref-countless ipv6 route handling. From
            Wei Wang.
      
        27) Support VLAN offloading in aquantia driver, from Igor Russkikh.
      
        28) Add AF_XDP zero-copy support to mlx5, from Maxim Mikityanskiy.
      
        29) Add flower GRE encap/decap support to nfp driver, from Pieter
            Jansen van Vuuren.
      
        30) Protect against stack overflow when using act_mirred, from John
            Hurley.
      
        31) Allow devmap map lookups from eBPF, from Toke Høiland-Jørgensen.
      
        32) Use page_pool API in netsec driver, Ilias Apalodimas.
      
        33) Add Google gve network driver, from Catherine Sullivan.
      
        34) More indirect call avoidance, from Paolo Abeni.
      
        35) Add kTLS TX HW offload support to mlx5, from Tariq Toukan.
      
        36) Add XDP_REDIRECT support to bnxt_en, from Andy Gospodarek.
      
        37) Add MPLS manipulation actions to TC, from John Hurley.
      
        38) Add sending a packet to connection tracking from TC actions, and
            then allow flower classifier matching on conntrack state. From
            Paul Blakey.
      
        39) Netfilter hw offload support, from Pablo Neira Ayuso"
      
      * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (2080 commits)
        net/mlx5e: Return in default case statement in tx_post_resync_params
        mlx5: Return -EINVAL when WARN_ON_ONCE triggers in mlx5e_tls_resync().
        net: dsa: add support for BRIDGE_MROUTER attribute
        pkt_sched: Include const.h
        net: netsec: remove static declaration for netsec_set_tx_de()
        net: netsec: remove superfluous if statement
        netfilter: nf_tables: add hardware offload support
        net: flow_offload: rename tc_cls_flower_offload to flow_cls_offload
        net: flow_offload: add flow_block_cb_is_busy() and use it
        net: sched: remove tcf block API
        drivers: net: use flow block API
        net: sched: use flow block API
        net: flow_offload: add flow_block_cb_{priv, incref, decref}()
        net: flow_offload: add list handling functions
        net: flow_offload: add flow_block_cb_alloc() and flow_block_cb_free()
        net: flow_offload: rename TCF_BLOCK_BINDER_TYPE_* to FLOW_BLOCK_BINDER_TYPE_*
        net: flow_offload: rename TC_BLOCK_{UN}BIND to FLOW_BLOCK_{UN}BIND
        net: flow_offload: add flow_block_cb_setup_simple()
        net: hisilicon: Add an tx_desc to adapt HI13X1_GMAC
        net: hisilicon: Add an rx_desc to adapt HI13X1_GMAC
        ...
      237f83df
    • Linus Torvalds's avatar
      Merge tag 'clone3-v5.3' of git://git.kernel.org/pub/scm/linux/kernel/git/brauner/linux · 8f6ccf61
      Linus Torvalds authored
      Pull clone3 system call from Christian Brauner:
       "This adds the clone3 syscall which is an extensible successor to clone
        after we snagged the last flag with CLONE_PIDFD during the 5.2 merge
        window for clone(). It cleanly supports all of the flags from clone()
        and thus all legacy workloads.
      
        There are few user visible differences between clone3 and clone.
        First, CLONE_DETACHED will cause EINVAL with clone3 so we can reuse
        this flag. Second, the CSIGNAL flag is deprecated and will cause
        EINVAL to be reported. It is superseeded by a dedicated "exit_signal"
        argument in struct clone_args thus freeing up even more flags. And
        third, clone3 gives CLONE_PIDFD a dedicated return argument in struct
        clone_args instead of abusing CLONE_PARENT_SETTID's parent_tidptr
        argument.
      
        The clone3 uapi is designed to be easy to handle on 32- and 64 bit:
      
          /* uapi */
          struct clone_args {
                  __aligned_u64 flags;
                  __aligned_u64 pidfd;
                  __aligned_u64 child_tid;
                  __aligned_u64 parent_tid;
                  __aligned_u64 exit_signal;
                  __aligned_u64 stack;
                  __aligned_u64 stack_size;
                  __aligned_u64 tls;
          };
      
        and a separate kernel struct is used that uses proper kernel typing:
      
          /* kernel internal */
          struct kernel_clone_args {
                  u64 flags;
                  int __user *pidfd;
                  int __user *child_tid;
                  int __user *parent_tid;
                  int exit_signal;
                  unsigned long stack;
                  unsigned long stack_size;
                  unsigned long tls;
          };
      
        The system call comes with a size argument which enables the kernel to
        detect what version of clone_args userspace is passing in. clone3
        validates that any additional bytes a given kernel does not know about
        are set to zero and that the size never exceeds a page.
      
        A nice feature is that this patchset allowed us to cleanup and
        simplify various core kernel codepaths in kernel/fork.c by making the
        internal _do_fork() function take struct kernel_clone_args even for
        legacy clone().
      
        This patch also unblocks the time namespace patchset which wants to
        introduce a new CLONE_TIMENS flag.
      
        Note, that clone3 has only been wired up for x86{_32,64}, arm{64}, and
        xtensa. These were the architectures that did not require special
        massaging.
      
        Other architectures treat fork-like system calls individually and
        after some back and forth neither Arnd nor I felt confident that we
        dared to add clone3 unconditionally to all architectures. We agreed to
        leave this up to individual architecture maintainers. This is why
        there's an additional patch that introduces __ARCH_WANT_SYS_CLONE3
        which any architecture can set once it has implemented support for
        clone3. The patch also adds a cond_syscall(clone3) for architectures
        such as nios2 or h8300 that generate their syscall table by simply
        including asm-generic/unistd.h. The hope is to get rid of
        __ARCH_WANT_SYS_CLONE3 and cond_syscall() rather soon"
      
      * tag 'clone3-v5.3' of git://git.kernel.org/pub/scm/linux/kernel/git/brauner/linux:
        arch: handle arches who do not yet define clone3
        arch: wire-up clone3() syscall
        fork: add clone3
      8f6ccf61
    • Linus Torvalds's avatar
      Merge tag 'pidfd-updates-v5.3' of git://git.kernel.org/pub/scm/linux/kernel/git/brauner/linux · 5450e8a3
      Linus Torvalds authored
      Pull pidfd updates from Christian Brauner:
       "This adds two main features.
      
         - First, it adds polling support for pidfds. This allows process
           managers to know when a (non-parent) process dies in a race-free
           way.
      
           The notification mechanism used follows the same logic that is
           currently used when the parent of a task is notified of a child's
           death. With this patchset it is possible to put pidfds in an
           {e}poll loop and get reliable notifications for process (i.e.
           thread-group) exit.
      
         - The second feature compliments the first one by making it possible
           to retrieve pollable pidfds for processes that were not created
           using CLONE_PIDFD.
      
           A lot of processes get created with traditional PID-based calls
           such as fork() or clone() (without CLONE_PIDFD). For these
           processes a caller can currently not create a pollable pidfd. This
           is a problem for Android's low memory killer (LMK) and service
           managers such as systemd.
      
        Both patchsets are accompanied by selftests.
      
        It's perhaps worth noting that the work done so far and the work done
        in this branch for pidfd_open() and polling support do already see
        some adoption:
      
         - Android is in the process of backporting this work to all their LTS
           kernels [1]
      
         - Service managers make use of pidfd_send_signal but will need to
           wait until we enable waiting on pidfds for full adoption.
      
         - And projects I maintain make use of both pidfd_send_signal and
           CLONE_PIDFD [2] and will use polling support and pidfd_open() too"
      
      [1] https://android-review.googlesource.com/q/topic:%22pidfd+polling+support+4.9+backport%22
          https://android-review.googlesource.com/q/topic:%22pidfd+polling+support+4.14+backport%22
          https://android-review.googlesource.com/q/topic:%22pidfd+polling+support+4.19+backport%22
      
      [2] https://github.com/lxc/lxc/blob/aab6e3eb73c343231cdde775db938994fc6f2803/src/lxc/start.c#L1753
      
      * tag 'pidfd-updates-v5.3' of git://git.kernel.org/pub/scm/linux/kernel/git/brauner/linux:
        tests: add pidfd_open() tests
        arch: wire-up pidfd_open()
        pid: add pidfd_open()
        pidfd: add polling selftests
        pidfd: add polling support
      5450e8a3
    • Linus Torvalds's avatar
      Merge tag 'm68k-for-v5.3-tag2' of git://git.kernel.org/pub/scm/linux/kernel/git/geert/linux-m68k · 29cd581b
      Linus Torvalds authored
      Pull m68k fix from Geert Uytterhoeven:
       "Don't select ARCH_HAS_DMA_PREP_COHERENT for nommu or coldfire.
      
        This is a fix for an issue detected in next, to avoid introducing
        build failures when merging Christoph's dma-mapping tree later"
      
      * tag 'm68k-for-v5.3-tag2' of git://git.kernel.org/pub/scm/linux/kernel/git/geert/linux-m68k:
        m68k: Don't select ARCH_HAS_DMA_PREP_COHERENT for nommu or coldfire
      29cd581b
    • Linus Torvalds's avatar
      Merge branch 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/gerg/m68knommu · 398364a3
      Linus Torvalds authored
      Pull m68nommu updates from Greg Ungerer:
       "A series of cleanups for the FLAT format binary loader, binfmt_flat,
        from Christoph.
      
        The end goal is to support no-MMU on RISC-V, and the last patch
        enables that"
      
      * 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/gerg/m68knommu:
        riscv: add binfmt_flat support
        binfmt_flat: don't offset the data start
        binfmt_flat: move the MAX_SHARED_LIBS definition to binfmt_flat.c
        binfmt_flat: remove the persistent argument from flat_get_addr_from_rp
        binfmt_flat: provide an asm-generic/flat.h
        binfmt_flat: make support for old format binaries optional
        binfmt_flat: add a ARCH_HAS_BINFMT_FLAT option
        binfmt_flat: add endianess annotations
        binfmt_flat: use fixed size type for the on-disk format
        binfmt_flat: consolidate two version of flat_v2_reloc_t
        binfmt_flat: remove the unused OLD_FLAT_FLAG_RAM definition
        binfmt_flat: remove the uapi <linux/flat.h> header
        binfmt_flat: replace flat_argvp_envp_on_stack with a Kconfig variable
        binfmt_flat: remove flat_old_ram_flag
        binfmt_flat: provide a default version of flat_get_relocate_addr
        binfmt_flat: remove flat_set_persistent
        binfmt_flat: remove flat_reloc_valid
      398364a3
    • Linus Torvalds's avatar
      Merge tag 'nfsd-5.3' of git://linux-nfs.org/~bfields/linux · d2b6b4c8
      Linus Torvalds authored
      Pull nfsd updates from Bruce Fields:
       "Highlights:
      
         - Add a new /proc/fs/nfsd/clients/ directory which exposes some
           long-requested information about NFSv4 clients (like open files)
           and allows forced revocation of client state.
      
         - Replace the global duplicate reply cache by a cache per network
           namespace; previously, a request in one network namespace could
           incorrectly match an entry from another, though we haven't seen
           this in production. This is the last remaining container bug that
           I'm aware of; at this point you should be able to run separate
           nfsd's in each network namespace, each with their own set of
           exports, and everything should work.
      
         - Cleanup and modify lock code to show the pid of lockd as the owner
           of NLM locks. This is the correct version of the bugfix originally
           attempted in b8eee0e9 ("lockd: Show pid of lockd for remote
           locks")"
      
      * tag 'nfsd-5.3' of git://linux-nfs.org/~bfields/linux: (34 commits)
        nfsd: Make __get_nfsdfs_client() static
        nfsd: Make two functions static
        nfsd: Fix misuse of strlcpy
        sunrpc/cache: remove the exporting of cache_seq_next
        nfsd: decode implementation id
        nfsd: create xdr_netobj_dup helper
        nfsd: allow forced expiration of NFSv4 clients
        nfsd: create get_nfsdfs_clp helper
        nfsd4: show layout stateids
        nfsd: show lock and deleg stateids
        nfsd4: add file to display list of client's opens
        nfsd: add more information to client info file
        nfsd: escape high characters in binary data
        nfsd: copy client's address including port number to cl_addr
        nfsd4: add a client info file
        nfsd: make client/ directory names small ints
        nfsd: add nfsd/clients directory
        nfsd4: use reference count to free client
        nfsd: rename cl_refcount
        nfsd: persist nfsd filesystem across mounts
        ...
      d2b6b4c8
    • Linus Torvalds's avatar
      Merge tag 'gfs2-for-5.3' of git://git.kernel.org/pub/scm/linux/kernel/git/gfs2/linux-gfs2 · 0248a8be
      Linus Torvalds authored
      Pull gfs2 updates from Andreas Gruenbacher:
       "Some relatively minor changes for gfs2:
      
         - An initial batch of obvious cleanups and fixes from Bob's recovery
           patch queue.
      
         - Two iomap conversion patches and some cleanups from Christoph
           Hellwig.
      
         - A cosmetic cleanup from Kefeng Wang (Huawei).
      
         - Another minor fix and cleanup by me"
      
      * tag 'gfs2-for-5.3' of git://git.kernel.org/pub/scm/linux/kernel/git/gfs2/linux-gfs2:
        gfs2: Remove unused gfs2_iomap_alloc argument
        gfs2: don't use buffer_heads in gfs2_allocate_page_backing
        gfs2: use iomap_bmap instead of generic_block_bmap
        gfs2: mark stuffed_readpage static
        gfs2: merge gfs2_writepage_common into gfs2_writepage
        gfs2: merge gfs2_writeback_aops and gfs2_ordered_aops
        gfs2: remove the unused gfs2_stuffed_write_end function
        gfs2: use page_offset in gfs2_page_mkwrite
        gfs2: replace more printk with calls to fs_info and friends
        gfs2: dump fsid when dumping glock problems
        gfs2: simplify gfs2_freeze by removing case
        gfs2: Rename SDF_SHUTDOWN to SDF_WITHDRAWN
        gfs2: Warn when a journal replay overwrites a rgrp with buffers
        gfs2: log which portion of the journal is replayed
        gfs2: eliminate tr_num_revoke_rm
        gfs2: kthread and remount improvements
        gfs2: Use IS_ERR_OR_NULL
        gfs2: Clean up freeing struct gfs2_sbd
      0248a8be
    • Linus Torvalds's avatar
      Merge tag 'ext4_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4 · 2e756758
      Linus Torvalds authored
      Pull ext4 updates from Ted Ts'o:
       "Many bug fixes and cleanups, and an optimization for case-insensitive
        lookups"
      
      * tag 'ext4_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4:
        ext4: fix coverity warning on error path of filename setup
        ext4: replace ktype default_attrs with default_groups
        ext4: rename htree_inline_dir_to_tree() to ext4_inlinedir_to_tree()
        ext4: refactor initialize_dirent_tail()
        ext4: rename "dirent_csum" functions to use "dirblock"
        ext4: allow directory holes
        jbd2: drop declaration of journal_sync_buffer()
        ext4: use jbd2_inode dirty range scoping
        jbd2: introduce jbd2_inode dirty range scoping
        mm: add filemap_fdatawait_range_keep_errors()
        ext4: remove redundant assignment to node
        ext4: optimize case-insensitive lookups
        ext4: make __ext4_get_inode_loc plug
        ext4: clean up kerneldoc warnigns when building with W=1
        ext4: only set project inherit bit for directory
        ext4: enforce the immutable flag on open files
        ext4: don't allow any modifications to an immutable file
        jbd2: fix typo in comment of journal_submit_inode_data_buffers
        jbd2: fix some print format mistakes
        ext4: gracefully handle ext4_break_layouts() failure during truncate
      2e756758
    • Linus Torvalds's avatar
      Merge tag 'afs-next-20190628' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs · 8dda9957
      Linus Torvalds authored
      Pull afs updates from David Howells:
       "A set of minor changes for AFS:
      
         - Remove an unnecessary check in afs_unlink()
      
         - Add a tracepoint for tracking callback management
      
         - Add a tracepoint for afs_server object usage
      
         - Use struct_size()
      
         - Add mappings for AFS UAE abort codes to Linux error codes, using
           symbolic names rather than hex numbers in the .c file"
      
      * tag 'afs-next-20190628' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs:
        afs: Add support for the UAE error table
        fs/afs: use struct_size() in kzalloc()
        afs: Trace afs_server usage
        afs: Add some callback management tracepoints
        afs: afs_unlink() doesn't need to check dentry->d_inode
      8dda9957
    • Linus Torvalds's avatar
      Merge tag 'fscrypt-for-linus' of git://git.kernel.org/pub/scm/fs/fscrypt/fscrypt · 25cd6f35
      Linus Torvalds authored
      Pull fscrypt updates from Eric Biggers:
      
       - Preparations for supporting encryption on ext4 filesystems where the
         filesystem block size is smaller than PAGE_SIZE.
      
       - Don't allow setting encryption policies on dead directories.
      
       - Various cleanups.
      
      * tag 'fscrypt-for-linus' of git://git.kernel.org/pub/scm/fs/fscrypt/fscrypt:
        fscrypt: document testing with xfstests
        fscrypt: remove selection of CONFIG_CRYPTO_SHA256
        fscrypt: remove unnecessary includes of ratelimit.h
        fscrypt: don't set policy for a dead directory
        ext4: encrypt only up to last block in ext4_bio_write_page()
        ext4: decrypt only the needed block in __ext4_block_zero_page_range()
        ext4: decrypt only the needed blocks in ext4_block_write_begin()
        ext4: clear BH_Uptodate flag on decryption error
        fscrypt: decrypt only the needed blocks in __fscrypt_decrypt_bio()
        fscrypt: support decrypting multiple filesystem blocks per page
        fscrypt: introduce fscrypt_decrypt_block_inplace()
        fscrypt: handle blocksize < PAGE_SIZE in fscrypt_zeroout_range()
        fscrypt: support encrypting multiple filesystem blocks per page
        fscrypt: introduce fscrypt_encrypt_block_inplace()
        fscrypt: clean up some BUG_ON()s in block encryption/decryption
        fscrypt: rename fscrypt_do_page_crypto() to fscrypt_crypt_block()
        fscrypt: remove the "write" part of struct fscrypt_ctx
        fscrypt: simplify bounce page handling
      25cd6f35
    • Linus Torvalds's avatar
      Merge tag 'copy-file-range-fixes-1' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux · 40f06c79
      Linus Torvalds authored
      Pull copy_file_range updates from Darrick Wong:
       "This fixes numerous parameter checking problems and inconsistent
        behaviors in the new(ish) copy_file_range system call.
      
        Now the system call will actually check its range parameters
        correctly; refuse to copy into files for which the caller does not
        have sufficient privileges; update mtime and strip setuid like file
        writes are supposed to do; and allows copying up to the EOF of the
        source file instead of failing the call like we used to.
      
        Summary:
      
         - Create a generic copy_file_range handler and make individual
           filesystems responsible for calling it (i.e. no more assuming that
           do_splice_direct will work or is appropriate)
      
         - Refactor copy_file_range and remap_range parameter checking where
           they are the same
      
         - Install missing copy_file_range parameter checking(!)
      
         - Remove suid/sgid and update mtime like any other file write
      
         - Change the behavior so that a copy range crossing the source file's
           eof will result in a short copy to the source file's eof instead of
           EINVAL
      
         - Permit filesystems to decide if they want to handle
           cross-superblock copy_file_range in their local handlers"
      
      * tag 'copy-file-range-fixes-1' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux:
        fuse: copy_file_range needs to strip setuid bits and update timestamps
        vfs: allow copy_file_range to copy across devices
        xfs: use file_modified() helper
        vfs: introduce file_modified() helper
        vfs: add missing checks to copy_file_range
        vfs: remove redundant checks from generic_remap_checks()
        vfs: introduce generic_file_rw_checks()
        vfs: no fallback for ->copy_file_range
        vfs: introduce generic_copy_file_range()
      40f06c79