• Daniel Borkmann's avatar
    net, sched: fix panic when updating miniq {b,q}stats · 81d947e2
    Daniel Borkmann authored
    While working on fixing another bug, I ran into the following panic
    on arm64 by simply attaching clsact qdisc, adding a filter and running
    traffic on ingress to it:
    
      [...]
      [  178.188591] Unable to handle kernel read from unreadable memory at virtual address 810fb501f000
      [  178.197314] Mem abort info:
      [  178.200121]   ESR = 0x96000004
      [  178.203168]   Exception class = DABT (current EL), IL = 32 bits
      [  178.209095]   SET = 0, FnV = 0
      [  178.212157]   EA = 0, S1PTW = 0
      [  178.215288] Data abort info:
      [  178.218175]   ISV = 0, ISS = 0x00000004
      [  178.222019]   CM = 0, WnR = 0
      [  178.224997] user pgtable: 4k pages, 48-bit VAs, pgd = 0000000023cb3f33
      [  178.231531] [0000810fb501f000] *pgd=0000000000000000
      [  178.236508] Internal error: Oops: 96000004 [#1] SMP
      [...]
      [  178.311855] CPU: 73 PID: 2497 Comm: ping Tainted: G        W        4.15.0-rc7+ #5
      [  178.319413] Hardware name: FOXCONN R2-1221R-A4/C2U4N_MB, BIOS G31FB18A 03/31/2017
      [  178.326887] pstate: 60400005 (nZCv daif +PAN -UAO)
      [  178.331685] pc : __netif_receive_skb_core+0x49c/0xac8
      [  178.336728] lr : __netif_receive_skb+0x28/0x78
      [  178.341161] sp : ffff00002344b750
      [  178.344465] x29: ffff00002344b750 x28: ffff810fbdfd0580
      [  178.349769] x27: 0000000000000000 x26: ffff000009378000
      [...]
      [  178.418715] x1 : 0000000000000054 x0 : 0000000000000000
      [  178.424020] Process ping (pid: 2497, stack limit = 0x000000009f0a3ff4)
      [  178.430537] Call trace:
      [  178.432976]  __netif_receive_skb_core+0x49c/0xac8
      [  178.437670]  __netif_receive_skb+0x28/0x78
      [  178.441757]  process_backlog+0x9c/0x160
      [  178.445584]  net_rx_action+0x2f8/0x3f0
      [...]
    
    Reason is that sch_ingress and sch_clsact are doing mini_qdisc_pair_init()
    which sets up miniq pointers to cpu_{b,q}stats from the underlying qdisc.
    Problem is that this cannot work since they are actually set up right after
    the qdisc ->init() callback in qdisc_create(), so first packet going into
    sch_handle_ingress() tries to call mini_qdisc_bstats_cpu_update() and we
    therefore panic.
    
    In order to fix this, allocation of {b,q}stats needs to happen before we
    call into ->init(). In net-next, there's already such option through commit
    d59f5ffa ("net: sched: a dflt qdisc may be used with per cpu stats").
    However, the bug needs to be fixed in net still for 4.15. Thus, include
    these bits to reduce any merge churn and reuse the static_flags field to
    set TCQ_F_CPUSTATS, and remove the allocation from qdisc_create() since
    there is no other user left. Prashant Bhole ran into the same issue but
    for net-next, thus adding him below as well as co-author. Same issue was
    also reported by Sandipan Das when using bcc.
    
    Fixes: 46209401 ("net: core: introduce mini_Qdisc and eliminate usage of tp->q for clsact fastpath")
    Reference: https://lists.iovisor.org/pipermail/iovisor-dev/2018-January/001190.htmlReported-by: default avatarSandipan Das <sandipan@linux.vnet.ibm.com>
    Co-authored-by: default avatarPrashant Bhole <bhole_prashant_q7@lab.ntt.co.jp>
    Co-authored-by: default avatarJohn Fastabend <john.fastabend@gmail.com>
    Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
    Cc: Jiri Pirko <jiri@resnulli.us>
    Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
    81d947e2
sch_api.c 46.8 KB