设为首页 加入收藏

TOP

FCoE测试重启调试记录(一)
2023-07-23 13:40:46 】 浏览:91
Tags:FCoE

环境

CPU:Phytium,S2500/64 C00
内核版本:4.19.90-25.10
网讯网卡:txgbe
共两台设备,光纤直连

复现步骤

设备A、B分别执行以下操作,即可复现

modprobe fcoe
systemctl start lldpad
systemctl start fcoe

总结

重启问题是SCSI存储模块libfcoe中fcoe_ctlr_timer_work(drivers/scsi/fcoe/fcoe_ctlr.c)函数访问了非法的内存地址,地址异常原因是编码问题导致,使用结构体强制赋值而忽略了list指针成员的值。

调试记录

查看内核日志

麒麟4.19.90 25.10版本,开启CONFIG_FCOE后,加载txgbe系统发生重启,查看系统日志未发现异常信息,需要抓取串口信息

获取串口信息

串口信息抓取失败,内核在bios阶段串口日志正常,但内核阶段无输出,根据日志可知console enable时ttyS1还未创建。(4.19.90 25.10 51.40现象一样)

[root@compute ~]# dmesg | grep tty
[    0.000000] Kernel command line: BOOT_IMAGE=/vmlinuz-4.19.90-51.0.v2207.fortest.ky10.aarch64 root=/dev/mapper/klas-root ro crashkernel=auto rd.lvm.lv=klas/root rd.lvm.lv=klas/swap acpi=on video=VGA-1:640x480-32@60me smmu.bypassdev=0x1000:0x17 smmu.bypassdev=0x1000:0x15 crashkernel=1024M,high video=efifb:off video=VGA-1:640x480-32@60me console=ttyS1,115200 loglevel=7
[   14.695352] 00:02: ttyS0 at MMIO 0x200002f8 (irq = 0, base_baud = 115200) is a 16550A
[   15.478192] console [ttyS0] enabled
[   15.479627] HISI0031:00: ttyS1 at MMIO 0x28001000 (irq = 7, base_baud = 3125000) is a 16550A
[   90.445804] audit: type=1300 audit(1676610550.480:126): arch=c00000b7 syscall=105 success=yes exit=0 a0=aaaaf7e3bcb0 a1=2ab7 a2=aaaae0aff1c8 a3=aaaaf7e322d0 items=0 ppid=3107 pid=5502 auid=4294967295 uid=0 gid=0 euid=0 suid=0 fsuid=0 egid=0 sgid=0 fsgid=0 tty=(none) ses=4294967295 comm="modprobe" exe="/usr/bin/kmod" key=(null)
[  182.268682] audit: type=1006 audit(1676610660.130:141): pid=5598 uid=0 old-auid=4294967295 auid=0 tty=(none) old-ses=4294967295 ses=1 res=1
[  333.116270] audit: type=1006 audit(1676610810.970:231): pid=6326 uid=0 old-auid=4294967295 auid=0 tty=(none) old-ses=4294967295 ses=3 res=1

使用kdump

kdump部署命令:

yum install -y kexec-tools
systemctl restart kdump.service
kdumpctl restart

4.19.90 25.10 kdump 宕机
4.19.90 51.40 kdump 测试正常

kdump日志:

1950 [  811.607882] Unable to handle kernel paging request at virtual address fffffffffffffed8
1951 [  811.608613] Mem abort info:
1952 [  811.608882]   ESR = 0x96000005
1953 [  811.609175]   Exception class = DABT (current EL), IL = 32 bits
1954 [  811.609697]   SET = 0, FnV = 0
1955 [  811.609989]   EA = 0, S1PTW = 0
1956 [  811.610285] Data abort info:
1957 [  811.610576]   ISV = 0, ISS = 0x00000005
1958 [  811.610933]   CM = 0, WnR = 0
1959 [  811.611219] swapper pgtable: 64k pages, 48-bit VAs, pgdp = 000000003e78cc5f
1960 [  811.611855] [fffffffffffffed8] pgd=0000000000000000, pud=0000000000000000
1961 [  811.612454] Internal error: Oops: 96000005 [#1] SMP
1962 [  811.612930] Modules linked in: qedf qed crc8 fcoe libfcoe libfc scsi_transport_fc xt_CHECKSUM xt_MASQUERADE xt_conntrack ipt_REJECT nf_reject_ipv4 ip6table_mangle ip6table_nat iptable_mangle iptable_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 ip6table_f     ilter ip6_tables iptable_filter tun ebtable_filter ebtable_nat ebtables iptable_raw iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi br_netfilter bridge 8021q garp mrp ipmi_ssif stp rfkill llc sunrpc vfat fat aes_ce_blk crypto_simd cryptd aes_ce_cipher crct10dif     _ce ghash_ce joydev ch341 ses sha2_ce usbserial sha256_arm64 enclosure txgbe ngbe sha1_ce sbsa_gwdt ipmi_si ipmi_devintf ipmi_msghandler sch_fq_codel ip_tables megaraid_sas ast dm_mirror dm_region_hash dm_log gb
1963 [  811.618343] Process kworker/6:2 (pid: 1264, s
首页 上一页 1 2 3 4 5 下一页 尾页 1/5/5
】【打印繁体】【投稿】【收藏】 【推荐】【举报】【评论】 【关闭】 【返回顶部
上一篇使用云服务器配置MariaDB环境,Nav.. 下一篇黑群辉重启成功但是无法远程

最新文章

热门文章

Hot 文章

Python

C 语言

C++基础

大数据基础

linux编程基础

C/C++面试题目