| Age | Commit message (Collapse) | Author |
|
git://git.kernel.org/pub/scm/linux/kernel/git/chenhuacai/linux-loongson
Pull LoongArch updates from Huacai Chen:
- Add basic LoongArch32 support
Note: Build infrastructures of LoongArch32 are not enabled yet,
because we need to adjust irqchip drivers and wait for GNU toolchain
be upstream first.
- Select HAVE_ARCH_BITREVERSE in Kconfig
- Fix build and boot for CONFIG_RANDSTRUCT
- Correct the calculation logic of thread_count
- Some bug fixes and other small changes
* tag 'loongarch-6.19' of git://git.kernel.org/pub/scm/linux/kernel/git/chenhuacai/linux-loongson: (22 commits)
LoongArch: Adjust default config files for 32BIT/64BIT
LoongArch: Adjust VDSO/VSYSCALL for 32BIT/64BIT
LoongArch: Adjust misc routines for 32BIT/64BIT
LoongArch: Adjust user accessors for 32BIT/64BIT
LoongArch: Adjust system call for 32BIT/64BIT
LoongArch: Adjust module loader for 32BIT/64BIT
LoongArch: Adjust time routines for 32BIT/64BIT
LoongArch: Adjust process management for 32BIT/64BIT
LoongArch: Adjust memory management for 32BIT/64BIT
LoongArch: Adjust boot & setup for 32BIT/64BIT
LoongArch: Adjust common macro definitions for 32BIT/64BIT
LoongArch: Add adaptive CSR accessors for 32BIT/64BIT
LoongArch: Add atomic operations for 32BIT/64BIT
LoongArch: Add new PCI ID for pci_fixup_vgadev()
LoongArch: Add and use some macros for AVEC
LoongArch: Correct the calculation logic of thread_count
LoongArch: Use unsigned long for _end and _text
LoongArch: Use __pmd()/__pte() for swap entry conversions
LoongArch: Fix arch_dup_task_struct() for CONFIG_RANDSTRUCT
LoongArch: Fix build errors for CONFIG_RANDSTRUCT
...
|
|
Adjust VDSO/VSYSCALL because read_cpu_id() for 32BIT/64BIT are
different, and LoongArch32 doesn't support GENERIC_GETTIMEOFDAY now
(will be supported in future).
Reviewed-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Jiaxun Yang <jiaxun.yang@flygoat.com>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
|
|
Adjust misc routines for both 32BIT and 64BIT, including: bitops, bswap,
checksum, string, jump label, unaligned access emulator, suspend/wakeup
routines, etc.
Reviewed-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Jiaxun Yang <jiaxun.yang@flygoat.com>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
|
|
Adjust system call for both 32BIT and 64BIT, including: add the uapi
unistd_{32,64}.h and syscall_table_{32,64}.h inclusion, add sys_mmap2()
definition, change the system call entry routines, etc.
Reviewed-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Jiaxun Yang <jiaxun.yang@flygoat.com>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
|
|
Adjust module loader for both 32BIT and 64BIT, including: change the s64
type to long, change the u64 type to unsigned long, change the plt entry
definition and handling, etc.
Reviewed-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Jiaxun Yang <jiaxun.yang@flygoat.com>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
|
|
Adjust time routines for both 32BIT and 64BIT, including: rdtime_h() /
rdtime_l() definitions for 32BIT and rdtime_d() definition for 64BIT,
get_cycles() and get_cycles64() definitions for 32BIT/64BIT, show time
frequency info ("CPU MHz" and "BogoMIPS") in /proc/cpuinfo, etc.
Use do_div() for division which works on both 32BIT and 64BIT platforms.
Reviewed-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Jiaxun Yang <jiaxun.yang@flygoat.com>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
|
|
Adjust process management for both 32BIT and 64BIT, including: CPU
context switching, FPU loading/restoring, process dumping and process
tracing routines.
Q: Why modify switch.S?
A: LoongArch32 has no ldptr.d/stptr.d instructions, and asm offsets of
thead_struct members are too large to be filled in the 12b immediate
field of ld.w/st.w.
Reviewed-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Jiaxun Yang <jiaxun.yang@flygoat.com>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
|
|
Adjust memory management for both 32BIT and 64BIT, including: address
space definition, DMW CSR definition, page table bits definition, boot
time detection of VA/PA bits, page table init, tlb exception handling,
copy_page/clear_page/dump_tlb libraries, etc.
Reviewed-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Jiaxun Yang <jiaxun.yang@flygoat.com>
Signed-off-by: Yawei Li <liyawei@loongson.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
|
|
Adjust boot & setup for both 32BIT and 64BIT, including: efi header
definition, MAX_IO_PICS definition, kernel entry and environment setup
routines, etc.
Add a fallback path in fdt_cpu_clk_init() to avoid 0MHz in /proc/cpuinfo
if there is no valid clock freq from firmware.
Reviewed-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Jiaxun Yang <jiaxun.yang@flygoat.com>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
|
|
32BIT platforms only have 32bit CSR/IOCSR registers, 64BIT platforms
have both 32bit/64bit CSR/IOCSR registers. Now there are both 32bit and
64bit CSR accessors:
csr_read32()/csr_write32()/csr_xchg32();
csr_read64()/csr_write64()/csr_xchg64();
Some CSR registers (address and timer registers) are 32bit length on
32BIT platform and 64bit length on 64BIT platform. To avoid #ifdefs here
and there, they need adaptive accessors, so we define and use:
csr_read()/csr_write()/csr_xchg();
IOCSR doesn't have a "natural length", which means a 64bit register can
be treated as two 32bit registers, so we just use two 32bit accessors to
emulate a 64bit accessors.
Reviewed-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Jiaxun Yang <jiaxun.yang@flygoat.com>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
|
|
For thread_count, the current calculation method has a maximum of 255,
which may not be sufficient in the future. Therefore, we are correcting
it now.
Reference: SMBIOS Specification, 7.5 Processor Information (Type 4)[1]
[1]: https://www.dmtf.org/sites/default/files/standards/documents/DSP0134_3.9.0.pdf
Cc: stable@vger.kernel.org
Signed-off-by: Qiang Ma <maqianga@uniontech.com>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
|
|
It is better to use unsigned long rather than long for _end and _text to
calculate the kernel length.
Cc: stable@vger.kernel.org # v6.3+
Fixes: e5f02b51fa0c ("LoongArch: Add support for kernel address space layout randomization (KASLR)")
Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
|
|
Now the optimized version of arch_dup_task_struct() for LoongArch
assumes 'thread' is the last member of 'task_struct'. But this is
not true if CONFIG_RANDSTRUCT is enabled after Linux-6.16.
So fix the arch_dup_task_struct() function for CONFIG_RANDSTRUCT by
copying the whole 'task_struct'.
Cc: stable@vger.kernel.org # 6.16+
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
|
|
When CONFIG_RANDSTRUCT enabled, members of task_struct are randomized.
There is a chance that TASK_STACK_CANARY be out of 12bit immediate's
range and causes build errors. TASK_STACK_CANARY is naturally aligned,
so fix it by replacing ld.d/st.d with ldptr.d/stptr.d which have 14bit
immediates.
Cc: stable@vger.kernel.org
Reported-by: kernel test robot <lkp@intel.com>
Closes: https://lore.kernel.org/oe-kbuild-all/202511240656.0NaPcJs1-lkp@intel.com/
Suggested-by: Rui Wang <wangrui@loongson.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc
Pull SoC driver updates from Arnd Bergmann:
"This is the first half of the driver changes:
- A treewide interface change to the "syscore" operations for power
management, as a preparation for future Tegra specific changes
- Reset controller updates with added drivers for LAN969x, eic770 and
RZ/G3S SoCs
- Protection of system controller registers on Renesas and Google
SoCs, to prevent trivially triggering a system crash from e.g.
debugfs access
- soc_device identification updates on Nvidia, Exynos and Mediatek
- debugfs support in the ST STM32 firewall driver
- Minor updates for SoC drivers on AMD/Xilinx, Renesas, Allwinner, TI
- Cleanups for memory controller support on Nvidia and Renesas"
* tag 'soc-drivers-6.19' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc: (114 commits)
memory: tegra186-emc: Fix missing put_bpmp
Documentation: reset: Remove reset_controller_add_lookup()
reset: fix BIT macro reference
reset: rzg2l-usbphy-ctrl: Fix a NULL vs IS_ERR() bug in probe
reset: th1520: Support reset controllers in more subsystems
reset: th1520: Prepare for supporting multiple controllers
dt-bindings: reset: thead,th1520-reset: Add controllers for more subsys
dt-bindings: reset: thead,th1520-reset: Remove non-VO-subsystem resets
reset: remove legacy reset lookup code
clk: davinci: psc: drop unused reset lookup
reset: rzg2l-usbphy-ctrl: Add support for RZ/G3S SoC
reset: rzg2l-usbphy-ctrl: Add support for USB PWRRDY
dt-bindings: reset: renesas,rzg2l-usbphy-ctrl: Document RZ/G3S support
reset: eswin: Add eic7700 reset driver
dt-bindings: reset: eswin: Documentation for eic7700 SoC
reset: sparx5: add LAN969x support
dt-bindings: reset: microchip: Add LAN969x support
soc: rockchip: grf: Add select correct PWM implementation on RK3368
soc/tegra: pmc: Add USB wake events for Tegra234
amba: tegra-ahb: Fix device leak on SMMU enable
...
|
|
If the default state of the interrupt controllers in the first kernel
don't mask any interrupts, it may cause the second kernel to potentially
receive interrupts (which were previously allocated by the first kernel)
immediately after a CPU becomes online during its boot process. These
interrupts cannot be properly routed, leading to bad IRQ issues.
This patch calls machine_kexec_mask_interrupts() to mask all interrupts
during the kexec/kdump process.
Signed-off-by: Tianyang Zhang <zhangtianyang@loongson.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
|
|
On physical machine, NUMA node id comes from high bit 44:48 of physical
address. However it is not true on virt machine. With general method, it
comes from ACPI SRAT table.
Here the common function numa_memblks_init() is used to parse NUMA node
information with numa_memblks.
Cc: <stable@vger.kernel.org>
Signed-off-by: Bibo Mao <maobibo@loongson.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
|
|
Some processors have no IOCSR.VENDOR and IOCSR.CPUNAME, some processors
have these registers but there is no valid information.
Consolidate CPU names in /proc/cpuinfo:
1. Add "PRID" to display the PRID & Core-Name;
2. Let "Model Name" display "Unknown" if no valid name.
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
|
|
Several drivers can benefit from registering per-instance data along
with the syscore operations. To achieve this, move the modifiable fields
out of the syscore_ops structure and into a separate struct syscore that
can be registered with the framework. Add a void * driver data field for
drivers to store contextual data that will be passed to the syscore ops.
Acked-by: Rafael J. Wysocki (Intel) <rafael@kernel.org>
Signed-off-by: Thierry Reding <treding@nvidia.com>
|
|
When specifying '-d' for kexec_file_load interface, loaded locations of
kernel/initrd/cmdline etc can be printed out to help debug.
Commit eb7622d908a0 ("kexec_file, riscv: print out debugging message if
required") fixes the same issue on RISC-V.
So, remove kexec_image_info() because the content has been printed out
in generic code.
And on Loongson-3A5000, the printed messages look like below:
kexec_file: kernel: 00000000d9aad283 kernel_size: 0x2e77f30
kexec_file(EFI): No LoongArch PE image header.
kexec_file: Loaded initrd at 0x80000000 bufsz=0x1637cd0 memsz=0x1638000
kexec_file(ELF): Loaded kernel at 0x9c20000 bufsz=0x27f1800 memsz=0x2950000
kexec_file: nr_segments = 2
kexec_file: segment[0]: buf=0x00000000cc3e6c33 bufsz=0x27f1800 mem=0x9c20000 memsz=0x2950000
kexec_file: segment[1]: buf=0x00000000bb75a541 bufsz=0x1637cd0 mem=0x80000000 memsz=0x1638000
kexec_file: kexec_file_load: type:0, start:0xb15d000 head:0x18db60002 flags:0x8
Signed-off-by: Qiang Ma <maqianga@uniontech.com>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
|
|
The kexec_buf structure was previously declared without initialization.
commit bf454ec31add ("kexec_file: allow to place kexec_buf randomly")
added a field that is always read but not consistently populated by all
architectures. This un-initialized field will contain garbage.
This is also triggering a UBSAN warning when the uninitialized data is
accessed:
------------[ cut here ]------------
UBSAN: invalid-load in ./include/linux/kexec.h:210:10
load of value 252 is not a valid value for type '_Bool'
Zero-initializing kexec_buf at declaration ensures all fields are
cleanly set, preventing future instances of uninitialized memory being
used.
Fixes: bf454ec31add ("kexec_file: allow to place kexec_buf randomly")
Link: https://lore.kernel.org/r/20250827-kbuf_all-v1-2-1df9882bb01a@debian.org
Signed-off-by: Youling Tang <tangyouling@kylinos.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
|
|
(1) Use the existing CPUCFG6_PMNUM_SHIFT macro definition instead of
the magic value 4 to get the PMU number.
(2) Detect the value of PMU bits via CPUCFG instruction according to
the ISA manual instead of hard-coded as 64, because the value may be
different for various micro-architectures.
Link: https://loongson.github.io/LoongArch-Documentation/LoongArch-Vol1-EN.html#_cpucfg
Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
|
|
Now there 5 places which calculate max_pfn & max_low_pfn:
1. in fdt_setup() for FDT systems;
2. in memblock_init() for ACPI systems;
3. in init_numa_memory() for NUMA systems;
4. in arch_mem_init() to recalculate for "mem=" cmdline;
5. in paging_init() to recalculate for NUMA systems.
Since memblock_init() is called both for ACPI and FDT systems, move the
calculation out of the for_each_efi_memory_desc() loop can eliminate the
first case. The last case is very questionable (may be derived from the
MIPS/Loongson code) and breaks the "mem=" cmdline, so should be removed.
And then the NUMA version of paging_init() can be also eliminated.
After consolidation there are 3 places of calculation:
1. in memblock_init() for both ACPI and FDT systems;
2. in init_numa_memory() to recalculate for NUMA systems;
3. in arch_mem_init() to recalculate for the "mem=" cmdline.
For all cases the calculation is:
max_pfn = PFN_DOWN(memblock_end_of_DRAM());
max_low_pfn = min(PFN_DOWN(HIGHMEM_START), max_pfn);
Cc: stable@vger.kernel.org
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
|
|
Now we use virtual addresses to fill CSR_MERRENTRY/CSR_TLBRENTRY, but
hardware hope physical addresses. Now it works well because the high
bits are ignored above PA_BITS (48 bits), but explicitly use physical
addresses can avoid potential bugs. So fix it.
Cc: stable@vger.kernel.org
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
|
|
LoongArch's MSG interrupt features are used across multiple subsystems.
Clarify these features to avoid misuse, existing users will be adjusted
if necessary.
MSGINT: Infrastructure, means the CPU core supports message interupts.
Indicated by CPUCFG1.MSGINT.
AVECINT: AVEC interrupt controller based on MSGINT, means the CPU chip
supports direct message interrupts. Indicated by IOCSR.FEATURES.DMSI.
REDIRECTINT: REDIRECT interrupt controller based on MSGINT and AVECINT,
means the CPU chip supports redirect message interrupts. Indicated by
IOCSR.FEATURES.RMSI.
For example:
Loongson-3A5000/3C5000 doesn't support MSGINT/AVECINT/REDIRECTINT;
Loongson-3A6000 supports MSGINT but doesn't support AVECINT/REDIRECTINT;
Loongson-3C6000 supports MSGINT/AVECINT/REDIRECTINT.
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/chenhuacai/linux-loongson
Pull LoongArch updates from Huacai Chen:
- Initialize acpi_gbl_use_global_lock to false
- Allow specify SIMD width via kernel parameters
- Add kexec_file (both EFI & ELF format) support
- Add PER_VMA_LOCK for page fault handling support
- Improve BPF trampoline support
- Update the default config file
- Some bug fixes and other small changes
* tag 'loongarch-6.18' of git://git.kernel.org/pub/scm/linux/kernel/git/chenhuacai/linux-loongson: (23 commits)
LoongArch: Update Loongson-3 default config file
LoongArch: BPF: Sign-extend struct ops return values properly
LoongArch: BPF: Make error handling robust in arch_prepare_bpf_trampoline()
LoongArch: BPF: Make trampoline size stable
LoongArch: BPF: Don't align trampoline size
LoongArch: BPF: No support of struct argument in trampoline programs
LoongArch: BPF: No text_poke() for kernel text
LoongArch: BPF: Remove duplicated bpf_flush_icache()
LoongArch: BPF: Remove duplicated flags check
LoongArch: BPF: Fix uninitialized symbol 'retval_off'
LoongArch: BPF: Optimize sign-extention mov instructions
LoongArch: Handle new atomic instructions for probes
LoongArch: Try VMA lock-based page fault handling first
LoongArch: Automatically disable kaslr if boot from kexec_file
LoongArch: Add crash dump support for kexec_file
LoongArch: Add ELF binary support for kexec_file
LoongArch: Add EFI binary support for kexec_file
LoongArch: Add preparatory infrastructure for kexec_file
LoongArch: Add struct loongarch_image_header for kernel
LoongArch: Allow specify SIMD width via kernel parameters
...
|
|
The atomic instructions sc.q, llacq.{w/d}, screl.{w/d} were newly added
in the LoongArch Reference Manual v1.10, it is necessary to handle them
in insns_not_supported() to avoid putting a breakpoint in the middle of
a ll/sc atomic sequence, otherwise it will loop forever for kprobes and
uprobes.
Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
|
|
Automatically disable kaslr when the kernel loads from kexec_file.
kexec_file loads the secondary kernel image to a non-linked address,
inherently providing KASLR-like randomization.
However, on LoongArch where System RAM may be non-contiguous, enabling
KASLR for the second kernel may relocate it to an invalid memory region
and cause a boot failure. Thus, we disable KASLR when "kexec_file" is
detected in the command line.
To ensure compatibility with older kernels loaded via kexec_file, this
patch should be backported to stable branches.
Cc: stable@vger.kernel.org
Signed-off-by: Youling Tang <tangyouling@kylinos.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
|
|
Enabling crash dump (kdump) includes:
- Prepare contents of ELF header of a core dump file, /proc/vmcore,
using crash_prepare_elf64_headers().
- Add "mem=size@start" parameter to the command line and pass it to the
capture kernel. Limit the runtime memory area of the captured kernel
to avoid disrupting the production kernel's runtime state.
- Add "elfcorehdr=size@start" parameter to the cmdline.
The basic usage for kdump (add the cmdline parameter crashkernel=512M
to grub.cfg for production kernel):
1) Load capture kernel image (vmlinux.efi or vmlinux can both be used):
# kexec -s -p vmlinuz.efi --initrd=initrd.img --reuse-cmdline
2) Do something to crash, like:
# echo c > /proc/sysrq-trigger
Signed-off-by: Youling Tang <tangyouling@kylinos.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
|
|
This patch creates kexec_elf_ops to load ELF binary file for
kexec_file_load() syscall.
However, for `kbuf->memsz` and `kbuf->buf_min`, special handling is
required, and the generic `kexec_elf_load()` cannot be used directly.
$ readelf -l vmlinux
...
Type Offset VirtAddr PhysAddr
FileSiz MemSiz Flags Align
LOAD 0x0000000000010000 0x9000000000200000 0x9000000000200000
0x0000000002747a00 0x000000000287a0d8 RWE 0x10000
NOTE 0x0000000000000000 0x0000000000000000 0x0000000000000000
0x0000000000000000 0x0000000000000000 R 0x8
phdr->p_paddr should have been a physical address, but it is a virtual
address on the current LoongArch. This will cause kexec_file to fail
when loading the kernel and need to be converted to a physical address.
From the above MemSiz, it can be seen that 0x287a0d8 isn't page aligned.
Although kexec_add_buffer() will perform PAGE_SIZE alignment on kbuf->
memsz, there is still a stampeding in the loaded kernel space and initrd
space. The initrd resolution failed when starting the second kernel.
It can be known from the link script vmlinux.lds.S that,
BSS_SECTION(0, SZ_64K, 8)
. = ALIGN(PECOFF_SEGMENT_ALIGN);
It needs to be aligned according to SZ_64K, so that after alignment, its
size is consistent with _kernel_asize.
The basic usage (vmlinux):
1) Load second kernel image:
# kexec -s -l vmlinux --initrd=initrd.img --reuse-cmdline
2) Startup second kernel:
# kexec -e
Signed-off-by: Youling Tang <tangyouling@kylinos.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
|
|
This patch creates kexec_efi_ops to load EFI binary file for
kexec_file_load() syscall.
The efi_kexec_load() as two parts:
- the first part loads the kernel image (vmlinuz.efi or vmlinux.efi)
- the second part loads other segments (e.g: initrd, cmdline, etc)
Currently, pez (vmlinuz.efi) and pei (vmlinux.efi) format images are
supported.
The basic usage (vmlinuz.efi or vmlinux.efi):
1) Load second kernel image:
# kexec -s -l vmlinuz.efi --initrd=initrd.img --reuse-cmdline
2) Startup second kernel:
# kexec -e
Signed-off-by: Youling Tang <tangyouling@kylinos.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
|
|
Add some preparatory infrastructure:
- Add command line processing.
- Add support for loading other segments.
- Other minor modifications.
This initrd will be passed to the second kernel via the command line
'initrd=start,size'.
The 'kexec_file' command line parameter indicates that the kernel is
loaded via kexec_file.
Signed-off-by: Youling Tang <tangyouling@kylinos.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
|
|
For power saving or debugging purpose, we usually want to limit the SIMD
(LSX/LASX) usage on a rich feature platform. So allow specify SIMD width
via kernel parameters "simd=".
Allowed values of "simd=" are any integers, and recommended values are:
0: Disable all SIMD features;
128: Enable at most 128bit SIMD features;
256: Enable at most 256bit SIMD features;
-1(default): Enable as many as possible SIMD features automatically.
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
|
|
Init acpi_gbl_use_global_lock to false, in order to void error messages
during boot phase:
ACPI Error: Could not enable GlobalLock event (20240827/evxfevnt-182)
ACPI Error: No response from Global Lock hardware, disabling lock (20240827/evglock-59)
Fixes: 628c3bb40e9a8cefc0a6 ("LoongArch: Add boot and setup routines")
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull timer core updates from Thomas Gleixner:
- Address the inconsistent shutdown sequence of per CPU clockevents on
CPU hotplug, which only removed it from the core but failed to invoke
the actual device driver shutdown callback. This kept the timer
active, which prevented power savings and caused pointless noise in
virtualization.
- Encapsulate the open coded access to the hrtimer clock base, which is
a private implementation detail, so that the implementation can be
changed without breaking a lot of usage sites.
- Enhance the debug output of the clocksource watchdog to provide
better information for analysis.
- The usual set of cleanups and enhancements all over the place
* tag 'timers-core-2025-09-29' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
time: Fix spelling mistakes in comments
clocksource: Print durations for sync check unconditionally
LoongArch: Remove clockevents shutdown call on offlining
tick: Do not set device to detached state in tick_shutdown()
hrtimer: Reorder branches in hrtimer_clockid_to_base()
hrtimer: Remove hrtimer_clock_base:: Get_time
hrtimer: Use hrtimer_cb_get_time() helper
media: pwm-ir-tx: Avoid direct access to hrtimer clockbase
ALSA: hrtimer: Avoid direct access to hrtimer clockbase
lib: test_objpool: Avoid direct access to hrtimer clockbase
sched/core: Avoid direct access to hrtimer clockbase
timers/itimer: Avoid direct access to hrtimer clockbase
posix-timers: Avoid direct access to hrtimer clockbase
jiffies: Remove obsolete SHIFTED_HZ comment
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull scheduler updates from Ingo Molnar:
"Core scheduler changes:
- Make migrate_{en,dis}able() inline, to improve performance
(Menglong Dong)
- Move STDL_INIT() functions out-of-line (Peter Zijlstra)
- Unify the SCHED_{SMT,CLUSTER,MC} Kconfig (Peter Zijlstra)
Fair scheduling:
- Defer throttling to when tasks exit to user-space, to reduce the
chance & impact of throttle-preemption with held locks and other
resources (Aaron Lu, Valentin Schneider)
- Get rid of sched_domains_curr_level hack for tl->cpumask(), as the
warning was getting triggered on certain topologies (Peter
Zijlstra)
Misc cleanups & fixes:
- Header cleanups (Menglong Dong)
- Fix race in push_dl_task() (Harshit Agarwal)"
* tag 'sched-core-2025-09-26' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
sched: Fix some typos in include/linux/preempt.h
sched: Make migrate_{en,dis}able() inline
rcu: Replace preempt.h with sched.h in include/linux/rcupdate.h
arch: Add the macro COMPILE_OFFSETS to all the asm-offsets.c
sched/fair: Do not balance task to a throttled cfs_rq
sched/fair: Do not special case tasks in throttled hierarchy
sched/fair: update_cfs_group() for throttled cfs_rqs
sched/fair: Propagate load for throttled cfs_rq
sched/fair: Get rid of throttled_lb_pair()
sched/fair: Task based throttle time accounting
sched/fair: Switch to task based throttle model
sched/fair: Implement throttle task work and related helpers
sched/fair: Add related data structure for task based throttle
sched: Unify the SCHED_{SMT,CLUSTER,MC} Kconfig
sched: Move STDL_INIT() functions out-of-line
sched/fair: Get rid of sched_domains_curr_level hack for tl->cpumask()
sched/deadline: Fix race in push_dl_task()
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs
Pull copy_process updates from Christian Brauner:
"This contains the changes to enable support for clone3() on nios2
which apparently is still a thing.
The more exciting part of this is that it cleans up the inconsistency
in how the 64-bit flag argument is passed from copy_process() into the
various other copy_*() helpers"
[ Fixed up rv ltl_monitor 32-bit support as per Sasha Levin in the merge ]
* tag 'kernel-6.18-rc1.clone3' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs:
nios2: implement architecture-specific portion of sys_clone3
arch: copy_thread: pass clone_flags as u64
copy_process: pass clone_flags as u64 across calltree
copy_sighand: Handle architectures where sizeof(unsigned long) < sizeof(u64)
|
|
The include/generated/asm-offsets.h is generated in Kbuild during
compiling from arch/SRCARCH/kernel/asm-offsets.c. When we want to
generate another similar offset header file, circular dependency can
happen.
For example, we want to generate a offset file include/generated/test.h,
which is included in include/sched/sched.h. If we generate asm-offsets.h
first, it will fail, as include/sched/sched.h is included in asm-offsets.c
and include/generated/test.h doesn't exist; If we generate test.h first,
it can't success neither, as include/generated/asm-offsets.h is included
by it.
In x86_64, the macro COMPILE_OFFSETS is used to avoid such circular
dependency. We can generate asm-offsets.h first, and if the
COMPILE_OFFSETS is defined, we don't include the "generated/test.h".
And we define the macro COMPILE_OFFSETS for all the asm-offsets.c for this
purpose.
Signed-off-by: Menglong Dong <dongml2@chinatelecom.cn>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
|
|
Add a NULL-pointer check after the kcalloc() call in init_vdso(). If
allocation fails, return -ENOMEM to prevent a possible dereference of
vdso_info.code_mapping.pages when it is NULL.
Cc: stable@vger.kernel.org
Fixes: 2ed119aef60d ("LoongArch: Set correct size for vDSO code mapping")
Signed-off-by: Guangshuo Li <202321181@mail.sdu.edu.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
|
|
When testing the kernel live patching with "modprobe livepatch-sample",
there is a timeout over 15 seconds from "starting patching transition"
to "patching complete". The dmesg command shows "unreliable stack" for
user tasks in debug mode, here is one of the messages:
livepatch: klp_try_switch_task: bash:1193 has an unreliable stack
The "unreliable stack" is because it can not unwind from do_syscall()
to its previous frame handle_syscall(). It should use fp to find the
original stack top due to secondary stack in do_syscall(), but fp is
not used for some other functions, then fp can not be restored by the
next frame of do_syscall(), so it is necessary to save fp if task is
not current, in order to get the stack top of do_syscall().
Here are the call chains:
klp_enable_patch()
klp_try_complete_transition()
klp_try_switch_task()
klp_check_and_switch_task()
klp_check_stack()
stack_trace_save_tsk_reliable()
arch_stack_walk_reliable()
When executing "rmmod livepatch-sample", there exists a similar issue.
With this patch, it takes a short time for patching and unpatching.
Before:
# modprobe livepatch-sample
# dmesg -T | tail -3
[Sat Sep 6 11:00:20 2025] livepatch: 'livepatch_sample': starting patching transition
[Sat Sep 6 11:00:35 2025] livepatch: signaling remaining tasks
[Sat Sep 6 11:00:36 2025] livepatch: 'livepatch_sample': patching complete
# echo 0 > /sys/kernel/livepatch/livepatch_sample/enabled
# rmmod livepatch_sample
rmmod: ERROR: Module livepatch_sample is in use
# rmmod livepatch_sample
# dmesg -T | tail -3
[Sat Sep 6 11:06:05 2025] livepatch: 'livepatch_sample': starting unpatching transition
[Sat Sep 6 11:06:20 2025] livepatch: signaling remaining tasks
[Sat Sep 6 11:06:21 2025] livepatch: 'livepatch_sample': unpatching complete
After:
# modprobe livepatch-sample
# dmesg -T | tail -2
[Tue Sep 16 16:19:30 2025] livepatch: 'livepatch_sample': starting patching transition
[Tue Sep 16 16:19:31 2025] livepatch: 'livepatch_sample': patching complete
# echo 0 > /sys/kernel/livepatch/livepatch_sample/enabled
# rmmod livepatch_sample
# dmesg -T | tail -2
[Tue Sep 16 16:19:36 2025] livepatch: 'livepatch_sample': starting unpatching transition
[Tue Sep 16 16:19:37 2025] livepatch: 'livepatch_sample': unpatching complete
Cc: stable@vger.kernel.org # v6.9+
Fixes: 199cc14cb4f1 ("LoongArch: Add kernel livepatching support")
Reported-by: Xi Zhang <zhangxi@kylinos.cn>
Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
|
|
As Documentation/filesystems/sysfs.rst suggested, show() should only use
sysfs_emit() or sysfs_emit_at() when formatting the value to be returned
to user space.
No functional change intended.
Cc: stable@vger.kernel.org
Signed-off-by: Tao Cui <cuitao@kylinos.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
|
|
Add a check for the return value of kobject_create_and_add(), to ensure
that the kobj allocation succeeds for later use.
Cc: stable@vger.kernel.org
Signed-off-by: Tao Cui <cuitao@kylinos.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
|
|
The clockevents core already detached and unregistered it at this stage.
Signed-off-by: Bibo Mao <maobibo@loongson.cn>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/all/20250906064952.3749122-3-maobibo@loongson.cn
|
|
With the introduction of clone3 in commit 7f192e3cd316 ("fork: add
clone3") the effective bit width of clone_flags on all architectures was
increased from 32-bit to 64-bit, with a new type of u64 for the flags.
However, for most consumers of clone_flags the interface was not
changed from the previous type of unsigned long.
While this works fine as long as none of the new 64-bit flag bits
(CLONE_CLEAR_SIGHAND and CLONE_INTO_CGROUP) are evaluated, this is still
undesirable in terms of the principle of least surprise.
Thus, this commit fixes all relevant interfaces of the copy_thread
function that is called from copy_process to consistently pass
clone_flags as u64, so that no truncation to 32-bit integers occurs on
32-bit architectures.
Signed-off-by: Simon Schuster <schuster.simon@siemens-energy.com>
Link: https://lore.kernel.org/20250901-nios2-implement-clone3-v2-3-53fcf5577d57@siemens-energy.com
Fixes: c5febea0956fd387 ("fork: Pass struct kernel_clone_args into copy_thread")
Acked-by: Guo Ren (Alibaba Damo Academy) <guoren@kernel.org>
Acked-by: Andreas Larsson <andreas@gaisler.com> # sparc
Acked-by: David Hildenbrand <david@redhat.com>
Acked-by: Geert Uytterhoeven <geert@linux-m68k.org> # m68k
Reviewed-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Christian Brauner <brauner@kernel.org>
|
|
Now if preemption happens between protected_save_fpu_context() and
protected_save_lbt_context(), FTOP context is lost. Because FTOP is
saved by protected_save_lbt_context() but protected_save_fpu_context()
disables TM before that. So save LBT before FPU in setup_sigcontext()
to avoid this potential risk.
Signed-off-by: Hanlu Li <lihanlu@loongson.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
|
|
When enabling CONFIG_KASAN, CONFIG_PREEMPT_VOLUNTARY_BUILD and
CONFIG_PREEMPT_VOLUNTARY at the same time, there will be soft deadlock,
the relevant logs are as follows:
rcu: INFO: rcu_sched self-detected stall on CPU
...
Call Trace:
[<900000000024f9e4>] show_stack+0x5c/0x180
[<90000000002482f4>] dump_stack_lvl+0x94/0xbc
[<9000000000224544>] rcu_dump_cpu_stacks+0x1fc/0x280
[<900000000037ac80>] rcu_sched_clock_irq+0x720/0xf88
[<9000000000396c34>] update_process_times+0xb4/0x150
[<90000000003b2474>] tick_nohz_handler+0xf4/0x250
[<9000000000397e28>] __hrtimer_run_queues+0x1d0/0x428
[<9000000000399b2c>] hrtimer_interrupt+0x214/0x538
[<9000000000253634>] constant_timer_interrupt+0x64/0x80
[<9000000000349938>] __handle_irq_event_percpu+0x78/0x1a0
[<9000000000349a78>] handle_irq_event_percpu+0x18/0x88
[<9000000000354c00>] handle_percpu_irq+0x90/0xf0
[<9000000000348c74>] handle_irq_desc+0x94/0xb8
[<9000000001012b28>] handle_cpu_irq+0x68/0xa0
[<9000000001def8c0>] handle_loongarch_irq+0x30/0x48
[<9000000001def958>] do_vint+0x80/0xd0
[<9000000000268a0c>] kasan_mem_to_shadow.part.0+0x2c/0x2a0
[<90000000006344f4>] __asan_load8+0x4c/0x120
[<900000000025c0d0>] module_frob_arch_sections+0x5c8/0x6b8
[<90000000003895f0>] load_module+0x9e0/0x2958
[<900000000038b770>] __do_sys_init_module+0x208/0x2d0
[<9000000001df0c34>] do_syscall+0x94/0x190
[<900000000024d6fc>] handle_syscall+0xbc/0x158
After analysis, this is because the slow speed of loading the amdgpu
module leads to the long time occupation of the cpu and then the soft
deadlock.
When loading a module, module_frob_arch_sections() tries to figure out
the number of PLTs/GOTs that will be needed to handle all the RELAs. It
will call the count_max_entries() to find in an out-of-order date which
counting algorithm has O(n^2) complexity.
To make it faster, we sort the relocation list by info and addend. That
way, to check for a duplicate relocation, it just needs to compare with
the previous entry. This reduces the complexity of the algorithm to O(n
log n), as done in commit d4e0340919fb ("arm64/module: Optimize module
load time by optimizing PLT counting"). This gives sinificant reduction
in module load time for modules with large number of relocations.
After applying this patch, the soft deadlock problem has been solved,
and the kernel starts normally without "Call Trace".
Using the default configuration to test some modules, the results are as
follows:
Module Size
ip_tables 36K
fat 143K
radeon 2.5MB
amdgpu 16MB
Without this patch:
Module Module load time (ms) Count(PLTs/GOTs)
ip_tables 18 59/6
fat 0 162/14
radeon 54 1221/84
amdgpu 1411 4525/1098
With this patch:
Module Module load time (ms) Count(PLTs/GOTs)
ip_tables 18 59/6
fat 0 162/14
radeon 22 1221/84
amdgpu 45 4525/1098
Fixes: fcdfe9d22bed ("LoongArch: Add ELF and module support")
Signed-off-by: Kanglong Wang <wangkanglong@loongson.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
|
|
When the CPU is offline, the timer of LoongArch is not correctly closed.
This is harmless for real machines, but resulting in an excessively high
cpu usage rate of the offline vCPU thread in the virtual machines.
To correctly close the timer, we have made the following modifications:
Register the cpu hotplug event (CPUHP_AP_LOONGARCH_ARCH_TIMER_STARTING)
for LoongArch. This event's hooks will be called to close the timer when
the CPU is offline.
Clear the timer interrupt when the timer is turned off. Since before the
timer is turned off, there may be a timer interrupt that has already been
in the pending state due to the interruption of the disabled, which also
affects the halt state of the offline vCPU.
Signed-off-by: Xianglai Li <lixianglai@loongson.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/chenhuacai/linux-loongson
Pull LoongArch updates from Huacai Chen:
- Complete KSave registers definition
- Support the mem=<size> kernel parameter
- Support BPF dynamic modification & trampoline
- Add MMC/SDIO controller nodes in dts
- Some bug fixes and other small changes
* tag 'loongarch-6.17' of git://git.kernel.org/pub/scm/linux/kernel/git/chenhuacai/linux-loongson:
LoongArch: vDSO: Remove -nostdlib complier flag
LoongArch: dts: Add eMMC/SDIO controller support to Loongson-2K2000
LoongArch: dts: Add SDIO controller support to Loongson-2K1000
LoongArch: dts: Add SDIO controller support to Loongson-2K0500
LoongArch: BPF: Set bpf_jit_bypass_spec_v1/v4()
LoongArch: BPF: Fix the tailcall hierarchy
LoongArch: BPF: Fix jump offset calculation in tailcall
LoongArch: BPF: Add struct ops support for trampoline
LoongArch: BPF: Add basic bpf trampoline support
LoongArch: BPF: Add dynamic code modification support
LoongArch: BPF: Rename and refactor validate_code()
LoongArch: Add larch_insn_gen_{beq,bne} helpers
LoongArch: Don't use %pK through printk() in unwinder
LoongArch: Avoid in-place string operation on FDT content
LoongArch: Support mem=<size> kernel parameter
LoongArch: Make relocate_new_kernel_size be a .quad value
LoongArch: Complete KSave registers definition
|
|
This commit adds support for BPF dynamic code modification on the
LoongArch architecture:
1. Add bpf_arch_text_copy() for instruction block copying.
2. Add bpf_arch_text_poke() for runtime instruction patching.
3. Add bpf_arch_text_invalidate() for code invalidation.
On LoongArch, since symbol addresses in the direct mapping region can't
be reached via relative jump instructions from the paged mapping region,
we use the move_imm+jirl instruction pair as absolute jump instructions.
These require 2-5 instructions, so we reserve 5 NOP instructions in the
program as placeholders for function jumps.
The larch_insn_text_copy() function is solely used for BPF. And the use
of larch_insn_text_copy() requires PAGE_SIZE alignment. Currently, only
the size of the BPF trampoline is page-aligned.
Co-developed-by: George Guo <guodongtai@kylinos.cn>
Signed-off-by: George Guo <guodongtai@kylinos.cn>
Signed-off-by: Chenghao Duan <duanchenghao@kylinos.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Pull non-MM updates from Andrew Morton:
"Significant patch series in this pull request:
- "squashfs: Remove page->mapping references" (Matthew Wilcox) gets
us closer to being able to remove page->mapping
- "relayfs: misc changes" (Jason Xing) does some maintenance and
minor feature addition work in relayfs
- "kdump: crashkernel reservation from CMA" (Jiri Bohac) switches
us from static preallocation of the kdump crashkernel's working
memory over to dynamic allocation. So the difficulty of a-priori
estimation of the second kernel's needs is removed and the first
kernel obtains extra memory
- "generalize panic_print's dump function to be used by other
kernel parts" (Feng Tang) implements some consolidation and
rationalization of the various ways in which a failing kernel
splats information at the operator
* tag 'mm-nonmm-stable-2025-08-03-12-47' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (80 commits)
tools/getdelays: add backward compatibility for taskstats version
kho: add test for kexec handover
delaytop: enhance error logging and add PSI feature description
samples: Kconfig: fix spelling mistake "instancess" -> "instances"
fat: fix too many log in fat_chain_add()
scripts/spelling.txt: add notifer||notifier to spelling.txt
xen/xenbus: fix typo "notifer"
net: mvneta: fix typo "notifer"
drm/xe: fix typo "notifer"
cxl: mce: fix typo "notifer"
KVM: x86: fix typo "notifer"
MAINTAINERS: add maintainers for delaytop
ucount: use atomic_long_try_cmpxchg() in atomic_long_inc_below()
ucount: fix atomic_long_inc_below() argument type
kexec: enable CMA based contiguous allocation
stackdepot: make max number of pools boot-time configurable
lib/xxhash: remove unused functions
init/Kconfig: restore CONFIG_BROKEN help text
lib/raid6: update recov_rvv.c zero page usage
docs: update docs after introducing delaytop
...
|