| Age | Commit message (Collapse) | Author |
|
git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc
Pull more SoC driver updates from Arnd Bergmann:
"These updates came a little late, or were based on a later 6.18-rc tag
than the others:
- A new driver for cache management on cxl devices with memory shared
in a coherent cluster. This is part of the drivers/cache/ tree, but
unlike the other drivers that back the dma-mapping interfaces, this
one is needed only during CPU hotplug.
- A shared branch for reset controllers using swnode infrastructure
- Added support for new SoC variants in the Amlogic soc_device
identification
- Minor updates in Freescale, Microchip, Samsung, and Apple SoC
drivers"
* tag 'soc-drivers-6.19-2' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc: (24 commits)
soc: samsung: exynos-pmu: fix device leak on regmap lookup
soc: samsung: exynos-pmu: Fix structure initialization
soc: fsl: qbman: use kmalloc_array() instead of kmalloc()
soc: fsl: qbman: add WQ_PERCPU to alloc_workqueue users
MAINTAINERS: Update email address for Christophe Leroy
MAINTAINERS: refer to intended file in STANDALONE CACHE CONTROLLER DRIVERS
cache: Support cache maintenance for HiSilicon SoC Hydra Home Agent
cache: Make top level Kconfig menu a boolean dependent on RISCV
MAINTAINERS: Add Jonathan Cameron to drivers/cache and add lib/cache_maint.c + header
arm64: Select GENERIC_CPU_CACHE_MAINTENANCE
lib: Support ARCH_HAS_CPU_CACHE_INVALIDATE_MEMREGION
soc: amlogic: meson-gx-socinfo: add new SoCs id
dt-bindings: arm: amlogic: meson-gx-ao-secure: support more SoCs
memregion: Support fine grained invalidate by cpu_cache_invalidate_memregion()
memregion: Drop unused IORES_DESC_* parameter from cpu_cache_invalidate_memregion()
dt-bindings: cache: sifive,ccache0: add a pic64gx compatible
MAINTAINERS: rename Microchip RISC-V entry
MAINTAINERS: add new soc drivers to Microchip RISC-V entry
soc: microchip: add mfd drivers for two syscon regions on PolarFire SoC
dt-bindings: soc: microchip: document the simple-mfd syscon on PolarFire SoC
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxl
Pull compute express link (CXL) updates from Dave Jiang:
"The additions of note are adding CXL region remove support for locked
CXL decoders, adding unit testing support for XOR address translation,
and adding unit testing support for extended linear cache.
Misc:
- Remove incorrect page-allocator quirk section in documentation
- Remove unused devm_cxl_port_enumerate_dports() function
- Fix typo in cdat.c code comment
- Replace use of system_wq with system_percpu_wq
- Add locked CXL decoder support for region removal
- Return when generic target updated
- Rename region_res_match_cxl_range() to spa_maps_hpa()
- Clarify comment in spa_maps_hpa()
Enable unit testing for XOR address translation of SPA to DPA and vice versa:
- Refactor address translation funcs for testing in cxl_region
- Make the XOR calculations available for testing
- Add cxl_translate module for address translation testing in
cxl_test
Extended Linear Cache changes:
- Add extended linear cache size sysfs attribute
- Adjust failure emission of extended linear cache detection in
cxl_acpi
- Added extended linear cache unit testing support in cxl_test
Preparation refactor patches for PRM translation support:
- Simplify cxl_rd_ops allocation and handling
- Group xor arithmetric setup code in a single block
- Remove local variable @inc in cxl_port_setup_targets()"
* tag 'cxl-for-6.19' of git://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxl: (22 commits)
cxl/test: Assign overflow_err_count from log->nr_overflow
cxl/test: Remove ret_limit race condition in mock_get_event()
cxl/test: remove unused mock function for cxl_rcd_component_reg_phys()
cxl/test: Add support for acpi extended linear cache
cxl/test: Add cxl_test CFMWS support for extended linear cache
cxl/test: Standardize CXL auto region size
cxl/region: Remove local variable @inc in cxl_port_setup_targets()
cxl/acpi: Group xor arithmetric setup code in a single block
cxl: Simplify cxl_rd_ops allocation and handling
cxl: Clarify comment in spa_maps_hpa()
cxl: Rename region_res_match_cxl_range() to spa_maps_hpa()
acpi/hmat: Return when generic target is updated
cxl: Add handling of locked CXL decoder
cxl/region: Add support to indicate region has extended linear cache
cxl: Adjust extended linear cache failure emission in cxl_acpi
cxl/test: Add cxl_translate module for address translation testing
cxl/acpi: Make the XOR calculations available for testing
cxl/region: Refactor address translation funcs for testing
cxl/pci: replace use of system_wq with system_percpu_wq
cxl: fix typos in cdat.c comments
...
|
|
https://git.kernel.org/pub/scm/linux/kernel/git/conor/linux into soc/drivers-late
standalone cache drivers for v6.19
ccache:
Add a compatible for the pic64gx SoC. No driver change needed, as it
falls back to the PolarFire SoC.
hisi hha/generic cpu cache maintenance:
Add support for a non-architectural mechanism for invalidating memory
regions, needed for some cxl implementations on arm64 (and probably
elsewhere in the future). The HiSilicon Hydra Home Agent is the first
driver to provide this support.
Signed-off-by: Conor Dooley <conor.dooley@microchip.com>
* tag 'cache-for-v6.19' of https://git.kernel.org/pub/scm/linux/kernel/git/conor/linux:
MAINTAINERS: refer to intended file in STANDALONE CACHE CONTROLLER DRIVERS
cache: Support cache maintenance for HiSilicon SoC Hydra Home Agent
cache: Make top level Kconfig menu a boolean dependent on RISCV
MAINTAINERS: Add Jonathan Cameron to drivers/cache and add lib/cache_maint.c + header
arm64: Select GENERIC_CPU_CACHE_MAINTENANCE
lib: Support ARCH_HAS_CPU_CACHE_INVALIDATE_MEMREGION
memregion: Support fine grained invalidate by cpu_cache_invalidate_memregion()
memregion: Drop unused IORES_DESC_* parameter from cpu_cache_invalidate_memregion()
dt-bindings: cache: sifive,ccache0: add a pic64gx compatible
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
|
|
Extend cpu_cache_invalidate_memregion() to support invalidating a
particular range of memory by introducing start and length parameters.
Control of types of invalidation is left for when use cases turn up. For
now everything is Clean and Invalidate.
Where the range is unknown, use the provided cpu_cache_invalidate_all()
helper to act as documentation of intent in a fashion that is clearer than
passing (0, -1) to cpu_cache_invalidate_memregion().
Signed-off-by: Yicong Yang <yangyicong@hisilicon.com>
Reviewed-by: Dan Williams <dan.j.williams@intel.com>
Acked-by: Davidlohr Bueso <dave@stgolabs.net>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Conor Dooley <conor.dooley@microchip.com>
|
|
cpu_cache_invalidate_memregion()
The res_desc parameter was originally introduced for documentation purposes
and with the idea that with HDM-DB CXL invalidation could be triggered from
the device. That has not come to pass and the continued existence of the
option is confusing when we add a range in the following patch which might
not be a strict subset of the res_desc. So avoid that confusion by dropping
the parameter.
Link: https://lore.kernel.org/linux-mm/686eedb25ed02_24471002e@dwillia2-xfh.jf.intel.com.notmuch/
Reviewed-by: Dan Williams <dan.j.williams@intel.com>
Suggested-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Conor Dooley <conor.dooley@microchip.com>
|
|
- Simplify cxl_rd_ops allocation
- Group xor arithmetric setup code
- Remove local variable @inc in cxl_port_setup_targets()
|
|
Simplify the code by removing local variable @inc. The variable is not
used elsewhere, remove it and directly increment the target number.
Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com>
Signed-off-by: Robert Richter <rrichter@amd.com>
Link: https://patch.msgid.link/20251114075844.1315805-4-rrichter@amd.com
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
|
|
Simplify the xor arithmetric setup code by grouping it in a single
block. No need to split the block for QoS setup.
It is safe to reorder the call of cxl_setup_extended_linear_cache()
because there are no dependencies.
Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com>
Signed-off-by: Robert Richter <rrichter@amd.com>
Tested-by: Gregory Price <gourry@gourry.net>
Link: https://patch.msgid.link/20251114075844.1315805-3-rrichter@amd.com
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
|
|
A root decoder's callback handlers are collected in struct cxl_rd_ops.
The structure is dynamically allocated, though it contains only a few
pointers in it. This also requires to check two pointes to check for
the existence of a callback.
Simplify the allocation, release and handler check by embedding the
ops statically in struct cxl_root_decoder.
Implementation is equivalent to how struct cxl_root_ops handles the
callbacks.
[ dj: Fix spelling error in commit log. ]
Reviewed-by: Dave Jiang <dave.jiang@intel.com>
Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com>
Signed-off-by: Robert Richter <rrichter@amd.com>
Link: https://patch.msgid.link/20251114075844.1315805-2-rrichter@amd.com
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
|
|
- Add extended linear cache size sysfs attribute.
- Adjust failure emission of extended linear cache detection in cxl_acpi.
|
|
Enable unit testing for XOR address translation of SPA to DPA and vice versa.
|
|
Update the comment in spa_maps_hpa() to clearly convey the construction
of extended linear cache.
Suggested-by: Dan Williams <dan.j.williams@intel.com>
Link: https://lore.kernel.org/linux-cxl/68eea19c7e67e_2f899100a8@dwillia2-mobl4.notmuch/
Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com>
Reviewed-by: Gregory Price <gourry@gourry.net>
Link: https://patch.msgid.link/20251106170108.1468304-3-dave.jiang@intel.com
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
|
|
The function name region_res_match_cxl_range() does not accurately
convey the operation of address comparison with cache size. Rename
to spa_maps_hpa() to provide a better function name.
Suggested-by: Dan Williams <dan.j.williams@intel.com>
Link: https://lore.kernel.org/linux-cxl/68eea19c7e67e_2f899100a8@dwillia2-mobl4.notmuch/
Reviewed-by: Jonathan Cameron <jonathan.cameron@huwei.com>
Reviewed-by: Gregory Price <gourry@gourry.net>
Link: https://patch.msgid.link/20251106170108.1468304-2-dave.jiang@intel.com
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
|
|
When a decoder is locked, it means that its configuration cannot be
changed. CXL spec r3.2 8.2.4.20.13 discusses the details regarding
locked decoders. Locking happens when bit 8 of the decoder control
register is set and then the decoder is committed afterwards (CXL
spec r3.2 8.2.4.20.7).
Given that the driver creates a virtual decoder for each CFMWS, the
Fixed Device Configuration (bit 4) of the Window Restriction field is
considered as locking for the virtual decoder by the driver.
The current driver code disregards the locked status and a region can
be destroyed regardless of the locking state.
Add a region flag to indicate the region is in a locked configuration.
The driver will considered a region locked if the CFMWS or any decoder
is configured as locked. The consideration is all or nothing regarding
the locked state. It is reasonable to determine the region "locked"
status while the region is being assembled based on the decoders.
Add a check in region commit_store() to intercept when a 0 is written
to the commit sysfs attribute in order to prevent the destruction of a
region when in locked state. This should be the only entry point from user
space to destroy a region.
Add a check is added to cxl_decoder_reset() to prevent resetting a locked
decoder within the kernel driver.
Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com>
Link: https://patch.msgid.link/20251105201826.2901915-1-dave.jiang@intel.com
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
|
|
The HPA to DPA translation for poison injection assumes that the
base address starts from where the CXL region begins. When the
extended linear cache is active, the offset can be within the DRAM
region. Adjust the offset so that it correctly reflects the offset
within the CXL region.
[ dj: Add fixes tag from Alison ]
Fixes: c3dd67681c70 ("cxl/region: Add inject and clear poison by region offset")
Link: https://patch.msgid.link/20251031173224.3537030-5-dave.jiang@intel.com
Reviewed-by: Alison Schofield <alison.schofield@intel.com>
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
|
|
Add a region sysfs attribute to show the size of the extended linear
cache if there is any. The attribute is invisible when the cache
size is 0, which indicates it does not exist.
Moved the cxl_region_visible() location in order to pick up the
new sysfs attribute definition.
[ dj: Fixed spelling errors noted by Benjamin ]
Reviewed-by: Alison Schofield <alison.schofield@intel.com>
Reviewed-by: Ben Cheatham <benjamin.cheatham@amd.com>
Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com>
Link: https://patch.msgid.link/20251022203052.4078527-1-dave.jiang@intel.com
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
|
|
The cxl_acpi module spams "Extended linear cache calculation failed"
when the hmat memory target is not found for a node. This is normal
when the memory target does not contain extended linear cache
attributes. Adjust cxl_acpi_set_cache_size() to just return 0 if error
is returned from hmat_get_extended_linear_cache_size(). That is the
only error returned from hmat_get_extended_linear_cache_size() as
-ENOENT.
Also remove the check for -EOPNOTSUPP in cxl_setup_extended_linear_cache()
since that errno is never returned by cxl_acpi_set_cache_size().
[dj: Flipped minor return logic suggested by Jonathan ]
Suggested-by: Dan Williams <dan.j.williams@intel.com>
Reviewed-by: Alison Schofield <alison.schofield@intel.com>
Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com>
Link: https://patch.msgid.link/20251003185509.3215900-1-dave.jiang@intel.com
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
|
|
In preparation for adding a test module that can exercise the address
translation functions performed by the CXL Driver, refactor the XOR
implementation like this:
- Extract the core calculation into a standalone helper function,
- Export the new function for use by test module cxl_translate only,
- Enhance the parameter validation since this new function will be
called from a test module with no guarantee of valid parameters,
- Move the define of struct cxl_cxims_data to cxl.h so the test module
can build xormaps.
Reviewed-by: Dave Jiang <dave.jiang@intel.com>
Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com>
Signed-off-by: Alison Schofield <alison.schofield@intel.com>
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
|
|
In preparation for adding a test module that exercises the address
translation calculations, extract the core calculations into stand-
alone functions that operate on base parameters without dependencies
on struct cxl_region.
Perform additional parameter validation to protect against a test
module sending bad parameters. Export the validation function, as
well as the three core translation functions for use by test module
cxl_translate only.
This refactoring enables unit testing of the address translation logic
with controlled inputs, while preserving identical functionality in
the existing code paths.
Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com>
Reviewed-by: Dave Jiang <dave.jiang@intel.com>
Signed-off-by: Alison Schofield <alison.schofield@intel.com>
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
|
|
Currently if a user enqueue a work item using schedule_delayed_work() the
used wq is "system_wq" (per-cpu wq) while queue_delayed_work() use
WORK_CPU_UNBOUND (used when a cpu is not specified). The same applies to
schedule_work() that is using system_wq and queue_work(), that makes use
again of WORK_CPU_UNBOUND.
This lack of consistency cannot be addressed without refactoring the API.
system_wq should be the per-cpu workqueue, yet in this name nothing makes
that clear, so replace system_wq with system_percpu_wq.
The old wq (system_wq) will be kept for a few release cycles.
See 128ea9f6ccfb ("workqueue: Add system_percpu_wq and system_dfl_wq")
for cause of changes.
[ dj: Add reference to commit that initiated the change. ]
Suggested-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Marco Crivellari <marco.crivellari@suse.com>
Acked-by: Davidlohr Bueso <dave@stgolabs.net>
Reviewed-by: Ira Weiny <ira.weiny@intel.com>
Reviewed-by: Dave Jiang <dave.jiang@intel.com>> ---
Link: https://patch.msgid.link/20251030163839.307752-1-marco.crivellari@suse.com
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
|
|
- Corrected spelling of "bandwdith" -> "bandwidth"
- Fixed "wht" -> "with"
Signed-off-by: Alok Tiwari <alok.a.tiwari@oracle.com>
Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com>
Reviewed-by: Dave Jiang <dave.jiang@intel.com>
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
|
|
devm_cxl_port_enumerate_dports() is not longer used after below commit
commit 4f06d81e7c6a ("cxl: Defer dport allocation for switch ports")
Delete it and the relevant interface implemented in cxl_test.
Signed-off-by: Li Ming <ming.li@zohomail.com>
Reviewed-by: Dave Jiang <dave.jiang@intel.com>
Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com>
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
|
|
Traces of cxl_poison events include an hpa_alias0 field if the poison
address is in a region configured with an ELC, Extended Linear Cache.
Since the ELC always comes first in the region, the calculation needs
to subtract the ELC size from the calculated HPA address.
Fixes: 8c520c5f1e76 ("cxl: Add extended linear cache address alias emission for cxl events")
Signed-off-by: Alison Schofield <alison.schofield@intel.com>
Reviewed-by: Dave Jiang <dave.jiang@intel.com>
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
|
|
KASAN reports a stack-out-of-bounds access in validate_region_offset()
while running the cxl-poison.sh unit test because the printk format
specifier, %pr format, is not a match for the resource_size_t type of
the variables. %pr expects struct resource pointers and attempts to
dereference the structure fields, reading beyond the bounds of the
stack variables.
Since these messages emit an 'A exceeds B' type of message, keep
the resource_size_t's and use the %pa specifier to be architecture
safe.
BUG: KASAN: stack-out-of-bounds in resource_string.isra.0+0xe9a/0x1690
[] Read of size 8 at addr ffff88800a7afb40 by task bash/1397
...
[] The buggy address belongs to stack of task bash/1397
[] and is located at offset 56 in frame:
[] validate_region_offset+0x0/0x1c0 [cxl_core]
Fixes: c3dd67681c70 ("cxl/region: Add inject and clear poison by region offset")
Signed-off-by: Alison Schofield <alison.schofield@intel.com>
Reviewed-by: Dave Jiang <dave.jiang@intel.com>
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
|
|
match_region_by_range() is not using the helper function that also takes
extended linear cache size into account when comparing regions. This
causes a x2 region to show up as 2 partial incomplete regions rather
than a single CXL region with extended linear cache support. Replace
the open coded compare logic with the proper helper function for
comparison. User visible impact is that when 'cxl list' is issued,
no activa CXL region(s) are shown. There may be multiple idle regions
present. No actual active CXL region is present in the kernel.
[dj: Fix stable address]
Fixes: 0ec9849b6333 ("acpi/hmat / cxl: Add extended linear cache support for CXL")
Cc: stable@vger.kernel.org
Reviewed-by: Gregory Price <gourry@gourry.net>
Reviewed-by: Alison Schofield <alison.schofield@intel.com>
Reviewed-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
|
|
The function takes two parameters and compares them. The second parameter
should be const since no modification should be done to it.
Reviewed-by: Gregory Price <gourry@gourry.net>
Reviewed-by: Alison Schofield <alison.schofield@intel.com>
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
|
|
In order to compare the resource against the HMAT memory target,
the resource needs to be memory type. Change the DEFINE_RES()
macro to DEFINE_RES_MEM() in order to set the correct resource type.
hmat_get_extended_linear_cache_size() uses resource_contains()
internally. This causes a regression for platforms with the
extended linear cache enabled as the comparison always fails and the
cache size is not set. User visible impact is that when 'cxl list' is
issued, a CXL region with extended linear cache support will only
report half the size of the actual size. And this also breaks MCE
reporting of the memory region due to incorrect offset calculation
for the memory.
[dj: Fixup commit log suggested by djbw]
[dj: Fixup stable address for cc]
Fixes: 12b3d697c812 ("cxl: Remove core/acpi.c and cxl core dependency on ACPI")
Cc: stable@vger.kernel.org
Reviewed-by: Gregory Price <gourry@gourry.net>
Reviewed-by: Alison Schofield <alison.schofield@intel.com>
Reviewed-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
|
|
cxl EDAC calls cxl_feature_info() to get the feature information and
if the hardware has no Features support, cxlfs may be passed in as
NULL.
[ 51.957498] BUG: kernel NULL pointer dereference, address: 0000000000000008
[ 51.965571] #PF: supervisor read access in kernel mode
[ 51.971559] #PF: error_code(0x0000) - not-present page
[ 51.977542] PGD 17e4f6067 P4D 0
[ 51.981384] Oops: Oops: 0000 [#1] SMP NOPTI
[ 51.986300] CPU: 49 UID: 0 PID: 3782 Comm: systemd-udevd Not tainted 6.17.0dj
test+ #64 PREEMPT(voluntary)
[ 51.997355] Hardware name: <removed>
[ 52.009790] RIP: 0010:cxl_feature_info+0xa/0x80 [cxl_core]
Add a check for cxlfs before dereferencing it and return -EOPNOTSUPP if
there is no cxlfs created due to no hardware support.
Fixes: eb5dfcb9e36d ("cxl: Add support to handle user feature commands for set feature")
Reviewed-by: Davidlohr Bueso <dave@stgolabs.net>
Reviewed-by: Alison Schofield <alison.schofield@intel.com>
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
|
|
port->nr_dports is used to represent how many dports added to the cxl
port, it will increase in add_dport() when a new dport is being added to
the cxl port, but it will not be reduced when a dport is removed from
the cxl port.
Currently, when the first dport is added to a cxl port, it will trigger
component registers setup on the cxl port, the implementation is using
port->nr_dports to confirm if the dport is the first dport.
A corner case here is that adding dport could fail after port->nr_dports
updating and before checking port->nr_dports for component registers
setup. If the failure happens during the first dport attaching, it will
cause that CXL subsystem has not chance to execute component registers
setup for the cxl port. the failure flow like below:
port->nr_dports = 0
dport 1 adding to the port:
add_dport() # port->nr_dports: 1
failed on devm_add_action_or_reset() or sysfs_create_link()
return error # port->nr_dports: 1
dport 2 adding to the port:
add_dport() # port->nr_dports: 2
no failure
skip component registers setup because of port->nr_dports is 2
The solution here is that moving component registers setup closer to
add_dport(), so if add_dport() is executed correctly for the first
dport, component registers setup on the port will be executed
immediately after that.
Fixes: f6ee24913de2 ("cxl: Move port register setup to when first dport appear")
Signed-off-by: Li Ming <ming.li@zohomail.com>
Reviewed-by: Dave Jiang <dave.jiang@intel.com>
Reviewed-by: Davidlohr Bueso <dave@stgolabs.net>
Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com>
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
|
|
Add changes to delay the allocation and setup of dports until when the
endpoint device is being probed. At this point, the CXL link is
established from endpoint to host bridge. Addresses issues seen on
some platforms when dports are probed earlier.
Link: https://lore.kernel.org/linux-cxl/20250829180928.842707-1-dave.jiang@intel.com/
|
|
This patch moves the port register setup to when the first dport appears
via the memdev probe path. At this point, the CXL link should be
established and the register access is expected to succeed. This change
addresses an error message observed when PCIe hotplug is enabled on
an Intel platform. The error messages "cxl portN: Couldn't locate the
CXL.cache and CXL.mem capability array header" is observed for the
host bridge (CHBCR) during cxl_acpi driver probe. If the cxl_acpi module
probe is running before the CXL link between the endpoint device and the
RP is established, then the platform may not have exposed DVSEC ID 3
and/or DVSEC ID 7 blocks which will trigger the error message. This
behavior is defined by the CXL spec r3.2 9.12.3 for RPs and DSPs, however
the Intel platform also added this behavior to the host bridge.
This change also needs the dport enumeration to be moved to the memdev
probe path in order to address the issue. This change is not a wholly
contained solution by itself.
[dj: Add missing var init during port alloc]
Suggested-by: Dan Williams <dan.j.williams@intel.com>
Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com>
Tested-by: Robert Richter <rrichter@amd.com>
Reviewed-by: Alison Schofield <alison.schofield@intel.com>
Reviewed-by: Ira Weiny <ira.weiny@intel.com>
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
|
|
While cxl_switch_parse_cdat() is harmless to be run multiple times, it is
not efficient in the current scheme where one dport is being updated at
a time by the memdev probe path. Change the input parameter to the
specific dport being updated to pick up the SSLBIS information for just
that dport.
Reviewed-by: Gregory Price <gourry@gourry.net>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Li Ming <ming.li@zohomail.com>
Reviewed-by: Alison Schofield <alison.schofield@intel.com>
Reviewed-by: Dan Williams <dan.j.williams@intel.com>
Tested-by: Robert Richter <rrichter@amd.com>
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
|
|
With devm_cxl_switch_port_decoders_setup() being called within cxl_core
instead of by the port driver probe, adjustments are needed to deal with
circular symbol dependency when this function is being mock'd. Add the
appropriate changes to get around the circular dependency.
Reviewed-by: Alison Schofield <alison.schofield@intel.com>
Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com>
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
|
|
devm_cxl_add_dport_by_dev() outside of cxl_test is done through PCI
hierarchy. However with cxl_test, it needs to be done through the
platform device hierarchy. Add the mock function for
devm_cxl_add_dport_by_dev().
When cxl_core calls a cxl_core exported function and that function is
mocked by cxl_test, the call chain causes a circular dependency issue. Dan
provided a workaround to avoid this issue. Apply the method to changes from
the late dport allocation changes in order to enable cxl-test.
In cxl_core they are defined with "__" added in front of the function. A
macro is used to define the original function names for when non-test
version of the kernel is built. A bit of macros and typedefs are used to
allow mocking of those functions in cxl_test.
Co-developed-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com>
Reviewed-by: Li Ming <ming.li@zohomail.com>
Tested-by: Alison Schofield <alison.schofield@intel.com>
Tested-by: Robert Richter <rrichter@amd.com>
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
|
|
The current implementation enumerates the dports during the cxl_port
driver probe. Without an endpoint connected, the dport may not be
active during port probe. This scheme may prevent a valid hardware
dport id to be retrieved and MMIO registers to be read when an endpoint
is hot-plugged. Move the dport allocation and setup to behind memdev
probe so the endpoint is guaranteed to be connected.
In the original enumeration behavior, there are 3 phases (or 2 if no CXL
switches) for port creation. cxl_acpi() creates a Root Port (RP) from the
ACPI0017.N device. Through that it enumerates downstream ports composed
of ACPI0016.N devices through add_host_bridge_dport(). Once done, it
uses add_host_bridge_uport() to create the ports that enumerate the PCI
RPs as the dports of these ports. Every time a port is created, the port
driver is attached, cxl_switch_porbe_probe() is called and
devm_cxl_port_enumerate_dports() is invoked to enumerate and probe
the dports.
The second phase is if there are any CXL switches. When the pci endpoint
device driver (cxl_pci) calls probe, it will add a mem device and triggers
the cxl_mem_probe(). cxl_mem_probe() calls devm_cxl_enumerate_ports()
and attempts to discovery and create all the ports represent CXL switches.
During this phase, a port is created per switch and the attached dports
are also enumerated and probed.
The last phase is creating endpoint port which happens for all endpoint
devices.
The new sequence is instead of creating all possible dports at initial
port creation, defer port instantiation until a memdev beneath that
dport arrives. Introduce devm_cxl_create_or_extend_port() to centralize
the creation and extension of ports with new dports as memory devices
arrive. As part of this rework, switch decoder target list is amended
at runtime as dports show up.
While the decoders are allocated during the port driver probe,
The decoders must also be updated since previously they were setup when
all the dports are setup. Now every time a dport is setup per endpoint,
the switch target listing need to be updated with new dport. A
guard(rwsem_write) is used to update decoder targets. This is similar to
when decoder_populate_target() is called and the decoder programming
must be protected.
Also the port registers are probed the first time when the first dport
shows up. This ensures that the CXL link is established when the port
registers are probed.
[dj] Use ERR_CAST() (Jonathan)
Link: https://lore.kernel.org/linux-cxl/20250305100123.3077031-1-rrichter@amd.com/
Reviewed-by: Alison Schofield <alison.schofield@intel.com>
Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com>
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
|
|
Group the decoder setup code in switch and endpoint port probe into a
single function for each to reduce the number of functions to be mocked
in cxl_test. Introduce devm_cxl_switch_port_decoders_setup() and
devm_cxl_endpoint_decoders_setup(). These two functions will be mocked
instead with some functions optimized out since the mock version does
not do anything. Remove devm_cxl_setup_hdm(),
devm_cxl_add_passthrough_decoder(), and devm_cxl_enumerate_decoders() in
cxl_test mock code. In turn, mock_cxl_add_passthrough_decoder() can be
removed since cxl_test does not setup passthrough decoders.
__wrap_cxl_hdm_decode_init() and __wrap_cxl_dvsec_rr_decode() can be
removed as well since they only return 0 when called.
[dj: drop 'struct cxl_port' forward declaration (Robert)]
Suggested-by: Robert Richter <rrichter@amd.com>
Reviewed-by: Alison Schofield <alison.schofield@intel.com>
Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com>
Reviewed-by: Robert Richter <rrichter@amd.com>
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
|
|
Add a cached copy of the hardware port-id list that is available at init
before all @dport objects have been instantiated. Change is in preparation
of delayed dport instantiation.
Reviewed-by: Robert Richter <rrichter@amd.com>
Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com>
Tested-by: Robert Richter <rrichter@amd.com>
Reviewed-by: Alison Schofield <alison.schofield@intel.com>
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
|
|
Refactor the code in reap_dports() out to provide a helper function that
reaps a single dport. This will be used later in the cleanup path for
allocating a dport. Renaming to del_port() and del_dports() to mirror
devm_cxl_add_dport().
[dj] Fixed up subject per Robert
Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com>
Reviewed-by: Li Ming <ming.li@zohomail.com>
Reviewed-by: Alison Schofield <alison.schofield@intel.com>
Reviewed-by: Dan Williams <dan.j.williams@intel.com>
Tested-by: Robert Richter <rrichter@amd.com>
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
|
|
Add a helper to replace the open code detection of CXL device hierarchy
root, or the host bridge. The helper will be used for delayed downstream
port (dport) creation.
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Li Ming <ming.li@zohomail.com>
Reviewed-by: Dan Williams <dan.j.williams@intel.com>
Reviewed-by: Alison Schofield <alison.schofield@intel.com>
Reviewed-by: Robert Richter <rrichter@amd.com>
Tested-by: Robert Richter <rrichter@amd.com>
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
|
|
ACPICA commit 710745713ad3a2543dbfb70e84764f31f0e46bdc
This has been renamed in more recent CXL specs, as
type3 (memory expanders) can also use HDM-DB for
device coherent memory.
Link: https://github.com/acpica/acpica/commit/710745713ad3a2543dbfb70e84764f31f0e46bdc
Acked-by: Rafael J. Wysocki (Intel) <rafael@kernel.org>
Signed-off-by: Davidlohr Bueso <dave@stgolabs.net>
Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com>
Reviewed-by: Gregory Price <gourry@gourry.net>
Reviewed-by: Dave Jiang <dave.jiang@intel.com>
Link: https://patch.msgid.link/20250908160034.86471-1-dave@stgolabs.net
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
|
|
Update the CXL memory hotplug notifier to update the NUMA node access
coordinates directly rather than go through the HMAT memory hotplug
notifier.
|
|
Remove deadcode since CXL no longer calls hmat_update_target_coordinates().
Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com>
Link: https://patch.msgid.link/20250829222907.1290912-5-dave.jiang@intel.com
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
|
|
The current implementation of CXL memory hotplug notifier gets called
before the HMAT memory hotplug notifier. The CXL driver calculates the
access coordinates (bandwidth and latency values) for the CXL end to
end path (i.e. CPU to endpoint). When the CXL region is onlined, the CXL
memory hotplug notifier writes the access coordinates to the HMAT target
structs. Then the HMAT memory hotplug notifier is called and it creates
the access coordinates for the node sysfs attributes.
During testing on an Intel platform, it was found that although the
newly calculated coordinates were pushed to sysfs, the sysfs attributes for
the access coordinates showed up with the wrong initiator. The system has
4 nodes (0, 1, 2, 3) where node 0 and 1 are CPU nodes and node 2 and 3 are
CXL nodes. The expectation is that node 2 would show up as a target to node
0:
/sys/devices/system/node/node2/access0/initiators/node0
However it was observed that node 2 showed up as a target under node 1:
/sys/devices/system/node/node2/access0/initiators/node1
The original intent of the 'ext_updated' flag in HMAT handling code was to
stop HMAT memory hotplug callback from clobbering the access coordinates
after CXL has injected its calculated coordinates and replaced the generic
target access coordinates provided by the HMAT table in the HMAT target
structs. However the flag is hacky at best and blocks the updates from
other CXL regions that are onlined in the same node later on. Remove the
'ext_updated' flag usage and just update the access coordinates for the
nodes directly without touching HMAT target data.
The hotplug memory callback ordering is changed. Instead of changing CXL,
move HMAT back so there's room for the levels rather than have CXL share
the same level as SLAB_CALLBACK_PRI. The change will resulting in the CXL
callback to be executed after the HMAT callback.
With the change, the CXL hotplug memory notifier runs after the HMAT
callback. The HMAT callback will create the node sysfs attributes for
access coordinates. The CXL callback will write the access coordinates to
the now created node sysfs attributes directly and will not pollute the
HMAT target values.
A nodemask is introduced to keep track if a node has been updated and
prevents further updates.
Fixes: 067353a46d8c ("cxl/region: Add memory hotplug notifier for cxl region")
Cc: stable@vger.kernel.org
Tested-by: Marc Herbert <marc.herbert@linux.intel.com>
Reviewed-by: Dan Williams <dan.j.williams@intel.com>
Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com>
Link: https://patch.msgid.link/20250829222907.1290912-4-dave.jiang@intel.com
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
|
|
0day reported warnings of:
drivers/cxl/core/region.c:3664:25: warning: format '%llx' expects argument of type 'long long unsigned int', but argument 4 has type 'resource_size_t' {aka 'unsigned int'} [-Wformat=]
drivers/cxl/core/region.c:3671:37: warning: format '%llx' expects argument of type 'long long unsigned int', but argument 4 has type 'resource_size_t' {aka 'unsigned int'} [-Wformat=]
Replace %#llx with %pr to emit resource_size_t arguments.
Reported-by: kernel test robot <lkp@intel.com>
Closes: https://lore.kernel.org/oe-kbuild-all/202508160513.NAZ9i9rQ-lkp@intel.com/
Cc: Alison Schofield <alison.schofield@intel.com>
Reviewed-by: Alison Schofield <alison.schofield@intel.com>
Link: https://patch.msgid.link/20250818153953.3658952-1-dave.jiang@intel.com
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
|
|
Add support to allow expert users to inject and clear poison for the CXL
subsystem by writing a System Physical Address (SPA) to a debugfs file.
|
|
Add CXL region debugfs attributes to inject and clear poison based
on an offset into the region. These new interfaces allow users to
operate on poison at the region level without needing to resolve
Device Physical Addresses (DPA) or target individual memdevs.
The implementation uses a new helper, region_offset_to_dpa_result()
that applies decoder interleave logic, including XOR-based address
decoding when applicable. Note that XOR decodes rely on driver
internal xormaps which are not exposed to userspace. So, this support
is not only a simplification of poison operations that could be done
using existing per memdev operations, but also it enables this
functionality for XOR interleaved regions for the first time.
New debugfs attributes are added in /sys/kernel/debug/cxl/regionX/:
inject_poison and clear_poison. These are only exposed if all memdevs
participating in the region support both inject and clear commands,
ensuring consistent and reliable behavior across multi-device regions.
If tracing is enabled, these operations are logged as cxl_poison
events in /sys/kernel/tracing/trace.
The ABI documentation warns users of the significant risks that
come with using these capabilities.
A CXL Maturity Map update shows this user flow is now supported.
Signed-off-by: Alison Schofield <alison.schofield@intel.com>
Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com>
Link: https://patch.msgid.link/f3fd8628ab57ea79704fb2d645902cd499c066af.1754290144.git.alison.schofield@intel.com
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
|
|
The core functions that validate and send inject and clear commands
to the memdev devices require holding both the dpa_rwsem and the
region_rwsem.
In preparation for another caller of these functions that must hold
the locks upon entry, split the work into a locked and unlocked pair.
Consideration was given to moving the locking to both callers,
however, the existing caller is not in the core (mem.c) and cannot
access the locks.
Signed-off-by: Alison Schofield <alison.schofield@intel.com>
Reviewed-by: Dave Jiang <dave.jiang@intel.com>
Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com>
Link: https://patch.msgid.link/1d601f586975195733984ca63d1b5789bbe8690f.1754290144.git.alison.schofield@intel.com
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
|
|
Add infrastructure to translate System Physical Addresses (SPA) to
Device Physical Addresses (DPA) within CXL regions. This capability
will be used by follow-on patches that add poison inject and clear
operations at the region level.
The SPA-to-DPA translation process follows these steps:
1. Apply root decoder transformations (SPA to HPA) if configured.
2. Extract the position in region interleave from the HPA offset.
3. Extract the DPA offset from the HPA offset.
4. Use position to find endpoint decoder.
5. Use endpoint decoder to find memdev and calculate DPA from offset.
6. Return the result - a memdev and a DPA.
It is Step 1 above that makes this a driver level operation and not
work we can push to user space. Rather than exporting the XOR maps for
root decoders configured with XOR interleave, the driver performs this
complex calculation for the user.
Steps 2 and 3 follow the CXL Spec 3.2 Section 8.2.4.20.13
Implementation Note: Device Decode Logic.
These calculations mirror much of the logic introduced earlier in DPA
to SPA translation, see cxl_dpa_to_hpa(), where the driver needed to
reverse the spec defined 'Device Decode Logic'.
Signed-off-by: Alison Schofield <alison.schofield@intel.com>
Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com>
Link: https://patch.msgid.link/422f0e27742c6ca9a11f7cd83e6ba9fa1a8d0c74.1754290144.git.alison.schofield@intel.com
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
|
|
When DPA->SPA translation was introduced, it included a helper that
applied the XOR maps to do the CXL HPA -> SPA translation for XOR
region interleaves. In preparation for adding SPA->DPA address
translation, introduce the reverse callback.
The root decoder callback is defined generically and not all usages
may be self inverting like this XOR function. Add another root decoder
callback that is the spa_to_hpa function.
Update the existing cxl_xor_hpa_to_spa() with a name that reflects
what it does without directionality: cxl_apply_xor_maps(), a generic
parameter: addr replaces hpa, and code comments stating that the
function supports the translation in either direction.
Signed-off-by: Alison Schofield <alison.schofield@intel.com>
Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com>
Link: https://patch.msgid.link/79d9d72230c599cae94d7221781ead6392ae6d3f.1754290144.git.alison.schofield@intel.com
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
|
|
The root decoder's HPA to SPA translation logic was implemented using
a single function pointer. In preparation for additional per-decoder
callbacks, convert this into a struct cxl_rd_ops and move the
hpa_to_spa pointer into it.
To avoid maintaining a static ops instance populated with mostly NULL
pointers, allocate the ops structure dynamically only when a platform
requires overrides (e.g. XOR interleave decoding).
The setup can be extended as additional callbacks are added.
Co-developed-by: Dave Jiang <dave.jiang@intel.com>
Signed-off-by: Alison Schofield <alison.schofield@intel.com>
Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com>
Link: https://patch.msgid.link/818530c82c351a9c0d3a204f593068dd2126a5a9.1754290144.git.alison.schofield@intel.com
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
|