| Age | Commit message (Collapse) | Author |
|
The DWC core clears the L1 Substates Supported bits unless the driver sets
the "dw_pcie.l1ss_support" flag.
The tegra194 init_host_aspm() sets "dw_pcie.l1ss_support" if the platform
has the "supports-clkreq" DT property. If "supports-clkreq" is absent,
"dw_pcie.l1ss_support" is not set, and the DWC core will clear the L1
Substates Supported bits.
The tegra194 code to clear the L1 Substates Supported bits is unnecessary,
so remove it.
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Link: https://patch.msgid.link/20251118214312.2598220-3-helgaas@kernel.org
|
|
L1 PM Substates require the CLKREQ# signal and may also require
device-specific support. If CLKREQ# is not supported or driver support is
lacking, enabling L1.1 or L1.2 may cause errors when accessing devices,
e.g.,
nvme nvme0: controller is down; will reset: CSTS=0xffffffff, PCI_STATUS=0x10
If the kernel is built with CONFIG_PCIEASPM_POWER_SUPERSAVE=y or users
enable L1.x via sysfs, users may trip over these errors even if L1
Substates haven't been enabled by firmware or the driver.
To prevent such errors, disable advertising the L1 PM Substates unless the
driver sets "dw_pcie.l1ss_support" to indicate that it knows CLKREQ# is
present and any device-specific configuration has been done.
Set "dw_pcie.l1ss_support" in tegra194 (if DT includes the
"supports-clkreq' property) and qcom (for cfg_2_7_0, cfg_1_9_0, cfg_1_34_0,
and cfg_sc8280xp controllers) so they can continue to use L1 Substates.
Based on Niklas's patch:
https://patch.msgid.link/20251017163252.598812-2-cassel@kernel.org
[bhelgaas: drop hiding for endpoints]
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Link: https://patch.msgid.link/20251118214312.2598220-2-helgaas@kernel.org
|
|
As per DesignWare Cores PCI Express Controller Databook, section 5.50,
SII: Debug Signals, cxpl_debug_info[63:0]:
[5:0] smlh_ltssm_state: LTSSM current state. Encoding is same as the
dedicated smlh_ltssm_state output.
The mask should be 6 bits, from 0 to 5. Hence, fix the mask definition.
Fixes: 23fe5bd4be90 ("PCI: keystone: Cleanup ks_pcie_link_up()")
Signed-off-by: Shawn Lin <shawn.lin@rock-chips.com>
[mani: reworded description]
Signed-off-by: Manivannan Sadhasivam <mani@kernel.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Link: https://patch.msgid.link/1763122140-203068-1-git-send-email-shawn.lin@rock-chips.com
|
|
TC9563 is a PCIe switch that has one upstream and three downstream ports.
One of the downstream ports is connected to an integrated ethernet MAC
endpoint. The other two downstream ports are available to connect to
external devices. One Host can connect to TC9563 by upstream port. The
TC9563 switch needs to be configured after powering on and before the PCIe
link is up.
The PCIe controller driver already enables link training at the host side
even before this driver probe happens. Due to this, when driver enables
power to the switch, it participates in link training and the PCIe link may
come up before configuring the switch through I2C. Once the link is up the
configuration done through I2C will not have any effect. To prevent the
host from participating in link training, disable link training on the host
side to ensure the link does not come up before the switch is configured
via I2C.
Based on DT property and type of the port, TC9563 is configured through
I2C.
Signed-off-by: Krishna Chaitanya Chundru <krishna.chundru@oss.qualcomm.com>
[bhelgaas: squash fixes from
https://lore.kernel.org/r/20251120065116.13647-2-mani@kernel.org
https://lore.kernel.org/r/20251120065116.13647-3-mani@kernel.org]
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Bjorn Andersson <andersson@kernel.org>
Reviewed-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
Reviewed-by: Manivannan Sadhasivam <mani@kernel.org>
Link: https://patch.msgid.link/20251101-tc9563-v9-6-de3429f7787a@oss.qualcomm.com
|
|
In this code:
used_buses = max_t(unsigned int, available_buses,
pci_hotplug_bus_size - 1);
max_t() casts the 'unsigned long' pci_hotplug_bus_size (either 32 or 64
bits) to 'unsigned int' (32 bits) result type, so there's a potential of
discarding significant bits.
Instead, use max(a, b), which casts 'unsigned int' to 'unsigned long' and
cannot discard significant bits.
In this case, pci_hotplug_bus_size is constrained to <= 0xff by pci_setup()
so this doesn't fix a bug, but it makes static analysis easier.
Signed-off-by: David Laight <david.laight.linux@gmail.com>
[bhelgaas: commit log]
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Link: https://patch.msgid.link/20251119224140.8616-26-david.laight.linux@gmail.com
|
|
The KMSG_COMPONENT macro is a leftover of the s390 specific "kernel
message catalog" which never made it upstream.
Remove the macro in order to get rid of a pointless indirection. Replace
all users with the string it defines. In almost all cases this leads to a
simple replacement like this:
- #define KMSG_COMPONENT "appldata"
- #define pr_fmt(fmt) KMSG_COMPONENT ": " fmt
+ #define pr_fmt(fmt) "appldata: " fmt
Except for some special cases this is just mechanical/scripted work.
Acked-by: Thomas Richter <tmricht@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
|
|
The functionality implemented in the iproc driver in order to detect an
OF MSI controller node is now fully implemented in of_msi_xlate().
Replace the current msi-map/msi-parent parsing code with of_msi_xlate().
Since of_msi_xlate() is also a deviceID mapping API, pass in a fictitious
0 as deviceID - the driver only requires detecting the OF MSI controller
node not the deviceID mapping per-se (of_msi_xlate() return value is
ignored for the same reason).
Signed-off-by: Lorenzo Pieralisi <lpieralisi@kernel.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Frank Li <Frank.Li@nxp.com>
Acked-by: Bjorn Helgaas <bhelgaas@google.com>
Link: https://patch.msgid.link/20251021124103.198419-5-lpieralisi@kernel.org
|
|
Pick up OF changes to resolve dependencies
|
|
Provide an access to pci_p2pdma_map_type() function to allow subsystems
to determine the appropriate mapping type for P2PDMA transfers between
a provider and target device.
The pci_p2pdma_map_type() function is the core P2P layer version of
the existing public, but struct page focused, pci_p2pdma_state()
function. It returns the same result. It is required to use the p2p
subsystem from drivers that don't use the struct page layer.
Like __pci_p2pdma_update_state() it is not an exported function. The
idea is that only subsystem code will implement mapping helpers for
taking in phys_addr_t lists, this is deliberately not made accessible
to every driver to prevent abuse.
Following patches will use this function to implement a shared DMA
mapping helper for DMABUF.
Tested-by: Alex Mastro <amastro@fb.com>
Tested-by: Nicolin Chen <nicolinc@nvidia.com>
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Acked-by: Ankit Agrawal <ankita@nvidia.com>
Link: https://lore.kernel.org/r/20251120-dmabuf-vfio-v9-4-d7f71607f371@nvidia.com
Signed-off-by: Alex Williamson <alex@shazbot.org>
|
|
Refactor the PCI P2PDMA subsystem to separate the core peer-to-peer DMA
functionality from the optional memory allocation layer. This creates a
two-tier architecture:
The core layer provides P2P mapping functionality for physical addresses
based on PCI device MMIO BARs and integrates with the DMA API for
mapping operations. This layer is required for all P2PDMA users.
The optional upper layer provides memory allocation capabilities
including gen_pool allocator, struct page support, and sysfs interface
for user space access.
This separation allows subsystems like DMABUF to use only the core P2P
mapping functionality without the overhead of memory allocation features
they don't need. The core functionality is now available through the
new pcim_p2pdma_provider() function that returns a p2pdma_provider
structure.
Tested-by: Alex Mastro <amastro@fb.com>
Tested-by: Nicolin Chen <nicolinc@nvidia.com>
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Acked-by: Ankit Agrawal <ankita@nvidia.com>
Link: https://lore.kernel.org/r/20251120-dmabuf-vfio-v9-3-d7f71607f371@nvidia.com
Signed-off-by: Alex Williamson <alex@shazbot.org>
|
|
Currently the P2PDMA code requires a pgmap and a struct page to
function. The was serving three important purposes:
- DMA API compatibility, where scatterlist required a struct page as
input
- Life cycle management, the percpu_ref is used to prevent UAF during
device hot unplug
- A way to get the P2P provider data through the pci_p2pdma_pagemap
The DMA API now has a new flow, and has gained phys_addr_t support, so
it no longer needs struct pages to perform P2P mapping.
Lifecycle management can be delegated to the user, DMABUF for instance
has a suitable invalidation protocol that does not require struct page.
Finding the P2P provider data can also be managed by the caller
without need to look it up from the phys_addr.
Split the P2PDMA code into two layers. The optional upper layer,
effectively, provides a way to mmap() P2P memory into a VMA by
providing struct page, pgmap, a genalloc and sysfs.
The lower layer provides the actual P2P infrastructure and is wrapped
up in a new struct p2pdma_provider. Rework the mmap layer to use new
p2pdma_provider based APIs.
Drivers that do not want to put P2P memory into VMA's can allocate a
struct p2pdma_provider after probe() starts and free it before
remove() completes. When DMA mapping the driver must convey the struct
p2pdma_provider to the DMA mapping code along with a phys_addr of the
MMIO BAR slice to map. The driver must ensure that no DMA mapping
outlives the lifetime of the struct p2pdma_provider.
The intended target of this new API layer is DMABUF. There is usually
only a single p2pdma_provider for a DMABUF exporter. Most drivers can
establish the p2pdma_provider during probe, access the single instance
during DMABUF attach and use that to drive the DMA mapping.
DMABUF provides an invalidation mechanism that can guarantee all DMA
is halted and the DMA mappings are undone prior to destroying the
struct p2pdma_provider. This ensures there is no UAF through DMABUFs
that are lingering past driver removal.
The new p2pdma_provider layer cannot be used to create P2P memory that
can be mapped into VMA's, be used with pin_user_pages(), O_DIRECT, and
so on. These use cases must still use the mmap() layer. The
p2pdma_provider layer is principally for DMABUF-like use cases where
DMABUF natively manages the life cycle and access instead of
vmas/pin_user_pages()/struct page.
In addition, remove the bus_off field from pci_p2pdma_map_state since
it duplicates information already available in the pgmap structure.
The bus_offset is only used in one location (pci_p2pdma_bus_addr_map)
and is always identical to pgmap->bus_offset.
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Tested-by: Alex Mastro <amastro@fb.com>
Tested-by: Nicolin Chen <nicolinc@nvidia.com>
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Acked-by: Ankit Agrawal <ankita@nvidia.com>
Link: https://lore.kernel.org/r/20251120-dmabuf-vfio-v9-1-d7f71607f371@nvidia.com
Signed-off-by: Alex Williamson <alex@shazbot.org>
|
|
Cross-merge networking fixes after downstream PR (net-6.18-rc7).
No conflicts, adjacent changes:
tools/testing/selftests/net/af_unix/Makefile
e1bb28bf13f4 ("selftest: af_unix: Add test for SO_PEEK_OFF.")
45a1cd8346ca ("selftests: af_unix: Add tests for ECONNRESET and EOF semantics")
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Use %ptSp instead of open coded variants to print content of
struct timespec64 in human readable format.
Acked-by: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Link: https://patch.msgid.link/20251113150217.3030010-16-andriy.shevchenko@linux.intel.com
Signed-off-by: Petr Mladek <pmladek@suse.com>
|
|
Add support for assert_perst() for switches like TC9563, which require
configuration before the PCIe link is established. Such devices use this
function op to assert PERST# before configuring the device and once the
configuration is done they de-assert PERST#.
Signed-off-by: Krishna Chaitanya Chundru <krishna.chundru@oss.qualcomm.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Manivannan Sadhasivam <mani@kernel.org>
Link: https://patch.msgid.link/20251101-tc9563-v9-5-de3429f7787a@oss.qualcomm.com
|
|
Add .assert_perst() hook for dwc glue drivers to register with
assert_perst() of pci ops, allowing for better control over the link
initialization and shutdown process.
Implement assert_perst() function op for dwc drivers.
Signed-off-by: Krishna Chaitanya Chundru <krishna.chundru@oss.qualcomm.com>
[bhelgaas: squash dwc host support]
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Manivannan Sadhasivam <mani@kernel.org>
Link: https://patch.msgid.link/20251101-tc9563-v9-3-de3429f7787a@oss.qualcomm.com
Link: https://patch.msgid.link/20251101-tc9563-v9-4-de3429f7787a@oss.qualcomm.com
|
|
Update header inclusions to follow IWYU (Include What You Use)
principle.
In particular, replace of_gpio.h, which is subject to removal by the
GPIOLIB subsystem, with the respective headers that are being used by the
driver.
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: Manivannan Sadhasivam <mani@kernel.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Link: https://patch.msgid.link/20251114185534.3287497-1-andriy.shevchenko@linux.intel.com
|
|
pci_epc_mem_alloc_addr() allocates a CPU address from the ATU window phys
base and a page number. Set the ep->page_size so the resulting CPU address
is correctly aligned with the ATU required alignment.
Fixes: 151f3d29baf4 ("PCI: stm32-ep: Add PCIe Endpoint support for STM32MP25")
Signed-off-by: Christian Bruel <christian.bruel@foss.st.com>
[mani: added fixes tag]
Signed-off-by: Manivannan Sadhasivam <mani@kernel.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Link: https://patch.msgid.link/20251114-atu_align_ep-v1-1-88da5366fa04@foss.st.com
|
|
If the host has deasserted PERST# and started link training before the link
is started on EP side, enabling LTSSM before the endpoint registers are
initialized in the perst_irq handler results in probing incorrect values.
Thus, wait for the PERST# level-triggered interrupt to start link training
at the end of initialization and cleanup the stm32_pcie_[start stop]_link
functions.
Fixes: 151f3d29baf4 ("PCI: stm32-ep: Add PCIe Endpoint support for STM32MP25")
Signed-off-by: Christian Bruel <christian.bruel@foss.st.com>
[mani: added fixes tag]
Signed-off-by: Manivannan Sadhasivam <mani@kernel.org>
[bhelgaas: wrap line]
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Link: https://patch.msgid.link/20251114-perst_ep-v1-1-e7976317a890@foss.st.com
|
|
Introduce a driver for the PCIe host controller found in the SpacemiT K1
SoC. The hardware is derived from the Synopsys DesignWare PCIe IP. The
driver supports up to three PCIe ports operating at PCIe link speed up to
5 GT/s. The first port uses a combo PHY, which may be configured for use
for USB3 instead.
Signed-off-by: Alex Elder <elder@riscstar.com>
[mani: added FIXME to the comment on disabling ASPM L1]
Signed-off-by: Manivannan Sadhasivam <mani@kernel.org>
Tested-by: Jason Montleon <jmontleo@redhat.com>
Tested-by: Johannes Erdfelt <johannes@erdfelt.com>
Tested-by: Aurelien Jarno <aurelien@aurel32.net>
Link: https://patch.msgid.link/20251113214540.2623070-6-elder@riscstar.com
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/mellanox/linux
Tariq Toukan says:
====================
mlx5-next updates 2025-11-13
The following pull-request contains common mlx5 updates
* 'mlx5-next' of git://git.kernel.org/pub/scm/linux/kernel/git/mellanox/linux:
net/mlx5: Expose definition for 1600Gbps link mode
net/mlx5: fs, set non default device per namespace
net/mlx5: fs, Add other_eswitch support for steering tables
net/mlx5: Add OTHER_ESWITCH HW capabilities
net/mlx5: Add direct ST mode support for RDMA
PCI/TPH: Expose pcie_tph_get_st_table_loc()
{rdma,net}/mlx5: Query vports mac address from device
====================
Link: https://patch.msgid.link/1763027252-1168760-1-git-send-email-tariqt@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/pci/pci
Pull pci fixes from Bjorn Helgaas:
- Cache the ASPM L0s/L1 Supported bits early so quirks can override
them if necessary (Bjorn Helgaas)
- Add quirks for PA Semi and Freescale Root Ports and a HiSilicon Wi-Fi
device that are reported to have broken L0s and L1 (Shawn Lin, Bjorn
Helgaas)
* tag 'pci-v6.18-fixes-5' of git://git.kernel.org/pub/scm/linux/kernel/git/pci/pci:
PCI/ASPM: Avoid L0s and L1 on Hi1105 [19e5:1105] Wi-Fi
PCI/ASPM: Avoid L0s and L1 on PA Semi [1959:a002] Root Ports
PCI/ASPM: Avoid L0s and L1 on Freescale [1957:0451] Root Ports
PCI/ASPM: Convert quirks to override advertised link states
PCI/ASPM: Add pcie_aspm_remove_cap() to override advertised link states
PCI/ASPM: Cache L0s/L1 Supported so advertised link states can be overridden
|
|
PCI/TSM sysfs for physical function 0 devices, i.e. the "DSM" (Device
Security Manager), contains the 'connect' and 'disconnect' attributes.
After a successful 'connect' operation the DSM, its dependent functions
(SR-IOV virtual functions, non-zero multi-functions, or downstream
endpoints of a switch DSM) are candidates for being transitioned into a
TDISP (TEE Device Interface Security Protocol) operational state, via
pci_tsm_bind(). At present sysfs is blind to which devices are capable of
TDISP operation and it is ambiguous which functions are serviced by which
DSMs.
Add a 'dsm' attribute to identify a function's DSM device, and add a
'bound' attribute to identify when a function has entered a TDISP
operational state.
Cc: Bjorn Helgaas <bhelgaas@google.com>
Cc: Lukas Wunner <lukas@wunner.de>
Cc: Samuel Ortiz <sameo@rivosinc.com>
Cc: Alexey Kardashevskiy <aik@amd.com>
Cc: Xu Yilun <yilun.xu@linux.intel.com>
Cc: Suzuki K Poulose <suzuki.poulose@arm.com>
Cc: Aneesh Kumar K.V <aneesh.kumar@kernel.org>
Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com>
Link: https://patch.msgid.link/20251113021446.436830-9-dan.j.williams@intel.com
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
|
|
A PCIe device function interface assigned to a TVM is a TEE Device
Interface (TDI). A TDI instantiated by pci_tsm_bind() needs additional
steps taken by the TVM to be accepted into the TVM's Trusted Compute
Boundary (TCB) and transitioned to the RUN state.
pci_tsm_guest_req() is a channel for the guest to request TDISP collateral,
like Device Interface Reports, and effect TDISP state changes, like
LOCKED->RUN transititions. Similar to IDE establishment and pci_tsm_bind(),
these are long running operations involving SPDM message passing via the
DOE mailbox.
The path for a TVM to invoke pci_tsm_guest_req() is:
* TSM triggers exit via guest-to-host-interface ABI (implementation specific)
* VMM invokes handler (KVM handle_exit() -> userspace io)
* handler issues request (userspace io handler -> ioctl() ->
pci_tsm_guest_req())
* handler supplies response
* VMM posts response, notifies/re-enters TVM
This path is purely a transport for messages from TVM to platform TSM. By
design the host kernel does not and must not care about the content of
these messages. I.e. the host kernel is not in the TCB of the TVM.
As this is an opaque passthrough interface, similar to fwctl, the kernel
requires that implementations stay within the bounds defined by 'enum
pci_tsm_req_scope'. Violation of those expectations likely has market and
regulatory consequences. Out of scope requests are blocked by default.
Co-developed-by: Xu Yilun <yilun.xu@linux.intel.com>
Signed-off-by: Xu Yilun <yilun.xu@linux.intel.com>
Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com>
Link: https://patch.msgid.link/20251113021446.436830-8-dan.j.williams@intel.com
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
|
|
After a PCIe device has established a secure link and session between a TEE
Security Manager (TSM) and its local Device Security Manager (DSM), the
device or its subfunctions are candidates to be bound to a private memory
context, a TVM. A PCIe device function interface assigned to a TVM is a TEE
Device Interface (TDI).
The pci_tsm_bind() requests the low-level TSM driver to associate the
device with private MMIO and private IOMMU context resources of a given TVM
represented by a @kvm argument. A device in the bound state corresponds to
the TDISP protocol LOCKED state and awaits validation by the TVM. It is a
'struct pci_tsm_link_ops' operation because, similar to IDE establishment,
it involves host side resource establishment and context setup on behalf of
the guest. It is also expected to be performed lazily to allow for
operation of the device in non-confidential "shared" context for pre-lock
configuration.
Co-developed-by: Xu Yilun <yilun.xu@linux.intel.com>
Signed-off-by: Xu Yilun <yilun.xu@linux.intel.com>
Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com>
Link: https://patch.msgid.link/20251113021446.436830-7-dan.j.williams@intel.com
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
|
|
The PCIe spec defines two types of streams - selective and link. Each
stream has an ID from the same bucket so a stream ID does not tell the
type. The spec defines an "enable" bit for every stream and required
stream IDs to be unique among all enabled stream but there is no such
requirement for disabled streams.
However, when IDE_KM is programming keys, an IDE-capable device needs
to know the type of stream being programmed to write it directly to
the hardware as keys are relatively large, possibly many of them and
devices often struggle with keeping around rather big data not being
used.
Walk through all streams on a device and initialise the IDs to some
unique number, both link and selective.
The weakest part of this proposal is the host bridge ide_stream_ids_ida.
Technically, a Stream ID only needs to be unique within a given partner
pair. However, with "anonymous" / unassigned streams there is no convenient
place to track the available ids. Proceed with an ida in the host bridge
for now, but consider moving this tracking to be an ide_stream_ids_ida per
device.
Co-developed-by: Alexey Kardashevskiy <aik@amd.com>
Signed-off-by: Alexey Kardashevskiy <aik@amd.com>
Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com>
Link: https://patch.msgid.link/20251113021446.436830-6-dan.j.williams@intel.com
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
|
|
The address ranges for downstream Address Association Registers need to
cover memory addresses for all functions (PFs/VFs/downstream devices)
managed by a Device Security Manager (DSM). The proposed solution is get
the memory (32-bit only) range and prefetchable-memory (64-bit capable)
range from the immediate ancestor downstream port (either the direct-attach
RP or deepest switch port when switch attached).
Similar to RID association, address associations will be set by default if
hardware sets 'Number of Address Association Register Blocks' in the
'Selective IDE Stream Capability Register' to a non-zero value. TSM drivers
can opt-out of the settings by zero'ing out unwanted / unsupported address
ranges. E.g. TDX Connect only supports prefetachable (64-bit capable)
memory ranges for the Address Association setting.
If the immediate downstream port provides both a memory range and
prefetchable-memory range, but the IDE partner port only provides 1 Address
Association Register block then the TSM driver can pick which range to
associate, or let the PCI core prioritize memory.
Note, the Address Association Register setup for upstream requests is still
uncertain so is not included.
Co-developed-by: Aneesh Kumar K.V <aneesh.kumar@kernel.org>
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@kernel.org>
Co-developed-by: Arto Merilainen <amerilainen@nvidia.com>
Signed-off-by: Arto Merilainen <amerilainen@nvidia.com>
Signed-off-by: Xu Yilun <yilun.xu@linux.intel.com>
Co-developed-by: Dan Williams <dan.j.williams@intel.com>
Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com>
Link: https://patch.msgid.link/20251114010227.567693-1-dan.j.williams@intel.com
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
|
|
Use new PM_RUNTIME_ACQUIRE() and PM_RUNTIME_ACQUIRE_ERR() wrapper macros
to make the code look more straightforward.
No intentional functional impact.
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: Dhruva Gole <d-gole@ti.com>
Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com>
[ rjw; Typo fix in the changelog ]
Link: https://patch.msgid.link/3932581.kQq0lBPeGt@rafael.j.wysocki
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
PCIe r7.0, sec 7.8.6, defines resizable BAR sizes beyond the currently
supported maximum of 128TB, which will require more than u32 to store the
entire bitmask.
Convert Resizable BAR related functions to use u64 bitmask for BAR sizes to
make the typing more future-proof.
The support for the larger BAR sizes themselves is not added at this point.
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Link: https://patch.msgid.link/20251113180053.27944-12-ilpo.jarvinen@linux.intel.com
|
|
Add pci_rebar_get_max_size() to allow simplifying code that wants to know
the maximum possible size for a Resizable BAR.
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Link: https://patch.msgid.link/20251113180053.27944-9-ilpo.jarvinen@linux.intel.com
|
|
Many callers of pci_rebar_get_possible_sizes() are interested in finding
out if a particular encoded BAR Size (PCIe r7.0, sec 7.8.6.3) is supported
by the particular BAR.
Add pci_rebar_size_supported() into PCI core to make it easy for the
drivers to determine if the BAR size is supported or not.
Use the new function in pci_resize_resource() and in
pci_iov_vf_bar_set_size().
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com>
Link: https://patch.msgid.link/20251113180053.27944-6-ilpo.jarvinen@linux.intel.com
|
|
Fix the copy-pasted errors in the Resizable BAR handling functions kernel
doc and generally improve wording choices.
Fix the formatting errors of the Return: line.
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Link: https://patch.msgid.link/20251113180053.27944-5-ilpo.jarvinen@linux.intel.com
|
|
pci_rebar_size_to_bytes() is in drivers/pci/pci.h but would be useful for
endpoint drivers as well.
Move the function to rebar.c and export it.
In addition, convert the literal to where the number comes from
(PCI_REBAR_MIN_SIZE).
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Link: https://patch.msgid.link/20251113180053.27944-4-ilpo.jarvinen@linux.intel.com
|
|
Move pci_rebar_bytes_to_size() from include/linux/pci.h to rebar.c as it
does not look very trivial and is not expected to be performance critical.
Convert literals to use a newly added PCI_REBAR_MIN_SIZE define.
Also add kernel doc for the function as the function is exported.
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Michael J. Ruhl <mjruhl@habana.ai>
Link: https://patch.msgid.link/20251113180053.27944-3-ilpo.jarvinen@linux.intel.com
|
|
For lack of a better place to put it, Resizable BAR code has been placed
inside pci.c and setup-res.c that do not use it for anything. Upcoming
changes are going to add more Resizable BAR related functions, increasing
the code size.
As pci.c is huge as is, move the Resizable BAR related code and the BAR
resize code from setup-res.c to rebar.c.
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Link: https://patch.msgid.link/20251113180053.27944-2-ilpo.jarvinen@linux.intel.com
|
|
restore_dev_resource() copies saved addresses and flags from the struct
pci_dev_resource back to the struct resource, typically, during rollback
from a failure or in preparation for a retry attempt.
If the resource is within resource tree, the resource must not be
modified as the resource tree could be corrupted. Thus, it's a bug to
call restore_dev_resource() for assigned resources (which did happen
due to logic flaws in the BAR resize rollback).
Add WARN_ON_ONCE() into restore_dev_resource() to detect such bugs easily
and return without altering the resource to prevent corruption.
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Tested-by: Alex Bennée <alex.bennee@linaro.org> # AVA, AMD GPU
Link: https://patch.msgid.link/20251113162628.5946-12-ilpo.jarvinen@linux.intel.com
|
|
As pci_resize_resource() is meant to be used also outside of PCI core,
document the interface with kerneldoc.
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Tested-by: Alex Bennée <alex.bennee@linaro.org> # AVA, AMD GPU
Link: https://patch.msgid.link/20251113162628.5946-8-ilpo.jarvinen@linux.intel.com
|
|
BAR resize operation is implemented in the pci_resize_resource() and
pbus_reassign_bridge_resources() functions. pci_resize_resource() can be
called either from __resource_resize_store() from sysfs or directly by the
driver for the Endpoint Device.
The pci_resize_resource() requires that caller has released the device
resources that share the bridge window with the BAR to be resized as
otherwise the bridge window is pinned in place and cannot be changed.
pbus_reassign_bridge_resources() rolls back resources if the resize
operation fails, but rollback is performed only for the bridge windows.
Because releasing the device resources are done by the caller of the BAR
resize interface, these functions performing the BAR resize do not have
access to the device resources as they were before the resize.
pbus_reassign_bridge_resources() could try __pci_bridge_assign_resources()
after rolling back the bridge windows as they were, however, it will not
guarantee the resource are assigned due to differences in how FW and the
kernel assign the resources (alignment of the start address and tail).
To perform rollback robustly, the BAR resize interface has to be altered to
also release the device resources that share the bridge window with the BAR
to be resized.
Also, remove restoring from the entries failed list as saved list should
now contain both the bridge windows and device resources so the extra
restore is duplicated work.
Some drivers (currently only amdgpu) want to prevent releasing some
resources. Add exclude_bars param to pci_resize_resource() and make amdgpu
pass its register BAR (BAR 2 or 5), which should never be released during
resize operation. Normally 64-bit prefetchable resources do not share a
bridge window with the 32-bit only register BAR, but there are various
fallbacks in the resource assignment logic which may make the resources
share the bridge window in rare cases.
This change (together with the driver side changes) is to counter the
resource releases that had to be done to prevent resource tree corruption
in the ("PCI: Release assigned resource before restoring them") change. As
such, it likely restores functionality in cases where device resources were
released to avoid resource tree conflicts which appeared to be "working"
when such conflicts were not correctly detected by the kernel.
Reported-by: Simon Richter <Simon.Richter@hogyros.de>
Link: https://lore.kernel.org/linux-pci/f9a8c975-f5d3-4dd2-988e-4371a1433a60@hogyros.de/
Reported-by: Alex Bennée <alex.bennee@linaro.org>
Link: https://lore.kernel.org/linux-pci/874irqop6b.fsf@draig.linaro.org/
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
[bhelgaas: squash amdgpu BAR selection from
https://lore.kernel.org/r/20251114103053.13778-1-ilpo.jarvinen@linux.intel.com]
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Tested-by: Alex Bennée <alex.bennee@linaro.org> # AVA, AMD GPU
Reviewed-by: Christian König <christian.koenig@amd.com>
Link: https://patch.msgid.link/20251113162628.5946-7-ilpo.jarvinen@linux.intel.com
|
|
Freeing the saved list does not require holding pci_bus_sem, so the
critical section can be made shorter.
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Tested-by: Alex Bennée <alex.bennee@linaro.org> # AVA, AMD GPU
Link: https://patch.msgid.link/20251113162628.5946-6-ilpo.jarvinen@linux.intel.com
|
|
Usually, resizing BARs requires releasing bridge windows in order to
resize it to fit a larger BAR into the window. This is not always the
case, however, FW could have made the window large enough to accommodate
larger BAR as is, or the user might prefer to shrink a BAR to make more
space for another Resizable BAR.
Thus, replace the check that requires that at least one bridge window
was released with a check that simply ensures bridge is not NULL.
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Tested-by: Alex Bennée <alex.bennee@linaro.org> # AVA, AMD GPU
Link: https://patch.msgid.link/20251113162628.5946-5-ilpo.jarvinen@linux.intel.com
|
|
Upcoming fix to BAR resize will store also device BAR resource in the
saved list. Change the pci_dev variable in the loop from 'bridge' to
'dev' as the former would be misleading with non-bridges in the list.
This is in a separate change to reduce churn in the upcoming BAR resize
fix.
While it appears that the logic in the loop doing pci_setup_bridge() is
altered as 'bridge' variable is no longer updated, a bridge should never
appear more than once in the saved list so the check can only match to the
first entry. As such, the code with two distinct pci_dev variables better
represents the intention of the check compared with the old code where
bridge variable was reused for a different purpose.
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Tested-by: Alex Bennée <alex.bennee@linaro.org> # AVA, AMD GPU
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Link: https://patch.msgid.link/20251113162628.5946-4-ilpo.jarvinen@linux.intel.com
|
|
pci_rebar_set_size() adjusts BAR size for both normal and IOV BARs. The
struct pci_sriov keeps a cached copy of BAR size in ->barsz[] which is not
adjusted by pci_rebar_set_size() but by pci_iov_resource_set_size().
pci_iov_resource_set_size() is called also from
pci_resize_resource_set_size().
The current arrangement is problematic once BAR resize algorithm starts to
roll back changes properly in case of a failure. The normal resource
fitting algorithm rolls back resource size using the struct
pci_dev_resource easily but also calling pci_resize_resource_set_size() or
pci_iov_resource_set_size() to roll back BAR size would be an extra burden,
whereas combining ->barsz[] update with pci_rebar_set_size() naturally
rolls back it when restoring the old BAR size on a different layer of the
BAR resize operation.
Thus, rework pci_rebar_set_size() to also update ->barsz[].
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Tested-by: Alex Bennée <alex.bennee@linaro.org> # AVA, AMD GPU
Link: https://patch.msgid.link/20251113162628.5946-3-ilpo.jarvinen@linux.intel.com
|
|
pbus_reassign_bridge_resources() saves bridge windows into the saved
list before attempting to adjust resource assignments to perform a BAR
resize operation. If resource adjustments cannot be completed fully,
rollback is attempted by restoring the resource from the saved list.
The rollback, however, does not check whether the resources it restores were
assigned by the partial resize attempt. If restore changes addresses of the
resource, it can result in corrupting the resource tree.
An example of a corrupted resource tree with overlapping addresses:
6200000000000-6203fbfffffff : pciex@620c3c0000000
6200000000000-6203fbff0ffff : PCI Bus 0030:01
6200020000000-62000207fffff : 0030:01:00.0
6200000000000-6203fbff0ffff : PCI Bus 0030:02
A resource that are assigned into the resource tree must remain
unchanged. Thus, release such a resource before attempting to restore
and claim it back.
For simplicity, always do the release and claim back for the resource
even in the cases where it is restored to the same address range.
Note: this fix may "break" some cases where devices "worked" because
the resource tree corruption allowed address space double counting to
fit more resource than what can now be assigned without double
counting. The upcoming changes to BAR resizing should address those
scenarios (to the extent possible).
Fixes: 8bb705e3e79d ("PCI: Add pci_resize_resource() for resizing BARs")
Reported-by: Simon Richter <Simon.Richter@hogyros.de>
Link: https://lore.kernel.org/linux-pci/67840a16-99b4-4d8c-9b5c-4721ab0970a2@hogyros.de/
Reported-by: Alex Bennée <alex.bennee@linaro.org>
Link: https://lore.kernel.org/linux-pci/874irqop6b.fsf@draig.linaro.org/
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Tested-by: Alex Bennée <alex.bennee@linaro.org> # AVA, AMD GPU
Link: https://patch.msgid.link/20251113162628.5946-2-ilpo.jarvinen@linux.intel.com
|
|
Move the Cadence PCIe controller RP common functions into a separate file.
The common library functions are split from legacy PCIe RP controller
functions to a separate file.
Signed-off-by: Manikandan K Pillai <mpillai@cadence.com>
[mani: removed the unused variable]
Signed-off-by: Manivannan Sadhasivam <mani@kernel.org>
Link: https://patch.msgid.link/20251108140305.1120117-4-hans.zhang@cixtech.com
|
|
Split the Cadence PCIe header file by moving the Legacy (LGA) controller
register definitions to a separate header file for support of next
generation PCIe controller architecture.
Signed-off-by: Manikandan K Pillai <mpillai@cadence.com>
Signed-off-by: Manivannan Sadhasivam <mani@kernel.org>
Link: https://patch.msgid.link/20251108140305.1120117-3-hans.zhang@cixtech.com
|
|
Add support for building PCI cadence platforms as a module.
Signed-off-by: Manikandan K Pillai <mpillai@cadence.com>
Signed-off-by: Manivannan Sadhasivam <mani@kernel.org>
Link: https://patch.msgid.link/20251108140305.1120117-2-hans.zhang@cixtech.com
|
|
Assign the result of devm_gpiod_get_optional() directly to
pcie->reset_gpio, thereby removing the local variable.
Signed-off-by: Anand Moon <linux.amoon@gmail.com>
Signed-off-by: Manivannan Sadhasivam <mani@kernel.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Siddharth Vadapalli <s-vadapalli@ti.com>
Link: https://patch.msgid.link/20251028154229.6774-3-linux.amoon@gmail.com
|
|
Use devm_clk_get_optional_enabled() helper instead of calling
devm_clk_get_optional() and then clk_prepare_enable().
Assign the result of devm_clk_get_optional_enabled() directly to
pcie->refclk to avoid using a local 'clk' variable.
Signed-off-by: Anand Moon <linux.amoon@gmail.com>
Signed-off-by: Manivannan Sadhasivam <mani@kernel.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Siddharth Vadapalli <s-vadapalli@ti.com>
Link: https://patch.msgid.link/20251028154229.6774-2-linux.amoon@gmail.com
|
|
The 'pci-keystone.c' driver is the application/glue/wrapper driver for the
Designware PCIe Controllers on TI SoCs. Now that all of the helper APIs
that the 'pci-keystone.c' driver depends upon have been exported for use,
enable support to build the driver as a loadable module.
When building the driver as a module, the functions marked by the '__init'
keyword may be invoked after the init memory has been freed by the kernel.
This results will result in an exception of the form:
Unable to handle kernel paging request at virtual address ...
Mem abort info:
...
pc : ks_pcie_host_init+0x0/0x540
lr : dw_pcie_host_init+0x170/0x498
...
ks_pcie_host_init+0x0/0x540 (P)
ks_pcie_probe+0x728/0x84c
platform_probe+0x5c/0x98
really_probe+0xbc/0x29c
__driver_probe_device+0x78/0x12c
driver_probe_device+0xd8/0x15c
To address this, introduce a new function namely 'ks_pcie_init()' to
register the 'fault handler' while removing the '__init' keyword from
existing functions.
Note that hook_fault_code() is defined as '__init' function. Since the init
functions should never be called during runtime (after init memory freeing
stage), the driver is made as a built-in if CONFIG_ARM (where
hook_fault_code() is used) is selected.
Signed-off-by: Siddharth Vadapalli <s-vadapalli@ti.com>
[mani: added a note about hook_fault_code()]
Signed-off-by: Manivannan Sadhasivam <mani@kernel.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Link: https://patch.msgid.link/20251029080547.1253757-5-s-vadapalli@ti.com
|
|
The pci-keystone.c driver uses the functions 'dw_pcie_allocate_domains()'
and 'dw_pcie_ep_raise_msix_irq()'. Export them in preparation for enabling
the pci-keystone.c driver to be built as a loadable module.
Signed-off-by: Siddharth Vadapalli <s-vadapalli@ti.com>
Signed-off-by: Manivannan Sadhasivam <mani@kernel.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Link: https://patch.msgid.link/20251029080547.1253757-3-s-vadapalli@ti.com
|
|
The pci-keystone.c driver uses the 'pci_get_host_bridge_device()' helper.
Export it in preparation for enabling the pci-keystone.c driver to be built
as a loadable module.
Signed-off-by: Siddharth Vadapalli <s-vadapalli@ti.com>
Signed-off-by: Manivannan Sadhasivam <mani@kernel.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Link: https://patch.msgid.link/20251029080547.1253757-2-s-vadapalli@ti.com
|