Introduce a new command for RDMA event monitoring.
This patch adds a new attribute "event_type" which describes
the event recieved. Add a new NETLINK_RDMA multicast group
and processes listening to this multicast group receive RDMA
events.
The event types supported are IB device registration/unregistration
and net device attachment/detachment.
Example output of rdma monitor and the commands which trigger
the events:
$ rdma monitor
$ rmmod mlx5_ib
[UNREGISTER] dev 3 rocep8s0f1
[UNREGISTER] dev 2 rocep8s0f0
$modprobe mlx5_ib
[REGISTER] dev 4 mlx5_0
[NETDEV_ATTACH] dev 4 mlx5_0 port 1 netdev 4 eth2
[REGISTER] dev 5 mlx5_1
[NETDEV_ATTACH] dev 5 mlx5_1 port 1 netdev 5 eth3
$ devlink dev eswitch set pci/0000:08:00.0 mode switchdev
[UNREGISTER] dev 4 rocep8s0f0
[REGISTER] dev 6 mlx5_0
[NETDEV_ATTACH] dev 6 mlx5_0 port 30 netdev 4 eth2
$ echo 4 > /sys/class/net/eth2/device/sriov_numvfs
[NETDEV_ATTACH] dev 6 rdmap8s0f0 port 2 netdev 7 eth4
[NETDEV_ATTACH] dev 6 rdmap8s0f0 port 3 netdev 8 eth5
[NETDEV_ATTACH] dev 6 rdmap8s0f0 port 4 netdev 9 eth6
[NETDEV_ATTACH] dev 6 rdmap8s0f0 port 5 netdev 10 eth7
[REGISTER] dev 7 mlx5_0
[NETDEV_ATTACH] dev 7 mlx5_0 port 1 netdev 11 eth8
[REGISTER] dev 8 mlx5_0
[NETDEV_ATTACH] dev 8 mlx5_0 port 1 netdev 12 eth9
[REGISTER] dev 9 mlx5_0
[NETDEV_ATTACH] dev 9 mlx5_0 port 1 netdev 13 eth10
[REGISTER] dev 10 mlx5_0
[NETDEV_ATTACH] dev 10 mlx5_0 port 1 netdev 14 eth11
$ echo 0 > /sys/class/net/eth2/device/sriov_numvfs
[UNREGISTER] dev 7 rocep8s0f0v0
[UNREGISTER] dev 8 rocep8s0f0v1
[UNREGISTER] dev 9 rocep8s0f0v2
[UNREGISTER] dev 10 rocep8s0f0v3
[NETDEV_DETACH] dev 6 rdmap8s0f0 port 2
[NETDEV_DETACH] dev 6 rdmap8s0f0 port 3
[NETDEV_DETACH] dev 6 rdmap8s0f0 port 4
[NETDEV_DETACH] dev 6 rdmap8s0f0 port 5
Signed-off-by: Chiara Meiohas <cmeiohas@nvidia.com>
Reviewed-by: Mark Bloch <mbloch@nvidia.com>
Signed-off-by: David Ahern <dsahern@kernel.org>
The version field in mnlu was being passed in but never set.
This meant that all places mnlu_gen_socket was used, the version would
be uninitialized data from malloc().
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Don't trust the kernel to send payload of certain size. Sanitize that by
checking the payload length in mnlu_cb_stop() and mnlu_cb_error() and
only access the payload if it is of required size.
Note that for mnlu_cb_stop(), this is happening already for example
with devlink resource. Kernel sends NLMSG_DONE with zero size payload.
Fixes: 049c58539f5d ("devlink: mnlg: Add support for extended ack")
Fixes: c934da8aaacb ("devlink: mnlg: Catch returned error value of dumpit commands")
Signed-off-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Benefit from GET_POLICY command of ctrl netlink and introduce a helper
that dumps policies and finds out, if there is a separate policy
specified for dump op of specified command.
Signed-off-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: David Ahern <dsahern@kernel.org>
There is no rule to have an error code after NLMSG_DONE msg. The only reason
we has this offset is that kernel function netlink_dump_done() has an error
code followed by the netlink message header.
Making nl_dump_ext_ack_done() has an offset parameter. So we can adjust
this for NLMSG_DONE message without error code.
Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
Signed-off-by: David Ahern <dsahern@kernel.org>
mnlu_gen_socket_open opens a socket and configures it for use with a
generic netlink family. As part of this process it sends a
CTRL_CMD_GETFAMILY to get the ID for the family name requested.
In addition to the family id, this command reports a few other useful
values including the maximum attribute. The maximum attribute is useful in
order to know whether a given attribute is supported and for knowing the
necessary size to allocate for other operations such as policy dumping.
Since we already have to issue a CTRL_CMD_GETFAMILY to get the id, we can
also store the maximum attribute as well. Modify the callback functions to
parse the maximum attribute NLA and store it in the mnlu_gen_socket
structure.
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Signed-off-by: David Ahern <dsahern@kernel.org>
Introduce helper for generic socket receive helper and introduce helper
to build command with custom family and version.
Use API in subsequent devlink patch.
Signed-off-by: Parav Pandit <parav@nvidia.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: David Ahern <dsahern@kernel.org>
Subsequent patch needs to
(a) query and use socket family
(b) send/receive messages using this family
Hence add helper routines to open, close, query family and to perform
send receive operations.
Signed-off-by: Parav Pandit <parav@nvidia.com>
Signed-off-by: David Ahern <dsahern@kernel.org>
Receiving a message in libmnl is a somewhat involved operation. Devlink's
mnlg library has an implementation that is going to be handy for other
tools as well. Extract it into a new helper.
Signed-off-by: Petr Machata <me@pmachata.org>
Signed-off-by: David Ahern <dsahern@gmail.com>
Allocation of a new netlink message with the two usual headers is reusable
with other netlink netlink message types. Extract it into a helper,
mnlu_msg_prepare(). Take the second header as an argument, instead of
passing in parameters to initialize it, and copy it in.
Signed-off-by: Petr Machata <me@pmachata.org>
Signed-off-by: David Ahern <dsahern@gmail.com>
This little dance of mnl_socket_open(), option setting, and bind, is the
same regardless of tool. Extract into a new module that should hold helpers
for working with libmnl, mnl_util.c.
Signed-off-by: Petr Machata <me@pmachata.org>
Signed-off-by: David Ahern <dsahern@gmail.com>