Skip to content

Known Limitations

The following is a list of current limitations of the Fabric, which we are working hard to address:

Deleting a VPC and creating a new one right away can cause the agent to fail

The issue is due to limitations in SONiC's gNMI interface. In this particular case, the deletion and creation of a VPC back-to-back (i.e. using a script or the Kubernetes API) can lead to the reuse of the deleted VPC's VNI before the deletion had effect.

Diagnosing this issue

The applied generation of the affected agent reported by kubectl will not converge to the last desired generation. Additionally, the agent logs on the switch (accessible at /var/log/agent.log) will contain an error similar to the following one:

level=ERROR msg=Failed err="failed to run agent: failed to process agent config from k8s: failed to process agent config loaded from k8s: failed to apply actions: GNMI set request failed: gnmi set request failed: rpc error: code = InvalidArgument desc = VNI is already used in VRF VrfVvpc-02"

Known workarounds

Deleting the pending VPCs will allow the agent to reconverge. After that, the desired VPCs can be safely created.

Configuration not allowed when port is member of PortChannel

This is another issue related to limitations in SONiC's gNMI interface. Under some circumstances, the agent might find itself unable to apply the desired state.

Diagnosing the issue

The applied generation of the affected agent reported by kubectl will not converge to the last desired generation. Additionally, the agent logs on the switch (accessible at /var/log/agent.log) will contain logs similar to the following ones:

level=DEBUG msg=Action idx=4 weight=13 summary="Update Interface Ethernet1 Base" command=update path="/interfaces/interface[name=Ethernet1]"

level=ERROR msg=Failed err="failed to run agent: failed to process agent config from file: failed to apply actions: GNMI set request failed: gnmi set request failed: rpc error: code = InvalidArgument desc = Configuration not allowed when port is member of Portchannel.

Known workarounds

Manually setting the interface mentioned immediately before the error log to admin-up, i.e. using the sonic-cli on the affected switch. For example, given the logs above, one would ssh onto the switch (let's call it s5248-01) and do:

admin@s5248-01:~$ sonic-cli
s5248-01# configure
s5248-01(config)# interface Ethernet 1
s5248-01(config-if-Ethernet1)# no shutdown

External peering over a connection originating from an MCLAG switch can fail

When importing routes via External Peering over a connection originating from an MCLAG leaf switch, traffic from the peered VPC towards that prefix can be blackholed. This is due to a routing mismatch between the two MCLAG leaves, where only one switch learns the imported route. Packets hitting the "wrong" leaf will be dropped with a Destination Unreachable error.

Diagnosing this issue

No connectivity from the workload server(s) in the VPC towards the prefix routed via the external.

Known workarounds

Connect your externals to non-MCLAG switches instead.

Changing StaticExternal withinVPC field makes agent fail

Once a StaticExternal connection is created, changing its withinVPC field e.g. from vpc-01 to an empty string will cause the agent to repeatedly fail due to a gNMI ordering issue.

Diagnosing the issue

The applied generation of the affected agent reported by kubectl will not converge to the last desired generation. Additionally, the agent logs on the switch (accessible at /var/log/agent.log) will contain an error similar to the following one:

level=ERROR msg=Failed err="failed to run agent: failed to process agent config from k8s: failed to process agent config loaded from k8s: failed to apply actions: GNMI set request failed: gnmi set request failed: rpc error: code = InvalidArgument desc = L3 Configuration exists for Interface: Ethernet0"

Known workarounds

Deleting the StaticExternal connection and creating it from scratch with the desired withinVPC parameter will solve the issue.