16 July 2020

Dmitriy Vitvitskii

Software Developer

Juniper: how to grow a juniper at home. How to write a mock server for Juniper switches

как написать mock-сервер для коммутатора Juniper. Обложка к статье

My name is Dmitriy, and I am a software developer of DCImanager — the equipment management panel by ISPsystem. I spent a long time on the team developing switch management software. Together we have experienced ups and downs: from development of hardware management services to failure of the office network and hours-long “dates” in the server room hoping not to lose your loved ones.

And, finally, the testing time came. We were able to cover a part of switch handlers with ready-made testing solutions. However, that was not the case with Juniper. The research and implementation that we went through have inspired this article.

DCImanager supports different equipment types: switches, PDUs, and servers. The panel currently supports four switch handlers. Two handlers working via SNMP (Cisco Catalyst and snmp common) and two more via NETCONF (Juniper with and without ELS support).

All activities with equipment are heavily covered with testing. Using actual equipment for automatic testing does not work: tests are launched at each push to repository and run in parallel. Therefore, we try to use emulators.

We were able to cover the SNMP protocol handlers with tests by using the SNMP Agent Simulator library. But in case of Juniper, we ran into problems. After looking for ready-made solutions, we chose a couple of libraries, but one of them did not start, and the other was not doing the right thing — I actually spent more time trying to bring that little wonder to life.

So, the question was how to emulate Juniper switches? Juniper runs on NETCONF, which, runs over SSH. The idea of writing a small service that would work over SSH and emulate the switch came to mind. Accordingly, we needed the service itself as well as a Juniper "snapshot" for data emulation.

In snmpsim, a snapshot refers to a complete copy of the switch status, with all its supported OIDs and their current values.

However, in Junipier, things are slightly more complicated: no such snapshot can be created. In this case, a snapshot will refer to a set of query-response templates.

Part one: the architecture of planting

We are now actively developing a whole “zoo” of handlers for different switches. Soon we will have new switch handlers, but not all of them will be covered with ready-made testing solutions. However, we can try to write a base architecture of the service that will simulate different devices on different protocols.

In the simplest case — a factory, which depending on the protocol and the handler (some switches can run on several protocols), will return the switch object, in which all logic of its behavior will already be implemented. In the case of Juniper, it is a small query parser. Depending on the incoming rpc query with parameters, it will perform the necessary actions.

Important restriction: we will not be able to fully simulate operation of the switch. It will take a long time to describe all the logic, while if we add new functionality to the actual switch handler, we will also have to adjust the switch's mock server.

Part two: choosing the right soil for planting

Our eye fell on paramiko library that provides a convenient interface for working via SSH. To begin with, we wanted to check the basic things, such as connection and some simple query, instead of breaking down the architecture. We are doing research here, after all. So, we did not concern ourselves with authorization: a combination of simple ServerInterface and a socket server gave us something like a working option:

class SshServer(paramiko.ServerInterface):
   def check_auth_password(self, user, password):
       if user == SSH_USER_NAME and password == SSH_USER_PASSWORD:
           return paramiko.AUTH_SUCCESSFUL
       return paramiko.AUTH_FAILED

socket = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
socket.setsockopt(socket.SOL_SOCKET, socket.SO_REUSEADDR, 1)
socket.bind(("127.0.0.1", 8300))
socket.listen(10)

client, address = socket.accept()
session = paramiko.Transport(client)

server = SshServer()
session.start_server(server=server)

An approximate implementation of something you would like to see, but it looks scary

When the client connects to the server, the latter should respond with a list of its capabilities. For example:

reply = """
    <hello>
     <capabilities>
      <capability>urn:ietf:params:xml:ns:netconf:base:1.0</capability>
      <capability>xml.juniper.net/netconf/junos/1.0</capability>
      <capability>xml.juniper.net/dmi/system/1.0</capability>
     </capabilities>
     <session-id>1</session-id>
    </hello>
    ]]>]]>
"
""
socket.send(reply)

Yes, this is XML ]]>]]>

In case you wondered, the code is unstable. This implementation has the problem of socket closure. I was able to find a couple of registered issues in paramiko with this problem. We put this option aside, and decided to check the remaining opportunity.

Part three: planting

The ace up our sleeve was Twisted. This is a network application development framework, which supports a large number of protocols. It has extensive documentation and the fantastic Cred module that would help us.

Credis an authentication mechanism that allows different network protocols to connect to the system depending on your requirements.

To organize the entire logic, Realm was used — the part of the application responsible for business logic and access to its objects. However, first things first.

The core of the system login is Portal. If we want to write a service on top of the network protocol, we need to define the standard Portal. It already includes methods:

  • login (provides client access to the subsystem)
  • registerChecker (verification of credentials).

A Realm object is used to connect the business logic to the authentication system. Since the client is already authorized, the logic of our service over the SSH starts here. This interface has only one requestAvatar method, which is requested upon successful authorization in Portal and returns the main object - SwitchProtocolAvatar:

@implementer(portal.IRealm)
class SwitchRealm(object):
    def __init__(self, switch_obj):
        self.switch_obj = switch_obj

    def requestAvatar(self, avatarId, mind, *interfaces):
        return interfaces[0], SwitchProtocolAvatar(avatarId, switch_obj=self.switch_obj), lambda: None
The simplest Realm object implementation returning the required Avatar

Special objects — Avatars — are in charge of managing the business logic. In our case, the service over the SSH starts here. When a query is sent, the data are brought into SwitchProtocolAvatar, which checks the query subsystem and updates the configuration:

class SwitchProtocolAvatar(avatar.ConchUser):
    def __init__(self, username, switch_core):
        avatar.ConchUser.__init__(self)
        self.username = username
        self.channelLookup.update({b'session': session.SSHSession})

        netconf_protocol = switch_core.get_netconf_protocol()
        if netconf_protocol:
            self.subsystemLookup.update({b'netconf': netconf_protocol})
Checking the subsystem and updating the configuration, provided that the switch processing is running on NETCONF

Speaking of protocols. While bearing in mind that we are working with NETCONF, we proceed with execution. To write services on top of existing protocols and to implement our logic we use Protocol. The interface of this class is simple:

  • dataReceived — used to process data receipt events;
  • makeConnection — used to establish connection;
  • сonnectionMade — used once the connection has been established. Here we can define some logic before the client starts sending queries. In our case, we need to send the list of our capabilities.
class Netconf(Protocol):
    def __init__(self, capabilities=None):
        self.session_count = 0
        self.capabilities = capabilities

    def __call__(self, *args, **kwargs):
        return self

    def connectionMade(self):
        self.session_count += 1
        self.send_capabilities()

    def send_capabilities(self):
        rpc_capabilities_reply = "<hello><capabilities>{capabilities}</capabilities>" \
                                 "<session-id>{session_id}</session-id></hello>]]>]]>"
        rpc_capabilities = "".join(f"<capability>{cap}</capability>" for cap in self.capabilities)
       
        self.transport.write(rpc_capabilities_reply.format(capabilities=rpc_capabilities,
                                                           session_id=self.session_count))

    def dataReceived(self, data):
        # Process received data
       pass
Minimum implementation of a wrapper on top of the protocol. A part of logic is not shown for clarity

Here we start to wrap the layers of our nesting doll. Since we use a service on over SSH, we need to implement the SSH server logic. In it, we will specify keys for the server and processing modules for SSH services. Our interest in implementation of this class is limited, since the authorization will be password-based:

class SshServerFactory(factory.SSHFactory):
    protocol = SSHServerTransport
   
    publicKeys = {b'ssh-rsa': keys.Key.fromFile(SERVER_RSA_PUBLIC)}
    privateKeys = {b'ssh-rsa': keys.Key.fromFile(SERVER_RSA_PRIVATE)}

    services = {
        b'ssh-userauth': userauth.SSHUserAuthServer,
        b'ssh-connection': connection.SSHConnection
    }

    def getPrimes(self):
        return PRIMES
Implementation of the SSH server

For SSH server to work it is necessary to determine the logic of sessions, which operates regardless of the protocol we decide to use or what interface is requested:

class EchoProtocol(protocol.Protocol):
    def dataReceived(self, data):
        if data == b'\r':
            data = b'\r\n'
        elif data == b'\x03':  # Ctrl+C
           self.transport.loseConnection()
            return
        self.transport.write(data)


class Session:
    def __init__(self, avatar):
        pass

    def getPty(self, term, windowSize, attrs):
        pass

    def execCommand(self, proto, cmd):
        pass

    def openShell(self, transport):
        protocol = EchoProtocol()
        protocol.makeConnection(transport)
        transport.makeConnection(session.wrapProtocol(protocol))

    def eofReceived(self):
        pass

    def closed(self):
        pass
Session logic for all interfaces described

I have nearly forgotten about the switch handler itself. After all checks and authorizations, the logic moves on to the object that emulates the switch. Here you can set the query processing logic: receiving or editing interfaces, device configuration etc.

class Juniper:
    def __init__(self):
        self.protocol = Netconf(capabilities=self.capabilities())

    def get_netconf_protocol(self):
        return self.protocol

    @staticmethod
    def capabilities():
        return [
            "Candidate1_0urn:ietf:params:xml:ns:netconf:capability:candidate:1.0",
            "urn:ietf:params:xml:ns:netconf:capability:confirmed-commit:1.0",
            "urn:ietf:params:xml:ns:netconf:capability:validate:1.0",
            "urn:ietf:params:xml:ns:netconf:capability:url:1.0?protocol=http,ftp,file",
            "urn:ietf:params:netconf:capability:candidate:1.0",
            "urn:ietf:params:netconf:capability:confirmed-commit:1.0",
            "urn:ietf:params:netconf:capability:validate:1.0",
            "urn:ietf:params:netconf:capability:url:1.0?scheme=http,ftp,file"
        ]
Main logic of the switch handler. I took out all the functionality and query processing leaving only receipt of the capabilities

And finally we join it all together. The session adapter is registered (which describes the behavior upon connection), the connection method by username and password is defined, the Portal is configured and our service is launched:

components.registerAdapter(Session, SwitchProtocolAvatar, session.ISession)

switch_factory = SwitchFactory()
switch = switch_factory.get("juniper")

portal = portal.Portal(CustomRealm(switch))
credential_source = InMemoryUsernamePasswordDatabaseDontUse()
credential_source.addUser(b'admin', b'admin')
portal.registerChecker(credential_source)

SshServerFactory.portal = portal

reactor.listenTCP(830, SshServerFactory())
reactor.run()
Configuring and launching the server
Then we start the mock server. You can connect with the ncclient library to check if it works. A standard connection check and server capabilities display will suffice:
from ncclient import manager

connection = manager.connect(host="127.0.0.1",
                             port=830,
                             username="admin",
                             password="admin",
                             timeout=60,
                             device_params={'name': 'junos'},
                             hostkey_verify=False)

for capability in connection.server_capabilities:
   print(capability)
Connecting to the mock server via NETCONF and displaying the server capabilities

The query result is provided below. We have successfully established the connection and the server delivered us the list of its capabilities:

 Candidate1_0urn:ietf:params:xml:ns:netconf:capability:candidate:1.0
urn:ietf:params:xml:ns:netconf:capability:confirmed-commit:1.0
urn:ietf:params:xml:ns:netconf:capability:validate:1.0
urn:ietf:params:xml:ns:netconf:capability:url:1.0?protocol=http,ftp,file
urn:ietf:params:netconf:capability:candidate:1.0
urn:ietf:params:netconf:capability:confirmed-commit:1.0
urn:ietf:params:netconf:capability:validate:1.0
urn:ietf:params:netconf:capability:url:1.0?scheme=http,ftp,file
Server capabilities

Summary

This solution has its pros and cons. On the one hand, we spend a lot of time on implementation and description of the whole logic of query processing. On the other hand, we gain flexible configuration and behavior emulation. However, the key advantage is scalability. The Twisted framework has rich functionality and supports a large number of protocols, so you can easily describe new switch handlers’ interfaces. And if you think everything through well enough, this architecture can be used not only for working with switches, but also for other equipment.

Feedback from readers would be strongly appreciated. Have you done anything similar and if so, what technologies were used and how did you set up the testing process?

Dmitriy Vitvitskii

Software Developer