Getting Twisted

Getting Twisted
– a Framework for writing Asynchronously Networked Applications.

– – – – – – – – – – – –
By Christopher Armstrong | June 18, 2004

print

Twisted is a framework for writing asynchronously networked applications. One of the greatest advantages of Twisted is that it allows developers to integrate many protocols into their applications, which it does by offering consistent APIs to these different protocols. An overview of several of Twisted’s parts, along with design ideas and code samples, is provided.

In this installment, I’ll give a high-level explanation and rationale for Twisted, an overview of the way Twisted is structured, and some examples of the implementation of simple servers and clients.

The Overview

~ Wherein our Hero explains what Twisted is and is for, sans implementation details ~

Let’s break the expression down: “a framework for writing asynchronously networked applications”. First: “a framework”. Twisted isn’t a typical library. The simplest definition of “framework” would be “a library that calls your code as well as you calling it”. When writing code that will use Twisted, you should expect to be implementing Twisted-defined interfaces or subclassing Twisted-defined classes while implementing particular methods that are expected of you.

Second: “asynchronous”. Twisted uses asynchronous interfaces wherever another library would typically block (and assume that you would use threads to multiplex). When you’re writing code that uses Twisted, it should never block. There is an event loop, otherwise known as the “reactor”. At the beginning of your program, you’ll do some things that cause the reactor to call your code, and start the reactor. When events like “reactor started”, “connection made”, or “data received” happen, your code will be called if you’ve registered handlers for these events.

A bit more on “asynchronous”: It’s not always the reactor that receives and triggers the events; sometimes framework code needs to expose an interface for others to call that is asynchronous. This is what Deferreds are for. A Deferred is basically an abstracted callback, supporting error handling and chaining of callbacks. More on this later.

Third: “networked”. Twisted provides various levels of abstraction for writing networked applications. Protocols can be implemented independently of their transports. This means that if you have, say, an IRC client protocol and a SOCKS transport implementation, you don’t need to touch the IRC client protocol implementation to get it to run on SOCKS; you only need to change the part where they’re glued together. If you’re not working with already defined protocols, and you control both ends of the connection, there is an efficient remote object and method call system called Perspective Broker that lets you deal with APIs instead of byte-streams.

The rationale of Twisted is two-fold: there is the typical reason of enabling the programmer to avoid wasting time on grunt work like implementing protocols and frameworks to build her application on. The second rationale, though, is integration. We want to ease the integration of various systems into one application. In an email server, for example, one will often want SMTP, IMAP4, POP3, and even a Web interface, in addition to allowing each of these protocols to optionally use encryption (SSL/TLS). It is our hope that Twisted makes it relatively easy to implement systems that integrate so many protocols.

The Packages

~ Wherein the gracious Author gives an explanation of the various components of the system ~

Twisted has many packages; here are the interesting or important ones.

Core

  • twisted.internet
      The deepest core of Twisted. This contains the reactor and Deferreds.
  • twisted.cred
      An authorization framework. This facilitates the separation of knowledge between protocols, credential-checkers, application back-ends, and deployment.
  • twisted.protocols
      Implementations of many common Internet protocols.
  • twisted.python
      Utilities. These are basically things that should be in the Python standard library.
  • twisted.application
      Deployment functionality. This lets you tie together various services for an application.

High Level Framework/Application

  • twisted.web
      A framework for Web applications, as well as a stand-alone Web server.
  • twisted.names
      A client and server framework for using the DNS protocol, as well as a standalone configurable DNS server.
  • twisted.conch
      A client and server for SSH.

Others

  • twisted.spread
      A remote object system featuring remote method calls and object transfer.
  • twisted.enterprise
      An asynchronous database access interface, for using your favorite DBAPI2-compliant database access modules with Twisted.

Like all properly designed systems, Twisted layers its abstractions so that the developer may use whatever is appropriate for her task. At the bottom, in twisted.internet, there are the low-level platform-specific reactor implementations that implement an event loop, as well as networking, threading, and other services. These implementations provide a common, platform-agnostic API, which is the reactor interface defined in twisted.internet.interfaces.

Protocols are implemented on top of this. They are separated from the transport level so that they can run on TCP, SSL, SOCKS, and so on, without changes in their actual implementation (in most cases). Protocols such as HTTP, FTP, DNS, IMAP4, and SMTP are included.

Atop the protocols are the frameworks that help in the writing of applications that use the protocols; for HTTP, there is twisted.web, which exposes an “object publishing” system. twisted.names exposes a DNS framework and twisted.news a netnews framework. These frameworks often contain stand-alone functionality; e.g., you can run a simple static file-serving Web server without writing any code, using twisted.web.

Protocol Implementation

~ Wherein the Reader shall be asked to grok code snippets ~

Let’s dive into some code, shall we? Now, I’m not going to take the approach that some other pedagogical articles take. They’ll often have code using lower-level APIs that I wouldn’t honestly recommend for those programs, in order to build up to higher-level concepts. Instead, I’ll show examples of actual best practices, using our various abstractions, and then, after the examples, explain how things under those abstractions fit together

We always start out with an example of an Echo server, so let’s do that. Run the following snippet with:

   $ twistd -y echo.tac

=== echo.tac ===
from twisted.internet.protocol import Protocol, ServerFactory

class Echo(Protocol):
def dataReceived(self, data):
#As soon as any data is received, write it back
self.transport.write(data)

factory = ServerFactory()
factory.protocol = Echo

from twisted.application import service, internet

application = service.Application(“echoserver”)
internet.TCPServer(1025, factory).setServiceParent(application)
=== echo.tac ==

This defines an Echo class which subclasses Protocol; Protocol is the base class that all protocols implemented with Twisted must use. Our Echo class defines dataReceived so that it simply writes all data it receives back to the transport via self.transport.write.

Next we instantiate a ServerFactory; for more complex protocols, developers will often want to create a custom subclass of ServerFactory that knows more about their protocol. ServerFactory instances can have a protocol attribute that should refer to a Protocol *class*; we set it to the Echo class in our example above. ServerFactory instances get their buildProtocol method called on new incoming connections; this method is expected to return a Protocol *instance* used to handle the connection. It’s important to note that there is a 1:1 mapping between Protocol instances and connections.

It’s useful to know how ServerFactory’s buildProtocol method is defined; in many cases you’ll want your ServerFactory subclass to do some variation of it.

=== ServerFactory.buildProtocol ===
def buildProtocol(self, addr):
p = self.protocol()
p.factory = self
return p
=== ServerFactory.buildProtocol ===

It gives the protocol instance a reference to itself as the factory attribute; this is useful when protocols must store and/or share data that should persist longer than a single connection.

After instantiating the factory, the example goes through a short dance for deploying the app which basically tells the twistd program how to start our server. twistd is a program used to start Twisted-based daemons; it handles daemonization, logging, and has many other features that you can find by calling “twistd –help”. The application and service infrastructure will be explained in further articles.

Writing protocols to be used as a client is similar; most of the difference is in deployment and what the factory is responsible for. The following example is a client for our Echo server. You can execute it like a regular Python program, once your echo server is running.

=== echoclient.py ===
from twisted.internet.protocol import ClientFactory
from twisted.protocols.basic import LineReceiver
from twisted.internet import reactor
import sys

class EchoClient(LineReceiver):
end=”Bye-bye!”
def connectionMade(self):
self.sendLine(“Hello, world!”)
self.sendLine(“What a fine day it is.”)
self.sendLine(self.end)

def lineReceived(self, line):
print “receive:”, line
if line==self.end:
self.transport.loseConnection()

class EchoClientFactory(ClientFactory):
protocol = EchoClient

def clientConnectionFailed(self, connector, reason):
print ‘connection failed:’, reason.getErrorMessage()
reactor.stop()

def clientConnectionLost(self, connector, reason):
print ‘connection lost:’, reason.getErrorMessage()
reactor.stop()

def main():
factory = EchoClientFactory()
reactor.connectTCP(‘localhost’, 1025, factory)
reactor.run()

if __name__ == ‘__main__’
:   main()

=== echoclient.py ===

This time our protocol is a LineReceiver subclass. If you’ll recall, the dataReceived method on Protocol subclasses gets called with data of arbitrary size. Since it’s more natural to process a line at a time in this case, we need to buffer those arbitrary chunks of data until we receive a full line, and then act on it. Fortunately, LineReceiver has already been written for us. LineReceiver is a utility protocol subclass which buffers received data in its dataReceived method until a line has been received (by default, delimited with ‘\r\n’, but this is parameterized with the ‘delimiter’ class attribute). It then calls the lineReceived method, which is expected to be implemented by our subclass.

connectionMade is also a new method. It can be defined on any Protocol subclass, whether used as a server or client, and it will be called when a connection is made.

As you may notice, the client factory is responsible for more than the server factory; ClientFactory subclasses can implement clientConnectionLost and clientConnectionFailed methods that will be called when the respective event occurs. In our case, we just print some debugging info and stop the reactor, thus ending the program.

I’d better explain the reactor now, since we’ve just gone over code that stops it without having read the code that starts it! Well, our main function creates a client factory; that’s similar to what we did in the echo server example. Now we’ll use the reactor to initiate a connection to the server with reactor.connectTCP, and then kick off the main loop with reactor.run.

You’re probably wondering why there’s no reactor code in the echo server example (if you’re not, then it’s a good thing I pointed it out!). We didn’t have any there because twistd handled it for us, but since twistd isn’t really appropriate for our echo client (it’s not a daemon, for one), we didn’t bother using those application and service APIs that were appropriate for our echo server. Underneath the hood, those service calls use reactor APIs. TCPServer, for example, eventually calls reactor.listenTCP, an analog of reactor.connectTCP, and twistd itself calls reactor.run.

Now we’ll take it to the next fundamental step, communication between connections. A very simple chat server illustrates this nicely. Run it with ‘twistd -y chatserver.tac’.

=== chatserver.tac ===
from twisted.protocols import basic

class MyChat(basic.LineReceiver):
def connectionMade(self):
print “Got new client!”
self.factory.clients.append(self)

def connectionLost(self):
print “Lost a client!”
self.factory.clients.remove(self)

def lineReceived(self, line):
print “received”, repr(line)
for c in self.factory.clients:
c.message(line)

def message(self, message):
self.transport.write(message + ‘\n’)

from twisted.internet import protocol
from twisted.application import service, internet

factory = protocol.ServerFactory()
factory.protocol = MyChat
factory.clients = []

application = service.Application(“chatserver”)
internet.TCPServer(1025, factory).setServiceParent(application)
=== chatserver.tac ===

Run that with ‘twistd -y chatserver.tac’ and then connect with multiple telnet clients to localhost:1025. You should be able to type something in one client and see it in the others.

This example takes advantage of the factory for cross-connection communication. The factory has a list of protocol instances as its clients attribute; the protocols iterate through the list and tell each other to send out messages when they receive a message in lineReceived.

That’s it for this installment. Tune in next time for something quite fundamental to Twisted: Deferreds. Other higher-level packages will also be delved into with a little more depth.