Building a scalable Web-based Call Center CTI Solution

My project was part of our customer’s effort to replace all of the enterprise applications with web applications based on a standardized technology stack. In this strategic move, the call center integration was a crucial step. As it turned out, the technical design of the new call center telephony solution was quite challenging. We did not only learn a lot about CTI; we also had to implement the system to be scalable and ensure that it handles more than 1000 call center agents.

The call center agents should use mostly the standard web applications, but with an additional telephony control that allowed them to accept incoming calls, to disconnect calls, or to make consultation calls to other agents or supervisors.

An incoming call

Let’s have a look at the most important usecase first: an incoming call.

The following diagram gives an overview of the flow of events, before the agent’s telephone rings:

The incoming call of the customer is handled by the PBX (Private Branch Exchange). When the agent finally takes the call, a lot of information about the customer has already been collected. In most cases, the caller will already have gone through an interactive voice response system that has collected his account number and verified his PIN (omitted from the picture above).

This is how the agent screen might look like after the agent has taken the call:

Schematic screen of the call center agent UI.

The box on the left are the telephony controls. They are embedded in an iframe and allow the agent to disconnect the call or place consultation calls to other agents. The telephony controls send commands to the gateway (and in extension to the PBX) and receive asynchronous events.

CSTA as a Model for our Protocol

Before it was even determined whether the telephone system should be integrated directly through the PBX or via an integration layer (Genesys), we decided to use the CSTA Phase III communication protocol as an orientation for the protocol between gateway and browser. CSTA (Computer Supported Telephony Applications) is an ECMA standard (like JavaScript) and describes third-party call control using services and events. Third-party call control roughly means that the standard looks at an entire switch and all connected devices (telephones), and not just a single telephone. This point of view is reflected in the naming of the services and events. For example, when a call arrives at a terminal the event is called Delivered. A sample event flow is given in the following diagram.

Exemplary flow of CSTA services and events in our system.

Services are commands to the telephone system. An outgoing call (from any device within the domain of the switch) is initiated by a Make Call Service. But there are also services like Set Agent State.

CSTA is extremely comprehensive; we used only a small selection of its services and events. It is also easy to extend — the transfer of non-standardized key/value pairs within the data part of services and events is explicitly provided for.

CSTA provides an ASN.1 and an XML encoding. Writing an ASN.1 parser in JavaScript was obviously not a good idea and even the XML mapping is quite heavy-weight and we decided to design our own transport encoding on top of JSON and built a REST-inspired web service as a gateway to the PBX.


The customers’ technology framework requirements were:

  • Internet Explorer 8 as the browser for the call center agents
  • Wicket as web application framework for the call center application
  • Tomcat 7 as the web application server for both the call center web app and the gateways.

The technical requirements were: minimal latency, high throughput, and high availability. An average delay below 150 ms was required for latency, i.e. a value slightly below the attention threshold. For the call center callers, very low latency is not crucial — most callers will have waited in the queue for minutes rather than seconds to reach a free agent anyway. But the new web application should — if at all possible — not worsen the ergonomics for the call center agents. In the end, this wasn’t a problem: during tests using moderate load latencies below 80 ms could be achieved.

High availability is an obvious requirement: if a call center with about 1000 agents fails, there will be many unhappy customers. On an unlucky day the failure will even be reported in the news. We solved the problem by designing for redundant server components and a low latency failover protocol. The actual web application uses Tomcat’s built-in clustering mechanism. We couldn’t reuse this for the telephony gateway, because the relevant state is distributed across the switches anyway.

The gateway has two essential reliability requirements:

  • Commands to the telephone system have to be retried quickly if a gateway fails.
  • Telephony events must not be lost.

The functional requirements were straightforward:

  • Incoming and outgoing calls (simple call control)
  • Call forwarding (single-step/two-step transfer)
  • Forwarding to the IVR (Interactive Voice Response) — including customer dependent data — as well as routing back to the same agent that originally took the call
  • Setting and displaying the agent status


The architecture consists of several interconnected systems as shown in the diagram below:

  • The call center agents’ browser with the JavaScript/HTML,
  • Telephony-related systems (left): the gateway (a server-side web application running in Tomcat) and the PBX,
  • Call center web application (right): Wicket-based web application and its database(s).

The integration of telephony and web application happens in the browser. The web application includes our JavaScript library and a telephony control panel in an iframe.

The architecture of our CTI solution based on web technology.

Sending Server Events to the Browsers

For redundancy, every client connects to both gateways, and keeps the TCP connection open. This means that every application server (Tomcat) of the gateways has to hold nearly 1000 open connections. We use the AIO-Interface of Tomcat 7, so all these connections can be processed by a single thread. This greatly minimizes memory requirements and scheduling overhead.

Overview of our system architecture with redundant gateways and PBX systems.

Server-sent events (a.k.a. server push) was recently standardized as part of HTML5 in the EventSource interface. Another convenient method to implement bidirectional communication is WebSockets. But we couldn’t use any of these due to the use of legacy browsers — we were glad we didn’t have to support IE6 and could rely on at least IE8. So we implemented a COMET variant, which essentially consists of long running XMLHttpRequest through which events are sent as chunked responses.

The asynchronous events from the gateways are decoded by our JavaScript library, which updates the telephony control and forwards the events to the interface part of the browser, which in turn may trigger a server interaction.

Cross-Domain COMET with IE8

The customer wanted to be able to run the web application and the gateways on different application servers. This means that the Javascript XMLHTTPRequests are cross domain, which turned out to be a small challenge on IE8.

Mozilla Firefox, Safari and Google Chrome all support the CORS (Cross-Origin Resource Sharing) specification of the W3C. IE8 supports it as well; however, with IE8 one must use XDomainRequests instead of XMLHTTPRequests, and the API is slightly different. There is a also a subtle buffering bug within IE8 that makes it necessary to set 2 KB of fill characters on every new COMET connection to ensure that the next event is received by the application immediately.

Redundant Gateways and PBX

Each browser keeps two connections to two different gateways. One is active, and the other is a hot standby. When the connection to the active gateway is broken, the hot standby gateway is immediately activated. If necessary,  the last failed command will be retried. As the hot standby gateway has been sending events the whole time as well, it is guaranteed that no event is lost. After this failover, the connection to the failed gateway is retried. When it is active, the previously failed gateway has become the host standby gateway.

Loss of a gateway does not lead to the loss essential state — the gateways hold as little state as possible. All relevant state was either pushed up into the JavaScript library or down into the PBX integration layer. The gateways are also independent of each other and interchangeable. This makes the solution inherently scalable. More gateways can be added at any time.

The PBX (a Genesys installation) itself is also redundant. The fallback on this level is hidden by the Genesys API and the gateway doesn’t have to handle it.


Our solution was tested with three different methods:

  • Javascript unit tests with QUnit,
  • A simulator that implements the gateway’s HTTP services and simulates a single agent telephone (with a Swing GUI), and
  • Load tests.

Writing the simulator was a substantial effort, but it helped in two ways:

  • It made development without the telephony hardware possible.
  • It made it easy to test scenarios that were not reliable testable with real hardware (like deliberate race conditions).

In an ideal world, the load tests would have been performed with an external load test tool. We didn’t have one available, so we wrote our own load test generator using the CSTA API to generate and receive calls.


Our solution is light-weight, conceptually simple and scalable. The simplicity is the result of two development iterations and rather long design phases.

The decision to use CSTA as the blueprint for the communication protocol worked well, too. It was helpful that we did not have to re-invent two-step transfer for the umpteenth time. Also, the CSTA vocabulary (which goes down to the text in the log messages) can be understood by personnel that are familiar with CTI.

In the Footprints of Arnold Schwarzenegger

Call center applications always remind me of a slightly silly movie starring Arnold Schwarzenegger as an undercover agent and Jamie Lee Curtis as his unsuspecting wife. His cover story for her is that he is doing something with IT and in one scene she inquires about his day at work. He starts telling her enthusiastically and quite elaborately about a call center integration — and she nearly falls asleep.

I, however, think the combination of a call center and a web application is technically quite fascinating.