Ensuring performance requires test and measurement of all traffic, and listening to calls.
by Assaji Aluwihare, Gary Meyer, and Thad Ward
Voice over Internet Protocol (VoIP) may be viewed by accountants and business owners as a relatively new, cost-saving technology for the enterprise. For those commissioning and managing the data network transport of IP voice over the local area network/wide area network (LAN/WAN), it may seem like just another application to manage, like e-mail or http.
Yet the nature of the payload—voice, where there is no retransmission of time-sensitive packets—makes VoIP testing and troubleshooting to maintain a high quality of experience (QoE) an entirely different effort.
Start with basics
When a VoIP call is set up, speech is encapsulated inReal-time Transport Protocol (RTP) that is encap-sulated in User Datagram Protocol (UDP), both ofwhich are transmitted in an IP frame. Each RTP packet contains a small portion of the voice conversation. The size of the voice sample is dependent on the codec used to compress the digital bit stream at any endpoint, such as an IP phone.
Using network-analysis tools such as this one, the DA-3400, you can test and troubleshoot Voice over IP performance characteristics, including Mean Opinion Score.
Three common codecs are:
- G.711—A high-bandwidth, high-quality, lowest-delay 64-kbit/sec version;
- G.729A/B/C—A low-bandwidth (8 kbits/sec) codec common across the WAN;
- G.723.1—Low-bandwidth (5.3 or 6.3 kbits/sec) but not widely used for VoIP due to long delay.
While a higher-bandwidth codec more accuratelyreproduces the analog input signal, it requires ahigher bit rate, which generates more network trafficand reduces the network’s overall call capacity. Using a lower-bit-rate codec sacrifices quality yet uses lessbandwidth.
Real Time Control Protocol (RTCP) allows the endpoints to communicate directly concerning the quality of the RTP packet stream. The control plane provides signaling protocols that perform such functions as register VoIP phones and connect phone calls.
Measuring a subjective experience
Applications such as e-mail and file transfers tolerate packet delays and use retransmission of bad or missing packets to achieve error-free performance at the application level. Because VoIP service cannot tolerateretransmissions and demands priority routing of packets, it places more-stringent requirements on IP data networks. Ultimately, as with video, VoIP service quality is determined subjectively by the end users.
For voice, unlike data, the key measures of quality are intelligibility and identification. Intelligibility is the ability to understand what is being said. Identification is the ability to recognize the voice of familiar callers, such as a family member or the boss.
Objective and subjective measurements exist to judge the performance and QoE of VoIP service. These form the basis of good VoIP installation and troubleshooting procedures.
Active tests, such as Perceptual Speech Quality Measurement (PSQM), Perceptual Evaluation of Speech Quality (PESQ), and Perceptual Analysis and Measurement System (PAMS) thatuses an analog input signal, each collect known voice samples across the network to a receiving endpoint, where a comparison analysis of the degraded sample is conducted. These are not tones, but rather actual prerecorded WAV files available in different languages. This test requires two devices (one at each end) and isoften used to evaluate ability of the existing network tohandle VoIP by generating and assessing calls. (Active tests are not intended for in-service monitoring, analysis andtroubleshooting.)
The Mean Opinion Score (MOS) is a passive test that calculates voice quality without a reference voice sample,measuring IP transport quality of actual VoIP calls. Usedmost commonly to turn-up, test and troubleshoot networks, the MOS assigns a value between 1.00 (bad) and 5.00 (excellent) to the overall quality of delivered voice through a network. MOS does not look at encoded voice, but rather, it rates the IP transport quality of the packets carrying the encoded voice. Delay, jitter, loss, and sequence of packets are measured.
Common degradation problems
Several common effects impair voice quality on a VoIP system. The test and measurement values important for managing QoE on a VoIP system are:
- Latency. Because IP networks operate on statistical multiplexing technologies, latency in IP networks is usuallyhigher than with analog transmission. Any delay in end-to-end transmission of voice from speaker to listener impedes voice quality. IP networks will have varying latency times over a single path depending upon the level of traffic on the network. In general, lower-bit-rate codec increasesdelay for VoIP calls.
- Packet loss. This can occur in many ways. A router or switch queue may be full and cannot hold any more packets, causing arriving packets (segments of a voice transmission) to be discarded. Bit errors may exceed correctable levels, or a packet may be misrouted or exceed its time-to-livequota due to network topology changes or networkcongestion. In either case, packet loss harms VoIP quality.
- Jitter. Packets that arrive at the destination at irregularintervals or out of sequence can make voice choppy and difficult to understand. Out-of-sequence packets oftenoccur due to multiple routing paths to the same destination. If packets are out of sequence by only one or twosequence numbers, jitter buffers on receiving devices can place packets back into order before voice playback. Packets unable to be placed into proper order are discarded by the receiving device—reducing voice quality. Jitter isalso induced when switch or router packet-processing speed varies or if network conditions change due to congestion or route changes. This causes variation in packet spacing, which degrades voice quality. Jitter buffers on receivingdevices—which themselves cause delay—can only compensate for mild jitter.
Depending on the test instrument’s capabilities, you can measure and generate reports on VoIP-related issues, such as (clockwise from top right) signaling, packet loss, and jitter.
While echo is a common complaint with VoIP systems, inreality, it is an analog problem, usually the result of an impedance mismatch where two wires convert to four somewhere in the network. While echo is not induced on the IP network,increased latency on voice packets will aggravate echo.
Before a VoIP system can be installed at the premise, the network must be assessed for suitability for transporting VoIP traffic. Testing and measurement confirms the abilityof the LAN—switches, routers, and cabling—to handledelay-sensitive VoIP traffic. This step also helps determine load planning for the LAN/WAN and reveals the bandwidth of premises cabling.
This phase of testing is typically carried out with handheld equipment, software or passive test devices, each of which look at all the traffic on the LAN/WAN or home network to assess the quality of the network before VoIP service is added. Testing tools determine whether packets are being dropped or lost as they traverse the network.
It is important to note that the network can provide error-free data service while performing very poorly in terms of lost and dropped packets. High throughput can be used to make up for problems such as jitter, latency and packet loss on data transmission. A network that carries data perfectly may not be capable of providing satisfactory quality for time-sensitive, higher-priority packets of VoIP traffic.
Before service turn-up, it is important to verify connectivity to signaling gateways and provision service, and determine call quality. Terminal adapters and VoIP phones or IP phones plug into the LAN, and IP addresses are provisioned. Handheld test sets are normally used because they can mimic an end device in the network. Handheld test sets also help isolate problems. For example, a handheld device can be used to determine whether a specific end device has been provisioned correctly or to identifyerrors in provisioning network equipment during installation.
At this stage of assessment, voice quality issues are examined. Calls are placed and received through the network to ensure that links are provisioned with the correct signaling protocol. Calls are placed within the LAN/WAN and from the LAN/WAN to the public switched telephone network (PSTN). Here, technicians can confidently confirm that signalingwithin the LAN/WAN is operational. If voice quality is poor, to isolate problems, you place test calls from one side of the router to another and to the ingress gateway.
While checking VoIP call options, the quality of the RTP stream can be monitored by listening to calls via handheld units, with the MOS for each connection then measured and logged.
You should perform the following service turn-up andprovisioning tests:
- Ping to test registration with proxy server;
- Place calls on and off the network;
- Trace the call route;
- Assess MOS for call quality.
Troubleshooting and maintenance
There are two reasons for troubleshooting: failure or intermittent issues. When the issue is failure, troubleshooting is similar to turnup. You verify onnectivity to local elements through ping, trace routes and call placement. Calls are placed on and off the network, perhaps to the technician’s mobile phone. Test files are exchanged to validate jitter and packet loss. The problem can be sectionalized to CPE or carrier.
This display from a VoIP butt set reports on a call that was monitored live.
Intermittent issues are more challenging. The circuit must be monitored in-service to determine which calls are experiencing troubles and when exactly the trouble occurs. Here, it is important to know the profile of all traffic riding on the network at the time of the trouble. For example, other applications may use more router processing time or bandwidth, which can cause VoIP calls to drop or lose voice quality as packets are lost. The only means to determine whether CPE traffic is the cause is to monitor the whole circuit while the problem is occurring.
Another example of an intermittent problem that can only be uncovered by monitoring all traffic is a broadcast storm, which occurs as network elements are reconfigured or moved and broadcast domains are misconfigured. This storm causes broadcast packets, such as a network printer or other network elements that advertise their existence, to flood the network and steal bandwidth, impairing voice quality.
Broadcast storms are easily detected with analysis tools that show steep spikes in broadcast packets at the same time asreported VoIP quality problems. These tools also provide guidance on what actions to take, and include a list of offendingstations or devices that are allowing packets to broadcast outside of a specified domain, such as a printer in Hong Kong trying to tell everyone in New York that it is available. The VoIP problem can beby reconfiguring routers to not send broadcast packets into subnets where they don’t belong.
Capture agents on router/switch ports enable capture and forward of every call that falls below a score establishedthrough MOS or PESQ. Talking on a phone configured tohave 20 milliseconds (ms) of voice in every packet, a packet should arrive every 20 ms. Variation in arrival time or dropped packets are detected by the capture agent. If the test shows packets with errors, it helps pinpoint VoIP quality issues such as routers dropping packets. If errors are localized, packet errors that degrade voice quality can be stopped. For example, RF, EMI and other noise can cause packet errors. With proper testing, you may find acable run adjacent to a bell that is the source of massivepacket errors every time the bell goes off.
To isolate VoIP problems, the first step is determining whetherthe problem is in the network, a provisioning issue, CPE-related, or a customer issue. It is always important to evaluate quality from the end user’s perspective, so the first step is to perform an IP phone emulation test. By listening for the reported complaint—bad voice quality, echo, garbled speech, clicking sounds—you can immediately determine if the CPE is the source of the problem.
In the next step, attempt registration of the test device with the gateway/proxy server. The test set conducts signaling and displays signaling error messages, as well as proving connectivity to the gatekeeper/proxy server and verifying provisioning of the customer’s unique alias. You can also use this test device to place test calls on and off the network. If off-net calls fail, gateway provisioning may be causing the trouble, or there may be a connectivity issue that can be identified by pinging the gateway device IP address.
Here, you perform a trace route set of tests if the ping test fails, which helps isolate path/device connectivity problems. A handheld test set allows tests from multiple locations to isolate the source of the trouble. For example, calls can be placed from the next router down to the ingress router todiagnose whether the problem is on the LAN or WAN link.
Common troubleshooting tests include handset volume, processor, microphone, earpiece, echo canceller performance; TDM voice quality, echo; network packet performance; packet throughput; packet loss; packet delay/latency; packet jitter. Yet even when jitter and packet loss values are fine, there may still be problems reported. This is why it’s important to listen to calls. Objective voice quality is established by generating a MOS by taking measurements at the packet interface of live calls.
Of course, customer quality is based upon an experience. Your MOS values are based upon values generated on a test device; if the test shows a good MOS score, the call must be captured and listened to using a test set with the ability to play back the voice and emulate the actual VoIP phone.
The converged enterprise IP network shares many applications: e-mail, data, instant messaging, Internet access and,increasingly, voice. In addition, new applications, servermoves, and adding/deleting workstations, IP phones, and printers are happening allthe time. With such an unstable environ-ment, the load on the network is alwayschanging and can present problems for the VoIP application running on the samenetwork.
When server consolidation occurs, there is now new traffic over the WAN that never existed and possibly never left the building. More often than not, VoIP is a victim of changing IP traffic. Upon testing, you may find jitter, which is usually caused by excessive traffic loads on the network. The first rule of VoIP troubleshooting is to ask: What else is going on in the network? How come excessive jitter wasn’t there before?
Adding VoIP can cause its own network problems. If e-mail and data have the same priority as voice, then no application has priority. Packets must be tagged properly by CPE and honored by network elements. For example, tests may show that priority is only set in one direction for a particular router, slowing data traffic. With lots of new VoIP calls on the WAN, when someone tries to do a file transfer (which then takes a back seat to voice), the network slows and the help desk phone rings. More often than not, more bandwidth isthe solution.
When in doubt, listen
When testing VoIP, it is most important to remember that this is a voice service. Measuring jitter and delay provideimportant clues for troubleshooting. Yet at some point, resolving a VoIP problemrequires that you follow the second rule of VoIP troubleshooting:
Listen to calls to maintain and improve QoE for VoIP services.
Assaji Aluwihare is general manager/network enterprise and test, and Gary Meyer and Thad Ward are product managers with JDSU's Communications Test and Measurement business (www.jdsu.com)