Twelve years ago I developed a system with a couple of other folks to determine 
and report on quality of individual VoIP calls in real time. I really wish I 
could show it to you, but the work was done under contract, and I've never been 
released to offer the code as open source or to any other party. In any case, 
it would take a little work to update the software, but I can give you the gist 
of it, which might help with your evaluation, or in case you decide to build 
something on your own.

The GeoPacket CQA (Call Quality Analyzer) software would sit on a system with a 
view of all the VoIP traffic to monitor (using port mirroring). A call 
recognizer process would passively watch the signaling packets for device 
registration (so it could match up calls to devices/customers) and call setup. 
When it saw call setup, it would spawn a call analyzer process to sample 
sequences of RTP packets, using packet characteristics to calculate a quality 
score (by default about every 30 seconds during the call). The score data would 
be sent to the GeoPacket CQD (Call Quality Display), which incorporated a MySQL 
database for storage, plus a custom-developed web UI. When the recognizer saw 
the call teardown, it would reap the analyzer process for that call.

This was mostly implemented in Perl, using libpcap. It was designed to monitor 
up to 3000 simultaneous calls on a 1U rackmount server of the time; modern 
systems with multiple cores and more memory should allow for at least 10-20x 
the number of calls.

The general formula used to calculate the score is patented; you can view that 
here:

http://patft.uspto.gov/netacgi/nph-Parser?Sect1=PTO1&Sect2=HITOFF&d=PALL&p=1&u=%2Fnetahtml%2FPTO%2Fsrchnum.htm&r=1&f=G&l=50&s1=7,430,179.PN.&OS=PN/7,430,179&RS=PN/7,430,179

The score was made to match up to PSQM[1]. Determining the formula required a 
lot of calibration using a device called a Telegra from HP/Agilent, which would 
generate audio on one end of a line, and process the audio on the other end to 
determine how much it had degraded and generate a score. I monitored the 
artificial calls with the CQA, and fiddled with different approaches until I 
got something that generated very similar scores. As disclosed in the patent, 
this took the general form:

        score = A + ln(B + CJ) + exp(DL)

where A, B, C, and D are constants determined during the calibration process, J 
is jitter, and L is packet loss. High scores are bad. (I tried a lot of things 
before settling on that formula. It's interesting what it shows: increasing 
jitter makes things bad quickly, although the rate of increase slows down as 
the natural logarithm, while a small amount of packet loss is tolerable, but 
makes things exponentially worse as it increases.)

As you know, you generally look at three things that impact the quality of 
calls:

1) Latency: the amount of time it takes a packet to get from one end to the 
other.
2) Jitter: variations in inter-packet arrival times.
3) Packet loss: the rate of packets that get dropped along the way (or that are 
excessively delayed, which results in them being ignored).

Packet loss is easy to measure; RTP packets are stamped with sequence numbers, 
so you can see if any are missing. Jitter is also simple: just time the 
arrivals of packets with sufficient precision, and compare the deltas between 
arrival times to see how inconsistent they are.

Latency is not as easy, since you kinda need to be at both ends to measure it. 
You could send out probing ICMP packets or something to try getting an idea of 
latency, and we considered that. However, an important outcome of our research 
was that it didn't matter. Jitter served as such an effective stand-in that we 
could ignore latency. IOW, jitter and latency are not generally independent; 
situations of high latency usually result in the most jitter, and low latency 
usually means little jitter. We demonstrated this with a wide variety of 
customers, from those in offices only miles from the vendor's POP, to those 
using satellite links in the hinterlands of Afghanistan.

Anyway, I had a lot of fun with this project. I'd be happy to follow up on- or 
offlist if you'd like.

[1] http://en.wikipedia.org/wiki/PSQM

- Leon

On May 8, 2015, at 5:10 AM, Evan Pettrey wrote:

> Greetings folks,
> 
> I'm currently in the process of trying to put in place better proactive 
> monitoring of our VoIP environment and I'm hoping to tap into the wisdom of 
> some of you that have more experience with this than myself.
> 
> The challenge that we're running into is that there are simply so many 
> different things to monitor and it's not something that is on/off or that 
> relies solely on hardware utilization to fire off alerts. Currently we 
> typically don't know there is a problem until users report that audio on 
> their calls is dropping in and out.
> 
> Essentially we need to be able to monitor SIP and RTP traffic flow end-to-end 
> to be alerted when there is jitter or anything else that could affect call 
> quality. This should include (but may not be limited to):
>       • Internal Network - Juniper hardware (I've come across some really 
> good VoIP monitoring tools like the one available from Solarwinds but they 
> require Cisco hardware)
> 
>       • VoIP Servers - Running Asterisk 1.8
> 
>       • SIP Trunks - We have limited ability to monitor traffic once it hits 
> our SIP trunks so this poses a challenge but it needs to be monitored 
> nonetheless
> 
> We've set things up like Homer and while that is a great tool for 
> retroactively troubleshooting issues reported it does not do much to alert us 
> to problems proactively.
> 
> 
> Any guidance here would be greatly appreciated. Thanks in advance!
> 
> 
> Best,
> Evan
_______________________________________________
Discuss mailing list
Discuss@lists.lopsa.org
https://lists.lopsa.org/cgi-bin/mailman/listinfo/discuss
This list provided by the League of Professional System Administrators
 http://lopsa.org/

Reply via email to