Twelve years ago I developed a system with a couple of other folks to determine and report on quality of individual VoIP calls in real time. I really wish I could show it to you, but the work was done under contract, and I've never been released to offer the code as open source or to any other party. In any case, it would take a little work to update the software, but I can give you the gist of it, which might help with your evaluation, or in case you decide to build something on your own.
The GeoPacket CQA (Call Quality Analyzer) software would sit on a system with a view of all the VoIP traffic to monitor (using port mirroring). A call recognizer process would passively watch the signaling packets for device registration (so it could match up calls to devices/customers) and call setup. When it saw call setup, it would spawn a call analyzer process to sample sequences of RTP packets, using packet characteristics to calculate a quality score (by default about every 30 seconds during the call). The score data would be sent to the GeoPacket CQD (Call Quality Display), which incorporated a MySQL database for storage, plus a custom-developed web UI. When the recognizer saw the call teardown, it would reap the analyzer process for that call. This was mostly implemented in Perl, using libpcap. It was designed to monitor up to 3000 simultaneous calls on a 1U rackmount server of the time; modern systems with multiple cores and more memory should allow for at least 10-20x the number of calls. The general formula used to calculate the score is patented; you can view that here: http://patft.uspto.gov/netacgi/nph-Parser?Sect1=PTO1&Sect2=HITOFF&d=PALL&p=1&u=%2Fnetahtml%2FPTO%2Fsrchnum.htm&r=1&f=G&l=50&s1=7,430,179.PN.&OS=PN/7,430,179&RS=PN/7,430,179 The score was made to match up to PSQM[1]. Determining the formula required a lot of calibration using a device called a Telegra from HP/Agilent, which would generate audio on one end of a line, and process the audio on the other end to determine how much it had degraded and generate a score. I monitored the artificial calls with the CQA, and fiddled with different approaches until I got something that generated very similar scores. As disclosed in the patent, this took the general form: score = A + ln(B + CJ) + exp(DL) where A, B, C, and D are constants determined during the calibration process, J is jitter, and L is packet loss. High scores are bad. (I tried a lot of things before settling on that formula. It's interesting what it shows: increasing jitter makes things bad quickly, although the rate of increase slows down as the natural logarithm, while a small amount of packet loss is tolerable, but makes things exponentially worse as it increases.) As you know, you generally look at three things that impact the quality of calls: 1) Latency: the amount of time it takes a packet to get from one end to the other. 2) Jitter: variations in inter-packet arrival times. 3) Packet loss: the rate of packets that get dropped along the way (or that are excessively delayed, which results in them being ignored). Packet loss is easy to measure; RTP packets are stamped with sequence numbers, so you can see if any are missing. Jitter is also simple: just time the arrivals of packets with sufficient precision, and compare the deltas between arrival times to see how inconsistent they are. Latency is not as easy, since you kinda need to be at both ends to measure it. You could send out probing ICMP packets or something to try getting an idea of latency, and we considered that. However, an important outcome of our research was that it didn't matter. Jitter served as such an effective stand-in that we could ignore latency. IOW, jitter and latency are not generally independent; situations of high latency usually result in the most jitter, and low latency usually means little jitter. We demonstrated this with a wide variety of customers, from those in offices only miles from the vendor's POP, to those using satellite links in the hinterlands of Afghanistan. Anyway, I had a lot of fun with this project. I'd be happy to follow up on- or offlist if you'd like. [1] http://en.wikipedia.org/wiki/PSQM - Leon On May 8, 2015, at 5:10 AM, Evan Pettrey wrote: > Greetings folks, > > I'm currently in the process of trying to put in place better proactive > monitoring of our VoIP environment and I'm hoping to tap into the wisdom of > some of you that have more experience with this than myself. > > The challenge that we're running into is that there are simply so many > different things to monitor and it's not something that is on/off or that > relies solely on hardware utilization to fire off alerts. Currently we > typically don't know there is a problem until users report that audio on > their calls is dropping in and out. > > Essentially we need to be able to monitor SIP and RTP traffic flow end-to-end > to be alerted when there is jitter or anything else that could affect call > quality. This should include (but may not be limited to): > • Internal Network - Juniper hardware (I've come across some really > good VoIP monitoring tools like the one available from Solarwinds but they > require Cisco hardware) > > • VoIP Servers - Running Asterisk 1.8 > > • SIP Trunks - We have limited ability to monitor traffic once it hits > our SIP trunks so this poses a challenge but it needs to be monitored > nonetheless > > We've set things up like Homer and while that is a great tool for > retroactively troubleshooting issues reported it does not do much to alert us > to problems proactively. > > > Any guidance here would be greatly appreciated. Thanks in advance! > > > Best, > Evan _______________________________________________ Discuss mailing list Discuss@lists.lopsa.org https://lists.lopsa.org/cgi-bin/mailman/listinfo/discuss This list provided by the League of Professional System Administrators http://lopsa.org/