CDR capture via TCP/IP on Linux host 16

telcomwork · Aug 25, 2005

Anyone have a new download path for netcat. The one posted here looks to be gone...

I would liek to setup CDR on a windows XP workstation.

telcomwork · Aug 25, 2005

Never mond just saw it in about 5 posts up... skipped right over it

84Mike · Dec 8, 2005

I'm seconding LittleHopper's post from a while back.

About every minute, I get a beep and a line of smiley faces, hearts and spades like this show up (????? ??????). If I look a the "my-billing.txt" file, it shows a bunch of squares ( ).

84Mike · Dec 9, 2005

Problem solved! Page 3 of 'cha ip-services' set Reliable Protocol to 'n'. Fixed my problem with invalid characters.

VOIPeng · Dec 9, 2005

Then is there a way I could set up this Linux box to server the records to multiple other hosts, either send them to multicast IP address or allow other box to "poll" this Linux server.

Thank you,
VOIPEng

JeffSaxe · Feb 2, 2006

My first post on this site, so forgive any social faux pas.

First, thanks so much to J.C. for posting this in the first place, and thanks so much to Tek-Tips for hosting this forum and also for having an easy-to-robotically-crawl link structure so that Google would find it. J.C., I had a feeling this capturing call data could not be rocket science, and I was wondering why my boss had asked me to download a demo of a $3,000 (!!) package to do this; your brief instructions and a little bit of Perl hackery has saved our company a bunch of money. You rock, sir.

Second, here are a couple of "gotchas" of which I will remind other readers, to save you some time:

* If you have previously been using Avaya's crappy "Definity Reliable Transport Tool", which runs as a GUI interactive, has to be started manually on restart of the machine, crashes and burns all the time, and doesn't recover after it crashes and burns and hangs 10 windows open (can you see our frustration with this tool?), then your Avaya switch was already set to use "reliable" transport mode. I'm guessing Avaya implemented a reliable acknowledge-and-retransmit protocol for their original serial port data output, which was a good idea, but then when they started sending the data over a TCP/IP socket connection (which is already reliable by design!!), they just layered the same protocol on top of that. Having it go over TCP eliminates the need for Avaya's own reliable transport. So, as 84Mike notes a couple posts ago, you must go into screen 3 of "change ip-services" and tell the CDR1 service to not use its reliable protocol. Then when it makes the outbound connection to your Netcat listener, it just pours out data, no funny stuff, and no responses required from Netcat other than the TCP ACKs. Mine stays up for hours and hours (our whole working day) on one socket; it might stay up for days but I haven't yet increased my netcat timeout.

* The call data you receive is precisely what you ask for on page 2 of "change system-parameters cdr". Choose your fields there, then take note of their lengths and design your SQL tables to match. If you're going to chop it up and process or store it in a database for analysis, then you don't truly need spaces between the fields, as long as your program can count characters accurately.

If you're just archiving the records into text files that you're going to grep through and read as a human, you probably want the spaces. Also, the line endings are your choice: If you're spitting it out toward a Windows machine running Netcat, you probably want a "return" and a "line-feed" at the end, while going to Linux you probably just want a "line-feed".

* At our company we want the call data in a SQL server, so I started to write a Perl script that directly receives the data from Netcat and writes it to SQL in real time, with ODBC calls. This should work fine, but it's risky, since if the SQL server flakes out or is down for maintenance, we run the risk of losing data. I thought it would be more reliable to capture each line into a local hard drive text file, then after each file is cleanly cut off, attempt to insert them into the SQL tables, but if there's any error returned from the SQL calls, just hold the file and try again in a little while. Since my Netcat is never being stopped and started under normal circumstances, and the Perl script is actually creating the split-up files, I believe this is slightly better than "killall" every hour in terms of never losing call records, but I have to hope the PBX has some kind of small buffer so it wouldn't lose them anyway when it needs to reestablish the TCP session.

I have this basically working; it's just two independently-running Perl scripts running forever as services. If anyone has a good place to host them (a nice, stable Web site that's not likely to disappear in a year or two), I'll gladly hand them up for public posting, and we can link to them from here. Or I can copy-and-paste the entire code right into this thread; it's not that long. Give back to the community!

-- Jeff Saxe, Network Engineer
Crutchfield Corp.
Charlottesville, Virginia
(my license plate reads "PERL ROX")

FortKnox · Feb 2, 2006

JeffSaxe,

A few years ago I found a guy on irc that helped me develop a listener and an app to dump my call data into SQL server. Since then, I have done a lot with the data.

For example, I wrote web pages that display call data based on criteria that you specify and it also shows you the cost of the call. If I know that my call rate on trunk 12 is $.05/min and and trunk 15 is $.08/min then I can tell who is costing us the most amount of money. Also, in my web page it shows the location being called/called from. I rounded up some data on the internet that I eventually got all loaded into SQL server with the NPA and NXX, so for example, 615 is TN and 295 is Nashville, then I know the call was place to or from Nashville, TN. So far, I have 606,000 calls in my database.

The fella that wrote the app for me wrote it in .Net

BHodgins · Feb 2, 2006

I would love to see the perl scripts after they have been uploaded.

Cheers,
Bill

JeffSaxe · Feb 6, 2006

OK, theoretically these are a work product that belongs to Crutchfield Corporation, my employer, but my boss says I may post them. I hereby give them away to the public domain.

Here is the first program, AvayaCDRCapturetoSQL.pl. This sits around forever, starting a Netcat listener on a specific port, waiting for the output from the PBX, lightly processing it (throwing away too-short lines, rearranging dumb date/time formats into superior ones), and tossing out SQL stored-procedure-execute statements into files. The files are named by a date/time stamp, so every few minutes, the file will get cut off and the statements will go into a new file. This is extremely clean, no loss of data between files. It does involve a rapid-fire open-append-close for every row, but this is not a problem for any modern OS and caching filesystem, and the rate of output of call data records is pretty slow (depending, of course, on how big your PBX is and how many calls you take). So this creates a series of files in the GoingToSQL subdirectory, each with 5 minutes' worth of activity. If the other service dies or hangs for a while, these files will just stack up and wait.

Oh, obviously you will have to customize the EXPECTEDLENGTH number! It's going to be one or two less than the record length quoted at the bottom of "change sys cdr", depending on your line ending characters. Also, you will have to customize the regular expression that slices up the line based on which fields you are asking the PBX to spit out, and what their field lengths are. It's very easy. Also note that I'm outputting mostly character fields for storage, but one field, call duration, is an integer, so I'm outputting that without apostrophes in the SQL statement. If it will make it clearer, I'm asking for the following fields: date, space, time, space, duration, space, calling-num, space, vdn, space, dialed-num, space, in-trk-code, space, in-crt-id, space, out-crt-id, return, line-feed.

Code:

my $EXPECTEDLENGTH = 77;	# this MUST be set to the record length as reported by the "change system-parameters cdr" screen, NOT INCLUDING any line-ending characters
my $PORTNUMBER = 9000;	# TCP port number on which to listen -- must be set to whatever "remote port" the PBX will try to connect to (in "change ip-services")
my $IDLETIMEOUT = 900;	# a safety measure in case a particular Netcat listener process goes off into the ozone; it shouldn't wait forever when it doesn't hear anything, it should quit itself within 15 minutes
my $COLLECTIONPERIOD = 5;	# after this many minutes, start a new output file

mkdir "./GoingToSQL";

while (1) {	# forever, until this entire program is killed...

my $childPID = open FH, "nc -l -p $PORTNUMBER -w $IDLETIMEOUT |"
	or die "Could not open a subshell piped from a netcat listener: $!\n";

my $currentTime = localtime();
print "\n\n-- $currentTime   New child PID is $childPID\n";

while (<FH>) {
	chomp;	# records arrive correctly CR-LF terminated, because we asked for those characters in the PBX's "change system-parameters cdr" screen
	# print length($_), '  ', $_, "\n";

	if (length($_) < $EXPECTEDLENGTH) {
		# print "Above line discarded (should be a date/time stamp).\n";
		next;
	}

	if (length($_) == $EXPECTEDLENGTH + 3) {
		# print "Chop off first three chars.\n";
		$_ = substr($_, 3);
		# print "New line:\n";
		# print length($_), '  ', $_, "\n";
		#    ...and fall through to next case to process this line
	}

	if (length($_) == $EXPECTEDLENGTH) {
		# print "Process as normal.\n";
		$_ =~ /^(.{2})(.{2})(.{2}) (.{2})(.{2}) (.{4}) (.{15}) (.{7}) (.{23}) (.{4}) (.{3}) (.{3})$/;

		($second, $minute, $hour, $dayofmonth, $month, $year) = localtime;
		$minute -= ($minute % $COLLECTIONPERIOD);	# round down to nearest 5-minute boundary
		my $filename = sprintf('./GoingToSQL/AvayaCDR-%04d-%02d-%02d-%02d%02d.sql',
			$year + 1900, $month + 1, $dayofmonth, $hour, $minute);
		open FHoutput, ">>$filename"
			or die "Could not open $filename for append: $!\n";
		print FHoutput "exec CallAccountingDataAdd_tw '20$3-$1-$2 $4:$5',$6,'$7','$8','$9','$10','$11','$12'\n";
		close FHoutput;
		next;
	}

	print "Unknown line length!\n";
}


close FH;
}

Here is the second program, AvayaCDRPerformAndArchive.pl. This watches in the GoingToSQL directory for files, waits until any given file is at least a bit more than 5 minutes old (so the first program is guaranteed not to write any more rows into it, i.e., the file is done), then tries to run it against the SQL server. Note that in this case I'm on a Windows platform, "osql" is the SQL batch execute command, our server is named "dgoldmine", we use a trusted Windows login to SQL, the database is called CMS, and -- quite importantly -- I used the "-b" option to osql, which causes any problem or error in the execution to stop and return a non-zero result code from the system() call. So if the server is down for a while, or the database drive fills up, or someone changes the stored procedure, it gives an error, and this program leaves the file around to try again in 30 seconds. If the file processes fine, then it's moved to a Done folder. The output of each given SQL script is captured to a similarly-named file, so if it worked, you have confirmation later, and if it's getting problems, you have the output so you can troubleshoot.

Code:

my $REQUIREDAGE = 5 * 60 + 10;	# 5 minutes plus a "slop factor" of 10 seconds; file must be this many seconds old before it's considered "cut off" -- must be at least a bit larger than the cutoff boundary of the AvayaCDRCapturetoSQL program!

my $documentName;
my $modTime;
my $secondsOld;


mkdir "./Done";

while (1) {	# forever, until we are killed...

my $currentTime = time();
my @potentialFiles = glob "./GoingToSQL/*.sql";
#    print join("\n",@potentialFiles),"\n\n";

for ($i = 0; $i < (scalar @potentialFiles); $i++) {
	print "Found $potentialFiles[$i]: ";
	$modTime = (stat ($potentialFiles[$i]))[9];
	$secondsOld = $currentTime - $modTime;
	print "$secondsOld seconds old; ";
	if ($secondsOld > $REQUIREDAGE) {
		print "Process\n";
		$newFile = $oldFile = $potentialFiles[$i];
		$newFile =~ s:^\./GoingToSQL/:./Done/:;
		$resultCode = system("osql -S dgoldmine -E -d CMS -b -i $oldFile -w 8000 -o $oldFile.log");
		if ($resultCode == 0) {
			rename "$oldFile", "$newFile";
			rename "$oldFile.log", "$newFile.log";
		} else {
			print "Error encountered trying to run SQL script $oldFile; retrying later\n";
		}
	} else {
		print "Skip it\n";
	}
}

print "\n";
sleep 30;

}

That's about it. I have both of these running as console-interactive Scheduled Jobs in Windows (i.e., not as official Service Control Manager services, but as jobs that run constantly and start themselves every 10 minutes if they die for any reason), but you may use your favorite method to keep them running in the background. I'm not particularly logging the scrolling output, since the programs are pretty bulletproof; if the machine's on and the programs are running, and nobody else started a listener on port 9000, there's not a whole lot of ways it can fail. When run as scheduled jobs, they show up on screen, so you can watch them for any needed troubleshooting output, but they can also be remotely stopped and started by administrators without logging into the machine directly.

The 5-minute timeout does create lots of small files; I tried to strike a balance between number of little files and latency for a phone call to get into the database. In my case, it's nice to know that a call will be in the database at most 5-6 minutes after it ends. If your requirements are more relaxed, you could change COLLECTIONPERIOD to 10 minutes or even 60 minutes, then correspondingly increase the REQUIREDAGE so the processing program waits the required amount of time before considering a file done.

Enjoy! Hope this gives everyone a leg up on the process.

-- Jeff Saxe
JSaxe --at-- Crutchfield.com, in case you want to email me offline without posting here

JeffSaxe · Feb 6, 2006

Dang it. I didn't read the documentation carefully, so I didn't realize that default "duration" field was not directly in seconds, but hours-minutes-tenths-of-minute. Thanks to Eliav (above) for the descriptions. Here is a revised edition of the capturing program, with the "sec-dur" field (which the PBX spits out as 1-digit hour, 2-digit minute, 2-digit second) multiplied out into a simple integer number of seconds. This struck me as a more useful measurement to insert into the database, since we could calculate on it more flexibly.

Code:

my $EXPECTEDLENGTH = 78;	# this MUST be set to the record length as reported by the "change system-parameters cdr" screen, NOT INCLUDING any line-ending characters
my $PORTNUMBER = 9000;	# TCP port number on which to listen -- must be set to whatever "remote port" the PBX will try to connect to (in "change ip-services")
my $IDLETIMEOUT = 900;	# a safety measure in case a particular Netcat listener process goes off into the ozone; it shouldn't wait forever when it doesn't hear anything, it should quit itself within 15 minutes
my $COLLECTIONPERIOD = 5;	# after this many minutes, start a new output file

mkdir "./GoingToSQL";

while (1) {	# forever, until this entire program is killed...

my $childPID = open FH, "nc -l -p $PORTNUMBER -w $IDLETIMEOUT |"
	or die "Could not open a subshell piped from a netcat listener: $!\n";

my $currentTime = localtime();
print "\n\n-- $currentTime   New child PID is $childPID\n";

while (<FH>) {
	chomp;	# records arrive correctly CR-LF terminated, because we asked for those characters in the PBX's "change system-parameters cdr" screen
	# print length($_), '  ', $_, "\n";

	if (length($_) < $EXPECTEDLENGTH) {
		# print "Above line discarded (should be a date/time stamp).\n";
		next;
	}

	if (length($_) == $EXPECTEDLENGTH + 3) {
		# print "Chop off first three chars.\n";
		$_ = substr($_, 3);
		# print "New line:\n";
		# print length($_), '  ', $_, "\n";
		#    ...and fall through to next case to process this line
	}

	if (length($_) == $EXPECTEDLENGTH) {
		# print "Process as normal.\n";
		$_ =~ /^(.{2})(.{2})(.{2}) (.{2})(.{2}) (.{1})(.{2})(.{2}) (.{15}) (.{7}) (.{23}) (.{4}) (.{3}) (.{3})$/;

		($second, $minute, $hour, $dayofmonth, $month, $year) = localtime;
		$minute -= ($minute % $COLLECTIONPERIOD);	# round down to nearest 5-minute boundary
		my $filename = sprintf('./GoingToSQL/AvayaCDR-%04d-%02d-%02d-%02d%02d.sql',
			$year + 1900, $month + 1, $dayofmonth, $hour, $minute);
		open FHoutput, ">>$filename"
			or die "Could not open $filename for append: $!\n";
		$durationInSeconds = ($6 * 3600) + ($7 * 60) + ($8);
		print FHoutput "exec CallAccountingDataAdd_tw '20$3-$1-$2 $4:$5',$durationInSeconds,'$9','$10','$11','$12','$13','$14'\n";
		close FHoutput;
		next;
	}

	print "Unknown line length!\n";
}


close FH;
}

Sorry for my confusion.

dvc1 · May 22, 2006

I have a definity g3 system, and seem to be missing the change ip-services command, is there a work around?

thanks

alberton · Jul 5, 2006

Could anyone show me how to configure the Netcat with Windows. When I execute the .exe file it gime me a Cmd Line. If I type the nc sentence the black screen desappeared and it give me a my-billing.txt emptly.
What is wrong?

Alberton

FortKnox · Jul 5, 2006

First,

Click Start/Run/Cmd

Now you should be at the command line.

CD to the folder where netcat is.

So like, now you should be at something like this:

C:\Netcat

Now, type it just like this:

C:\Netcat\netcat -l -p 50000 >> my-billing.txt &

Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

CDR capture via TCP/IP on Linux host 16

telcomwork

Technical User

telcomwork

Technical User

84Mike

Technical User

84Mike

Technical User

VOIPeng

IS-IT--Management

JeffSaxe

MIS

FortKnox

Technical User

BHodgins

IS-IT--Management

JeffSaxe

MIS

JeffSaxe

MIS

dvc1

Technical User

alberton

Technical User

FortKnox

Technical User

Similar threads

Part and Inventory Search

Sponsor