After yet another few days working on how to calculate an accurate average of the timestamp's in each packet in a pcap found out that I was using the incorrect timestamp values. Duh! So the packet.timestamp values coming out of the Radiotap information for the beacon's was likely just synchronisation info with little objective basis in real life. However, the timestamp of arrival was simply to be found in the Tcpdunmp header using:
<><><>
import pcap
p=pcap.pcapObject()
p.open_offline("/home/file")
packet=p.next()
timestamp=packet[2]
You can see why the Duh! then.
<><><>
With that out of the way I needed to organise how to start calculating averages (specifically standard deviations and mean) from the data. Remembering of course that I couldnt re-cycle over the data again to calculate it. Well I could, but that would require at least a doubling of time spent in calculation. Admittedly the particular application of this isnt time sensitive but with something like 25 million packets to look through any doubling needs to be avoided!!
With a quick google and more than a nod in this direction I found a C implementation. That's perfect only I'm writing in Python. Not too bad, but it did take me a while to get it converted so I thought I would share it.
<><><>
def std_dev_run(a, len(a)):
if n == 0:
return 0
sum=0.0
sq_sum=0.0
for i in range(n):
sum += a[i]
sq_sum += (a[i] * a[i])
mean = sum / n
variance = sq_sum / n - mean * mean
return sqrt(abs(variance))
<><><>
The only appreciative difference being the inclusion of the abs() term. The maths module kept throwing domain errors at me and turns out the variance was negative. And of course negative square roots tend to make computers hate you a little.
Oh yeah, and THEN I realised I didn't actually need a single pass. After all, I dont want to store 25million packets in an array, or the values of them, so a single pass wont help. My intention is to produce single values from each packet and have that update the necessary stats before discarding them. The upshot being what I need is actually an 'online' or 'running' standard deviation calculation. Best get started on that one now...
No comments:
Post a Comment