• Tony is on holiday until Saturday 20th

    This means that the Travelling Shop will be out of action for a week and any inquiries sent to him won't be picked up until he's back.

    In the meantime the rest of the Starless River team will be hard at work ensuring online orders are posted out to you all. Thanks everyone.

    Click here for details

  • Help us work out the future of the Inglesport Café

    We've been trading since 1977 and next year will be our 50th anniversary.

    The café has been part of that for a long time, running quietly in the background for years, and we don't think it always gets the credit it deserves as a genuine community hub. ⁠But we need to be straight with you: the café is under real pressure, and we’re not sure of the best path forward.....

    Click here to add your thoughts

Wanted programming help in sorting out data

Bob Mehew

Well-known member
I am looking for some help from someone to write a program in preferably python to sort two sets of time based data into order.  There are (at least) two complicating factors.  The first is the date and time statement is likely to be different between the two sets of data (c.f. 22/08/2019 09:32:23 and 22/08/19 9:33 and no doubt other variants).  The second is the data is usually in txt or csv format but has different column divider codes (c.f. tab, comma, space etc.).  One data set also contains the occasional null value for CO2 plus a host of program progress lines which need ignoring.

The purpose of the program is to take two sets of in cave CO2 and other data (pressure, temperature and relative humidity) obtained at different time intervals over a common time period and extract the CO2 reading from the more frequent data set timed closest to the time of each sample in the less frequent data set.  I then want to do some statistical checking of how well the two sets of CO2 reading correlate.  That might get complicated as a trail stab at it has indicated there could be a time lag between the two data sets.

If you are interested in helping develop a CO2 logger for use in cave, then please PM me.

many thanks in anticipation
 
This sort of thing is easy in Python (although everything is great in Python  :ang: ); send me some data and I can knock something up. Is use of numpy (standard package for doing proper number with in python) ok (can be easily done without it).

It's also fairly easy to interpolate a time sequence on one set of intervals (e.g. every 4 mins past the hour) onto a different time sequence (e.g. on the hour) for comparison (I have done this for logger data before for someone) and you can either average to longer intervals or interpolate to shorter intervals (although obviously you don't get new data by doing that!).

Having done the mapping onto equivalent sampling intervals you could get the cross-correlation as a function of time delay to estimate any lag.

(Although by Python I assume you mean Python3 these days)
 
Lexik has already started on the task so thanks for the offer.  I have PMed you.

Yes Python 3 would be preferable  and numpy is OK.  (I run Spyder under Anaconda3 though that is causing me some irritation these days in failing to update.)

Thanks for the thoughts about interpolation; I might come back to you.  But it looks like the lag which has been seen is a mixture of BST v GMT setting plus a number of minutes mismatch in times kept by the clocks in each logger  :-[
 
Back
Top