Compare Profile
Common practice in data analysis is compare an activity for a specific time period against the average activity(profile) over a wider time * We will compare trading volumes on a specific date against the average daily volume (ADV) * first, let's create hdb for 60 days
q)\l fakedb.q
USAGE: makedb[NUM QUOTES;NUM TRADES] eg makedb[100000;10000]
makedb1[NUM QUOTES;NUM TRADES;DATE;RANDOMISED COUNT FACTOR] eg makedb1[100000;10000;.z.d;.3]
makehdb[HDBDIR;NUM DAYS;APPROXIMATE NUM QUOTES PER DAY; APPROXIMATE NUM TRADES PER DAY] eg makehdb[`:hdb; 5; 100000; 10000]
makecsv[CSVDIR;NUM DAYS;NUM QUOTES;NUM TRADES]
q)makehdb[`:hdb;60;100000;10000]
2022.12.26T12:29:09.312 saving data for date 2014.04.21 to :hdb
2022.12.26T12:29:09.380 saving data for date 2014.04.22 to :hdb
..
..
2022.12.26T12:29:13.521 saving data for date 2014.07.10 to :hdb
2022.12.26T12:29:13.580 saving data for date 2014.07.11 to :hdb
q)\l hdb
q)dts:(2014.04.22;2014.05.22)
q)p: select sum size by date,sym, 5 xbar time.minute from trades where date within dts
q)p
date sym minute| size
----------------------| -----
2014.04.22 AAPL 08:00 | 8536
2014.04.22 AAPL 08:05 | 15896
2014.04.22 AAPL 08:10 | 18935
2014.04.22 AAPL 08:15 | 6874
sums
q)p: update sums size by date,sym from p
q)p
date sym minute| size
----------------------| ------
2014.04.22 AAPL 08:00 | 8536
2014.04.22 AAPL 08:05 | 24432
2014.04.22 AAPL 08:10 | 43367
2014.04.22 AAPL 08:15 | 50241
2014.04.22 AAPL 08:20 | 65972
2014.04.22 AAPL 08:25 | 86996
2014.04.22 AAPL 08:30 | 97990
2014.04.22 AAPL 08:35 | 116046
q)p1: select avgSize: (sum size)%count distinct date by sym,minute from p
q)p1
sym minute| avgSize
-----------| --------
AAPL 08:00 | 25892.09
AAPL 08:05 | 51668.17
AAPL 08:10 | 82826.09
AAPL 08:15 | 107311.2
AAPL 08:20 | 135899.5
q)rack: (select distinct sym from p) cross ([]minute: start+00:05*til `int$end-start)
q)rack
sym minute
-----------
AAPL 08:00
AAPL 08:05
AAPL 08:10
AAPL 08:15
AAPL 08:20
AAPL 08:25
q)rack lj p1
sym minute avgSize
--------------------
AAPL 08:00 25892.09
AAPL 08:05 51668.17
AAPL 08:10 82826.09
AAPL 08:15 107311.2
AAPL 08:20 135899.5
AAPL 08:25 163998.3
AAPL 08:30 192402.9
AAPL 08:35 216513.4
AAPL 08:40 244737.7
AAPL 08:45 270592.9
AAPL 08:50 294260
- Now let's create a script for same
q)adv:{[startdate;enddate;bucketsize] // extract the bucketed totals p: select sum size by date, sym, bucketsize xbar time.minute from trades where date within (startdate;enddate); // make total cumulative p: update sums size by date, sym from p; // calculate avergae daily volume across all dates p: select avgSize: (sum size)% count distinct date by sym,minute from p; // rack the data start: min exec minute from p; end: max exec minute from p; rack: (select distinct sym from p) cross ([]minute: start+bucketsize*til `int$(end-start)%bucketsize); // join and fill racked data update fills avgSize by sym from rack lj p } //
- now we have loaded the function and hdb in same session
- we can compare by joining on cumulative size data for specified date
- we can also fill forward to remove any blank value
q)update sums size from select sum size by sym,5 xbar time.minute from trades where date=2014.05.23 sym minute| size -----------| ------ AAPL 08:00 | 31991 AAPL 08:05 | 53513 AAPL 08:10 | 88931 AAPL 08:15 | 125699 AAPL 08:20 | 151826 q)adv[2014.04.22;2014.05.22;5] lj update sums size from select sum size by sym,5 xbar time.minute from trades where date=2014.05.22 sym minute avgSize size --------------------------- AAPL 08:00 25892.09 17140 AAPL 08:05 51668.17 45260 AAPL 08:10 82826.09 58595 AAPL 08:15 107311.2 87317 q)update fills size from adv[2014.04.22;2014.05.22;5] lj update sums size from select sum size by sym,5 xbar time.minute from trades where date=2014.05.22 sym minute avgSize size --------------------------- AAPL 08:00 25892.09 17140 AAPL 08:05 51668.17 45260 AAPL 08:10 82826.09 58595 AAPL 08:15 107311.2 87317 AAPL 08:20 135899.5 113833 AAPL 08:25 163998.3 131745 AAPL 08:30 192402.9 176591 AAPL 08:35 216513.4 184667 AAPL 08:40 244737.7 197525
- now we can compare 1 date's data to adv for date range
q)update sizeComp: 100*size%avgSize from update fills size from adv[2014.04.22;2014.05.22;5] lj update sums size from select sum size by sym,5 xbar time.minute from trades where date=2014.05.22
sym minute avgSize size sizeComp
------------------------------------
AAPL 08:00 25892.09 17140 66.19783
AAPL 08:05 51668.17 45260 87.59744
AAPL 08:10 82826.09 58595 70.74462
AAPL 08:15 107311.2 87317 81.36804
AAPL 08:20 135899.5 113833 83.76265
AAPL 08:25 163998.3 131745 80.33317
AAPL 08:30 192402.9 176591 91.7819
q)comparetoAdv:{[startdate;enddate;bucketsize;comparisonDate]
res: update sums size from select sum size by sym,bucketsize xbar time.minute from trades where date=comparisonDate;
res1: update fills size from adv[2014.04.22;2014.05.22;5] lj res;
update sizeComp: 100*size%avgSize from res1
};
q)compareToAdv[2014,04,22;2014.05.22;5;2014.05.22]
sym minute avgSize size sizeComp
------------------------------------
AAPL 08:00 25892.09 17140 66.19783
AAPL 08:05 51668.17 45260 87.59744
AAPL 08:10 82826.09 58595 70.74462
AAPL 08:15 107311.2 87317 81.36804
AAPL 08:20 135899.5 113833 83.76265