A quick update. I’m working on Rod’s Eddington Number (as it pertains to cycling, not astrophysics) leaderboard. I have a suspicion we may have multiple people with the same value, but maybe there will be a clear winner. (Actually, I’m not sure what the prize criteria were.) Of course, I would assume if the challenge is for highest E number the winner would be Eric …
I am dedicating myself to do this purely in SQL which is proving challenging to me — especially with the limitations of MySQL [which is my only option on the shared hosting provider].
I started by creating a view of the data that simply takes the truncated distance — floor(ride.distance) — and counts the number of times each person rode rides of that distance.
So, for a specific athlete, we might have a table like:
Dist Freq
8 7
6 10
3 5
1 2
Then I took that data and changed the frequency to the sum of frequency for all distances >= each distance. Meaning that really, in the chart above the 1 mile ride was done 24 times, not 2.
Dist Freq
8 7
6 17
3 22
1 24
And then, the tricky part (for MySQL especially) is realizing that the E number for this rider (assuming I’ve understood this correctly) is actually a number that falls between the distance values — it is 7. Because they rode [at least] 7 miles 7 times.
So to do that, I have to fill in the gaps in the distances, joining against a table of incrementing numbers. In other database (like postgres) this is easy to do at runtime; in MySQL you just have to create a table or view of just pure numbers and then join that up. So the result will look like:
Dist Freq
8 7
7 7
6 17
5 17
4 17
3 22
2 22
1 24
That part isn’t actually quite done, though I finally had a breakthrough before going to bed on how I’d accomplish this. It would have been so trivial to just pull the values into a python script and do it the naive way. But this is much more fun.
Oh, my E number for FS2015 period is 18. Made me realize how few long rides I’ve done this winter. Thanks to not being on a real team. 