Code
pacman::p_load(arrow, lubridate, tidyverse, tmap, sf) Victoria Grace ANN
January 15, 2024
January 22, 2024
Packages that will be used:
arrow, to read and write Parquet files (format which data is in)
lubridate, to work with time-related data more easily
tidyverse
tmap
sf
# A tibble: 6 × 9
trj_id driving_mode osname pingtimestamp rawlat rawlng speed bearing accuracy
<chr> <chr> <chr> <int> <dbl> <dbl> <dbl> <int> <dbl>
1 70014 car android 1554943236 1.34 104. 18.9 248 3.9
2 73573 car android 1555582623 1.32 104. 17.7 44 4
3 75567 car android 1555141026 1.33 104. 14.0 34 3.9
4 1410 car android 1555731693 1.26 104. 13.0 181 4
5 4354 car android 1555584497 1.28 104. 14.8 93 3.9
6 32630 car android 1555395258 1.30 104. 23.2 73 3.9
One trajectory id, trj_id, represents one Grab ride.
There may be multiple repeated trj_id as the ride data is collected every minute
Check updated df
# A tibble: 6 × 9
trj_id driving_mode osname pingtimestamp rawlat rawlng speed bearing
<chr> <chr> <chr> <dttm> <dbl> <dbl> <dbl> <int>
1 70014 car android 2019-04-11 00:40:36 1.34 104. 18.9 248
2 73573 car android 2019-04-18 10:17:03 1.32 104. 17.7 44
3 75567 car android 2019-04-13 07:37:06 1.33 104. 14.0 34
4 1410 car android 2019-04-20 03:41:33 1.26 104. 13.0 181
5 4354 car android 2019-04-18 10:48:17 1.28 104. 14.8 93
6 32630 car android 2019-04-16 06:14:18 1.30 104. 23.2 73
# ℹ 1 more variable: accuracy <dbl>
pingtimestamp looks better nowwday defines the workday.The original dataset takes up a lot of space.
In future, the files can be read as such,
Homework: Hands-on Ex 3 and Data Preparation for Take-home Ex 1