DP16 Transform, Filter and Recap of Date Fields
Description
The present post shows a practical applications of the lubridate package, focusing on manipulating date formats, extracting year and month information from date values, and summarizing interaction dates within specific time periods like weeks and weekdays. The post also covers the addition of weekdays and week numbers to enhance the temporal context of the data.
Link to the Complete Script in Github
R Script - Date type recap by time periods
Initial dataframe with POSIXct data formats
The next screenshot shows the structure of the initially uploaded data.
Structure and Head Initial Data
Batch transformation of all POSIXct date format fields to standard date
The next function is convenient to convert all columns of date and time data with class POSIXct to the standard date format using the lubridate::as_date() function.
1
2
3
4
5
6
dfINT[] <- lapply(dfINT, function(x) {
if (inherits(x, "POSIXt")) lubridate::as_date(x) else x
})
Structure and Head Transformed Date Fields
Random Sampling: 10 Records
Out of a dataset comprising 9,000 records of interaction dates, a random sample of 10 records was chosen.
1
2
3
4
5
6
7
8
sampledates <- dfINT[sample(1:9000,10),"Date.of.Interaction"]
asinteger <- as.integer(sampledates) # As numeric
asdate <- sampledates # Already converted to date POSIXct format
ascharmdy <- format(sampledates, format="%m/%d/%Y") # as character
asdatelubri <- lubridate::mdy(ascharmdy) # back as date using lubridate
data.frame(asinteger, asdate, ascharmdy, asdatelubri)
str(data.frame(asinteger, asdate, ascharmdy, asdatelubri))
Data frame with changes in date formats and structure of this data frame
Recap as a table Years vs Months
Recap year ~ month for interactions for the period July 2020 - June 2021 - with Lubridate:
1
2
3
# Recap year ~ month with Lubridate:
table(lubridate::year(dfINT$Date.of.Interaction),
lubridate::month(dfINT$Date.of.Interaction), useNA = "ifany")
Recap of number of Interactions per Year/Month as 2D Table
Recap year/month using format Date:
1
2
3
## Recap year ~ month using format Date:
table(paste(format.Date(dfINT$Date.of.Interaction, "%Y"),
format.Date(dfINT$Date.of.Interaction, "%m"), sep="/"))
Array Recap of number of Interactions per Year/Month
__
End of Post