Blog Home!

profile picture
Milestone 1.1 - Coding and cleaning time use data

Project: AgeingTimeUse, Horizon 2020 ERC

If you've ever spent time analyzing survey data, you know that it's not always easy to get the results you want. Sometimes, the data is dirty and needs to be cleaned before it can be used. Coding is another process that can take a lot of time - especially if you're coding by hand. Fortunately, there are ways to speed up both of these processes, particularly if you can rely on the previous research and what others already did before you. In this blog post, I'll talk about some difficulties with cleaning and wrangling time-use data and about what I did for this milestone. Additionally, I'll share some code that might help make your time-use data analysis faster and easier.

Data cleaning is essential to the process of coding!

Suppose you are working with the American Time Use Survey (ATUS), like me. In that case, you might be lucky because ATUS is also available via IPUMS, which provides a much cleaner version of ATUS to its users. It is easy to set up an account and start creating your own extracts of ATUS-X.

Coding activities can be arbitrary

For bigger categories of activities, there are some standard conventions. For instance, it is usually easier to identify paid work activities than other categories of activities. Sometimes, however, it cannot be as straightforward as we expect. In unpaid work, some categories are murkier. Like, take shopping and services as an example. Not all of the activities within these categories can be categorized as unpaid work, and a lot of it can be decided at the coding stage.

Often, there isn't a clear distinction between unpaid work and care-related activities. I know some people who consider commuting as just another work activity, whereas some might consider it as a care activity, particularly when the commute was connected with adult care following it. At the coding stage, you might struggle to identify which activities should be considered as unpaid or paid work, leisure or self-care activities. The best way is to rely on current conventions on how most time-use researchers code the activities. In this way, we can ensure the comparability of research results.

My code to wrangle and visualize ATUS diaries using Stata

For this milestone, I created a tutorial where I describe how diaries could be visualized using Stata (tempograms). You can find the code in this link to OSF preprints. In this preprint, I used the original ATUS data from the Bureau of Labor Statistics