How to Covert Casual Users into Subscribers?
Backstory: In this scenario, the stakeholder is convinced that the best approach to increase revenue for the bike rental company is to convert casual users into subscribers. To that end, I decided to start by investigating the patterns both group of users are exhibiting.
Business Questions
-
How do subscribed users (Members) use bikes differently than non-subscribers (Casuals) ?
-
Why would casual riders want to buy membership?
Result
-
"Casuals" and "Members" have different but consistent patterns in behaviour.
-
"Casuals" match "Members'" usage count on Saturday and Sunday.
-
"Casuals" ride more than double the average duration of "Members" (16 mins).
-
Both user groups have same favourite bike stations AND starting hours
Suggestions
-
Device a different membership plan catering to "Casuals' " usage habit
-
Promote weekend and long traveling distance membership
-
Promote membership benefit at popular stations by "Casuals"
DASHBOARDS (TABLEAU)
-"Members" are the more frequent user group but "Casuals" has more than double average ride length than "Members". "Members" also display a consistent riding habit. The pattern is consistent throughout the year.
-
"Casuals' " using pattern doesn't change but their usage count picks up greatly during Weekend.
​
​
-
Both groups have same preferred starting hours (1pm to 3pm), a pattern that is also consistent throughout the year.
​
​
-
Adjustable geographic plot to show location and popularity of favourite stations, on different days and hours.
​
​
Data Cleaning Journey
The data set used were all coming from the same source, made available by Motivate International Inc. The data sets were separated into different CSV file with same columns in each table.
Because of the vast amount of data in this project, I had chosen to use MySQL to handle the bulk of the data processing, in favour of the software's great variety of functions and speed.
Excel
In order to import CSV spreadsheets into MySQL, the first thing to do is to eliminate NULL cells. That was done by using Excel's "Go To Special..." option to locate all the blank cells and replaced with placebo values with the intention to exclude later on.
MySQL
MySQL can also help picking up data type related data discrepancy while importing data. In example, for the first 3 quarters of spreadsheet I had gone through, there was only one station_ID system and it had been purely in INT. However, as new year comes, a new station_ID system was added to the system.
The discovery gave me the opportunity to re-examine my strategy. Once it became apparent that the changes were too drastic and thus inefficient to convert either system into another, I opted to consider using station names and latitude, longitudes as foreign keys instead.
MySQL allows me to UNION ALL the spreadsheets to quickly look for context related data discrepancy. I had planned to calculate the average use time for each user group and the first step is to check if there are any outliers that really stand out.
Given that normal riders don't ride bike for 50k minutes and there are only 700 riding records (of 3.7 million) which exceeds 3 days (4320 minutes) of successive ride time, I opted to exclude these data.
Visualising software can also help with spotting data discrepancy.
In this scenario, Tableau had helped me to pick up some incorrectly logged timestamps. Speculation was that the mistake happened due to a newly introduced electric bike type. However, the nature of the error was unknown and the numbers did not look normal. I opted to exclude the data given its small number of rows.
Once I was sure all the necessary data were clean, I then ran a query which would answer all the business questions, and exported a concise table for plotting in Tableau.
Looking for more? Feel free to check out my other portfolios.