Grab’s Head of data science Kong-wei Lye gives an insight into how the platform works

Grab data science

From the time of its launch in 2012, ride-hailing giant Grab has been on a ceaseless quest to scale its services to new markets and build new products to become not just a solution for the transport sector, but also an e-commerce and fintech player.

It’s no surprise that the company, which has attracted investments of over a billion dollars, has, in a span of five years, managed to build a fleet of over 2.4 million drivers, serving various Southeast Asian markets including Indonesia, Malaysia, Myanmar, Philippines, Cambodia, Singapore, Thailand and Vietnam.

But while strong growth and massive scale are positive signs of a thriving business, these also present challenges: How does a company ensure that its platform is constantly optimised as more customers pile on to its service?

For Grab, it has to figure out how it can match drivers to riders in the quickest way possible. It also has to ensure its drivers remain in good standing. This is no easy task.

Every day, the Grab team has to process over 10 terabytes. So how does it make sense of this enormous volume of data and use it to its advantage? Kong-wei Lye, Grab’s Head of Data Science, gave e27 a peek at what goes on under the hood of the data science team, to demonstrate how and why data is integral to providing a smooth driver/rider experience.

Give me an insight into what goes on inside Grab’s command centre. What happens when data is being processed?

We have automated models that are trained, simulated and optimized to process all data from the Grab app’s users and their environment. As soon as data is picked up by the Grab app, two things happen:

They are fed into Grab’s computer model which predicts and decides on how to offer the best experience to our users at any given time processed into a machine learning algorithm that will continue to adapt and improve the computer model.

Also Read: Own up to your mistakes: Grab Philippines Country Manager on dealing with crisis

The Grab platform is constantly learning from all the data processed by our app. It does that by having multiple computer models and algorithms performing tasks across different Grab services and responding to what works best for our users. This is necessary because the whole ecosystem in Southeast Asia (road network, traffic, passenger behaviours, driver behaviours, etc) evolves continuously.

How and what kind of data is analysed? How do you extract meaningful data? 

To extract and work with the most meaningful data, we have automated our analysis and predictions through computer models. These models either self-improve automatically via ‘online’ machine learning algorithms or get re-worked and improved ‘offline’ by our data scientists.

We use Grab’s historical data. There is more than a petabyte worth of data from Grab’s 5-year existence, for trend analysis. Simply put, our analysts study the data, find correlations, build and refine models, to then make predictions and refinements to Grab’s model in each country.y

For example, based on location, time, and past booking history, our algorithm predicts when is your next booking, origin, and destination.

However, this varies across all regions and cities, and each model must learn from online data individually. This has allowed us to identify congestion patterns for each city and adjust pricing and driver availability to ensure all passengers get a car on-time and at a reasonable price.

Our models are built and used for different time scales. We do not just use historical data, but also have models to assess and react accordingly to market dynamics in real-time. If today, there is a traffic accident, our model picks this up immediately and reworks driver availability and pricing in such a way that it helps our passengers.

Our app needs to work for our users at any time of day. That’s why we also track the performance of our products 24/7. There are real-time dashboards that allow the different analysts and operations people to keep a close watch on the performance of our systems and products in each city and country. In case anything happens, our teams are able to respond at a moment’s notice.

At the driver level, how does this data help him/her drive better?

There are a few ways we use data to enhance our drivers’ experience. We’ve optimized our driver booking system by combining the power of our localized data-set combined with machine learning. We have learned our drivers’ preferences and behaviours, enabling us to predict which jobs drivers will take and assign targeted jobs. Bookings are then sent to drivers with the highest probable booking rate.

This helps drivers be more engaged by giving jobs they like. For instance, many bike drivers in Jakarta have a ‘home-base’ which they don’t want to veer away from too far, no matter how profitable a ride might be.

In terms of utilization, the data science team analyses real-time and historical demand data to alert drivers to anticipated demand hotspots. Our model is sensitive enough to account for a large range of factors, including the arrival of ferries, weather and days of the week.

SMSes are sent to drivers to notify them of high-demand areas, enabling them to be more efficient in serving passenger demand and earn more in the process – our internal study conducted in June 2017 shows Grab drivers earn on average one-third (32 per cent) more per hour than the average worker in their country.

Based on testing, we also noted a 35 per cent decrease in speeding incidents when we monitored speeds and notified our bike drivers in Indonesia.

How does Grab work with government agencies to optimize cities’ traffic flow, and perhaps even city planning?

We believe we can play a role in shaping what Southeast Asia’s cities will look like 10-20 years down the road and have heavily invested in our data science, machine learning and AI capabilities. Today, we have more than 30 talented people in our data-science team – making an impact on our business that is equivalent to that by other data science teams several times our size elsewhere.

We’re continuing to rapidly invest in our capabilities and expect to expand the team by more than 50 per cent by end of 2018.

Our simulation and optimization team is the subject of a strong hiring drive; the capabilities of this team allow us to build models of Southeast Asia’s road network, manage demand and supply in real-time and also simulate a range of variables and scenarios.

We believe that our investments in these capabilities and collaboration with governments across the region, will do two things:

Optimize our supply and demand today, offering drivers a way to earn more, faster; and offer passengers the safest, most affordable and most convenient mode of transportation.

In the longer term, this will make our cities safer, cleaner and less congested. With fewer private cars on the road, more city parking lots will be freed up for recreational activities. Our roads will be less congested as rides will be shared.

Grab also provides our driver location data to the OpenTraffic platform, a collaboration with the World Bank to provide Southeast Asian governments (Malaysia, Indonesia, Philippines) access to real-time traffic information.

We are now exploring how to use Grab’s data to help governments directly with transport planning, complement unmet demand in transport and map out how car growth affects cities.

Also Read: In video, 5 tips to note when interviewing for a Grab engineering position

One result of such a collaboration is the launch of GrabShuttle in March 2017: in collaboration with the Government Technology Agency of Singapore (GovTech) and using our data and simulation and optimization expertise, we launched this shuttle bus service mobile application to move larger groups of individual commuters affordably from door to door.

What are the challenges faced by Grab’s data scientists as the platform continues to scale and more markets and services are added on to the platform?

First, scaling our systems is a ‘Grab-wide’ effort. Just in 2017, we launched Grab in two more countries (Myanmar and Cambodia) and launched a range of new services, including payments in restaurants and shops, street-hailing with GrabNow or shuttle services with GrabShuttle, just to name a few.

With each new service and market, the levels of complexity in our data also grow almost exponentially. Each new data set adds a new dimension to explore, and enables us to derive even more insights.

Our biggest challenge, therefore, is to be smart in how to use our data. We need to minimise time and effort as we explore and experiment with the data, while maximizing the meaningful and impactful insights we gather.

This is like ‘connecting the dots’ and telling a new story. Luckily, to do that, we’re also able to work with a talented team of product managers, designers, engineers as well as the business managers in each country.

Does Grab work with third-party telematic providers?

For data analytics purposes, Grab uses in-house telematics software to monitor driving speeds and patterns. With a combination of machine learning and predictive analytics, Grab’s telematics can identify rides that exhibit safe or dangerous driving behaviour.

On a weekly basis, we share a report with our drivers highlighting the instances in which they exhibited dangerous driving behaviour: hard braking, dangerous cornering and swerving.

We’ve seen the following results from this initiative:

  • 15 per cent reduction in speeding (when looking at speeding trips proportion out of total weekly trips per driver);
  • 11 per cent less instances of hard braking (calculated based on heavy brakes per trip);
  • 18 per cent less instances of sudden hard acceleration (calculated based on heavy acceleration per trip).

[Note: A previous version of the article stated that Grab has 23 million drivers. This is incorrect, it has 2.4 million drivers. We apologies for the error.]

Image Credit: Grab

The post Looking under the hood: How Grab’s data science team optimises a fleet of 2.4 million drivers appeared first on e27.