Use Python to analyze the distribution relationship between Luckin and Starbucks nationwide stores

Will Luckin shake Starbucks’ industry status?

Last month, Luckin Coffee’s sauce latte became popular, putting Luckin in the spotlight again. The last time was when it committed financial fraud.

The domestic coffee market has been booming in recent years, driving the rapid development of many coffee brands including Luckin. From 2013 to 2023, China’s per capita coffee consumption is expected to increase by 238%, and now the total number of coffee stores nationwide has exceeded 100,000. And the number is growing by tens of thousands every year.

The rise of Luckin Coffee reminds us of Starbucks, the benchmark in the coffee industry. Starbucks has been almost synonymous with coffee in the past ten years and is also the lifestyle of urban white-collar workers.

What is happening now is that wherever there is a Starbucks store, there is almost a Luckin store within a few hundred meters, and some are even surrounded by two or three.

The following uses visual dashboards and Python data analysis to compare the differences and correlations between Starbucks and Luckin stores in terms of quantity, regional distribution.

There are two main findings:

1. Starbucks is more concentrated in economically developed coastal areas such as the Yangtze River Delta, Pearl River Delta, and Beijing-Tianjin-Hebei, especially first- and second-tier cities. Luckin is more dispersed than Starbucks and has stores in many third- and fourth-tier cities and below.

2. Luckin store locations are concentrated around Starbucks. Data shows that within a radius of 500 meters, there are an average of 0.6 Luckin stores around each Starbucks store nationwide.

Preparation phase

The tools used in this analysis task include Next Data Robot, Python, and shapely.

Xiamiao Data Robot is a cloud data platform that integrates data sets, data cleaning, data analysis, data visualization and billboard construction. The Starbucks and Luckin store data sets used in this analysis are all stored in Xiamiao Data Robot.

We will develop data dashboards based on data sets, and also use API interfaces to directly call Python for data analysis and visualization.

?

Platform link:

http://nexadata.cn/mobileSetMessage

?

Python is used to connect to the data interface of the robot and process and analyze the data.

Shapely is a third-party library for Python, used to process latitude and longitude data, and can determine the distance and inclusion relationship between different geographical coordinates.

Dataset

Because we need to compare and analyze the number and location of Starbucks and Luckin stores, the main fields of the data set include store name, longitude, latitude, and city.

?

Note: The data set is in 2022, and there is a quantitative error of about 20%.

?

National Starbucks coffee store data set: National Luckin Coffee Store data set:

Both data sets are stored on the next data robot platform. You can directly view and use the data sets through the data view, which we will use to build a data dashboard later.

Because Python is needed to process the data later, the data needs to be obtained through the API data interface. It is very convenient to operate and can be saved for later use.

import requests
headers = { "x-token": "your authentication token" }
response = requests.get("http://app.chafer.nexadata.cn/openapi/v1/sheet/sht22nId5uouP2/records?size=1 & amp;page=1", headers = headers)
print(response.json())

Build an analysis dashboard

It is relatively simple to build a dashboard on the Next Data Robot. First, create a process task and select two data views: Starbucks and Luckin.

Then create a Kanban board and edit the design chart, which is similar to what we usually do on BI software.

There are more than a dozen chart forms here, which can basically meet most visualization scenarios.

Comparison of the number of Starbucks and Luckin stores nationwide

As of the data set time (2022), the number of Starbucks stores nationwide is expected to be 4,442, and the number of Luckin Coffee stores nationwide is expected to be 3,904. The number of Starbucks stores is 14% more than that of Luckin.

Judging from the magnitude, the two are very close, and Luckin is expanding its stores at a terrifying growth rate. Taking the business district near my home as an example, there was only one Luckin last year, and there are three this year.

Starbucks is more demanding than Ruixing in terms of location, store opening cost, store area, and number of store employees. Ruixing specializes in takeout + takeout. This is also the reason why Ruixing can expand rapidly in addition to market demand factors.

Starbucks is distributed in the top 20 cities across the country

The top five cities with the number of Starbucks stores are: Shanghai, Beijing, Hangzhou, Shenzhen, and Guangzhou. Among the top 20 cities, there are 6 in the Yangtze River Delta, 5 in the Pearl River Delta, and 2 in the Beijing-Tianjin-Hebei region.

The number of Starbucks stores in Shanghai is 668, which is twice as many as the second-place Beijing. At the same time, Shanghai is also the city with the largest number of Starbucks stores in the world. It seems that the people of Shanghai have a well-deserved love for coffee.

The number of Starbucks in Hangzhou is second only to Shanghai and Beijing, and higher than Shenzhen and Guangzhou. Internet and e-commerce practitioners in Hangzhou also prefer Starbucks.

Luckin is distributed in Top 20 cities across the country

The top five cities with the number of Luckin stores are: Shanghai, Beijing, Guangzhou, Shenzhen, and Hangzhou. They are the same as the top five cities for Starbucks, but the ordering is slightly different.

Among the top 20 cities, there are 6 in the Yangtze River Delta, 2 in the Pearl River Delta, and 2 in the Beijing-Tianjin-Hebei region.

Starbucks is mainly concentrated in first- and second-tier coastal cities, while Luckin is rapidly occupying the market in inland cities. Luckin’s top 20 cities already include Hefei, Kunming, and Zhengzhou, but these three provincial capital cities are not in Starbucks’ top 20.

Therefore, the distribution of Luckin stores is more dispersed and not overly concentrated in first-tier cities.

Starbucks nationwide distribution heat map

It can also be seen from the heat map of Starbucks stores that the red high-density areas are mainly concentrated in coastal areas, while the inland areas are distributed in a point-like manner and are relatively sparse.

Ruixing’s nationwide distribution heat map

The distribution of Luckin stores is more even. In addition to coastal areas, there are also many stores in central China such as Hunan, Anhui, Hubei, and Hunan.

Starbucks Shanghai distribution heat map

Shanghai is the city with the largest coffee consumption demand in the country. Let’s take a look at the distribution of Starbucks stores in Shanghai.

Generally speaking, Starbucks stores are concentrated in the inner ring of Shanghai, and are scattered outwards in dotted lines. The five major new cities in the suburbs, Pudong Airport, and Hongqiao hub are also relatively concentrated areas.

Luckin Shanghai distribution heat map

The concentration of Luckin in the inner ring of Shanghai is not as obvious as that of Starbucks, and the overall number is also much smaller.

Python data analysis

Earlier, we analyzed the distribution of Starbucks and Luckin stores across the country by building a visual signage on the Next Miao Robot. The regional differences are still quite obvious.

Let’s further analyze the relationship between Starbucks and Luckin stores. We know that Luckin Coffee is a rising star. It is said that the location of many stores mainly depends on whether there is a Starbucks nearby.

So, on average, how many Luckin stores are there around each Starbucks store nationwide? This time, look at the gathering situation of Luckin around Starbucks from a 500-meter radius.

We use Python and its third-party library shapely to process data. Shapely is mainly used to process geographical coordinate data.

Step 1: Import the required libraries

# Import related libraries
import pandas as pd
import requests
import time
from shapely.geometry import Point
from shapely.geometry.polygon import Polygon

Step 2: Extract data from the API

# Extract Starbucks and Luckin store data and call it through the next robot API call
# Extract Starbucks store data
headers = { "x-token": "tk7a2980431688455e8976e4bad4d13d6a" }
starbucks_list = []
for i in range(1,10):
    response_1 = requests.get("http://app.chafer.nexadata.cn/openapi/v1/sheet/sht22nId5uouP2/records?size=500 & amp;page={0}".format(i), headers = headers)
    starbucks = response_1.json()['data']['list']
    starbucks = pd.DataFrame(starbucks)
    time.sleep(1)
    starbucks_list.append(starbucks)
starbucks = pd.concat(starbucks_list)
# Extract Luckin store data
luckin_list =[]
for j in range(1,9):
    response_2 = requests.get("http://app.chafer.nexadata.cn/openapi/v1/sheet/sht22nIeomVmYy/records?size=500 & amp;page={0}".format(j), headers = headers)
    luckin = response_2.json()['data']['list']
    luckin = pd.DataFrame(luckin)
    luckin_list.append(luckin)
    time.sleep(1)
luckin = pd.concat(luckin_list)

Step 3: Determine the number of Luckin stores within a 500-meter radius of a Starbucks store

# Draw a geographical area with a radius of XX meters based on the coordinates of the Starbucks coffee shop
def circle(data,radius):
    # radius represents the area radius
    # Given geographical coordinates
    center_latitude = float(data['dimension'])
    center_longitude = float(data['longitude'])
    #Create a circular area
    center_point = Point(center_longitude, center_latitude)
    circle = center_point.buffer(radius/111300)
    #Create polygon area
    polygon = Polygon(circle.exterior)
    return polygon

# Construct coordinate points based on longitude and latitude
def point(data):
    # Given geographical coordinates
    center_latitude = float(data['dimension'])
    center_longitude = float(data['longitude'])
    #Create coordinate points
    center_point = Point(center_longitude, center_latitude)
    return center_point

# Determine whether Luckin Coffee Shop is within 500m of Starbucks
def is_inside(data):
    polygon = data['Polygon']
    # Determine whether the coordinates are within the area
    n = 0
    luckin_city = luckin[luckin['city']==data['city']]
    for point in luckin_city['Point']:
        is_inside = polygon.contains(point)
        #Print the judgment result
        if is_inside:
            n = n + 1
    return n

# Draw a geographical area with a radius of 500 meters based on the coordinates of Starbucks stores
starbucks['Polygon'] = starbucks.apply(circle,axis=1,args=(500,))

# Construct coordinate points based on the longitude and latitude of Luckin stores
luckin['Point'] = luckin.apply(point, axis=1)

# Determine whether a Luckin store is within a 500-meter radius of a Starbucks store
starbucks['Luckin_numbers'] = starbucks.apply(is_inside, axis=1)

The data after processing is as follows:

Step 4: Analyze the data

Within a radius of 500 meters, there are an average of 0.6 Luckin stores around each Starbucks store nationwide.

# Within a radius of 500 meters, there are an average of 0.6 Luckin stores around each Starbucks store nationwide.
starbucks['Luckin_numbers'].mean()

Output: 0.6

Within a radius of 500 meters, there are 7 Luckin stores surrounding the largest number of Starbucks stores in the country.

# Within a radius of 500 meters, there are 7 Luckin stores around the most Starbucks stores
starbucks['Luckin_numbers'].max()

Output: 7

Within a radius of 500 meters, the average number of Luckin stores around a Starbucks store is ranked by city. The largest number is Linyi City, with an average of 1.8 Luckin stores around each Starbucks store.

# Ranking of average number of Luckin stores in each city within a 500-meter radius of Starbucks stores
# The most is that there are an average of 1.8 Luckin stores around each Starbucks store in Linyi City
city_list = []
for city in pd.unique(starbucks['city']):
    avg_luckin_numbers = starbucks[starbucks['city']==city]['Luckin_numbers'].mean()
    starbucks_nums = starbucks[starbucks['city']==city]['name'].count()
    city_list.append([city,starbucks_nums,avg_luckin_numbers])

df = pd.DataFrame(city_list,columns=['city','starbucks_nums','avg_luckin_numbers'])
df.sort_values(by=['avg_luckin_numbers'],axis=0,ascending=False)

Output:

It seems that Luckin does have an indissoluble bond with Starbucks. No wonder we see so many Luckin stores around Starbucks.

Starbucks stores have developed the habit of drinking coffee among surrounding users. In other words, there are many coffee drinking users here, so Starbucks just came here to open a store. Then opening Luckin Coffee near Starbucks can acquire a large wave of potential users at a low cost, despite competition. , it’s still very worthwhile.

Summary

We used visual dashboards and Python data analysis to show the regional distribution and correlation of Starbucks and Luckin Coffee stores. In fact, there are many things worth analyzing and exploring.

For example, in some cities there are very few or almost no Luckin stores around Starbucks stores. What is the reason? Is it a potential opportunity or a pit to be avoided?

If you are interested, you can give it a try.

Data and code address:

http://nexadata.cn/mobileSetMessage