pandas calculates the photovoltaic power generation time utilization rate of charging stations through charging orders

1. Case background

The existing electric vehicle charging station order record data set contains data from many stations and many days. The data intercepts data from 6 am to 19 pm. The fields include the name of the charging station, the name of the charging pile, the name of the charging gun, and the start time of charging. , ends the charging time. There are many charging guns in the charging station, and photovoltaic power generation is set up on the awning. All the electricity generated by photovoltaics is used for car charging. As long as there is more than one charging gun in the charging station, the electricity generated by photovoltaics will be used during charging. When none of the guns are charged, the electricity generated by the photovoltaics will be wasted. These data are used to calculate the proportion of time that each charging station has people charging every day, that is, the photovoltaic time utilization rate. Example data: Suppose there are two guns at the station, gun A and gun B. On a certain day, these guns have the following charging orders:

td>

Charging Gun Start charging time End charging time
A gun 2023-10-21 11:32:22 2023-10-21 12:22:20
B gun 2023-10-21 11:45:25 2023-10-21 13:15:24
A gun 2023-10-21 13:50:21 2023-10-21 14:30:21

The actual photovoltaic utilization time periods calculated by the above three calculations are [2023-10-21 11:32:22, 2023-10-21 13:15:24], [2023-10-21 13:50:21, 2023-10- 21 14:30:21]. Finally calculate the percentage of these photovoltaic utilization time periods in a day.

2. Problem-solving ideas

1. Convert the start charging time and end charging time to the number of seconds in a day. For example, the start time is 2023-10-21 10:01:21, which is converted to 10*60*60 + 1*60 + 21=36081, that is, this time is 36081 seconds of the day. There are a total of 60*60*25=86400 seconds in the whole day.

1. Date conversion
'''
datetime.strptime(start_time,"%Y-%m-%d %H:%M:%S"): This line parses the start_time string into a datetime object. %Y-%m-%d %H:%M:%S is Python’s date and time format, where:

%Y represents a four-digit year (for example, 2023)
%m represents a two-digit month (for example, 07)
%d represents a two-digit date (for example, 06)
%H represents the number of hours in 24-hour format (for example, 13)
%M represents the number of minutes (for example, 30)
%S represents the number of seconds (for example, 45)
'''
t1=datetime.strptime(start_time,"%Y-%m-%d %H:%M:%S") #2023-10-21 11:55:23

2. Date splicing
t2=datetime.combine(t1.date(),datetime.min.time()) #2023-10-21 00:00:00

3. Calculate seconds
'''
This line of code calculates the difference between two datetime objects, and the result is a timedelta object that contains the time difference between the two points in time. The difference is in seconds.
'''
total_seconds=int((t1-t2).total_seconds()) #total_seconds() method returns the total number of seconds

2. Write a function that accepts a list of time intervals as input and returns a merged list of time intervals. This may be useful in applications such as scheduling or time period merging.

#Input[[3,7],[2,6],[4,5],[12,19]]Output[[2,7],[12,19]]
def merge_time_intervals(intervals):
    if not intervals:
        return []

    # Sort according to the start time of the time interval
    intervals.sort(key=lambda x: x[0])

    merged = [intervals[0]]
    for interval in intervals[1:]:
        if interval[0] <= merged[-1][1]:
            print("merged[-1][1]:",merged[-1][1])
            merged[-1][1] = max(merged[-1][1], interval[1])
        else:
            merged.append(interval)
    return merged

input=[[3,7],[2,6],[4,5],[12,19]]
print(merge_time_intervals(input))

3. Calculate the total seconds of the merged time interval list

#Calculate the total number of seconds in the time interval list merged
#Input [[3,7],[2,6],[4,5],[12,19]] Output [[2,7],[12,19]]

def merge_time_intervals(intervals):
    if not intervals:
        return []

    # Sort according to the start time of the time interval
    intervals.sort(key=lambda x: x[0])

    merged = [intervals[0]]
    for interval in intervals[1:]:
        if interval[0] <= merged[-1][1]:
            #print("merged[-1][1]:",merged[-1][1])
            merged[-1][1] = max(merged[-1][1], interval[1])
        else:
            merged.append(interval)
    print(merged)
    return merged

l=[[3,7],[2,6],[4,5],[12,19]]

merged=merge_time_intervals(l)


'''
merged is a list of tuples [[2,7],[12,19]], each tuple contains two elements, which we assume are timestamps.
The tuples in the list represent a series of time periods.
end - start operates on the start and end timestamps in each tuple to get the duration of each time period.
sum(end - start for start, end in merged)
Python's generator expressions (also called "list comprehensions") are used to generate a list of all time differences,
Then use the sum() function to calculate the sum of all values in this list.
'''
total_duration = sum(end - start for start, end in merged)
print(total_duration)

#Output: [[2, 7], [12, 19]]
#12

#(7-2) + (19-12) =5 + 7=12

3. Complete code

import pandas as pd
from datetime import datetime
df = pd.read_excel(r'E:\SynologyDrive\python\Calculate photovoltaic time utilization\Data set for photovoltaic utilization calculation - send csdn.xlsx')

grouped = df.groupby(['station name',df['start charging time'].apply(lambda x: datetime.strptime(x, '%Y-%m-%d %H: %M:%S').date())])
# Calculate the start time and end time in seconds of the day
def get_time_difference(start_time, end_time):
    #start_time=2023-08-26 10:11:32
    r1=int((datetime.strptime(start_time,"%Y-%m-%d %H:%M:%S") - datetime.combine(datetime.strptime(start_time,"%Y-%m -%d %H:%M:%S").date(),datetime.min.time())).total_seconds())

    if datetime.strptime(start_time,"%Y-%m-%d %H:%M:%S").date()<datetime.strptime(end_time,"%Y-%m-%d % H:%M:%S").date():
        r2=86400 #The date of the start time is less than the date of the end time, that is, the charging order time spans one day.
    else:
        r2=int((datetime.strptime(end_time,"%Y-%m-%d %H:%M:%S") - datetime.combine(datetime.strptime(end_time,"%Y-%m -%d %H:%M:%S").date(),datetime.min.time())).total_seconds())
    #print(r1,r2)
    return [r1,r2]

# Put the start time and end time in each group into a list
result = {
    'Station name': [],
    'date': [],  
    'Starting time': [],  
    'End Time': [],
    'Fill interval':[],
    'Merge interval':[],
    'duration':[],
    'Photovoltaic time utilization rate':[]
}

#Input [[3,7],[2,6],[4,5],[12,19]] Output [[2,7],[12,19]]
def merge_time_intervals(intervals):
    if not intervals:
        return []

    # Sort according to the start time of the time interval
    intervals.sort(key=lambda x: x[0])

    merged = [intervals[0]]
    for interval in intervals[1:]:
        if interval[0] <= merged[-1][1]:
            #print("merged[-1][1]:",merged[-1][1])
            merged[-1][1] = max(merged[-1][1], interval[1])
        else:
            merged.append(interval)
    total_duration = sum(end - start for start, end in merged)
    Time utilization=total_duration/86400

    return merged,total_duration,time utilization

for name, group in grouped:
    result['station name'].append(name[0])
    result['date'].append(name[1])
    result['Start time'].append(group['Start charging time'].tolist())
    result['End time'].append(group['End charging time'].tolist())
    r=[]
    for i,item in enumerate(group['Start charging time'].tolist()):
        timerange=get_time_difference(group['Start charging time'].tolist()[i],group['End charging time'].tolist()[i])
        r.append(timerange)

    result['fill interval'].append(r)
    t=merge_time_intervals(r)
    result['Merge interval'].append(t[0])
    result['Duration'].append(t[1])
    result['Photovoltaic time utilization rate'].append(t[2])
  
# print results
for i, v in enumerate(result['station name']):
    #print(f'Site name: {v}, date: {result["date"][i]}, start time: {result["start time"][i]}, end time : {result["End time"][i]}, filling interval: {result["Filling interval"][i]}')
    print(f'station name: {v}, date: {result["date"][i]}, merge interval: {result["merge interval"][i]}, duration: { result["duration"][i]}, photovoltaic time utilization rate: {result["photovoltaic time utilization rate"][i]}')

df = pd.DataFrame(result)
  
#Write DataFrame to Excel file
df.to_excel('Photovoltaic time utilization calculation result result.xlsx', index=False)

4. Relevant data sets and calculation results

This article binds two resources:

1. Data set

2. Calculation results

The knowledge points of the article match the official knowledge archive, and you can further learn relevant knowledge. Python entry-level skill treeStructured data analysis tool PandasPandas overview 382,731 people are learning the system