Python practical application explanation-[numpy special topic] numpy application case (1) (with python sample code)

Directory

Analyzing the sale price of a used car with Python

Pencil sketches for building GUI applications in Python

package needed

Implementation steps

full code


Using Python to analyze the sales price of used cars

Today, with advancements in technology, techniques like machine learning are being applied at scale in many organizations. These models usually work with a predefined set of data points, provided as datasets. These datasets contain past/previous information of a specific domain. It is very important to organize these data points before feeding them into the model. This is where we use data analytics. If the data fed to a machine learning model is not well organized, it can give wrong or unwanted output. This can be costly to the organization. Therefore, it is very important to utilize proper data analysis.

About the dataset:

In this example, the data we are going to use is about cars. In particular, it contains various information data points about used cars, such as price, color, etc. Here we need to understand that simply collecting data is not enough. Raw data is useless. Here, data analysis plays an important role in unlocking the information we need and gaining new insights into these raw data.

Consider this scenario, our friend Otis wants to sell his car. But he has no idea how much his car should sell for! He wants to maximize profit, but he also wants it to sell at a reasonable price to someone who wants to own it. So here we, as a data scientist, we can help our friend Otis.

Let’s think like a data scientist and clearly define some of his problems. For example, is there data on the prices of other cars and their features? What features of a car affect its price? color? brand? Does horsepower also affect the selling price, or perhaps something else?

These are some of the questions we can start thinking about as a data analyst or data scientist. To answer these questions, we will need some data. But the data exists in raw form. Therefore, we need to analyze it first. These data are provided to us in the format of .csv/.data

To download the files used in this example, click here. The provided files are in .data format. Follow the procedure below to convert the .data file to a .csv file.

The process of converting a .data file to a .csv:

1. Open MS Excel
2. Go to data
3. Select From Text
4. Tick the comma (only).
5. Save to your desired location on your computer in .csv format!

Required modules:

  • pandas: Pandas is an open source library that allows you to perform data manipulation in Python. Pandas provides an easy way to create, manipulate, and process data.
  • numpy: Numpy is the fundamental package for scientific computing in Python. Numpy can be used as an efficient multidimensional container for general data.
  • matplotlib: Matplotlib is a Python two-dimensional plotting library that can generate publication-quality figures in a variety of formats.
  • seaborn: Seaborn is a Python data visualization library based on matplotlib. Seaborn provides a high-level interface for drawing attractive and informative statistical graphics.
  • scipy: Scipy is a Python-based ecosystem of open source software for mathematics, science, and engineering.

Steps to install these packages:

  • If you use Anaconda-jupyter/Syder or any other third-party software to write your Python code, please make sure to set the “scripts folder” path of the software in your computer’s command prompt.
  • Then enter – pip install package-name
    Example:
pip install numpy
  • Then after the installation is complete. (Make sure you are connected to the internet!!) Open up your IDE, and import these packages. To import, type –import-package-name
    Example:
import numpy

Steps used in the following code (short description):

  • import package
  • Set the path of the data file (.csv file).
  • Find if we have any empty or NaN data in our file. remove it if there is
  • Perform various data cleaning and data visualization operations on your data. The steps are explained in the form of comments next to each line of code for better understanding as it is better to see the side of the code than fully explained here which would be pointless.
  • make achievement!

Let’s start analyzing the data.

Step 1: Import the required modules.

# importing section
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
import scipy as sp

Step Two: Let’s examine the first five entries of the dataset.

# using the Csv file
df = pd.read_csv('output.csv')
  
# Checking the first 5 entries of dataset
df. head()

Output:

Step 3: Define header files for our dataset.

headers = ["symboling", "normalized-losses", "make",
           "fuel-type", "aspiration", "num-of-doors",
           "body-style", "drive-wheels", "engine-location",
           "wheel-base", "length", "width", "height", "curb-weight",
           "engine-type", "num-of-cylinders", "engine-size",
           "fuel-system", "bore", "stroke", "compression-ratio",
           "horsepower", "peak-rpm", "city-mpg", "highway-mpg", "price"]
  
df.columns=headers
df. head()

Output:

Step 4: Find missing values, if any.

data = df
  
# Finding the missing values
data.isna().any()
  
# Finding if missing values
data.isnull().any()

Output:

Step 4: Convert mpg to L/100km and check the data type of each column.

# converting mpg to L / 100km
data['city-mpg'] = 235 / df['city-mpg']
data.rename(columns = {'city_mpg': "city-L / 100km"}, inplace = True)
  
print(data. columns)
  
# checking the data type of each column
data.dtypes

Output:

Step 5: Here, price is object type (string), it should be int or float, so we need to change it.

data.price.unique()
  
# Here it contains '?', so we Drop it
data = data[data.price != '?']
  
# checking it again
data.dtypes

Output:

Step 6: By using a simple feature scaling method example (done for the rest) and grouping – normalize the values.

data['length'] = data['length']/data['length'].max()
data['width'] = data['width']/data['width'].max()
data['height'] = data['height']/data['height'].max()
  
# binning-grouping values
bins = np.linspace(min(data['price']), max(data['price']), 4)
group_names = ['Low', 'Medium', 'High']
data['price-binned'] = pd. cut(data['price'], bins,
                              labels = group_names,
                              include_lowest = True)
  
print(data['price-binned'])
plt.hist(data['price-binned'])
plt. show()

Output:

Step 7: Do a descriptive analysis of the data into numerical categories.

# categorical to numerical variables
pd.get_dummies(data['fuel-type']).head()
  
# descriptive analysis
# NaN are skipped
data. describe()

Output:

Step 8: Plot the data by price based on engine size.

# examples of box plot
plt. boxplot(data['price'])
  
# by using seaborn
sns.boxplot(x = 'drive-wheels', y = 'price', data = data)
  
# Predicting price based on engine size
# Known on x and predictable on y
plt.scatter(data['engine-size'], data['price'])
plt. title('Scatterplot of Enginesize vs Price')
plt.xlabel('Engine size')
plt.ylabel('Price')
plt. grid()
plt. show()

Output:

Step 9: Group the data by wheel, body style, and price.

# Grouping Data
test = data[['drive-wheels', 'body-style', 'price']]
data_grp = test.groupby(['drive-wheels', 'body-style'],
                         as_index = False).mean()
  
data_grp

Output:

Step 10: Use perspective and draw a heatmap based on the data from the perspective.

# pivot method
data_pivot = data_grp. pivot(index = 'drive-wheels',
                            columns = 'body-style')
data_pivot
  
# heatmap for visualizing data
plt.pcolor(data_pivot, cmap='RdBu')
plt. colorbar()
plt. show()

Output:

Step 11: Get the final results and display them in a graph. Since the slope increases in the positive direction, it is a positive linear relationship.

# Analysis of Variance- ANOVA
# returns f-test and p-value
# f-test = variance between sample group means divided by
# variation within sample group
# p-value = confidence degree
data_annova = data[['make', 'price']]
grouped_annova = data_annova. groupby(['make'])
annova_results_l = sp.stats.f_oneway(
                             grouped_annova.get_group('honda')['price'],
                             grouped_annova.get_group('subaru')['price']
                                    )
print(annova_results_l)
  
# strong corealtion between a categorical variable
# if annova test gives large f-test and small p-value
  
# Correlation- measures dependency, not causation
sns.regplot(x = 'engine-size', y = 'price', data = data)
plt.ylim(0, )

Output:

Pencil sketches for building GUI applications in Python

Artists in many disciplines use sketching and painting to preserve ideas, memories, and thoughts. From painting and sculpture to visiting art museums, experiencing the arts offers a variety of health benefits, including reduced stress and improved critical thinking skills. Drawing, drawing, and painting, in particular, have been linked to improved creativity, memory, and stress reduction, and are used in art therapy.

With this article, we can now build a web application that converts images directly into sketches using the python framework Streamlit. Users can upload an image to be converted into a watercolor or pencil sketch. Users can go further and download the converted pictures, before that, let’s understand some definitions that we will use in this article.

  • Streamlit – Streamlit is a popular open source web application framework among Python developers. It is interoperable and compatible with a range of commonly used libraries, including Keras, Sklearn, Numpy, and pandas.
  • PIL – PIL is short for Python Imaging Library. It is a software package for image processing in the Python programming language. It includes lightweight image manipulation tools to help with picture editing, creation and storage.
  • Numpy – Numpy is a widely used Python programming library for advanced mathematical calculations.
  • cv2 – This library is used to solve computer vision problems

Required packages

pip install streamlit
pip install opencv-python
pip install numpy
pip install Pillow

Implementation steps

Step 1: Install Streamlit

Likewise, we will install PIL, Numpy and cv2.

Step 2: Test whether the installation was successful.

streamlit hello

Step 3: Now run the streamlit web app. We need to type the following command

streamlit run app.py

Step Four: Now, the web application has been successfully launched. You can access the web app via a local URL or a web URL.

Step 5: Create a new folder, name it – Web App, and convert the image to a sketch.

Step 6: Paste the web app code in the file ‘app.py‘ and save the file.

Initially in the code, we import all the required frameworks, packages, libraries and modules that we will utilize to build the web application. Also, we have to use user-defined functions for converting images to watercolor sketches and images to pencil sketches. There is also a function to load images using the PIL library. The main function contains the code of the web application. Initially, we had some titles and subtitles to direct users to upload images. In order to upload images, we make use of streamlit’s file uploader. We also provide a dropdown menu for the user to choose to make a watercolor sketch/make a pencil sketch, and based on their choice, we render the result. Both the original image and the image with the filter applied are presented side by side so that users can compare the results of the two images. Finally, users can also download images to their local machines. This can be done by utilizing Streamlit’s download button.

Full code

# import the frameworks, packages and libraries
import stream lit as st
from PIL import Image
from io import BytesIO
import numpy as np
import cv2 # computer vision
  
# function to convert an image to a
# water color sketch
def convertto_watercolorsketch(inp_img):
    img_1 = cv2.edgePreservingFilter(inp_img, flags=2, sigma_s=50, sigma_r=0.8)
    img_water_color = cv2.stylization(img_1, sigma_s=100, sigma_r=0.5)
    return(img_water_color)
  
# function to convert an image to a pencil sketch
def pencilsketch(inp_img):
    img_pencil_sketch, pencil_color_sketch = cv2.pencilSketch(
        inp_img, sigma_s=50, sigma_r=0.07, shade_factor=0.0825)
    return(img_pencil_sketch)
  
# function to load an image
def load_an_image(image):
    img = Image. open(image)
    return img
  
# the main function which has the code for
# the web application
def main():
    
    # basic heading and titles
    st.title('WEB APPLICATION TO CONVERT IMAGE TO SKETCH')
    st.write("This is an application developed for converting\
    your ***image*** to a ***Water Color Sketch*** OR ***Pencil Sketch***")
    st.subheader("Please Upload your image")
      
    # image file uploader
    image_file = st. file_uploader("Upload Images", type=["png", "jpg", "jpeg"])
  
    # if the image is uploaded then execute these
    # lines of code
    if image_file is not None:
        
        # select box (drop down to choose between water
        # color / pencil sketch)
        option = st. selectbox('How would you like to convert the image',
                              ('Convert to water color sketch',
                               'Convert to pencil sketch'))
        if option == 'Convert to water color sketch':
            image = Image.open(image_file)
            final_sketch = convertto_watercolorsketch(np.array(image))
            im_pil = Image.fromarray(final_sketch)
  
            # two columns to display the original image and the
            # image after applying water color sketching effect
            col1, col2 = st. columns(2)
            with col1:
                st. header("Original Image")
                st. image(load_an_image(image_file), width=250)
  
            with col2:
                st. header("Water Color Sketch")
                st. image(im_pil, width=250)
                buf = BytesIO()
                img = im_pil
                img. save(buf, format="JPEG")
                byte_im = buf. getvalue()
                st.download_button(
                    label="Download image",
                    data=byte_im,
                    file_name="watercolorsketch.png",
                    mime="image/png"
                )
  
        if option == 'Convert to pencil sketch':
            image = Image.open(image_file)
            final_sketch = pencilsketch(np. array(image))
            im_pil = Image.fromarray(final_sketch)
              
            # two columns to display the original image
            # and the image after applying
            # pencil sketching effect
            col1, col2 = st. columns(2)
            with col1:
                st. header("Original Image")
                st. image(load_an_image(image_file), width=250)
  
            with col2:
                st. header("Pencil Sketch")
                st. image(im_pil, width=250)
                buf = BytesIO()
                img = im_pil
                img. save(buf, format="JPEG")
                byte_im = buf. getvalue()
                st.download_button(
                    label="Download image",
                    data=byte_im,
                    file_name="watercolorsketch.png",
                    mime="image/png"
                )
  
  
if __name__ == '__main__':
    main()

Output: