Use C# to simplify your work (processing Excel tables)

Use C# to simplify work (processing Excel tables)

It is necessary to deduplicate the data in several Excel tables, and then save the data of each site as a Sheet.

Several tables are shown below:

The implementation effect is as follows:

Detailed implementation

Need to use EPPlus to operate Excel

Install EPPlus as follows:

For better demonstration and explanation, the steps are divided. First, import Excel data, then remove duplicates, then classify the data, and finally export it as Excel data. A form is designed, as shown below:

Import Excel data

First define a class to save relevant data. The design of the class is as follows:

 public class WaterData
 {<!-- -->
      public int Id {<!-- --> get; set; }
      public string? Name {<!-- --> get; set; }
      public string? WaterLevel {<!-- --> get; set; }
      public string? WaterChange {<!-- --> get; set; }
      public string? Source {<!-- --> get; set; }
     
 }

The code for clicking the Import Excel data button is as follows:

OpenFileDialog openFileDialog = new OpenFileDialog();
openFileDialog.Filter = "Excel Files (*.xlsx; *.xls;*.csv)|*.xlsx; *.xls;*.csv";
openFileDialog.FilterIndex = 1;
openFileDialog.Multiselect = false;

   if (openFileDialog.ShowDialog() == DialogResult.OK)
      {<!-- -->
          filePath = openFileDialog.FileName;
          ExcelPackage.LicenseContext = OfficeOpenXml.LicenseContext.NonCommercial;
          using (ExcelPackage package = new ExcelPackage(filePath))
           {<!-- -->

               ExcelWorksheet worksheet = package.Workbook.Worksheets[0];

                //Get the number of columns and rows of the table
                int rowCount = worksheet.Dimension.Rows;
                int colCount = worksheet.Dimension.Columns;

                for (int i = 0; i < rowCount - 1; i + + )
                {<!-- -->
                   //Create a realData class to save data
                   var data = new WaterData();
             data.Id = n;
             data.Name = (string)worksheet.Cells[i + 2, 3].Value;
             data.WaterLevel = Convert.ToString(worksheet.Cells[i + 2, 4].Value);
             data.WaterChange = Convert.ToString(worksheet.Cells[i + 2, 5].Value);
             data.Source = (string)worksheet.Cells[i + 2, 2].Value;
             waterList.Add(data);
                     n + + ;
                    }
             package.Save();

                }
            }
            else
            {<!-- -->

                MessageBox.Show("You did not select any files this time!!!");
            }
        }

The n above is a static int, and the initial value is 0.

The effect of importing Excel data is as follows:

Data deduplication

Data deduplication is performed based on the DistinctBy method in C# LINQ. In this example, id is not used because each piece of data has a different id. Even if it is duplicate data, the id is also different. This example is based on the Name attribute. And the Soure attribute, as long as the two data are the same, it is considered to be duplicate data.

The code for clicking the data deduplication button is as follows:

private void button2_Click(object sender, EventArgs e)
{<!-- -->
    distinctList = waterList.DistinctBy(x => new {<!-- --> x.Name, x.Source }).ToList();
}

All it takes is one line of code:

distinctList = waterList.DistinctBy(x => new {<!-- --> x.Name, x.Source }).ToList();

The effect of performing deduplication is as follows:

Perform data classification

In this example, it is required to put the data of the same site on the same Sheet, so first you need to know how many different site names there are. The code is as follows:

 var Names = distinctList.Select(x => x.Name).Distinct().ToList();

The effect is as follows:

Since these data are recognized by image text, the recognition may be incorrect. If there are less than 50 pieces of data for a website name, it is not needed. The code is as follows:

 for (int i = 0; i < Names.Count; i + + )
  {<!-- -->
     var nameList = distinctList.Where(x => x.Name == Names[i]).ToList();
     if (nameList.Count > 50)
         {<!-- -->
               list.Add(nameList);
          }
   }

Implementing data classification only requires one line of code:

 var nameList = distinctList.Where(x => x.Name == Names[i]).ToList();

The implementation effect is as follows:

Export to Excel file

In this example, the idea of exporting to an Excel file is to first let the user select a folder, and then save the exported Excel file under this folder. The code for exporting to an Excel file is as follows:

 private void button4_Click(object sender, EventArgs e)
        {<!-- -->
            //Create a FolderBrowserDialog object
            FolderBrowserDialog folderBrowserDialog = new FolderBrowserDialog();

            //Set the title of the dialog box
            folderBrowserDialog.Description = "Select the folder to save the data of each site";

            //Set the default root folder, if necessary
            // folderBrowserDialog.RootFolder = Environment.SpecialFolder.MyComputer;

            //Show the folder selection dialog
            DialogResult result = folderBrowserDialog.ShowDialog();

            if (result == DialogResult.OK)
            {<!-- -->
                //The user selected a folder
                selectedFolderPath = folderBrowserDialog.SelectedPath;
                richTextBox1.Text + = $"The selected Excel saving folder is: {<!-- -->selectedFolderPath}\r\\
";
                richTextBox1.Text + = "Executing export to Excel file...";
                using (ExcelPackage excelPackage = new ExcelPackage())
                {<!-- -->
                    for(int i =0; i < list.Count; i + + )
                    {<!-- -->
                        ExcelWorksheet worksheet = excelPackage.Workbook.Worksheets.Add(list[i][0].Name);
                        for (int j = 0; j < list[i].Count; j + + )
                        {<!-- -->
                            worksheet.Cells[j + 1, 1].Value = list[i][j].Id;
                            worksheet.Cells[j + 1, 2].Value = list[i][j].Name;
                            worksheet.Cells[j + 1, 3].Value = list[i][j].WaterLevel;
                            worksheet.Cells[j + 1, 4].Value = list[i][j].WaterChange;
                            worksheet.Cells[j + 1, 5].Value = list[i][j].Source;
                        }
                    }

                    //Save Excel file
                    FileInfo excelFile = new FileInfo($"{<!-- -->selectedFolderPath}\Each site data.xlsx");
                    excelPackage.SaveAs(excelFile);
                    richTextBox1.Text + = "Export to Excel file completed\r\\
";
                }
            }
        }

The implementation effect is as follows:

Finally

We generally use VBA and Python to operate Excel. This article demonstrates how to use C# to simplify our office work (processing Excel data) through an example.

The entire source code of this example is as follows:

using OfficeOpenXml;
using System.Collections;
using System.Collections.Generic;

namespace Excel data processing
{
    public partial class Form1 : Form
    {
        string filePath;
        string selectedFolderPath;
        static int n = 0;
        List waterList = new List();
        List distinctList = new List();
        List> list = new List>();
        public class WaterData
        {
            public int Id { get; set; }
            public string? Name { get; set; }
            public string? WaterLevel { get; set; }
            public string? WaterChange { get; set; }
            public string? Source { get; set; }

        }


        public Form1()
        {
            InitializeComponent();
        }

        private void button1_Click(object sender, EventArgs e)
        {
            OpenFileDialog openFileDialog = new OpenFileDialog();
            openFileDialog.Filter = "Excel Files (*.xlsx; *.xls;*.csv)|*.xlsx; *.xls;*.csv";
            openFileDialog.FilterIndex = 1;
            openFileDialog.Multiselect = false;

            if (openFileDialog.ShowDialog() == DialogResult.OK)
            {
                filePath = openFileDialog.FileName;
                richTextBox1.Text + = $"The file path you selected is: {filePath}\r\\
";
                richTextBox1.Text + = $"Importing Excel data...\r\\
";
                ExcelPackage.LicenseContext = OfficeOpenXml.LicenseContext.NonCommercial;
                using (ExcelPackage package = new ExcelPackage(filePath))
                {

                    ExcelWorksheet worksheet = package.Workbook.Worksheets[0];

                    //Get the number of columns and rows of the table
                    int rowCount = worksheet.Dimension.Rows;
                    int colCount = worksheet.Dimension.Columns;

                    for (int i = 0; i < rowCount - 1; i + + )
                    {
                        //Create a realData class to save data
                        var data = new WaterData();
                        data.Id = n;
                        data.Name = (string)worksheet.Cells[i + 2, 3].Value;
                        data.WaterLevel = Convert.ToString(worksheet.Cells[i + 2, 4].Value);
                        data.WaterChange = Convert.ToString(worksheet.Cells[i + 2, 5].Value);
                        data.Source = (string)worksheet.Cells[i + 2, 2].Value;
                        waterList.Add(data);
                        n + + ;
                    }
                    richTextBox1.Text + = $"Imported Excel data successfully, the data volume is: {rowCount - 1}\r\\
";
                    package.Save();

                }
            }
            else
            {

                MessageBox.Show("You did not select any files this time!!!");
            }
        }

        private void button2_Click(object sender, EventArgs e)
        {
            richTextBox1.Text + = "Performing data deduplication...\r\\
";
            distinctList = waterList.DistinctBy(x => new {<!-- --> x.Name, x.Source }).ToList();
            richTextBox1.Text + = $"Data deduplication has been completed. The amount of data after deduplication is: {distinctList.Count}\r\\
";
        }

        private void button3_Click(object sender, EventArgs e)
        {
            richTextBox1.Text + = "Performing data classification...\r\\
";
            var Names = distinctList.Select(x => x.Name).Distinct().ToList();
            for (int i = 0; i < Names.Count; i + + )
            {
                var nameList = distinctList.Where(x => x.Name == Names[i]).ToList();
                if (nameList.Count > 50)
                {
                    list.Add(nameList);
                }
            }
            richTextBox1.Text + = $"The execution of data classification is completed, the number of categories is: {list.Count}\r\\
";
        }

        private void button4_Click(object sender, EventArgs e)
        {<!-- -->
            //Create a FolderBrowserDialog object
            FolderBrowserDialog folderBrowserDialog = new FolderBrowserDialog();

            //Set the title of the dialog box
            folderBrowserDialog.Description = "Select the folder to save the data of each site";

            //Set the default root folder, if necessary
            // folderBrowserDialog.RootFolder = Environment.SpecialFolder.MyComputer;

            //Show the folder selection dialog
            DialogResult result = folderBrowserDialog.ShowDialog();

            if (result == DialogResult.OK)
            {<!-- -->
                //The user selected a folder
                selectedFolderPath = folderBrowserDialog.SelectedPath;
                richTextBox1.Text + = $"The selected Excel saving folder is: {<!-- -->selectedFolderPath}\r\\
";
                richTextBox1.Text + = "Executing export to Excel file...";
                using (ExcelPackage excelPackage = new ExcelPackage())
                {<!-- -->
                    for(int i =0; i < list.Count; i + + )
                    {<!-- -->
                        ExcelWorksheet worksheet = excelPackage.Workbook.Worksheets.Add(list[i][0].Name);
                        for (int j = 0; j < list[i].Count; j + + )
                        {<!-- -->
                            worksheet.Cells[j + 1, 1].Value = list[i][j].Id;
                            worksheet.Cells[j + 1, 2].Value = list[i][j].Name;
                            worksheet.Cells[j + 1, 3].Value = list[i][j].WaterLevel;
                            worksheet.Cells[j + 1, 4].Value = list[i][j].WaterChange;
                            worksheet.Cells[j + 1, 5].Value = list[i][j].Source;
                        }
                    }

                    //Save Excel file
                    FileInfo excelFile = new FileInfo($"{<!-- -->selectedFolderPath}\Each site data.xlsx");
                    excelPackage.SaveAs(excelFile);
                    richTextBox1.Text + = "Export to Excel file completed\r\\
";
                }
            }
        }
    }
}

If you are also interested in C#, welcome to follow the WeChat official account, DotNet to learn and communicate, and grow and progress together~