Implementation of (console) keyword retrieval system based on C++100010890

Keyword search system

1 Analysis

1.1 Background Analysis

Retrieving the number of occurrences of a keyword in a text is an operation that is often used in word processing, text statistics, and other work. Because its operation is relatively low-level, the frequency of use is relatively high, so the efficiency and complexity of its execution are also a major issue that designers need to consider. For different languages, the execution efficiency is also different. For example, in English, the words are clearly separated by spaces, so it is easier to implement; but for Chinese, all the words in a sentence are closely connected, so It is more complicated to implement.

1.2 Functional Analysis

As a keyword retrieval system, we allow users to input multiple English paragraphs (that is, multi-line English) in the terminal. In order to clearly identify the terminator, we use a “0” in a single line as the terminal input terminator. In the program, we read the English paragraphs entered by the user and store them in a file with a file name specified by the user. This is the first step.

Then, we read this file to retrieve keywords. This is the second step. This is the requirement of the title. Although it seems redundant and cumbersome, it realizes the two operations of storing the input content in the file and reading the content from the file, which is a good basis for this question on the basis of file input and output. And it can save the text entered by the user, killing two birds with one stone.

2 Design

2.1 Class Structure Design

We designed two classes: one class (file initialization class InitFile) is responsible for reading the English paragraphs input by the terminal and generating files; the other class (keyword search class KeywordSearch) is responsible for reading the first class generated file, then retrieve the number of occurrences of the keyword in it, and finally output it in the main function.

2.2 Member and Operation Design

File initialization class (InitFile)

public:
//Constructor: Initialize the file name and create (or open) the file in the form of ios::out
//Open the file in the form of ios::out: if the file does not exist, create the file; if it exists, open it after clearing the file
InitFile();

// Destructor: close the file
~InitFile();

// Get the text entered (terminated with 0 on a single line)
void InputText();

// get the file name
string getFilename()const;
private:
//_name is the file name, _input is the unprocessed input directly from the terminal
//_fout is the file output stream
string _name, _input;
ofstream_fout;

Keyword Search Class (KeywordSearch)

public:
//The constructor passes in the keyword to be searched for, and opens the file to be searched
KeywordSearch(const string & amp; keyword, const string & amp; filename);
\t
//Destructor, close the file
~KeywordSearch();

// Get the number of occurrences of the keyword
int getCount();
private:
string _filename,_keyword;
ifstream_fin;
int_count;

2.3 System logic design

First generate the object init_file of the InitFile class, directly call its InputText() function, and the user outputs relevant information at the same time, thus creating a file containing the user input information; then enter the keywords to be retrieved and convert them all into lowercase; Use the keyword keyword and the file name init_file.getFilename() to generate a KeywordSearch object kwSearch, and directly call its getCount() function to get the number of keyword occurrences.

3 Implementation

3.1 Implementation of generated files

3.1.1 Implementation method

In the InitFile constructor, let the user input the file name first, if the user input is not empty, change the file name to the user input, otherwise it is the default file name Default.txt; then use the file name to open with ios::out document.

In InitFile::InputText(), let the user input an English paragraph first, use a single line of 0 as the end mark, use getline to read a line, and convert each line to lowercase and store it in the opened file.

This completes the generation of the file.

3.1.2 Core code

InitFile::InitFile()
{
_input = "";
cout << "Please enter a file name (Default.txt by default):";
getline(cin, _input);
_name = (_input == "" ? "Default.txt" : _input);
_fout.open(_name, ios::out /*| ios::app*/);
}

void InitFile::InputText()
{
cout << "Please enter an English paragraph (with 0 on a single line as the end mark):" << endl;
while (getline(cin, _input))
{
if (_input == "0") {
break;
}
transform(_input.begin(), _input.end(), _input.begin(),std::tolower);//All converted to lowercase
_fout << _input << endl;
}
}

3.2 Implementation of Keyword Search

3.2.1 Implementation method

Initialize the KeywordSearch object kwSearch through the keyword keyword and the file name init_file.getFilename() (the file name should be obtained from the object of the InitFile class), assign values to the private members _keyword and _filename in its constructor, and set the keyword to The number of times _count is initialized to -1, then _filename is opened with ios::in, and the file identifier is set at the begin position of the file.

Then call kwSearch.getCount() in the mian function, in KeywordSearch::getCount(), first judge whether _count is -1, if it is not -1, it means that it has already been retrieved, just return _count directly; otherwise Assign it a value of 0, and then continuously read the strings in the file (because English uses spaces as separators, and the reading of strings is also separated by spaces, so it can simulate the reading process well), and judge characters one by one Whether the string is a keyword, if so, _count ++.

It should be noted here that punctuation marks may be read into the string together with English words, so it is necessary to judge whether the end of the string is English. If not, extract the pure English substring of the string as a new judgment object.

3.2.2 Core code

 KeywordSearch::KeywordSearch(const string & amp; keyword, const string & amp; filename)
{
_count = -1;
_keyword = keyword;
_filename = filename;
_fin.open(_filename, ios::in);//Open the file in the form of ios::in, only read the file
_fin.seekg(ios::beg);//Set the file identifier at the begin position
}

int KeywordSearch::getCount()
{
//_count!=-1 means that it has been searched, and returns _count directly
if (_count == -1) {
_count = 0;
string word;
//When the end of the file is not read
while (!_fin.eof()) {
//Clear to prevent the last word from being counted twice because the last line cannot read anything
word. clear();
_fin >> word;//Read the string from the file, it may be read together with punctuation marks, so it needs to be processed twice
int size = word. size();
if (!word.empty()) {
//If the last digit of the read string is not a letter, it means that it has been read together with punctuation
if (word[size - 1] < 'a' || word[size - 1]>'z') {
// Get the substring and reconstruct the string
word = word.substr(0, size - 1);
}
//At this time, the string must be pure English, directly judge whether it is equal to the keyword
if (word == _keyword) {
_count++;
}
}
}
}
return_count;
}

4 tests

4.1 Basic function test

4.2 Default filename test

4.3 Edge test

Resources

Size: 139KB
Resource download: https://download.csdn.net/download/s1t16/87472193
Note: If the current article or code violates your rights, please private message the author to delete it!