1071 Speech Patterns

People often have a preference among synonyms of the same word. For example, some may prefer “the police”, while others may prefer “the cops”. Analyzing such patterns can help to narrow down a speaker’s identity, which is useful when validating, For example, whether it’s still the same person behind an online avatar.

Now given a paragraph of text sampled from someone’s speech, can you find the person’s most commonly used word?

Input Specification:

Each input file contains one test case. For each case, there is one line of text no more than 1048576 characters in length, terminated by a carriage return \
. The input contains at least one alphanumerical character, i.e., one character from the set [0-9 A-Z a-z].

Output Specification:

For each test case, print in one line the most commonly occurring word in the input text, followed by a space and the number of times it has occurred in the input. If there are more than one such words, print the lexicographically smallest one. The word should be printed in all lower case. Here a “word” is defined as a continuous sequence of alphanumeric characters separated by non-alphanumeric characters or the line beginning/end.

Note that words are case insensitive.

Sample Input:

Can1: "Can a can can a can? It can!"

Sample Output:

can 5

People often have a preference for synonyms of the same word. For example, some people may prefer “the police” while others may prefer “the cops”. Analyzing such patterns can help narrow down the identity of the speaker, which is useful when verifying, for example, whether the same person is still behind an online profile picture. Now given a piece of text extracted from someone’s speech, can you find the most frequently used words by that person?

Input specification: Each input file contains one test case. In each case, the length of a line of text does not exceed 1048576 characters, terminated by a carriage return \
. The input contains at least one alphanumeric character, that is, a character from the set [0-9 A-Z a-z].

Output specification: For each test case, print on one line the most frequently occurring word in the input text, followed by a space and the number of times it occurs in the input. If there are multiple such words, print the one with the smallest lexicographical order. The word should be in all lowercase. A “word” here is defined as a sequence of consecutive alphanumeric characters separated by non-alphanumeric characters or line beginnings/ends. Note that words are not case sensitive.

think:

First of all, it is clear that a string is also an array. We use subscripts one by one to start operations corresponding to each letter to achieve extraction, clearing, and traversal.

How to select the first one not to be received?

No need, just receive it as usual, the first continuous string actually has more numbers than pure words, anyway, it will not be output if the number of times is small

How to extract a string of strings?

We use characters outside the set as the split point, such as ” “. Once this empty character appears, add one to the previously stored string mark, and then clear the empty character and continue traversing down.

The end point is that the occurrence of non-set characters means that a word has ended, and the segmentation begins.

//struct_data

//{

// string word;

// int cnt = 0;

//};

//

//set<_data>s;

//No need to use a container, you can store these two data directly by hashing

// with hash?

//It really is

//The general idea is to store the corresponding string data in the hash array, and then record the number of times,

//Uppercase and lowercase belong to the same category and can be recorded, and the value corresponding to the array stores the number of occurrences

//The subscript records the string, and the upper and lower case can be converted with conditions

//The idea is correct

problem solving process

#include<iostream>
#include<string>
#include <map>
#include <vector>
#include <queue>
#include <algorithm>
using namespace std;

map<string,int>mp;
bool isright(char s)
{
if ((s >= '0' & amp; & amp; s <= '9') || (s >= 'a' & amp; & amp; s <= 'z') || (s >= 'A' & amp; & amp; s <= 'Z'))
//0-9 are also characters! ! !
return true;
\t
return false;
}

int main()
{
string str1, t;
getline(cin, str1);
for (int i = 0; i < str1. size(); i ++ )
{
if (isright(str1[i]))
{
str1[i] = tolower(str1[i]);
//write tolower as islower, convinced
//islower is to judge whether it is lowercase
//tolower is rewritten to lowercase
t + = str1[i];
}

if (!isright(str1[i]) || i == str1. size() - 1)
{
if (t.size() != 0)//Don't miss the size of this t or it is 0,
// Otherwise, the characterless hash will always be recorded and stored, and the result will be wrong
//The case of no characters is not considered!
                mp[t] + + ;
                 
                t = "";
}

\t\t
}

int max = 0;
for (auto it = mp. begin(); it != mp. end(); it ++ )
{
if (it->second > max)
{
t = it->first;
max = it->second;
}
}
cout << t << " " << max;
return 0;
}

analyze

#include<stdio.h>
#include <map>
#include <iostream>
#include<string>

using namespace std;
bool isright(char s) {
    if ((s >= 'a' & amp; & amp; s <= 'z') || (s >= 'A' & amp; & amp; s <= 'Z') || (s >= '0' & amp; & amp; s <= '9'))
        return true;
    return false;
}
int main() {
    string str1, t;
    map<string, int> mp;
    getline(cin, str1);
    //The input string is actually an array, and each letter is put into it, and each space corresponds to a letter.
    for (int i = 0; i < str1. size(); i ++ ) {
        if (isright(str1[i]) == true) {
            str1[i] = tolower(str1[i]);
            t + = str1[i];
            //This addition is to directly insert characters into the array of t, as long as the type is the same, it can be written like this.
            //And as long as it is a character type, the system will add it directly by default, instead of adding two letter sizes at the same time to become a new letter!
            //Every time a letter is added, a new letter is inserted in the next space, + in the string is to achieve insertion rather than superposition!
        }
        if (!isright(str1[i]) || i == str1.size() - 1){//The last character
            if (t. size() != 0) mp[t] + + ;
            t = "";
            //Clear t immediately after each recording, so that word-by-word traversal can be realized
            //Then use the hash to gradually increase the number of records

        }
    }
    int Max = 0;
    for (auto it = mp. begin(); it != mp. end(); it ++ )
    {
        if (it->second > Max) {
            t = it->first;
            Max = it->second;
        }
    }
    cout << t << " " << Max;
    return 0;
}