Semantic Network and Knowledge Graph (2) Knowledge Representation (Predicate Logic & Production Rule Representation & Framework Representation & Semantic Web Representation & XML & RDF & OWL

Knowledge representation: Use computer symbols to describe the knowledge in the human brain, and simulate the reasoning process of the human brain through operations between symbols.
Semantic Web Core: RDF OWL
knowledge representation method
? First-order predicate logic representation
? Production rule notation
? Frame notation
? Semantic Web Notation
? Distributed knowledge representation

Predicate logic

Very rigorous! Based on mathematical logic, it is the most precise formal language that can express human thinking and reasoning so far.
Common predicate logic:



A predicate is an assertion! ! ! ! !

Predicate logic can be used to express factual knowledge such as the status, attributes, and concepts of things, as well as regular knowledge of causal relationships between things.
General steps for representing knowledge using predicate logic:
1. Define predicates and individuals, and determine the exact meaning of each predicate and individual.
2. Assign specific values to the variables in each predicate according to the thing or concept to be expressed.
3. According to the semantics of the knowledge to be expressed, use appropriate logical connectives to connect various predicates to form a predicate formula.
advantage:
Accuracy: Can represent knowledge more accurately and support precise reasoning.
Universality: Possess universal logical calculation methods and reasoning rules.
Naturalness: It is a formal language system that is close to human natural language.
Modularization: Each piece of knowledge is relatively independent and there is no direct connection between them, making it easy to add, delete and modify knowledge.
shortcoming:
Poor representation ability: it can only represent deterministic knowledge, but cannot represent non-deterministic knowledge, procedural knowledge and heuristic knowledge.
Management difficulties: lack of organizational principles for knowledge, difficulty in knowledge base management.
Low efficiency: Separating the reasoning calculation from the knowledge meaning often makes the reasoning process lengthy and reduces the system efficiency.

Production rule notation

The production system uses the form of a sequence of rules to describe the thinking process of the problem and form a thinking model for solving the problem.
Each rule in the system is called a production.

Case:


advantage:
Validity: It can represent both deterministic knowledge and uncertain knowledge, which is beneficial to the expression of heuristic and procedural knowledge.
Naturalness: Use “if…, then…” to express knowledge, which is intuitive and natural.
Consistency: All rules have the same format and the database is accessible to all rules for uniform processing.
Modularization: Each rule can only be connected through the database and cannot call each other, which facilitates the addition, deletion and modification of knowledge.
shortcoming:
Low efficiency: The solution is a repeated “matching-conflict resolution-execution” process, and the execution efficiency is low.
Limitations of representation: cannot represent structural or hierarchical knowledge! !

Frame notation

Frame representation is a structured knowledge representation method developed based on frame theory and is suitable for expressing various types of knowledge. Frame theory believes that people’s understanding of various things in the real world is stored in memory in a structure similar to a frame. When facing a new thing, they find a suitable frame from memory and use it according to the actual situation. The situation modifies and supplements its details, thereby forming an understanding of the current situation.

There are two types of frames:
A class frame is used to describe a concept or a class of objects.
Instance frame is used to describe a specific object.

Framework hierarchy:
Subclass-subclass of->parent class: the inclusion relationship between class frames.
Instance-instance of->class: The affiliation between the instance frame and the class frame.

Lower-level frames can inherit certain properties and values from upper-level frames. The following article discusses both
Without distinction, they are collectively referred to as “generic” relationships.

advantage:
Structural: A hierarchical nested structure that can represent both the internal structure of knowledge and the connections between knowledge.
Inheritance: The lower frame can inherit certain attributes or values from the upper frame, and can also make supplementary modifications to reduce redundant information and save storage space.
Naturalness: Frame theory conforms to the thinking process of human cognition.
Modularity: Each framework is a relatively independent data structure, which facilitates the addition, deletion and modification of knowledge.
shortcoming:
Cannot represent procedural knowledge.
Lack of clear reasoning mechanism.

Semantic Web Representation

The Semantic Web provides a set of representation languages and tools designed to describe data, used to formally describe concepts, terms and relationships in a knowledge field.

OWL has more powerful semantic expression capabilities

XML

XML (eXtensible Markup Language, extensible markup language) is the earliest semantic web representation language. It cancels the display style and layout description capabilities of HTML and highlights the semantics and element structure description capabilities of data.
Used to store and transmit data, focusing on how to describe information in a structured manner
The elements of XML represent the “events” described by the XML document, such as books, authors, and publishers.
An element consists of a start tag, element content, and an end tag.

<author>Thomas B. Passin</author>

Users can choose tag names as they wish, with very few restrictions. Elements have a nested structure, and there is no constraint on the depth of nesting.

<author>
<name>Thomas B. Passin</name>
<gender>Male</gender>
<phone> + 61-7-3875 507</phone>
</author>

Similar to HTML, XML can also have attributes, that is, element name-value pairs, which can express the same semantics as the element.

<author name="Thomas B. Passin"
phone="+61-7-3875 507"/>
gender=“Male”

Attributes can also be mixed with elements, but they cannot be nested.

<author name=“Thomas B. Passin” gender=“Male” >
<phone> + 61-7-3875 507</phone>
</author>

Properties are unique! !
Child elements can have multiple values
Use Xlink to describe relationships

XML advantages:
Structured data representation separates data content from its form. With good scalability, users can create and use their own tags, define special tag languages in industry fields, and carry out data sharing and exchange. Contains document type declaration, its data can be extracted, analyzed, and processed by any XML parser, and can be easily applied across platforms.
XML disadvantages:
XML is a meta-markup language that can be used by any organization or individual to define new tags and standards, which can easily lead to conflicts and confusion.
When an XML document is used as a data collection, it is equivalent to a database and does not have the complete functions of a database management system.
Data is stored in a tree structure, making insertion and modification difficult.

RDF

RDF (Resource Description Framework, Resource Description Framework) is a resource description language that uses a variety of current metadata standards to describe various network resources, forming a form that is human-machine readable and can be automatically processed by machines. document.
The core idea of RDF: Use Web identifiers (URIs) to identify things and describe the nature of resources or the relationship between resources through specified attributes and corresponding values.
Data model:

Case:

RDF Schema

RDFS is an extension of RDF. It provides a set of modeling primitives based on RDF to describe classes, attributes and the relationships between them.
Class, subClassOf: Describes the class hierarchy.
Property, subPropertyOf: Describes the property hierarchy.
domain, range: The resource class and attribute value class to which the declared attribute applies.
type: declares that a resource is an instance of a class.

Case:


RDF(S) properties
advantage:
Simple: Resources are described in the form of triples, which is simple and easy to control.
Easy to expand: description and vocabulary set are separated, with good scalability.
Inclusiveness: Allows you to define your own vocabulary and seamlessly use multiple vocabulary sets to describe resources.
Easy to synthesize: RDF considers everything to be a resource, so it is easy to comprehensively describe it.
shortcoming:
The semantics cannot be accurately described: the same concept has multiple vocabulary representations, and the same vocabulary has multiple meanings (concepts).
Without a reasoning model, there is no reasoning ability.

Ontology determines the precise meaning of concepts through strict definitions of concepts and the relationships between concepts, representing commonly recognized and shareable knowledge.
For ontology, author, creator and writer are the same concept, while doctor represents two concepts respectively in universities and hospitals. Therefore, in the Semantic Web, ontology plays a very important role and is the basis for solving Web information sharing and exchange at the semantic level.

Studer: Ontology is an explicit formal specification of a shared conceptual model.
Features:
Conceptualization: Ontology is a model obtained by abstracting the concepts of the objective world, and the meaning it represents is independent of the specific environmental state.
Explicit: The concepts used by the ontology and the constraints for using these concepts are clearly defined, without ambiguity.
Formal: The ontology is computer processable rather than natural language.
Shared: Ontology embodies commonly recognized knowledge and reflects a collection of concepts recognized in related fields. It is aimed at groups rather than individuals.

OWL

OWL (Web Ontology Language) is a recommended language for representing ontologies on the Semantic Web. As an extension of RDF(S), its purpose is to provide more primitives to support richer semantic expressions and support reasoning.
Three sub-languages of OWL:
Lite: Provides a classification hierarchy and simple attribute constraints.
DL: Provides a reasoning system to ensure computational completeness and decidability.
Full: supports completely free RDF syntax, but does not have computability guarantees.
Expressive skills: OWL Lite < OWL DL < OWL Full

Summary:

Distributed knowledge representation

Core idea: Map samples (symbolized entities and relationships) into a low-dimensional dense space through transformation, and use low-dimensional vectors to represent the original samples. Simplify calculations while retaining the original graph structure to the greatest extent.
Mapping! ! !
Representation learning: Automatically learn useful features from data and use them in subsequent tasks!

statistical language model
– Treat language (sequence of words) as a random event and assign corresponding probabilities to describe the possibility of it belonging to a certain language set.
– Given a vocabulary set V, for a sequence S = ?w1, … , wT ? ∈ Vn consisting of words in V, the statistical language model assigns a probability P(S) to this sequence to measure whether S is consistent with natural language Confidence in syntactic and semantic rules.
– The higher the score probability of a sentence, the more it indicates that it is a more natural sentence spoken by people.



Core: Words with similar contexts have similar semantics.

Continuous word belt model:






A is the adjacency matrix
A + I is for the information of statistics itself
D is the degree matrix (diagonal)
DAD is equivalent to the normalization operation on A