1. file file information
file can identify the type and encoding format of the file.
# Syntax file file name # output File name: file type and encoding format
-
-b do not output filenames
-
-i MIME type of the output file
MIME, Multipurpose Internet Mail Extensions type. -
-F Change the separator between the file name and file information
It is possible to avoid colon character collisions during text analysis. -
-L View the information of the target file of the soft link
The direct query of the file command will display the information of the soft link file itself;
With the -L option, information about the target file pointed to by the soft link will be displayed. -
-f read filenames from a text file
When you need to view a large number of file information, save the file name in a text, usefile -f text file
to specify the text file, and file will view the information of the files one by one. -
-z View information about internal files of gz files containing only one file
2. lnCreate a link
Soft link: The content of a soft link file is the path and name of another file. When opening it, the system will find and open the linked file.
Hard link: The hard link file has its own inode node and name, and the inode will point to the data block where the file content is located. A hard link increments the reference count of the contents of the file it points to. A file modification will take effect for all hard link files. When one of the files is deleted, the other file is not affected, but the reference count of the data block is decremented by 1. The system clears this data block only when the reference count is 0.
- build link
# Create a hard link ln source file name hard link file name # View the inode of the file ls -i filename # Create a soft link ln -s source file name soft link file name
The initial letter of the permission of the soft link file is l.
- 2 limitations of hard links
- Cannot span filesystems.
- Ordinary users are not allowed to make hard links to directories.
Linking to the directory may introduce circular links in the directory, resulting in an infinite loop when traversing the directory. The operating system can recognize soft links when traversing directories, and can then take steps to stop the traversal. For hard links, due to algorithm limitations, it is temporarily impossible to prevent this infinite loop.
# View file information, the second column is the number of hard links of the file ls -l filename
- -n Treat the soft link of the folder as the soft link of the file
When defining a new soft link to an existing soft link name, if the link is a file, an error will occur because the name already exists. If the link is a folder, an infinite loop file will be generated under the soft link folder.
Using the-n
option, you can also report an error when you find a soft link that defines a folder repeatedly.
When repeated definition errors are reported, you can use the -f option to forcefully change the target of the soft link.
3. find find files
The functions of directory search and file location are provided by the findutils package. findutils contains four useful commands: find, xargs, locate (quickly locate file names), updatedb (update file name database).
find syntax find [path...] -name [pattern]
-
-type specifies the type of search object
d folder; f ordinary file; l symbolic link file; b block device;
c character device; p pipe file; s socket socket. -
-regex use regular expression matching
-
-user by owner and -group by group
-
-perm search by permission
-
-exec Execute the search results as parameters
-exec
can execute a specific shell command for each object found by find.
You can use{}
in the command to replace the result found by find.
Use\;
as the sign of the end of the command. Escaping must be used. -
Search files by time
There are three types of time, access time (a), modification time (m), and state change time?.
how long ago and how long ago
# Files that have been modified within n minutes find -mmin -n # Files that were modified n minutes ago find -m min + n # Files with state changes find -cmin ±n # The file being accessed find -amin ±n # The unit of n changes to days find -mtime ±n find -ctime ±n find -atime ±n
Time relative to a file:
# Find the time of modification/access/state change that is closer to the corresponding time of the specified file find -newer file find -anewer file find -cnewer file # Find the X time that is closer to the Y time of the file find -newerXY file
#newerXY # Find files whose access time is closer than the modification time of file find -neweram file # Find files whose access time is closer to '2022-12-01 10:00:00' find -newerat '2022-12-01 10:00:00'
- search large files
# Find files larger than 40M in the directory and its subdirectories find -size + 40M # Find files smaller than 40M find -size -40M # Find files equal to 40M find -size 40M
Other units supported by -size
:
b: 512-byte data block; c: bytes; w: two-byte words (words);
k: KB; M: MB; G: GB.
-
-maxdepth specifies the folder depth to search
-
Logical operations of find expressions
Each condition followed by find is an expression, and logical operations can be performed between expressions.
\( expr \)
Enclose the expression in parentheses to increase the priority of processing; the parentheses must be escaped when used;
The!expr
expression is reversed, and the file that does not satisfy expr;
expr1 expr2
default state, expr1 and expr2;
expr1 -a expr2
expr1 and expr2;
expr1 -o expr2
expr1 or expr2;
expr1, expr2
Both expressions will be judged, and the result is always the result of expr2;
4. The regularity of find
- Types of regular expressions
The syntax of regular expressions is not uniform enough. The find command can use the-regextype
option to specify the type of regular expression used.
Optional parameters are emacs, posix-awk, posix-basic, posix-egrep, posix-extended.
Find uses the emacs type by default.
in emacs syntax
- ., *, + ,?, [0-9], [a-z], ^, $ can all be supported normally;
- Or the relationship needs to use escape
\|
; - The grouping function needs to use escaped parentheses
\(\)
, where \\
represents the grouping identifier; - Interval matching is not supported;
5. du disk usage
du focuses on displaying the disk usage of files and folders, and df focuses on the disk usage at the file system level.
-
Common options for du
-h –human-readable Displays disk footprint units in human-readable form;
-s –summarize Calculate the disk usage for each given parameter;
-c –total sum the displayed results;
-d –max-depth Controls the depth of folder nesting; when the value is 0, the result is the same as the-s
option.
-a –all displays the disk usage of the folder and its files;
–exclude=pattern Exclude some options according to the regular pattern; -
Use sort to sort
du -sh *| sort -hr
sort’s -h option intelligently compares by unit. The -n option will only compare according to the value, and does not recognize the unit.
- unit of du
The default unit of du is KB, which is 1024bytes. This unit is affected by the following options in turn:
- If the block size is set using the
--block-size
option, the unit of du is this size; - If the environment variable DU_BLOCK_SIZE, BLOCK_SIZE or BLOCKSIZE is set, this size is the unit of du;
- If the environment variable POSIXLY_CORRECT is enabled, the unit of du is 512bytes.
The priority enabled by the above options is –block-size > DU_BLOCK_SIZE > BLOCK_SIZE > BLOCKSIZE > POSIXLY_CORRECT > default KB.
- Differences displayed by du and ls
du shows the disk usage, and ls shows the file size.
The disk is divided into data blocks according to a fixed size, usually each data block is 4KB in size.
Most file systems stipulate that a data block can store at most one file content, and when it is not full, the remaining space will not be used by other files; large files can be stored in multiple data blocks.
Therefore, generally the file size displayed by ls will be smaller than that displayed by du. Unless it is a sparse file, the file contains holes that do not take up disk space.
6. gzip compression
Compression can be used to improve data transmission efficiency, reduce transmission bandwidth, and manage backup data.
- gzip compressed file
# Compressed file (will change the file to a file with a .gz suffix) gzip source file 1 source file 2 # unzip files gzip -d Compress file 1 Compress file 2 # Keep source files when compressing gzip -c source_file>compress_file.gz
- tar package
tar can package multiple files and folders together.
gzip can only compress and decompress ordinary files, and does not support folders and symbolic links, nor can it package multiple files together. Therefore, it is generally used together with thetar
command in actual use.
# pack tar -czvf compressed file name file1 file2 # unpacking tar -xzvf archive file # Unzip only the specified files tar -xzvf compressed package file specified file
The meaning of the common options of the tar command:
-x for unpacking, -c for packaging;
-z Compress or decompress using gzip;
-j Use bzip2 to compress or decompress;
-v Display the unpacked or packed files;
-f specifies the file to be unpacked or packaged, which needs to be placed at the end;
other options:
-t lists the contents of the package file (no decompression required);
tar can automatically determine the compression method according to the file suffix, so you don’t need to specify the compression method when decompressing.
- Compression Strength and Compression Velocity
gzip provides 9 compression levels, from 1 to 9, the compression is getting slower and the compression strength is getting higher and higher. The default compression level is 6.
gzip -1 source file
7. bzip2 compression
The effect of bzip2 is the same as that of gzip, but compared with gzip, the stability and effect of compression are better.
# compression bzip2 source file 1 source file 2 # unzip bzip2 -d compress file 1 compress file 2 bunzip2 compressed file 1 compressed file 2
bzip2 and bunzip2 are one program. This program performs compression work by default, and if it detects that there is unzip or UNZIP in the running command, it will perform decompression work.
8. zip compression
zip preserves the source files when compressing.
# compress files and folders zip -r compressed file name source file or source folder # Unzip to the specified folder unzip -d target folder address compressed file
- Common options
-v View the files in the archive without decompressing.
-t Verify the integrity of compressed files.
-d Delete a file in the compressed file.
# Delete the files in the compressed file zip archive -d file
9. dd makes a black hole
dd is used to read the content in the device and file, and copy it to the specified location intact.
When using the dd command to read /dev/null files, empty files are created.
# Syntax dd if=input file or device of=output file or device # backup disk vda dd if=/dev/vda of=/app/vda.img # restore backup to vdb dd if=/app/vda.img of=/dev/vdb
- Compress when backing up
# Backup dd if=/dev/vda | gzip > /app/vda.img.gz dd if=/dev/vda | bzip2 > /app/vda.img.bz2 # recover gzip -dc /app/vda.img.gz | dd of=/dev/sdb bzip2 -dc /app/vda.img.bz2 | dd of=/dev/sdb
The dd command does not write if=, which means reading from standard input.
- Backup partition, memory, floppy disk
There is no difference between the command and the backup disk
# partition, memory dd if=/dev/vda2 of=/app/vda2.img dd if=/dev/mem of=/app/mem.img # Floppy, CD dd if=/dev/fd0 of=/app/fd0.img dd if=/dev/cdrom of=/app/cd.img
- other options
bs=N: Set the data block size for a single read-in or a single output. You can also useibs=
andobs=
to set input and output respectively.
count=N: Indicates that a total of N data blocks need to be copied.
# MBR backup of the disk dd if=/dev/vda of=/app/vda_mbr.img count=1 bs=512 # MBR writeback dd if=/app/vda_mbr.img of=/dev/vda
Master Boot Record (MBR) The master boot record of the hard disk, if the MBR is damaged, the partition table will also be damaged, and the system will not be able to boot normally. The 512 bytes of the first sector of a disk store the MBR information of the disk.
- Test disk with /dev/null and /dev/zero
/dev/null
: Null device, any written data will be discarded.
/dev/zero
: It can generate an uninterrupted null stream, which is used to write null data to the device or file, and is generally used to initialize the device or file.
# test write performance dd if=/dev/zero bs=1024 count=1000000 of=/app/1GB.file # Test read performance dd if=/app/1GB.file bs=64k | dd of=/dev/null
- formatted using /dev/urandom
You can use /dev/urandom to generate random data, write it to disk, and completely overwrite the original data.
dd if=/dev/urandom of=/dev/sda