Bsdiff, differential incremental upgrade of Bspatch (based on Win and Linux)

Table of Contents

background

content

Preparation

on windows platform

on linux platform

Formal work

Ideas for generating differential files

Functional differential file idea

Perform differential incremental upgrade while maintaining the same directory structure

Server (generate differential file)

Client (function differential file)

Updated on December 6th (bug fixed)

Perform analysis

Notes and areas of concern:


background

Common Android Linux platforms, games, and system updates will all use this method.

From my own understanding, this method is somewhat like version management in git, which takes the least amount of time to perform version management. The core lies in how to record file differences.

Service-Terminal:

pass

 bsdiff old new patchfile_path 

Generate a difference file. Generally named after the .patch file.

Client: Pass according to patch file

bspatch oldfile newfile patchfile_path

Under normal circumstances, I thought it could be done directly in the form of a compressed package. .apk files on the Android platform are acceptable, as are files in executable hex and other formats for microcontrollers. However, there may be hidden dangers through compressed compressed packages. . Through discussions at meetings and personal inquiry of information, I found that differences in the compression algorithm and the order of compressed files may cause problems with the differential package.

The reasons are

The main reasons are:

1. Different compression algorithms will produce different compressed data. Even if the original data is the same, the compression results through different algorithms are not exactly the same. This will directly affect the comparison results of bsdiff.

2. Even if the same compression algorithm is used, changing the order of the original data in the compressed file may change the compression effect. Compression algorithms use repeating patterns to achieve compression effects. Changing the order disrupts this pattern.

3. bsdiff compares data sequentially to generate differences. Therefore, even if the compression principle data is the same, changes in its order in the compressed file will cause bsdiff to generate different differential patches.

4. The compression algorithm itself uses dictionary and order to improve the compression rate. This conflicts with the working principle of bsdiff. In summary, in order to generate consistent bsdiff patches, the same algorithm and stable order must be used when generating compressed packages for the same data. Otherwise the differential results may vary significantly. Generally, when compressed data needs to be differentiated, attention needs to be paid to controlling these two factors, or consideration should be given to differencing the original data files after decompression.

Therefore, consider keeping the same directory structure after decompression for differentiation, that is, the generated .patch file has the same directory as the original project.

Therefore, you need to write a script to generate a differential folder (directory). This differential folder has the same directory structure as the original project.

Later, the upgrade is performed based on this differential folder, that is, a new file is generated for the patch file and the original file. The new target is the same as the original target.In this way, a differential package is generated on the server and the differential package is applied on the client< /strong>, differential packets can be compressed and decompressed on the client, which is faster and more reasonable.

Therefore, a total of 2 bash scripts are needed, one is placed on the server to generate the differential package. The other is placed on the client to perform local upgrade after receiving the differential package.

content

As of October 27, 2023, the official websites of bsdiff and bspatch do not have permission to download the source code, so you have to look for the source code elsewhere.

Preparation

on windows platform

refer to

whistle713/bsdiff-win: bsdiff Windows binaries and Visual Studio 2015/2019 project. (github.com)

There are .exe executable files that are allowed on the Windows platform.

on linux platform

refer to

Red Orange Darren video notes bsdiff bspatch use (under Linux)_Li Bing’s blog-CSDN blog

Complete compilation

Formal work

There are some special cases for old and new targets that need to be taken into account here.

  1. New target has new files
  2. The new target sometimes deletes the original old files.
  3. The directories and files of the new target and the old target can all match, but there are changes.
  4. The old target and the new target have files with a size of 0 bytes (bsdiff fails)

I believe that general upgrades will encounter all situations of 1.2.3.4.

Regarding the fourth situation, I don’t know if it is a problem with the version of bsdiff or a problem with the Linux system. I don’t have this problem with my local liunx.

bsdiff reports an error on linux when processing a file with a size of 0 bytes

Report an error

bsdiff:mmap() xxx:Invalid argument

Ideas:

For the first and second cases.

New target added: Generate a file with the same name in the old target, but with a size of 0 bytes

The new target may delete the original old file: a file with the same name is generated in the new target, and the size is still 0 bytes.

In this case, as long as the problem of 4 does not occur, the corresponding bspatch file can be generated through bsdiff.

Ideas for generating differential files

1. Synchronize the old target (when files corresponding to the new target are added)

2. Synchronize the new target (when the new target deletes the old file)

3. Recursively traverse each file in the target and search in another target. You can directly generate a differential file through bsdiiff.

Even if there are two identical files, a patch file will be generated, but bspatch will not work when applied to this patch file. This is very convenient and no judgment is needed. This shows that each file has a corresponding difference file. (This needs to be modified in my code)

I did not do this below. Instead, I judged the files according to the md5 value and then generated the corresponding patch file.

The idea of using differential files

Directly traverse the generated differential file directory structure and call bspatch.

Perform differential incremental upgrade while maintaining the same directory structure

Server (Generate Difference File)

transfer.

 ./gen.sh (script name) ./old (old directory) ./new (new directory)

Finally, a directory of differential files with date suffix will be generated (maintaining the same directory structure as the original directory)

#!/bin/bash

# check if two arguments are given
if [ $# -ne 2 ]; then
  echo "Usage: $0 oldfolder newfolder"
  exit 1
fi

# check if the arguments are valid directories
if [ ! -d "$1" ] || [ ! -d "$2" ]; then
  echo "Invalid directories"
  exit 2
fi

# if it is ended by '/' delete '/'
inputemp=$1
if test "${inputemp: -1}" == "/";then
    s1=${inputemp%/}
else
    s1=$inputtemp
fi

inputemp=$2
if test "${inputemp: -1}" == "/";then
    s2=${inputemp%/}
else
    s2=$inputtemp
fi

 


# create a new directory for patch files
patch_dir="patch_$(date + %Y%m%d%H%M%S)"
mkdir -p "$patch_dir"

# sync in new target
find "$s1" -type f | while read oldfile; do
  #get the relative path of the file
  rel_path=${oldfile#$s1/}
  # get the corresponding file in the second directory
  newfile="$s2/$rel_path"
  # exist in old and not exist in new and create same name to instead in the new folder
  if [ ! -f "$newfile" ]; then
    echo -e "\033[0;36m [disapper in new]: $newfile Generate 0 Bytes to instead in new target \033[0m"
     mkdir -p "$(dirname $newfile)"
     >$newfile
  fi

done


# sync in old target
find "$s2" -type f | while read newfile; do
  #get the relative path of the file
  rel_path=${newfile#$s2/}
  # get the corresponding file in the second directory
  oldfile="$s1/$rel_path"
  # exist in new and not exist in old and create same name to instead in the old folder
  if [ ! -f "$oldfile" ]; then
  echo -e "\033[0;36m [disapper in old]: $oldfile Generate 0 Bytes to instead in old target \033[0m"
  # create the parent directory if needed
    mkdir -p "$(dirname oldfile)"
    > $oldfile
  fi

done


#Generate patch
find "$s1" -type f | while read oldfile; do

  # get the relative path of the file
  rel_path=${oldfile#$s1/}
  # get the corresponding file in the second directory
  newfile="$s2/$rel_path"
  # Haved sync and create the patch file name
  patch_file="$patch_dir/$rel_path.patch"
  # create the parent directory if needed
  mkdir -p "$(dirname "$patch_file")"
  # use bsdiff to generate the patch file
  oldmd5=$(md5sum $oldfile | awk '{print $1}')
  newmd5=$(md5sum $newfile | awk '{print $1}')

  if [ "$oldmd5" = "$newmd5" ]; then
      
  echo -e "\033[0;32m Don't Need to Change \033[0m"

  else
      bsdiff "$oldfile" "$newfile" "$patch_file"
      echo -e "\033[0;33mGenerated patch for $rel_path \033[0m"
  fi
   

done

echo "Done. Patch files are in $patch_dir"

Client (functioning difference file)

transfer

Script name old target new target (can also be the old target, which is equivalent to replacing the old target) difference directory
#!/bin/bash


# check if two arguments are given
if [ $# -ne 3 ]; then
  echo "Usage: cmd oldfolder newfolder patchfolders"
  exit 1
fi

# new generate
if [ ! -e "$2" ]; then
mkdir $2
fi

# check if the arguments are valid directories
if [ ! -d "$1" ] || [ ! -d "$3" ] ; then
  echo "Invalid directories"
  exit 2
fi

# if it is ended by '/' delete '/'
inputemp=$1
if test "${inputemp: -1}" == "/";then
    s1=${inputemp%/}
else
    s1=$inputtemp
fi

inputemp=$2
if test "${inputemp: -1}" == "/";then
    s2=${inputemp%/}
else
    s2=$inputtemp
fi


inputemp=$3
if test "${inputemp: -1}" == "/";then
    s3=${inputemp%/}
else
    s3=$inputtemp
fi






#loop item in path_item
find "$s3" -type f -name "*.patch" | while read patch_item; do
    temp=${patch_item#$s3/}
    temp=${temp%.patch} #equal to temp=${temp:0:${#temp}-6}
    oldfile="$s1/$temp"
    newfile="$s2/$temp"
    mkdir -p "$(dirname "$newfile")"
    echo -e "\033[0;32m Generate $oldfile $newfile \033[0m"
    
# execute bspatch
    bspatch "$oldfile" "$newfile" "$patch_item"
  

 

done

how to use?

General calling process

diff -rq ./old ./new (you will see file differences at this time)

./gen ./old ./new

./upgrate ./old ./old ./patch_xxx

diff -rq ./old ./new (no output means the update is complete)

Updated on December 6 (bug fixed)

A bug occurs, the soft link file cannot be found, and the differential upgrade cannot be applied to the existing soft link file.

So updated.

  • Solved the problem of not being able to find soft link files, now all files can be found (including soft links and hidden files)
  • Delete empty directories and empty files (0-byte files) during synchronization

gen.sh

#!/bin/bash


#record waiting_for_delete_index
wddi="waiting_for_dindex"
GetAllfile()
{
        local ret=()
        local links=$(find "$1" -type l)
        local normal_files=$(find "$1" -type f)
        while read -r item
        do
            ret + =($item)
        done <<< "$links"
        while read -r item
        do
            ret + =($item)
        done <<< "$normal_files"
        echo ${ret[@]}
}


# check if two arguments are given
if [ $# -ne 2 ]; then
  echo "Usage: $0 oldfolder newfolder"
  exit 1
fi

# check if the arguments are valid directories
if [ ! -d "$1" ] || [ ! -d "$2" ]; then
  echo "Invalid directories"
  exit 2
fi

# if it is ended by '/' delete '/'
inputemp=$1
if test "${inputemp: -1}" == "/";then
    s1=${inputemp%/}
else
    s1=$inputtemp
fi

inputemp=$2
if test "${inputemp: -1}" == "/";then
    s2=${inputemp%/}
else
    s2=$inputtemp
fi

 


# create a new directory for patch files
patch_dir="$(dirname $s1)"/patch_"$(echo "$s1" | awk -F'/' '{print $NF}')"
echo patch_path: $patch_dir
mkdir -p "$patch_dir"

wddi="$(dirname $s1)"/"$wddi"

if [ ! -f $wddi ];then
  touch "$wddi"
else
  > $wddi
fi

# sync in new target
all_files=`GetAllfile $s1`
all_files=$(echo -n "$all_files" | tr ' ' '\
')
echo "$all_files" | while read oldfile; do
  #get the relative path of the file
  rel_path=${oldfile#$s1/}
  # get the corresponding file in the second directory
  newfile="$s2/$rel_path"
  # exist in old and not exist in new and create same name to instead in the new folder
  if [ ! -f "$newfile" ]; then
    echo -e "\033[0;36m [disapper in new]: $newfile Generate 0 Bytes to instead in new target \033[0m"

  #appends not exist path recorded list
    if [ ! -d "$(dirname $newfile)" ];then
         mkdir -p "$(dirname $newfile)"
         echo "$(dirname $rel_path)" >> $wddi
    fi
  #write 0 bytes
     >$newfile
  fi

done


# sync in old target
all_files=`GetAllfile $s2`
all_files=$(echo -n "$all_files" | tr ' ' '\
')
echo "$all_files" | while read newfile; do
  #get the relative path of the file
  rel_path=${newfile#$s2/}
  # get the corresponding file in the second directory
  oldfile="$s1/$rel_path"
  # exist in new and not exist in old and create same name to instead in the old folder
  if [ ! -f "$oldfile" ]; then
  echo -e "\033[0;36m [disapper in old]: $oldfile Generate 0 Bytes to instead in old target \033[0m"
  # create the parent directory if needed
    mkdir -p "$(dirname $oldfile)"
    >$oldfile
  fi

done


#Generate patch
# when running in this poision ,old target is same as new target
all_files=`GetAllfile $s1`
all_files=$(echo -n $all_files | tr ' ' '\
')
echo "$all_files" | while read oldfile; do
  #get the relative path of the file
  rel_path=${oldfile#$s1/}
  # get the corresponding file in the second directory
  newfile="$s2/$rel_path"
  # Haved sync and create the patch file name
  patch_file="$patch_dir/$rel_path.patch"
  # create the parent directory if needed
  mkdir -p "$(dirname "$patch_file")"
  # use bsdiff to generate the patch file
  oldmd5=$(md5sum $oldfile | awk '{print $1}')
  newmd5=$(md5sum $newfile | awk '{print $1}')

  if [ "$oldmd5" = "$newmd5" ]; then
      
  echo -e "\033[0;36m Don't Need to Change \033[0m"

  else
      bsdiff "$oldfile" "$newfile" "$patch_file"
      echo -e "\033[0;36mGenerated patch for $rel_path \033[0m"
  fi
done


# after generating patches_files ,delete extra 0 bytes files by delete index
if [ ! -f $wddi ];
then
    exit 1
fi
cat $wddi | while read del_item;
do
    realpath=$s2/$del_item
    if [ -d "$realpath" ] ;then
        echo $realpath
        rm -rf $realpath
    fi
done

echo "Done. Patch files are in $patch_dir"

upgrate.sh

#!/bin/bash

# wdfi="waiting_for_delete"
wddi="waiting_for_dindex"
# check if two arguments are given
if [ $# -ne 2 ]; then
  echo "Usage: upgrade oldfolder patch_path "
  echo "example ./upgrade ./test/old/ ./test/boot/"
  exit 1
fi



# check if the arguments are valid directories
if [ ! -d "$1" -o ! -d "$2" ] ; then
  echo "Invalid directories"
  exit 2
fi

# if it is ended by '/' delete '/' => old folder
inputemp=$1
if test "${inputemp: -1}" == "/";then
    s1=${inputemp%/}
else
    s1=$inputtemp
fi

# if it is ended by '/' delete '/'=> patch_path
inputemp=$2
if test "${inputemp: -1}" == "/";then
    patch_path=${inputemp%/}
else
    patch_path=$inputemp
fi

wddi="$patch_path/$wddi"
echo wddi : "$wddi"

patch_path="$patch_path"/patch_"$(echo "$s1" | awk -F'/' '{print $NF}')"
echo find patch_path: $patch_path

 

#loop item in path_item
find "$patch_path" -type f -name "*.patch" | while read patch_item; do
    temp=${patch_item#$patch_path/}
    temp=${temp%.patch} #equal to temp=${temp:0:${#temp}-6}
    oldfile="$s1/$temp"
    newfile="$s1/$temp"
    mkdir -p "$(dirname "$newfile")"
    if [ ! -f "$newfile" ];
    then
    >$newfile
    fi
    echo -e "\033[0;36m Generate $oldfile $newfile \033[0m"
    
# execute bspatch
    bspatch "$oldfile" "$newfile" "$patch_item"
done


#delete not exist index when processing to delete
if [ ! -f $wddi ];
then
    exit 1
fi
cat $wddi | while read del_item;
do
    realpath=$s1/$del_item
    if [ -d "$realpath" ] ;then
        echo $realpath
        rm -rf $realpath
    fi
done
echo "done"

Execute Analysis

Execute this on the server

./gen ./test/old ./test/new (old is the old version, new is the new version), then patch_old and waiting_for_dindex are generated under test.

./upgrade old file location pathch file parent path (excluding pathch file name)

Notes and areas of concern:
  1. When generating patch files, that is, the ./gen script, first synchronize the old directory and the new directory, and finally generate patch files based on each file, and place these patch files separately in the directory pathch_old directory name , this directory has the same directory structure as the old directory (the new directory.
  2. In the ./gen script, due to synchronization, if the old directory is deleted in the new version, the deleted old directory will be synchronized to the new directory, and the deleted old directory will be stored in a list (such as wddi), these synchronized contents will be deleted on the server side after the patch is completed.
  3. The ./upgrade script accepts an old directory and the parent path of a patch directory (excluding the patch directory name). This will search for the patch directory ending with the suffix of the old directory in the parent directory of the patch, and then traverse the patch directory in sequence. Call bspatch to upgrade. Finally, after the client calls the upgrade, it will delete the old directory and 0-byte files generated during the synchronization process according to the list.
syntaxbug.com © 2021 All Rights Reserved.