2020 bzfilelist executable

(written 8/16/2020)

by Brian Wilson


 

    bzfilelist - walks the entire file system on the SSD on the customer's laptop looking for new and changed files to backup.  bzfilelist is always launched by bzserv.  bzfilelist creates lists of files, but absolutely does not transmit them anywhere.  See this parent 2020 Backblaze Personal Backup architecture page for terminology, and some context for what this VERY SPECIFIC web page is about.

     

    NOTE: this page is currently a repeat of the content on that above page.  THIS PAGE IS A PLACE HOLDER that BrianW needs to fill out even more.

     

    bzfilelist - walks the entire file system on the SSD on the customer's laptop looking for new and changed files to backup.  bzfilelist is always launched by bzserv.  bzfilelist creates lists of files, but absolutely does not transmit them anywhere.  bzfilelist completely lacks the ability to do network HTTPS communication, it profoundly cannot do anything but create lists of files for other executables to consume.  bzfilelist has no UI components, and is therefore largely cross platform between Windows and Macintosh.  bzfilelist runs as the user "SYSTEM" on Windows, and as the user "root" on Macintosh because it is always launched by parent process "bzserv" (see above).
     
    Location on Disk Windows: C:\Program Files (x86)\Backblaze\bzfilelist.exe
    Location on Disk Macintosh: /Library/Backblaze.bzpkg/bzfilelist
     
    Purpose of bzfilelist: The primary purpose of "bzfilelist" is to create the complete list of filenames with associated modification dates for each attached SSD or hard drive.  Each SSD's (each "volume's") list of files is stored in a separate file.  These lists of filenames with their last modification date are found at C:\ProgramData\Backblaze\bzdata\bzfilelists\ on Windows, and /Library/Backblaze.bzpkg/bzdata/bzfilelists/ on the Macintosh.  The name of the list of files starts with the BzVolumeGuid.  For the primary boot (system) volume that BzVolumeGuid begins with letters "v000", then subsequent drives start with "v001" and then "v002" and so on.  Here is an example of the list's filename from Windows: C:\ProgramData\Backblaze\bzdata\bzfilelists\v000c0101f6fb58de90a713a0e19_c____filelist.dat and you can open that with WordPad on Windows, or TextEdit on the Macintosh.  The name "v000c0101f6fb58de90a713a0e19_c____filelist.dat" always starts with the "volume guid" then has an underbar, then a friendly description of the drive (in my example above this is "_c____" to indicate this is the "C:\" Windows drive (on Macintosh the system boot drive would have the string "_root_"), then always ends with "filelist.dat".  These "per drive lists of files" are produced approximately once per hour, but it might be once every two hours, or even longer for customers with extremely large volumes.  There is a guarantee that the list of files with the name above is ALWAYS VALID and ALWAYS PRESENT for other programs to read and use, but it might be 1 or 2 hours "out of date" waiting for the next list of files to be produced.  If a new list of files is being produced by bzfilelist the new INCOMPLETE list of files has the same name, but at the end of it is appended "_future".

    Inside of one of these lists named things like "v000c0101f6fb58de90a713a0e19_c____filelist.dat" the very first line inside that file is when that list of files was created, it looks like:
    # GmtMillisThisListWasStarted: 00000173f855ec2b, GmtDateTime: 20200816173407
    Those are actually the identical date and time, the first one is the number of milliseconds since 1970, and the second one is human readable and says it is year "2020", month "08", day "16", then hours, minutes, and seconds.

    After that first line, the rest of the contents are pretty self explanatory.  The first letter on each line is an "f" for a file, a <tab> character, then the last modified timestamp (in milliseconds since 1970), then another <tab> character, then the number of bytes contained in the file, then another <tab> character, then the filename in completely pure (non-encoded) Utf8.  When the character '\n' (end of line) is encountered, that marks the end of that one filename.  Because this Utf-8 is not encoded in any way, this is extremely fast, there is no encode or decode step.

    Resource Load bzfilelist puts on customer laptop: bzfilelist only runs for maybe 10 minutes once an hour on most customer's laptops.  It is designed to use less than 1% of one core of CPU (bzfilelist is one single thread), and less than 1% extra load on the SSD, and while it is running bzfilelist might use about 20 MBytes of RAM or less (0.25% of an 8 GByte RAM computer - one fourth of 1% of the customer RAM).
     
     

All done.

Return to Random Stufff

Return to Ski-Epic home page.