Split jobs (to save memory) after x million entries and realtime sync

English Support for Syncovery on Windows.
ruudelux
Posts: 9
Joined: Tue Oct 12, 2021 7:34 am

Split jobs (to save memory) after x million entries and realtime sync

Post by ruudelux »

I have to copy/mirror about 13 million files and 8000 folders to another drive.
I want to first use the attended mode one time in combination with setting the option "Cache Destination File list/Re-read the destination file list completely after every" to 1 and "Next re-reading will happen after" to 0.
The idea is to first build up the file-list database during the attended mode, then disable the option "Cache Destination File list/Re-read the destination file list completely after every" and enable the real-time synchronization, so that the real-time synchronization will not need to build up the whole file-list database every time it detects a file change.

However, when i use the attended mode, i noticed that the file database will become too big (8 gb memory limit), so i had to enable the "Split jobs (to save memory)" option.
This worked, however the realtime synchronization gave an error after enabling the realtime synchronization.
I am just wondering, if the file-list database has been build up in 3 parts using the "Split Jobs" option, will the file-list database actually contain the full list of files and folders, or the just the last part?

If it will be only the last part, then i guess, the real-time synchronization will not be able to use that file-list database.

tobias
Posts: 1589
Joined: Tue Mar 31, 2020 7:37 pm

Re: Split jobs (to save memory) after x million entries and realtime sync

Post by tobias »

Hello,
even for split jobs, the database will contain all files.

Normally you don't need to worry about these things, as Syncovery handles them automatically. Your plan will work even if you don't take the special steps you mentioned.

If there is no database, it will do a normal scan and add all entries into the database. You don't need to set "Re-read destination file list" for that purpose.

The 8GB limit is not the database. It's just the in-memory file list. While splitting the job is not a problem, you can increase the memory limit by adding the folowing line to the [Main] section of Syncovery.ini:
AbsMaxMemoryMB=24000

(for example)

ruudelux
Posts: 9
Joined: Tue Oct 12, 2021 7:34 am

Re: Split jobs (to save memory) after x million entries and realtime sync

Post by ruudelux »

So basically, i just need to enable the "Real Time Synchronization" and disable the option "Cache Destination File list/Re-read the destination file list completely after every" ?

tobias
Posts: 1589
Joined: Tue Mar 31, 2020 7:37 pm

Re: Split jobs (to save memory) after x million entries and realtime sync

Post by tobias »

Yes, that will be fine.

But yes, please do the first complete run in attended or unattended mode, or even background mode, so that all files are copied and scanned initially and the database is initialized.

ruudelux
Posts: 9
Joined: Tue Oct 12, 2021 7:34 am

Re: Split jobs (to save memory) after x million entries and realtime sync

Post by ruudelux »

Does the Unattended mode also need to be done in parts? Or does the split option only apply to the attended mode?

tobias
Posts: 1589
Joined: Tue Mar 31, 2020 7:37 pm

Re: Split jobs (to save memory) after x million entries and realtime sync

Post by tobias »

It applies to Unattended as well, but sometimes there are fewer parts unattended because files that don't need to be copied aren't kept in the list.

ruudelux
Posts: 9
Joined: Tue Oct 12, 2021 7:34 am

Re: Split jobs (to save memory) after x million entries and realtime sync

Post by ruudelux »

Ok, i tried the unattended option, but for some reason it seems to be running out of memory even though the split option is set to 9 milion items. Why is it saying that it is splitting after 9 million entries when it already scanned more then 10 million entries? It seems to be ignoring the split configuration.

Left Side Volume Name: data3

9:26:58 AM.422: Opening Database: C:\ProgramData\Syncovery\Database\F_to_Objectstore.x.nl.syncfdb
9:26:58 AM.485: Connecting With Database
9:26:58 AM.549: Database Connected
9:26:58 AM.581: Database Opening Complete
Connecting with fschijf@objectstore.x.nl
Using OpenSSL 1.1.1i 8 Dec 2020: libssl-1_1-x64.dll, libcrypto-1_1-x64.dll

TLS Certificate verified successfully.
This server does not support empty folders.
Copying Direction : Left To Right
Exact Mirror Mode is being used.

DatabaseProbablyOutdated: FALSE, PrevPartWasNotUsingCache: FALSE
Not using cache for scanning destination.
Using reference database in file C:\ProgramData\Syncovery\Database\F_to_Objectstore.x.nl.syncfdb (0 Folders, 0 Entries, DB size: 2,928 kB)

Scanning Folders Started At 10/13/2021 9:26:58 AM
Still Scanning Folders At 10/13/2021 9:36:59 AM, Entry Count:4997588, splitting after 9000000 entries, using 4,503 MB of max 12 GB RAM
Still Scanning Folders At 10/13/2021 9:46:59 AM, Entry Count:10418821, splitting after 9000000 entries, using 8,572 MB of max 12 GB RAM


Stopping because memory usage is over 12 GB

Usage now: 12583676032 / 12583676032

See memory report in C:\ProgramData\Syncovery\Logs\F_to_Objectstore.x.nl - 2021-10-13 09.26.58.logmem.txt

tobias
Posts: 1589
Joined: Tue Mar 31, 2020 7:37 pm

Re: Split jobs (to save memory) after x million entries and realtime sync

Post by tobias »

Just set it to 4 million.

The splitting value is more of a hint, not necessarily precise.

ruudelux
Posts: 9
Joined: Tue Oct 12, 2021 7:34 am

Re: Split jobs (to save memory) after x million entries and realtime sync

Post by ruudelux »

I already tried even going down to 2 million, it still seems to be ignoring the split value:
Log for profile F_to_Objectstore.x.nl, started at 10:40:24 AM on 10/13/2021

LEFT: F:\
RIGHT: S3://fschijf@objectstore.x.nl

The profile is run in attended mode.

Previous run was at 10/13/2021 10:34:36 AM: INCOMPLETE: 222 copied (139.1MB) of 1984506

Left Side Volume Name: data3

10:40:24 AM.966: Opening Database: C:\ProgramData\Syncovery\Database\F_to_Objectstore.x.nl.syncfdb
10:40:24 AM.971: Connecting With Database
10:40:25 AM.34: Database Connected
10:40:25 AM.39: Database Opening Complete
Connecting with fschijf@objectstore.x.nl
Using OpenSSL 1.1.1i 8 Dec 2020: libssl-1_1-x64.dll, libcrypto-1_1-x64.dll

TLS Certificate verified successfully.
This server does not support empty folders.
Copying Direction : Left To Right
Exact Mirror Mode is being used.

Not using cache for scanning destination.
Using reference database in file C:\ProgramData\Syncovery\Database\F_to_Objectstore.x.nl.syncfdb (10 Folders, 231 Entries, DB size: 2,928 kB)

Scanning Folders Started At 10/13/2021 10:40:25 AM
Still Scanning Folders At 10/13/2021 10:50:25 AM, Entry Count:5036657, splitting after 2000000 entries, using 4,538 MB of max 12 GB RAM
Still Scanning Folders At 10/13/2021 11:00:25 AM, Entry Count:11052007, splitting after 2000000 entries, using 9,122 MB of max 12 GB RAM


Stopping because memory usage is over 12 GB

Usage now: 12583676032 / 12583676032

See memory report in C:\ProgramData\Syncovery\Logs\F_to_Objectstore.x.nl - 2021-10-13 10.40.24.logmem.txt

tobias
Posts: 1589
Joined: Tue Mar 31, 2020 7:37 pm

Re: Split jobs (to save memory) after x million entries and realtime sync

Post by tobias »

Possibly the splitting is not possible with S3 compatible storages unless you remove the checkmark Recursive Listing from the second tab sheet of the Internet dialog.

Can you let me know which S3 provider or software it is? Maybe I can improve compatibility.

Post Reply